Showing posts with label performance. Show all posts
Showing posts with label performance. Show all posts

Monday, 24 January 2011

Bad code go away now

Some days, I get so wrapped up in Moviestorm as a movie-making tool that I forget it's a piece of software made out of hundreds of thousands of lines of hand-written code. And even once we've got those lines of code to actually do what we want them to do, that's not the same thing as making them do it efficiently. This cartoon from xkcd explains why.



So every often, we go through the code and see if we can figure out what it's actually doing. We find loads of places where it's taking two steps forward and one step back, and try to see if we can make it just take one step forward instead of doing the hokey-cokey all over the place. Here's an email Julian sent me at the end of last week which will give you an insight into what this actually entails.
Well, I've had fun this week. I did some profiling and found that there were some very odd things going on. Why, for example, were there over 160,000 bounding boxes in a particular - fairly simply - scene? Why did the performance problems and crashes only happen when I switched views? And the like...

You will be glad to know that I have answers to both of these and more. A few accidents, going back to last summer, resulted in some innocuous code being checked in. Innocent it looked. But it had the side-effect of adding multiple copies of objects to the scene when it was loaded or changed view. If you stayed in the set workshop though, you'd never see it. And in fact, I found two completely different areas of code where this was happening, leading to a seriously exponential over-allocation, gobbling of RAM, and also CPU cycles. I have removed the first offending item as it was an accident; and I have added extra bullet-proofing to stop the second happening.

Thirdly, those 160,000 boxes came about as a result of some debug code that was used to test our snapping sockets. When you stacked objects, the collision detection system got a bit confused trying to partition things and got stuck in recursive hell. The code was unused anyway, and I have removed it.

Lastly, I have made further modifications - I have speeded up some of the critical loops in the code, and also cut out a bunch of redundant work. The result is that movies load a bit faster, take significantly less memory, and render faster. We think that the performance reduction happened in the summer for release 1.4.1 - we had a few reports of sluggishness. If so, we should be nimbler now than we were then. You should see the results for the next release of the product.

Toodle pip!
If that made no sense, here's a translation for non-programmers. "Unused code made it go slow when you did stuff. Took out kludges & spaghetti and made it better."

Go Jules!

Thursday, 24 June 2010

A handy performance tip from Overman

Here's a little twitter exchange between myself and Phil Rice earlier:

zsoverman: Had to restart master render of First today. 200% speed boost after blowing dust off heat sink. (!) Sony Vegas says 1 hr to go... tick tock

mattkelland: @zsoverman 200%? blimey!

zsoverman: @MattKelland I know, right? Dang thing was crawling, crashing... a little compressed air and voila!

So, give it a go. Pfft!

(Strangely, I did the same thing to my air conditioning last week, and that works much better now too.)

Thursday, 12 November 2009

Take thirty-seven... action!

Over in the forums, there have been some discussions about performance issues, particularly why things sometimes run slowly with relatively simple sets, no mods, few characters, etc. Julian posted this detailed reply, which we figured was worth reposting in its entirety.

As you may know, it's my job to keep tabs on Moviestorm performance. My tests this week have been on what we call "retakes" - this is the process by which we turn the commands that you give our objects into actual chains of animations, and it is quite a time-consuming process. For example, consider a walk - it consists of many different snippets of animations stitched together in as seamless a fashion as possible. When you move a target point, the walk gets recalculated. In fact during a retake everything is recalculated, and this may seem like overkill; if you only change something near the end, why do you have to process the entire scene? Well the answer is that determining which bits of the scene are (un)affected by which operations can be a more time-consuming process than recalculating the lot anyway! It can also be more error-prone, and more complex (hence bug-prone too). So for now, when we retake (and we retake a lot), we do the entire scene, from time = 0 to the end of the scene.

In order to mitigate the cost of the retake, we've put some effort into making sure we are reducing the amount of recalculation done. This is done "locally" to each activity (rather than globally as above). One of the areas this is harder to optimise is when the activity depends on some resource - which could be eg a texture or a data file lurking on your hard drive. In order to be robust, we have to put checks in to see if these resources have changed - so if you change an animation (directly or indirectly), that change propagates into your scene. Large (ie long) scenes result in lots of resources, and hence Moviestorm spends a lot of time in the retake gathering info about resources. At the moment, for reasons I won't bore you with the details of, this is more costly than we'd like. In essence there are multiple file systems in place, and they all have their costs. We intend to fix the inefficiencies as soon as we can; but they are actually quite insidious because we do a lot of file-related activity and so the changes need to be made in lots of places. But the good news is that we are on the case, and with v1.2 due out soon, we'll have time to take stock of these issues, hopefully to come in a subsequent release asap.

Wednesday, 14 October 2009

Here Come The Girls!

You wanted a version of the crowd scene with the ladies? Here you go.


The sweet thing about machinima is that this only took a couple of hours to put together. All I did was start up my movie, save it as a new movie, and go through the cast list one by one, converting each of the 60-odd characters into their nearest female equivalent. Then, just for the hell of it, I made some small changes to the set and the lighting, adjusted the colour of the intro text, and hit render. Lovely.

I also did some quick calculations as to how long it would take me to get those shots in real life. Say, a few hours booking the hall, and sorting out the people to do the shoot, making sure they knew when to arrive. Another hour or so making sure the more outlandish costumes were ready. A crew of three or four would spend a couple of hours putting out chairs and getting the room ready before the cast arrive. That's about 13 or 14 man-hours already. Then the cast arrive, we get them changed, run them through what's needed, and shoot it a few times. Say, if we're really lucky, an hour each for the male and female versions. That's another 125 man-hours for the cast, and another 10 or so for the crew. All in all, about 140 man-hours to get those two shots filmed, then we take the footage home and edit it. I'd have to put in, say, about 16 hours.

And that, of course, assumes everything went swimmingly. I didn't even think about insurance costs, actor releases or other legal stuff. And what's the likelihood of getting 60 people to perform flawlessly in an hour, even if all they have to do is walk onto a stage?

Doing it in machinima took me about 12 hours for the first one, and another 2 for the second one. That's about the same effort on my side as doing it for real, but only 10% of the total number of man-hours and a fraction of the hassle.

Tuesday, 13 October 2009

C'mon, everybody!

We've gone on and on about the performance improvements in Moviestorm 1.1.7 for the last few months, and we've quoted all sorts of numbers to show you how clever we are. What you really want to know, though, is what this means in terms of the movies you can make and how easy it is to use Moviestorm. So, in my usual destructive manner, I decided to push Moviestorm to its limits and see what it could do. It was a simple test plan. Put as many characters as possible on the set, all different (in order to maximise the number of textures being used), animate them all at once, and see how big a crowd I could create before Moviestorm exploded. For reference, in the previous version, I had difficulty with more than ten characters, and twenty was out of the question.

Try this, then...



And, just for extra sweetness, that movie loads from the desktop into the Director's View in about 35 seconds.

To be fair, some disclaimers. It wasn't all fun'n'games. I did have to save frequently with this number of characters on set. Once I had over 50 characters in there, it ran out of memory after about 10-15 minutes of work, especially if it involved scrubbing around in Director's View, changing the duration of animations, or dragging things on the timeline. After 60, it did get quite painful, and was certainly more work than fun - though maybe that was just because I'd been repeatedly adding a character, choreographing them, and rendering the result for more hours than I could count. I also did resort to the trick of switching shaders off while I worked, and then back on again for rendering. I could probably have put a few more on, but by this point, I decided I'd reached Moviestorm's practical limitations.

However, dealing with ten or twenty characters at a time now seems trivial. I don't have to worry about whether putting an extra character or two in the background will cause Moviestorm to choke on its breakfast. I can do a small crowd if I need to - maybe not an entire stadium, but certainly a pub crowd.

It also came as quite a revelation to see just how many costumes we have now. There are over 60 unique outfits on the screen, and that wasn't all of them. Many of them are customisable, which means there are even more possibilities than you can see on screen. And that's just the guys!

It'll certainly be interesting to see what possibilities 1.1.7 opens up for you - we're looking forward to seeing your next crop of movies!

Tuesday, 15 September 2009

Moviestorm 1.1.7

We're currently getting ready for the release of Moviestorm version 1.1.7. This is quite a major upgrade, with a lot of stuff going on under the hood. In many ways, it's a preparation for an even bigger upgrade to be shipped in the fall, Moviestorm 1.2.

The most obvious thing you'll see is the new user interface, designed by no less than our multi-talented CEO, Jeff "Babyface" Zie. Out with the old icons, and in with a whole bunch of stylish black and white ones. Moviestorm feels like a completely new app, more professional, and more finished.

You'll also notice much better performance. As we've been promising for a while, we've massively speeded up the load times, and Moviestorm also runs at a considerably better framerate on some hardware.

More subtly, we've made huge changes to the underlying animation system. This should eliminate the ugly "popping" that you get when blending animations, and result in much smoother character motion. We've also got new walk code in there, which gives much more natural and controllable movement around the set. We'll tell you more about that later in the week, as it's a huge, huge change.

We'll also post some more about some of the other new features over the next few days, including the new catalog and asset structure.

1.1.7 will be shipped in two stages. In a week or so, maybe less if all goes well, we'll let the pioneers get their hands on it and give us early feedback. We've been testing the bejesus out of it for a while, but there are still only a few of us, and it's a surefire bet that you lot will find a whole mess of stuff we missed. This may not contain absolutely everything in the final release, but it'll be close, and it'll include the most important things. Then, when we've fixed whatever howlers you find (or taken out the bits we can't fix fast enough), we'll put it out to everyone else, probably at the very end of September or the beginning of October.

Tuesday, 21 July 2009

Run, Stormer, Run!

With the release of the Special Effects pack imminent, and the code support for the next few packs already complete (music and hair), the engineering team are now hammering away at Moviestorm 1.1.7.

The three key issues we're tackling for Moviestorm 1.1.7 are performance, performance, and performance. In other words, you probably won't see any major new features, but you'll see Moviestorm running noticeably faster, particularly in terms of load times. We're now at the stage where we're trimming fractions of a second off things, and while progress feels painfully slow at times, the end result is already feeling very different.

August is, of course, holiday season, so we're aiming to ship 1.1.7 some time in September.

Thursday, 14 May 2009

Faster, Pussycat! Kill! Kill!

One of the unending background tasks round here is to make Moviestorm run faster. I managed to persuade Julian to take a few minutes out to tell you what he's been doing. If you get bogged down in the techspeak, just skip to the last paragraph. ;)

Recently, I have used the small amounts of time not devoted to watching Russ Meyer films to attempt to make Moviestorm run faster. I don't even want to count the number of changes I've made, but it's quite a number.

One of the things about performance improvement is that not everything you think makes things faster actually does. In a previous job I was reliably informed by a Games Industry "Guru" that "I shouldn't need to profile my code because I should know where the bottlenecks are". Let's leave aside the distinct possibility that this was to avoid him having to buy me a profiler. The real moral of the tale is "what you assume makes an ASS out of U and ME". I'm frequently wrong about such things - most devs I know are, to err is human etc, and in a complex app with threads and the like such as Moviestorm, it is very easy for us mortal non-Gurus to miss the wrong end of the wrong stick.

So my life has involved the use of Java profilers for a 30,000ft view of what's going on, and an OpenGL debugger to see what the individual molecules are up to within our favourite application. First off, I've identified some of the critical code, the 10% that runs 90% of the time. (I say "some of" because Moviestorm is an open system, and adding props and characters with new materials and activities can drastically change the balance of the call state.) Once I know what they are, there are a number of strategies I've employed to speed things up:

* Open GL state management. I've written a GL wrapper that tracks state and rejects redundant changes. Using my magic tools, I've seen that the graphics card on the test machine was hardly breaking a sweat on a scene with 7 characters. Therefore I don't expect to see this optimisation making a big impact YET. I guess it's good news - we can do lots more pretties without hurting the frame rate provided they don't cost a lot of CPU to set up.

* Workspace variables in bottleneck systems. You'll see a lot of maths code using static variables named things like vWork1 to avoid the cost of doing eg

Vector3f pos = new Vector3f();

Yes, the code is less readable, but that's optimisation for ya (in more complex cases I've preserved the semantics so that instead of writing

Vector3f pos = new Vector3f();

I write

Vector3f pos = vWork1;

and hopefully the JIT can optimise, but who knows?)

Why is this inefficient? Firstly, the memory allocation takes time; there are also THREE constructors called when you construct a Vector3f for instance - Vector3f, Tuple3f and Object; plus there is time to zero the x/y/z fields AND there is a hit on the GC side because lots of small short-lived objects can be a nightmare. In frequently-called code, this can accumulate and deferring the calculations to pre-allocated workspace saves these overheads.

As a general point of advice, it's only worth doing this if you know the code is causing a performance problem - ie you've profiled it some way (either using Netbeans' profiler or timing method calls manually). Early optimisation is the root of all evil, remember! (Donald Knuth, Mr Computer Science, circa 197x).

* In-place calculations. In some functions, the results are allocated dynamically (which exacerbates the problems mentioned above). So I've written versions that use pre-allocated arguments, and inlined the relevant code where appropriate.

* Removed redundant field settings in constructors. Eg in Vector3f(), the fields were set to 0 even though Java guarantees to have done this already.

* Caching of invariant state. Many bits of code do searches for things which never change, wasting valuable cycles.

* Lazy evaluation of dynamic state. Some states change at a significantly lower frequency than that which they are polled at. For instance the world transform of a scene object. This gives us an opportunity to only recalculate such state values when we really need to (in the SceneObject's case when its local transform is modified in some way).

* Map iteration. I've changed some iterations over hash maps to use the entry sets rather than the key sets as then the values and keys come for free (as opposed to doing a lookup for the value).

* Many of our skeleton operations are faster now that I've speeded up traversal of bones, and eliminated a number of tests that only need to be performed once.

The first item aside, these are all on the CPU side of things. I'm currently working on some graphics optimisation. I've changed the way our primitives sort and collected them up into batches. Batches require only one setup and teardown, so we win when the batches are bigger than 1 prim, and never lose (the overhead for batch generation being small). It's going well though it has required a big overhaul of the render code.

OK, so far, so much techno-babble. How has this improved speed? Well, on our test machine, we were rendering at 10.6fps on my test scene with 7 characters, lights and quite a few props. After 2 weeks of optimisation, the same scene ran at 32fps. Cool! But the machine is a monster, and it remains a priority to get the speed up on lesser beasts. Optimisation is a war of attrition: lots of small changes can be hardly noticeable; but put one more in, and suddenly things start to flow. Then again, when one bottleneck is removed, another will surely rise to take its place. Slowly slowly catchee pussycat (oh, how I love mixing antonymic metaphors). There are a number of areas we are aware of where we can make yet more speed gains; my notional target is 20-25fps for a moderate scene on an average PC / Mac, and I'm optimistic we can hit this. The only way is Up!


So there you have it. With luck, I'll get Dave to tell you something next week about the work he's been doing on the launcher, which is already giving us much faster load times on our test machines. We've obviously got to do a shedload of testing on the new code before we ship it to you, as there's always a risk that we've broken something major, so we've got this pencilled in for the release after next, assuming all goes well.

Oh, and please don't ask me what any of this means. All I know is Moviestorm runs a lot faster now.