Tuesday, November 08, 2011

Data-Oriented Design : look at trends and values

Quite a good post from Niklas Frykolm, not unexpected, he's a good gamedev from what I've read,

http://bitsquid.blogspot.com/2011/11/example-in-data-oriented-design-sound.html

But he does skip the data-oriented design, which is a shame. I see this a lot on data-oriented design posts, the poster writes about how to remove the bad C++ stuff that menaces the code with potential cache misses, which is definitely important, but then doesn't extend the work into data-oriented design.

In Niklas' post, he makes the layout of the data much better for the hardware, which can be seen as data-oriented design, but data oriented design goes further than this and is about looking at the actual data, not just the schema.

One indicator of this is the words Say you have in front of 512 sounds. This may sound trivial, but the point of data-oriented design is that it isn't architecture astronaut programming, it's not about maybe's, it's about facts. You can only get so far with good guesses.

For example, he rightly says that you might need to set MAX_INSTANCE_PARAMETERS pretty high, but that is only true if you don't know what your coding for. It should be mentioned, that if the code is middleware, then data-oriented design is hard as there is little knowledge about the data. It's difficult to nail down requirements at the best of times, but when you're not even involved in the final utilisation of a library, you can't really make any good guesses about what shape the data will be in.

To turn this into a data-oriented post, we need to put qualifiers around it. How many sounds will there ever be? 447 at once? with up to 11 parameters? from a set of 315 different parameters? Also, there is no mention of how the parameters are used. You can't do a data flow analysis if you don't know when or where the data is flowing.

Let's assume that the parameters are used to set up the sound, and pulled in by a function that chooses a wav based on the material and the weapon, and only then pulls in the force parameter for volume and pitch tuning. The number of sound instances that have weapon or material hashes outweigh the ones that don't, so there's a good reason to include them as a first class members of the structure.

So, we know we always want to access the weapon and material at the same time, which means that the weapon and material can be combined into one hash, the wav selector hash. Now we have an static array allocated pool of 450 sound instances, with space for an index into the linked parameter pool plus one free hash that represents the material weapon combo, saving us two parameters and the overhead of another cache-line load.

Now that's data-oriented design. We took what we knew about the data, including domain knowledge, and turned out a solution that takes advantage of facts that the compiler could not guess at.