Saturday, December 04, 2010

Data is more important than code.

In summary, Dino Dini's post on data-oriented-development has me worried that even old school developers aren't aware of the problems of putting code ahead of data. On the one hand he's quite fluent in hardware appreciation, and on the other he's thinking that it's a good idea to cache a length of a vector. I've not yet seen a performance oriented reason to do that on modern machines, but I have done it recently on a very small processor. This might be the reason he's leaning against the idea of data oriented development. I don't know, but I can only assume he hasn't had to do too much highly cache sensitive work, or much work on in-order processors.

He mentions that both John Carmack and Mike Acton are trying to promulgate the destruction of abstraction through Data Oriented Design, and whether or not he is right about those particular cases, the fact remains that abstraction is not the enemy of data-oriented-development. The real enemey of data-oriented-development is data-driven-control-flow development, also known as object-oriented-development.

Dino Dini is right in that the DOD approach is as old as the hills, but it's not the old as the hills that he mentions being used to, not the days of the Spectrum or Amiga or ST or Megadrive where the CPU was strong, but not far outclassing the memory it worked with, and saving instructions would save you cycles pretty much no matter how you saved it. No, it's a lot older than that. Data oriented development has it's strongest roots in a time when the memory bandwidth to cpu power had the same gap as it has now, namely when the memory in question was tape storage on giant cabinets, and the cpus has local memory (similar in scope to the cache of modern architectures). We had a blip of OO friendly time in the 90s when memory was getting big enough to hold all our working data AND our CPUs could get hold of it fast enough to work on it. But either side of that blip we've been either trying to read stuff out of slow ram into fast cache to use on our super CPUs, or we've been loading off slow disk or tape into our tiny rams to use on our simlarly speedy CPUs.

So what did we do back then? Back when data was on petrifyingly slow media such as megnetic tape or worse, punch cards? We processed things as steams. Which, at the heart of it, is what DOD is all about. No more random access to code by way of reading some data. No more requesting random data or dereferencing multiple times just to get to the one thing you want to work on.

I'm having to cut this post short, but I hope that's at least some insight into why many advocates of DOD might sound mad, but actually, they're just trying to make us think.
Post a Comment