Monday, December 06, 2010

All your bases

when you commonly work in base ten (that is 9+1, not F+1 or 1+1), you can sometimes miss interesting artefacts of numbers or dismiss them as being just facts or oddities.

One thing that's interested me over the years was the childhood realisation that the nine times table "went up" one side, and "came down" the other.

It wasn't until I grew up and started trying to think about problems from different points of view that I stumbled across the fact that this situation is true for all number bases.

so, in hexadecimal, if you take the nine as being ten-1, the F times table looks much like the nine times table you had as a kid.

1 x F = F
2 x F = 1E
3 x F = 2D
4 x F = 3C
5 x F = 4B
6 x F = 5A
7 x F = 69
8 x F = 78
9 x F = 87
A x F = 96
B x F = A5
C x F = B4
D x F = C3
E x F = D2
F x F = E1
10 x F = F0

This can be quite a revelation for some people. Funnily enough, it works right down to binary.

1 x 1 = 1
10 x 1 = 10

okay, that was a poor maths joke, so now I'll just get my coat.

Saturday, December 04, 2010

Data is more important than code.

In summary, Dino Dini's post on data-oriented-development has me worried that even old school developers aren't aware of the problems of putting code ahead of data. On the one hand he's quite fluent in hardware appreciation, and on the other he's thinking that it's a good idea to cache a length of a vector. I've not yet seen a performance oriented reason to do that on modern machines, but I have done it recently on a very small processor. This might be the reason he's leaning against the idea of data oriented development. I don't know, but I can only assume he hasn't had to do too much highly cache sensitive work, or much work on in-order processors.

He mentions that both John Carmack and Mike Acton are trying to promulgate the destruction of abstraction through Data Oriented Design, and whether or not he is right about those particular cases, the fact remains that abstraction is not the enemy of data-oriented-development. The real enemey of data-oriented-development is data-driven-control-flow development, also known as object-oriented-development.

Dino Dini is right in that the DOD approach is as old as the hills, but it's not the old as the hills that he mentions being used to, not the days of the Spectrum or Amiga or ST or Megadrive where the CPU was strong, but not far outclassing the memory it worked with, and saving instructions would save you cycles pretty much no matter how you saved it. No, it's a lot older than that. Data oriented development has it's strongest roots in a time when the memory bandwidth to cpu power had the same gap as it has now, namely when the memory in question was tape storage on giant cabinets, and the cpus has local memory (similar in scope to the cache of modern architectures). We had a blip of OO friendly time in the 90s when memory was getting big enough to hold all our working data AND our CPUs could get hold of it fast enough to work on it. But either side of that blip we've been either trying to read stuff out of slow ram into fast cache to use on our super CPUs, or we've been loading off slow disk or tape into our tiny rams to use on our simlarly speedy CPUs.

So what did we do back then? Back when data was on petrifyingly slow media such as megnetic tape or worse, punch cards? We processed things as steams. Which, at the heart of it, is what DOD is all about. No more random access to code by way of reading some data. No more requesting random data or dereferencing multiple times just to get to the one thing you want to work on.

I'm having to cut this post short, but I hope that's at least some insight into why many advocates of DOD might sound mad, but actually, they're just trying to make us think.

Tuesday, November 23, 2010

Pitfalls of Data-Oriented (DO) development

Object oriented development is ubiquitous in the games industry and it's quite hard to remove your brain from thinking along those lines if you've been doing games for a reasonable amount of time. I just read this post and thought about the problems I faced when I started doing DO. The whole entity ID problem that finally faded away with practice and the general "no, wait, XYZ said that was bad". I thought that it was about time to point out some of the problems I have with data-oriented development after I started doing it in my day job.

Firstly, data-oriented is different. This makes it harder for seniors to look at your code and appreciate that it even works. Sometimes you might get a "what is that? why did you do it like that? can't you just wrap it up in a class to make it easier to read?" These are disheartening to hear, especially from people you should be learning from.

Secondly, there's the stress of not being able to write data oriented. You still have to work daily with object oriented code bases, so if you're a good developer, one that work well with others, then you're constantly chomping at the bit to write it better, but hampered by style guides and already in place systems that do everything in an OO style.

Thirdly, there's the solitary feeling. It's not just a feeling of being alone with the concept of DO, but also the feeling that there might be groups or gatherings of others that do it the DO way, but you're not part of them, yet. It can be draining, and you may want to shout out about it, but no-one will have time to listen to a completely alien methodology when they're all concentrating on just getting stuff done. Which admittedly is your job too, so you have to suck it up.

Friday, November 19, 2010

Nuclear power vs renewables

Okay, so I've listened to both sides of the debate here, and to be honest, I'm sure both could have done a better job arguing their point.

Firstly, the anti nuclear pro renewable guy (Mark Z. Jacobson) probably should have gone into a lot more on the impact of the lag of installing nuclear power plants, this is a real threat, as opposed to the threat of terrorist activity. It's about the only thing that has any real hard hitting negative impact as far as I can see for nuclear approach.

Secondly the nuclear fella (Stewart Brand) seemed constrained artificially to try to solve the problem posed without extrapolating. Don't know why, but he's not pointing out that nuclear power provides a surplus opportunity.

This is the reason why I'm pro nuclear.

I believe that surplus energy provides the opportunity for good things to happen. If energy becomes cheap, then a lot of things become easier or possible. With cheap energy comes invention and innovation, which in turn are self propelling. I like things that are self propelling (see my work on taking advantage of innovation by random creation and critical selection) and with a world of cheap energy, creatives will have the opportunity to do more before the bills cause them to do a day job. Creatives are what keeps progress going, what keeps our next level of comfort approaching. Without surplus we're doomed to live a subtractive experience, cutting away at what is bad, not adding more good. That's not what I was brought up with, and it's worrying that my poor privileged brain might have to put up with it.

What if we had infinite computing power?

Just a little mind exercise, what do you think would happen if we found a trick and managed to create something like quantum computers that took virtually no power to operate and would run any program to completion as soon as it was run?

  • don't forget: with infinite CPU power, comes a massive reduction in clever coding. We can all write games in python / C# / natural language maybe even.
  • this isn't just about games, what about your daily commute, or your breakfast choices?
  • what about the rest of the world? How do you think it will affect the third world?

Wednesday, October 20, 2010

Is the future of graphics FPGAs or CPLDs?

Well, it is the ultimate conclusion for pixel/vertex/geometry shaders...

They provide a difficult to program interface, but provide a way to approach the fastest possible solution to any massively repeating logical manipulation (such as a shader).

I'm an advocate of turning your game code into streams of data, and this seems like a really good way of writing data transforms in a super efficient manner.

Thursday, October 07, 2010


A long time ago I wrote up a piece on how important it is to try to work faster.
Not smarter or cleaner, just faster.
The faster you go, the better you get as experience accumulates and decisions find their outcomes sooner and you learn more about why and how things happen.
The more work you do, the better you get at doing it.
The faster you get, the more work you can do, so you know you're going to get better quicker too, which in turn makes you faster again. Concentrating on being productive is something I tried to promote at Broadsword, but we were already scuppered by the time I really started pushing this so it fell a bit flat. What have I learned since? Anyone that doesn't embrace speed over cautiousness is a large lumbering beast that takes one hit to kill. Anyone that does embrace speed seems to do well even in a dangerously volatile market like games development.
Faster games are easier to debug.
When your game runs faster it means it's doing less, doing less means there's less code for bugs to exist in. But doing less is what coders like to do anyway, so why isn't there less code already?
Special cases make a game what it is.
My opinion is that all games are generic games with special cases. This is either obvious to you, or absurd, either way, it spells out what's really in our code bases: exceptions to the norm. So games that do stuff other than just sitting there blankly have the special case that they take input. The moment that happens, you have an exception that needs to be handled by some code, and therein lies the opportunity for bugs. The amount of work you do to implement your exceptions is your productivity metric. If one programmer builds a magnificent framework that lets him add features at a rate of one per minute, whereas another programmer doesn't and spends ten minutes per feature, we might assume that the first programmer was more productive. This false conclusion is perpetrated because we forget about the cost of building the framework, debugging the framework (as there is also a framework to hide bugs in), and also the time taken for another programmer to get used to the framework before adding features themselves.
Feature creep never goes where you expect it.
Another problem with a framework for solving exceptions is that the framework was probably built for the list of problems available at the time it was conceived, and sometimes for some that the programmer thought of while developing the framework too, but not for the unexpected exceptions. These unexpected exceptions are now just as hard to solve with code as they were before, but now they have to work alongside a lot of code not just some smaller, to the point, exception code that came about from the approach the second programmer took.
Code less to cause less code.
If you code less, it's easier to refactor. If you find you need to refactor, or in fact if you find that refactoring would solve some new exceptions without adding more code, then you've already won. But if you architect from the start, your invite performance loss of both your team and your game. If you prepare for the worst and write a brilliant all encompassing architecture, you're also in danger of hiding your bottlenecks. Imagine you had a load balancing system that automatically reduced asset quality, AI processing, physics model accuracy, all on the fly. If you have this, then it's great achievement, but you've probably just hidden all the stupid behind this barrier of guaranteed consistency. How will you know that AreaX is really badly put together if you can't tell that you're not seeing the highest level graphics because they never came online?
Fast games makes smarter coders.
Wisdom is the product of failure, observation and reason. You need to do things and fail to get wise. Good developers are wise from effort, and to expend effort, there must have been an opportunity to develop. The more time you get to try things, the wiser you can get. The faster your code compiles, runs, gets to the bit your testing (in other words, your iteration time), the more you can learn and the more you can do.
Wasted cycles are not wasted if they're spare.
In optimization there is the concept of wasted cycles, that is cycles of the CPU or GPU that it's wasting while waiting for data to load, or waiting for some comparison operation to figure out if it needs to branch. This is real waste as it's trying to do something and it can't because it's stalled. There is another kind of wasted cycles though, and it's the cycles spent waiting for the VSync or some other real-time event. These are not wasted though, they're spare.
Why the distinction? Surely cycles that aren't being productive are a waste of resources either way. The result is the same for the consumer, a game with the same quality graphics running at the same fps with the same features.
Recoverable time.
The reason lies in whether they are recoverable. If your code is running faster than your frame rate then you can run unit tests of game play in accelerated time, you can add in logging features without affecting frame rate, you can try out mediocre quality implementations of new ideas and see how they would affect the game. Also, normally, when a game is running within it's frame, it's easier to have more than one game running at once, and easier to run on battery powered devices like laptops or phones.
Battery use = efficiency of code.
Limited battery life affects us but so does download size or computer spec. If you are a web games portal, would you prefer the game that ran at 60fps on a low end machine and only had a 1mb download, or the larger 2mb game that only ran great on good hardware? It's obvious, you choose the one that gets players looking at your ads, so the smaller one that costs less in bandwidth and runs on more people's machines.
Time is money.
Even if web portals isn't your demographic, or battery life on phones isn't important to you, or whether you can run or develop the game on your mum's laptop she bought from PCWorld, or even if you don't care about logging or adding features, the simple fact is, if you plan to be fast, you'll get more done with your time.

Fitter, happier, more productive.

Tuesday, September 21, 2010

Compile Times - The Mangling

I was thinking about how we normally go about optimising our compile times and wonder if anyone's actually taken the crazy step towards trying to optimise the source files themselves...

If we want to be able to parse text files really quickly, why not rename all your game identifiers so they are much smaller. Not human readable, but really really short, but still unique identifiers (much like the mini links you get from sites like tinyurl, or

I can't imaging we'll end up any larger than a 32 bit value for each identifier even sticking to the simple rules ([_a-zA-Z][_a-zA-Z0-9]*) you get 13,252,491 identifiers.

I might give this a go on a large project and see if it can actually increase the compilation speed. I can't think why it would increase the linking speed as it's already in machine form by that point, but it might still be using mangled names rather than pure lookup stuff, so might do something... worth a try, especially on really old and trusted stuff that doesn't need to be legible any more. And, I suppose as long as there is a reverse operation, it could be used for every day coding as a pre-parse before loading into and saving files out from your editor.

Still, sounds nuts anyway.

Tuesday, July 13, 2010

Game Logic XIII: Components imply entities

At first, using a component based architecture feels awkward. I remember having issues when I started doing it many years ago. I wanted to communicate between components, but that was bad practice caught from doing so many years of Object Oriented development.

As my understanding of component based development grew, my opportunity to do it lessened, which was unfortunate as I feel that because of that, my appreciation never fully matured until last year when I started totally rethinking every technique I had used from the perspective of Data Oriented development.

I'd been stumbling along the ideas and benefits of Data Oriented without knowing they were related to each other while working on many different areas of engine development, and looking back I can see a few points that I could have made the leap, but didn't. But even then, after figuring it all out, I wouldn't have been able to make this latest leap.

From the beginning of this series of posts on data oriented development, I've been opting for a component-implies-entity approach to development, which is maintained by the components having primary-key like values to identify which entity they belong. Moving away from this would seem pointless, however there is a good reason to make one more step away from object orientation.

If components imply entities, does that make entities the cosmic base class of data oriented development? I think it does, and removing this crutch helps out in some odd ways.

Firstly, if you don't need an entity to identify your component, it becomes an entity in it's own right and can offer more flexibility in how it's used. Consider the humble path find. If your path finding nodes, the resultant chain of nodes that is, is allowed to exist outside of the initiator of the path request, then not only can the chain be used by the requester in the normal manner, but also used as a virtual signpost for groups of entities. Consider a path find that leaves the path find nodes hanging in space, waiting for other entities to pass by and tell them where to go if they happen to want to go in the same direction. Surely this would be quite a CPU saving if there was a common goal (which there often is).

Secondly, not using the entity as a basis gives you a better opportunity for runtime template objects. If an object is not tied to an entity, it can be used by multiple entities. Consider the component Weapon, with a instance for Handgun that has all the targeting techniques and rendering info attached, but that handgun can be used by many many entities. Using binding objects (which are your many-to-many tables of database science), lets you connect any amount of template objects to each other without actually making them coupled. Now your entities are instanced through couplings, but your code isn't coupled.

To this end, I'm giving up on calling the components of an entity in my new book components. I'm going to be calling them Elements, as any entity component should be a table row that binds an Element through existence, implying a component. The actual component might be a template object that has more information and doesn't need to be bound to the entity. Also, a component might not even belong to an entity, or might belong to multiple entities, or might be multiply owned by just one entity (consider the healing potion you have ten of in your inventory).

So there you have it. Component based architectures are great, but you can take another step and have element based architectures with bindings where necessary.

I believe that the dungeon siege component system had a form of this (through some default state template objects) but was still stuck to it's entities. I can't find the original documents that I read that stuff in so can't be sure, but I do remember it being very important thinky stuff at the time.

Wednesday, May 26, 2010

Content aware

Adobe's (kind of) technology for content aware fixes and adjustments gave me an idea that I really would like to see implemented in Tivo's and Sky+ boxes.

I was watching someone else watch a tennis match a moment ago and thought: "if he pauses that, that is just going to go on forever and ever", thinking that watching TV at work actually makes it last about ten times as long as you're never really watching it and you're not often able to either.

But then the thought struck me, if it's live, you could compress the amount of video you have to watch to catch up. Not in some 1.5x speed playback, but by having an actual interest aware culling and squashing system. Like content aware resizing, but for time.

Then, I thought, wouldn't creators of TV shows love it if their shows got cut in the right places, and the right bits cut out to make space for adverts? Probably, so there you go. We need content aware like timeline adjustments but for a simpler input stream, the importance value stream.

I'd like to have this on many you tube videos that I try to watch, but fear that many of them would never have a non-zero value in their meta data.

Once upon a compile

Once upon a time, I wrote a font renderer for OpenGL, last night in fact.
I was happy and done.
I had checked in my changes, and wanted to install the latest version of Ubuntu, so I went about switching over hard disks to get some data off my windows partition before I finally killed it dead by installing Ubuntu over it.

But, an evil witch was near and she had blanked my memory.

When I formatted the hard drive to copy stuff off, I was not aware of the fate that awaited me.
A number of minutes later, my snazzy SSD (which will soon be up on ebay) was sitting there filled with just over 10gb of data, safely stored now. Stored over what was my old installation of Ubuntu.

Now, I had taken precautions and copied all my home directory that I thought mattered onto my 1.5tb drive, so I didn't care about the fate of the ubuntu partition.

This was my downfall, as the witch had cast her spell before I'd even shutdown the linux OS.

As I rebooted onto the live CD, I was happily unaware of the fate before me. I installed and installed until finally I had code::blocks up and running, and rapid svn was downloading my source that had merely minutes before been uploaded. I installed SDL and g++ and prepared to shout tada, when:
compilation failed.
Can't find file to make FontRenderer.o went the output.
Code::blocks could not find the FontRenderer i had just written
I checked SVN once
I checked it twice
I'd forgotten (cursed to forget) to add the new files to the checkin
weep I did not, for a time long gone a wise man once said to me:
"that what yuv coded once, is thrice as easy the second time"
remembering that it was me that said it, i thought, "Well, I'm always right, so it must be true."
and so, in my lunchtime today, a beautiful font renderer was born again. in about 8 minutes.
and they all rendered happily ever after.

the end

Wednesday, May 19, 2010


I was told by someone, not sure who, but I think that person told me they were told by someone who "manned the scanners", that the full nude image of lenna was a fake.

Now, for a person who knows photoshop quite well, I was very prepared to take this as fact, and that was fine until I stumbled across this article because a work-mate showed me the "fake" that wasn't. Photoshop is powerful, and there's no reason why I wouldn't beleive it was a fake if told it was. In the same way we all believed that people eat spiders in their sleep.

The full nude image, the one that was in playboy, the one that was scanned and used for years as one of the main continuous tone, face artefact, compression technology test images. Was the one I knew to be a fake. How come? Information, bad information, flows just as free as good.

I'm reminded of a quote from a film I recently watched:

"A woman was gossiping with her friend about a man whom they hardly knew - I know none of you have ever done this. That night, she had a dream: a great hand appeared over her and pointed down on her. She was immediately seized with an overwhelming sense of guilt. The next day she went to confession. She got the old parish priest, Father O' Rourke, and she told him the whole thing. 'Is gossiping a sin?' she asked the old man. 'Was that God All Mighty's hand pointing down at me? Should I ask for your absolution? Father, have I done something wrong?' 'Yes,' Father O' Rourke answered her. 'Yes, you ignorant, badly-brought-up female. You have blamed false witness on your neighbor. You played fast and loose with his reputation, and you should be heartily ashamed.' So, the woman said she was sorry, and asked for forgiveness. 'Not so fast,' says O' Rourke. 'I want you to go home, take a pillow upon your roof, cut it open with a knife, and return here to me.' So, the woman went home: took a pillow off her bed, a knife from the drawer, went up the fire escape to her roof, and stabbed the pillow. Then she went back to the old parish priest as instructed. 'Did you cut the pillow with a knife?' he says. 'Yes, Father.' 'And what were the results?' 'Feathers,' she said. 'Feathers?' he repeated. 'Feathers; everywhere, Father.' 'Now I want you to go back and gather up every last feather that flew out onto the wind,' 'Well,' she said, 'it can't be done. I don't know where they went. The wind took them all over.' 'And that,' said Father O' Rourke, 'is gossip!'" - from "Doubt"

And yet we still tell lies. Lies for absolutely no reason at all (in the case of Lenna, I can see no reason at all). The feathers travel for miles and years. I'm glad that Google is here today to find the truth. All I need now is to start mistrusting everything that I've been told, and wear a citation needed hat.

Friday, May 14, 2010

Game Logic XII: Know your user

One of the neat things about Google Mail is that it uses a push architecture. The idea is simple really (the implementation isn't though) youn can infer that people are interested in updates as long as they have asked for data before.

Once you know someone has asked a question or submitted a query, like an html request for a page, you can provide them with up to date diffs or events that can affect the current state of their query. They can figure out what it would be, without actually doing a new query.

This is another form of compression. The assumption of prior knowledge, and assumption of interest.

In games development, when we set up how we're going to do network traffic for multiplayer, normally we initialise a game state, schedule a sync point, then keep each of the players / clients up to date with network events, or sync ups for positioning and other things that are less deterministic.

This same thing can be used in other ways too. Some resource managers remember who's had what off them, and if the resource is updated externally, it get propagated through the system.

There are quite a few other places where this technique works well, but boil it down to its basic component and you've got a query that is kept alive by the data source. The query is the user, and the data source knows about it, and wants to keep it happy and up to date.

This is quite simple, but how often to entities in games "poll" for some world states? Lots! And all of these every frame pollings (apart from the really esoteric ones) can be replaced with live queries. Queries that live past their call and response, get updated whenever anything matters, and can probably self optimise in many cases.
Difficult to syncronise queries like "what is near me" can be made live, by updating the parameters whenever the entity moves.

All this means that senses are more continuous. I'd speculate that it won't help with memory usage by much, but should save CPU time by not creating and destroying queries all the time.

GPHI, by Insomniac?!?

Ok, for those of you who worked with me at broadsword: doesn't this sound an awful lot like GPhi, circa 2005? It's got the "pools of components" (that would be gphi behaviour containers), "async update", that would be the behaviour container centric update, "implicit usage via existence", that would be the entity ID reference into behaviour container thing.

I'm reading it and I'm not seeing anything new apart from maybe the get component by interface implementation, but isn't that just as bad as dynamic casting in OO? Please correct me if I'm wrong, but surely you know what type of interface something has if you are updating its container in this system? If not, are you now doubly adding virtual overhead?

Also, in Dynamic Components (at least from this presentation) there's no mention of DLLs in conjunction with this stuff! That was one of the main selling points in GPHI, you don't need to know what a component/behaviour does, only that you could register a behaviour container and add the behaviour to your entity and let it get on with it.

But, and this is a big one, they implemented it. In a production game, they implemented the cool. GPHI never made it into any BS games, just an editor I used for animation adjustments and working with shaders, so they beat me there. The Ascendance project (which included the DB asset server framework, the crash proof editor, and game deployment tools) never got finished. Looking back, that was Jan 2005, just before first work on Snow Queen. That's a long time ago now.

One last thing too, at least mainstream thinking is catching up with what I've been doing. This should mean that when I've finished my tests and written my book on row oriented development, people's heads will be in the right place to fully appreciate it.

Wednesday, May 12, 2010

Game Logic XI: Current Concerns

All this clever stuff is mighty clever, and though implementing some of it was easy, actually implementing it all as some form of framework has turned out to abuse C++ too much for my way of thinking.

I don't like clever code, it's often the start of a deeper relationship with a debugger than I want to dedicate myself to. I have a wife and kids, they're my dependants, not some crazy meta-coding template macro system.

So, looking at how things have progressed so far, I can say that clever type safe templated table heading things are probably not the way to go to make this stuff fly, no, oddly, I think that going back to how sql does things might be a way forward. I'm thinking that the base command of an SQL query (select/insert/modify etc) is my basic object for the table control, and any arguments that would have come after that in SQL can be written out as a text string constructor argument.

Welcome back printf style arguments? I'm not sure I need to go this far, but we'll see. Again, it all depends on actual implementation experience. If I can get one game out written in this stuff, then I'll at least have a good basis for building a better way of working. Right?

Monday, April 19, 2010

Game Logic X : First draft

Having promised to write this stuff up better, here goes:

Have fun, please comment.

Tuesday, April 06, 2010

Freedom of speech

This is the reality of how my liberal side sees the world. An important side effect of this attitude is that it is up to the state or the church to censor for it's people. If your government or religion finds something abhorent, then it should take the same stance parents do with their children and hard-core porn. Just keep your wards away from the stuff.

It is not the fault of the author that you can and are shocked or appalled by anything seen or heard. In complete contradiction to this way of thinking about responsibility, is the scandalous way that some crazy politicians are using Islam to make the world bend to their own perverted morals. If a religious organisation thinks something is corrupting, or offensive, then it's up to the religion to stop it's patrons from being able to observe the influence.

Friday, March 26, 2010

Game Logic IX : Generic Optimisations

While thinking about how to do general optimisations on this data-oriented stuff, like good sorted views on tables and fast set existence checks, something that struck me was that I've been used to using a non standard container for a very long time that helps with both of these operations.

Judy arrays have been around a long time, and I remember at the time I first found them, I thought they were magical and cool. Now, they're looking even more useful as they offer a very fast sorted insert operation. They can also be optimised to return early from existence checks. I'm not sure I'd like to try and make a branchless version of it, but there are many places where the version of Judy arrays that I implemented could be made branch free, without any degradation of performance on out of order processors. Judy arrays were designed to mostly allocate 16 byte lumps, or regular multiples of. This on it's own is probably worth caring about for the sake of the cache line size.

Judy arrays are really worth reading up on. So go do it.

The other thing I thought of when talking about existence based behaviour was that sometimes you wanted to be able to check if an implicit entity existed through their components. This is quite easy on a sorted container, but what about an unsorted container? Do you have to run through all the elements to find out if a certain one exists? Well, no, not all the time. A certain Mr Bloom invented a very clever use of a bit field and a hash function. The Bloom filter lets you check to see if an entry exists. It sometimes returns false positives, but the bloom filter operation is so quick and so lacking in memory usage that it's a really good trade off to run it quick, then do a make sure afterwards.

Judy Arrays
Bloom Filters

And so you now have some generic optimisations for your row-oriented development.

Monday, March 22, 2010

Branchless Development

So, onto making the way we get around the problem of dynamic stuff in row-oriented development. One of the things that did worry me for a while was the possibility of row oriented being a memory thrasher, but then, as homogeneous data is notoriously easy to pool, all i needed for a slick quick row allocator was a risc processor friendly branch free pool container.

Branchless Containers

Along the way I came up with some other stuff too. A stack, queue, linked list and a couple of helper functions. Mail me if you want my source in it's single header file incarnation.

Sunday, March 21, 2010

How real people do it.

GDC presentation by the makers of PixelJunk:Shooter


Ties in quite nicely with the other article

Friday, March 19, 2010

Game Logic VIII : optimising

When optimising code from an OO base, I find that one of the problems I face more often than not is the tendency for the code to have just in time value calculation. This was really cool when I started out in coding, even a year ago I would have advocated lazy evaluation whenever possible, but now I'm not so sure. The issue with lazy evaluation is the inherent additional state data about the value, and the I cache miss for when it does need evaluating.

Pushing your data around in rows and transforming it from one state to another, you'll never hit the dreaded I-cache fail that lazy evaluation necessitates, but, you can still have lazy evaluation. Huh?

Simple, it's existence based processing again. One of the input tables you need for a process can be a dependency table, this dependency table is your lazy evaluator friend. All it needs to do is check to see if a value doesn't already exist, then add to the table of values to calculate, then process that table, then voila, you have your lazy values without having to actually break away from your main thread, because your main thread doesn't do too much before it demands the values.

Also, optimising is quite a bit easier because you know what to optimise. How often have you looked at some code and had an "aha" moment, because you've noticed that the real bottleneck was something daft like "if( gDoDebugRenderingOfThisOneThing )", which is not a const bool, so it doesn't compile out of your release build? For me, often enough to say that debug code and runtime code (processing of the game state vs processing of views on the data) should be kept separate.

Optimising processes or transforms is also a lot easier than optimising great spiderweb like connected runtime dynamic function messes that I've found in scene-graphs, game-logic and AI code paths. Humans can reason about single transforms a lot easier than they can reason about complex nested dependencies.

Finally, ifs are really bad for your code, and processing data through existence lists gives you a lot of the power of ifs without the failure overhead. The principle behind my process for removing ifs is to provide a ifless list structure that uses branchless value update techniques to write to link->next pointers. The only ifs you have to use are the "is list terminated yet" type.

By the way, thanks for reading all these posts. I hope it helps. Its helped me understand my own idea better to write it out. I think it's time for me to write up a complete document now.

Wednesday, March 17, 2010

Game Logic VII : Beautiful Homogeneity

Going back a bit now, from the start there's been other work done on stream processing and the structure of arrays approach to developing objects that can be manipulated at high velocity, but something that doesn't crop up much in their talk is the idea of processing from one source into destination of a different size. Most of my realisations have been of those kind, where a row is processed into becoming more rows (LOD up and increase the AI count), or less rows (process items and find them no longer necessary, so revert to defaults and drop the row).

This leads to a slightly different problem than the ones faced by the pure stream processing crowd. If my stuff relies on tables that can grow and shrink, surely it's going to be either just as bad at random access (if I use a linked list for my rows), or horrible for insert or delete (if I use dynamically sized arrays for my structs of arrays). Well, as I've learned from Mike Acton, you don't have to do stuff the way everyone expects.

Most of the different uses of rows fall into one of a few categories, and solving the performance of each of them can be thought of as a separate task. As far as I can tell, there are only a few types of table. I'm going to try to explain what I might do to stop things from getting bad.

There is the very simple type of table that just needs to have deletions and insertions, ignoring order, but maintaining its current-content state during a pass of the processing. By that I mean that you can guarantee that an iterator gets to see all the rows that existed at the beginning of the iteration run. Deletions and insertions are guaranteed not to affect the iterator. I think this is just a case of following Mike's lead. I'd use a similar system, but when the deletion scan passes through, rather than collapse the table back, try to put a new insertion in the gap first. That should save a few memory ops. Extending the table in case of an out of capacity exception would probably be simplest to implement as an extension mechanism (akin to that of many good dynamic pool allocators), the actual iterators will need to bear this in mind, but it may be quite a small price to pay for a no-brainer access model. Always remember, we're after maximal throughput of data, not avoiding the occasional spike.

Implicit entity tables, in some circumstances, might need to be sorted. One example would be if they need to be merged/joined for connecting sensory input into the decision tables for changing which state table their FSM exists in. A sorted table would be better than an unsorted one, however, most of the time, this could be solved by having a dedicated sorted view into the general container for the tables. If the table really does need to be sorted, then initially I'd suggest a batch based representation. Put batches of N in some self balancing tree, and whenever you need to delete an item, just mark it as deleted, so skipping is simple, and when you need to insert mid sequence, split the batch into two batches and insert into the now obvious space in the middle. If you have the safety net of an insertion and delete sync point, you can optimise this really well. By the way, I would choose a tree representation so to maintain the power of the data set being sorted. The size of N is important, but I don't think it will be easy to guess at. I think you would nee to tune per use, as access patterns and the size of the data will change the optimal value for N.

Obviously, if you're doing this, you're not going to be thrashing memory, as it's all simply poolable. This is cool as one of the problems with highly active databases in the past was disk thrashing for high throughput systems.

Tuesday, March 16, 2010

Game Logic VI : Finite State machines and Decision Tables

Finite state machines can easily be defined by existence in a table, this much should be apparent from the previous articles, but the alphabet (triggers that cause state change) can also be represented by row information. Decision tables are quite common in database tech, and they provide an easy to read, simple to reason about, really easy to modify setup for making decisions about multiple outcomes given multiple incoming data.

I'd recommend reading some stuff.

With a decision table per state, finding the next state is quite a simple task, and one that can be optimised algorithmically (using standard techniques for query optimisation, or specific ones for optimising join order on decision tables).

Also coming to mind as I type, things that are frame coherent, like renderables that have a state of "potentially visible" can use existence based processing and be ticked only when necessary. Using count-lodding to drop in sub-objects from larger size cullable groups when they become visible also allows for the traditional hierarchical approach to culling, while maintaining the benefits of FSM simplicity.

Oh and another thing, on debugging, I saw this on a data-oriented development post

"When debugging why a specific mesh is being culled, though, it really helps to see all the data about that entity in one place. Otherwise, you spend a lot of time traversing data structures in the watch window, just to find out what state an object got in that caused it to flip its visible bit to false. So the view of the data that humans need is another important piece of the puzzle"

No, think bigger dude!

As you can see, the preference for all the data in one place comes about because we're thinking about things as objects that have a state, but rethink these words with decision tables and we end up having a very good possibility of keeping tabs on what decisions were made. We can keep the resultant decision row of the culled items, and even the calculated values that lead to that row choice. This actually lets us look back in time rather than have to try to reproduce the same scenario multiple times. You could say that we're naturally stuck thinking that the only way we can debug something is by thinking about the system rather than just asking it questions directly. Microsoft have just released some cool stuff in Studio 2010 that lets you look at the history of a value, and there was a talk about a JVM that let you roll back the whole state machine, and now we can do the same without having to use managed code.

This actually makes debugging easier, not harder.

Think how often you've wondered "why" something happened. When you're homogenising the techniques you use to make decisions like this, debugging will be easier as logging decisionscan be written once and reused often. Ever heard of a reusable debugging framework before? You have now!

Monday, March 15, 2010

Game Logic V : Count Lodding

This whole thing about entities being implicit on their attributes does allow something else quite cool to emerge from data-oriented development. I'm calling it count-lodding until someone tells me what we really call it.

Take your natural squadron of space ships, visible on radar from the ground. The player has a radar and is picking up the squadron, which in game space is actually represented as an entity of type squadron, with attributes number_of_ships=4.
As the squadron comes into visible range, the distance lod requests that a squadron renderer be instantiated, suddenly there is now a new entity in the render lists, a mesh render of four simple ships.
As the squadron gets within good visible range, the squadron lods out into a single ship (that is inserts a row into the squadron leader ship table and deletes it's row from the low-lod squadron table), and spawns three more entities that are squadron-ships, all formation flying following their leader.
The number of entities has gone up from 1, to 4.
Now, anti-aircraft weaponry has come on-line and blown up two of the space ships, causing the pilot of one to eject, and the wings of the other to fall off.
1: Add row to "disabled aircraft", delete row from "squadron-ships", add row to "ejected pilots"
2: Add row to "disabled aircraft", add two rows to "broken wings"
The number of entities has gone up from 4, to 7.

as the wings and planes com crashing to the ground, their physics stabilises and they turn into mementos and simple render calls. Entities goes back down to 3, unless the ejected pilot's shute doesn't open, in which case, we can drop to 2, but a splatter render row might need to be added.

Count lodding, therefore, is about increasing the count based on lod, or decreasing it.

Consider AIs getting into a car, as they get in, their "car passenger" behaviour can be a dead behaviour, no table, and they get held only as rendering objects. The car turns from "passive parked" (non-processing table), to "driving car". At a certain distance, the people in the car can have their rendering rows removed too, meaning we've gone from 4 active, 1 passive entity, to 1 active entity with 4 mementos for rebuilding their render or other attributes in case of going high lod, or getting out of the car once they've reached their destination.

This saves us time, and memory. Both these things sound good to me.

Game Logic IV : Existence based processing

Existence based processing is something that has been missing from many data-oriented development articles. This isn't just a side note, this is a really important thing. If you have lists of stuff that need to get done, then you get dynamic runtime polymorphism. Let me explain.

Runtime polymorphism normally means runtime binding of code using virtual tables and the like, which cost you a double dereference at best. We are aware of the extra space taken up by these signatures at the beginning of all of our polymorphic structures, and the extra time involved in double dereferencing the function pointer on virtual calls. We also think it's impossible for a class to exhibit runtime binding without these costs. In games we normally use virtuals to allow us to specialise a class into a particular role, such as a person being either a player character, a normal badguy, or maybe just an innocent bystander. How they get around, their mode of transport, or even down to how they path find, it's different for each one. The player just uses controller input most of the time, sometimes controlled during cut-scenes, for a badguy, it's his template or setup from the level, for an innocent bystander it might be a canned animation for the most part, or some flocking algorithm. The important thing to remember is that whenever anything does a movement tick, it normally goes through a process of hunting down the virtual table by accessing the vtableptr at the beginning of the child class of the "person", then offsetting into the table to find the function pointer for "TickMovement()".

Normally I say because in my experience, people do it this way.

Existence based processing handles it differently. For virtuals that define a behaviour like in the case above, the table your person belongs to affects how they move. Literally, the table the person is in defines what processing will be done on them to get them to move. If the person has their motion data in the pedestrian table, they get handled like a pedestrian, looking for a random path somewhere. If they're in the badguy table, they use their start points and their patrol routes to choose their motion. If they're in the player table (which might not just be one entity, think about network games), then they'll have some input to process, then get on with moving too. But in each of these cases, there is no need for a double dereferencing vtable pointer.

That's one benefit. The other benefit is using existence to provide an alternative to bools.

Summarising compression technology: encoding probability. All known compression technologies work on the simple premise that if it's likely, use less bits to descibe that it happened. Existence based processing offers us a little of that, but for our CPU time. If most of the people in the world are at full health, only process those who have non-full health. This is simple, whenever someone gets hurt, insert a row in the "am hurt" table, and process the table it every time you would have processed health. If that person dies, then you could convert them into a deadperson (who also doesn't need an "am hurt" as all dead people have zero health, presumably). So, for things that are sometimes not default, you can reduce the amount of time spent on their processing and reduce their memory usage.
  • anim: in a modal animation for some task, like throwing a grenade, or changing weapon
  • anyone: hurt, but regenerating health
  • player: under water, losing oxygen or just surfaced and replenishing
  • people: heard some gunfire, so actively looking out for suspicious activity
  • shops: player interacted with me so keep track of what happened so I don't repeat myself or suddenly forget he sold me a +5 longsword in the ten seconds he's been away.
The idea of patching default data with modifications has been around a long time, notably with landscape systems that start from a fractal and let artists affect it then save the modifications. But this is much more runtime.

Now, let me reiterate, this gives you runtime changable polymorphism, costs you a pointer probably due to point back to parent class, which is equivalent to vtable pointer cost, but saves you the double dereference for accessing the polymorphic qualities. Also, you get to change your "type" at runtime quite easily because your type is defined by your existence in a set / table.

Game Logic III

A while back I posted about a new way of thinking about games logic and game entities. Now, I've been writing up a doc on how to make your game data-oriented using ideas from database engineering.

I feel like I've come full circle, right back to university, where I first learned about databases, about data-flow and about entity relationship diagrams.

So, onto the important. Data-oriented development: taking the old OO and throwing out the random access patterns, bringing in the structs of arrays. Graphics cards are very good stream processors, and so will all the big CPUs in our near future. Data-oriented development will let us stream through this era and give us a high possibility of utilising our machines to their fullest.

This article explains the basics better than I can, but it uses a lot of words...

Extending that, think about bools being replaced by existence in tables, using table rows for attributes but their existence to indicate that the attribute is somehow not default. We do it in many markup languages. Also think about entities as implicit through their behaviours, you can switch type as quick as you can delete a row from one table and insert it into another. This gives you runtime dynamic polymorphism without even the overhead of a normal virtual call.

If entities are implicit, then you also get what I call count-lodding, on the cheap. Count-lodding is where the LOD determines how many entities there are, such as a single "squadron" entity turning into individual planes once they're close enough, or passengers in a car, driver too, turning in their IDs to the car itself, which went from being a car-passive, to a car-driven, taking some attributes from a pedestrian that got into the driving seat, now merely a rendering task and a memento of their pedestrian selves.

This methodology leads us to a minimal amount of memory used for all tasks, keeps our allocations in regular sized pieces (because rows are regular), and gives us direction for optimisation, and provides a simple to parallelise code base, in fact, probably generically threadable. Serialising becomes a doddle, and networking should be quite a lot simpler too. Also, using this data-oriented tabled approach gives us access to years of development experience by database engineers. It finally gives us some really useful science to utilise in our day to day job.

Many times I've wanted there to be games-design-patterns, but never came across much more than the GameSessionLevel pattern, or the ObjectModelController pattern.

Tuesday, February 23, 2010

The Progress Driven Life

The reason to do what you do could come from any one of many different sources. Some people do what they do because it's what their parents did. Some do it because they think it's right to do it. Some people do what they do because it's what they think they're good at. Some people do what they do because they think they're having the best of times doing it.

I'm in the last two camps, I think I'm good at what I do (and think that it is fun), and get upset when I am asked to do things I'm bad at. This can be frustrating sometimes.

In reference to an argument between myself and a colleague at Broadsword, I'd like to propose that everyone should strive to find out what they're best at and do it, rather than work towards being a lawyer, doctor, dentist, stock broker.

Some time ago I posted (might not even be here) that I felt a little bad sometimes that being a games programmer was a bit like being a drug dealer, in that we make things that actively try to addict people into playing them. Especially with the advent of the MMO, we're seeing people losing their lives to the drug that is games. I felt that what I was doing was somehow wrong.

In the argument, which was started over the question of "what science should we do more of to make the best progress for the human race?", I came to the conclusion that pharmaceutical cosmetic drug research, though not obviously useful in any way other than to beauticians, was still an active science, and therefore should still be on the list of things that humans should be proud of in the name of progress. Accidents happen in random labs and there's no saying that cosmetic chemists won't be the ones to find the next big thing.

Which leads me back to games. One the objections I have to large scale games publishers is their fear of the odd, of the new. This lack of innovation, though there is plenty of work done very efficiently, may stifle new emergent technologies or strategies for progress in games development. I fear that unless these companies engage in a 20% time of active innovation, they're going to remain large market share, but with little growth.

Games development is odd, it's smaller and larger than other software development at the same time. We have a much lower expectation of robustness of code than most of the other industries, as no-one's going to lose a life over badly written code, but our software has to take into account so many more variables than anything else. Where else do you find that the baby science of artificial intelligence is so commonly used and performance pushed to the limit of hardware?

From this furnace of rapid development we find many improvements to technology that otherwise just wouldn't exist. If it wasn't for games, there would be very little of the technology that runs the beautiful Pixar films, there wouldn't be the large TVs that now clutter our living rooms, there wouldn't be Blu-Ray or film download services, and there wouldn't be easy to drive military-tanks or our beautiful camera phones.

Every form of technology feeds into every other form of technology, and the sooner we realise this the sooner we can all feel proud that Mario helped you survive cancer.
Do what you do best, never assume that what you do will not help the human race. Just do what you must, because you can. For the good of us all. Except those who are dead.

Monday, February 22, 2010


I found a very good definition of the amount of control we have over ourselves in the book Happiness Hypothesis. It describes us as being two minds, the rider (who has a whip and a good sense of tactics and strategy), and the elephant (not slow, but hard to change direction, and generally knows what it's doing and follows the path of least resistance without input from the rider).

This idea of rider and mount doesn't just work for people, it also works for any other large scale operation involving a fast moving mind with a slower moving ultimate decision process. Business is definitely one area that it links very well (based on readings of Built to Last and Good to Great), government also, I think.

The importance of value in this view of life is not to be unappreciated. Value is the way of setting the goals of the elephant. If you want to make yourself do something, you can either use up a lot of energy (will power in constant use) thrashing your elephant to move in the direction you want, or you can change the elephants values (one shot use of will power to change values).
In business, you can adjust the direction by adding incentives and fines for certain types of behaviour, and the same for government.

So, if you want to change what you do, you have to change your values. But firstly, you have to find out what your values are. How do you do that? For starters, your rational brain won't really know what your elephant's values are, so you have to do a bit of observing. Watch what you spend time on, because that's a good indicator of values. If you find it easy to commit to something, then it's probably true that your elephant holds that in high regard. Once you have a handle on what your current set of values are, then you can start to think about what you might want to change to make your life a nicer one.

Values can be changed, but you really have to work at understanding the ones you already have, otherwise you might end up trying to add oppositional but orthogonal forces. A reasonable example could be smokers. Some won't give up because they really value their cigarette time, and even if they are rationally aware of all the dangers of smoking and how much better off they will be financially, they can't really give up because they enjoy the chatting with other smokers element of their life. A good way to move into a better quitting position might be to always stand away from the pack of other smokers, in order to reduce the positive effect of smoking. Take out the positive side and the elephant disconnects and can be taught to appreciate the healthier living afterwards.

Another might be television, if you feel you're addicted, then begin by recording TV, and turning off the television until a show you want to watch has finished recording, then skip all the adverts and announcements. This might help reduce the effect of "just one more show" that is prevalent amongst the couch potato generation. Also, plan to do something, anything, some DIY or read a book, just so you have something to do while you're not watching telly. Hopefully, the values of time spent can be changed through distracting your elephant.

Basically, you're trying to change the rules of physics. Any time you change the laws of physics, you change the optimal solution. Your elephant naturally tries to find the path of least resistance. The elephants in businesses are even easier to divert, just use money. Fine businesses for things you don't want them doing. Reward them for things you do.

Okay, so that sounds easy... why hasn't anyone done that before? Well, they have, but the corporations of the U.S. found a loophole. It's the same loophole that anyone that's done genetics to any degree has found. The Selfish Gene points out that many highly complex interactions can come about through the relatively simple process of reproduction with inheritance. Sometimes cheating is actually the best fit in a complex landscape. You have to be careful that you don't create artificial fitness landscape that fits cheaters better than people how are doing the right thing. Sometimes it's possible to find a local maxima that matches the true intention of the law givers, but more often than not, a badly thought out law can offer a terrifying maximal solution, that by all accounts, is judged fair on paper.
Fixing this usually means using less variables, lower the complexity of the fitness landscape. Omit any naturally occurring fitness rewards or fines (such as, transport costs or pr). Keep the laws simple, and massively promote open books so the media can contribute to the morality of the corporations involved.

Government has the hardest job, it's an attempt to change the actions of its people, by changing the values of its people, by changing laws that affect both its people and itself. To do this it has to find value in change, and change the opinion of its members to allow the introduction or change in the laws, which might mean it needs to create a situation with which to convince its audience. Hard work working a slow system from the second or third removed driving seat.

Happiness Hypothesis
by Jonathan Haidt
Built to Last and Good to Great
by James C Collins and Jerry I Porras
The Selfish Gene
by Richard Dawkins