Tuesday, November 08, 2011

Data-Oriented Design : look at trends and values

Quite a good post from Niklas Frykolm, not unexpected, he's a good gamedev from what I've read,


But he does skip the data-oriented design, which is a shame. I see this a lot on data-oriented design posts, the poster writes about how to remove the bad C++ stuff that menaces the code with potential cache misses, which is definitely important, but then doesn't extend the work into data-oriented design.

In Niklas' post, he makes the layout of the data much better for the hardware, which can be seen as data-oriented design, but data oriented design goes further than this and is about looking at the actual data, not just the schema.

One indicator of this is the words Say you have in front of 512 sounds. This may sound trivial, but the point of data-oriented design is that it isn't architecture astronaut programming, it's not about maybe's, it's about facts. You can only get so far with good guesses.

For example, he rightly says that you might need to set MAX_INSTANCE_PARAMETERS pretty high, but that is only true if you don't know what your coding for. It should be mentioned, that if the code is middleware, then data-oriented design is hard as there is little knowledge about the data. It's difficult to nail down requirements at the best of times, but when you're not even involved in the final utilisation of a library, you can't really make any good guesses about what shape the data will be in.

To turn this into a data-oriented post, we need to put qualifiers around it. How many sounds will there ever be? 447 at once? with up to 11 parameters? from a set of 315 different parameters? Also, there is no mention of how the parameters are used. You can't do a data flow analysis if you don't know when or where the data is flowing.

Let's assume that the parameters are used to set up the sound, and pulled in by a function that chooses a wav based on the material and the weapon, and only then pulls in the force parameter for volume and pitch tuning. The number of sound instances that have weapon or material hashes outweigh the ones that don't, so there's a good reason to include them as a first class members of the structure.

So, we know we always want to access the weapon and material at the same time, which means that the weapon and material can be combined into one hash, the wav selector hash. Now we have an static array allocated pool of 450 sound instances, with space for an index into the linked parameter pool plus one free hash that represents the material weapon combo, saving us two parameters and the overhead of another cache-line load.

Now that's data-oriented design. We took what we knew about the data, including domain knowledge, and turned out a solution that takes advantage of facts that the compiler could not guess at.

Friday, August 12, 2011

How many machines do you need to develop a multiplayer game?

Assumption : You're working on internet PC game.
  • As long as you can run multiple instances of your application, then you can do all the network multiplayer testing you need.
This would be fine if you didn't want to test if your game grinds to a halt due to network traffic not being 100% reliable (like it is if you use local loopback).
  • You can run the game on two machines to do your testing, that would be indicative of network traffic in reality.
Except it's not, as you're testing on a Lan, which is about a thousand times more reliable than any real world internet connection.
  • You can run the game on two machines, but make sure they are located on different ISPs.
Much better, and would work for almost all cases of network errors such as NAT tunnelling.

But, what if you're working on action game? They normally have match-making, and no dedicated servers any more apart from login and match-making. Most games are actually host-client, but have more state information held in the clients so they can reliably survive a host drop. This P2P network gaming has it's own problems.
  • Two machines, one is virtual host, one is client.
Great, what about client vs client interaction?
  • Three machines, one is virtual host, two are clients, do most of your testing on a client vs client basis as host would probably know enough to be right anyway.
Good, but what about buggy situations where there are three clients involved in a dispute (such as when you get shot by two people, both client, you're a client, and you don't know who shot first and who shot second.)
  • Three or more machines depending on how many people are involved in each of the logical gameplay elements. (#involved + 1host == needed machines)
But what about network throughput, connection dropping and net-splits?
  • As many machines as you have decided maximum players in the game, and some funky network-switches/hubs you can pull plugs on.
Which is impossible for MMOs... no-one could get that machine machines together, let alone pay all the testers to operate the game.

Well, here's what I think you should probably do:
  • For initial development, three is a must.
  • Once the basic game is done, make sure you have some form of bot, and get as many computers as you can afford.
This works for everything but human error and tunnelling problems. You must have real humans play the game in their homes, so you must do an alpha/beta before release. If you don't there will be game breaking issues. This isn't just likely, it's a guarantee on all but the simplest multiplayer games.

Tuesday, August 09, 2011

"first draft of the GPHI doc / wiki"

That was the first line of an email. About what was to be my new approach to writing games. The email was sent on the 17th of April 2006.

That's how long I've been doing data-oriented development without knowing it's name, or even having any other people to talk to about it.

GPHI was a new layer into our code base that took the idea of classes and turned it on its head. I found out in 2007 that what I'd invented was quite similar to the dungeon siege component model. I took all the components we normally had in our classes, or inherited from, and made them into managers. This mean that we had a manager update loop, not an entity update loop....

fast forward a short while and I went on to work on (but not finish before the company went belly up) a new rendering engine that didn't use a scene graph, but instead stored all the renderables in an array and sorted by whatever criteria necessary (material/mesh/shader/shader-constants), even changing per frame.

fast forwards to now... I still haven't been involved in the release of a data-oriented game, but I'm still plugging along, and pushing component oriented development, entity systems, stream and transform oriented approaches to game object processing and even writing a new language to help teach and maybe even be useful and productive with.

It's been a long time since I had that epiphany back in early 2006, the world has moved on a long way, but I feel quite happy that the road I started on is the one leading to the future.

Monday, August 08, 2011

The reason why, and what it's not.

The language I'm working on (Transformative C or TC) is designed to force the programmer into doing things in such a way as it becomes hard to make software that is bad for the hardware.

Here is a list of things it provides in order to persuade the programmer to play with it:
  • Cannot memory leak. Behind the scenes memory management without a need for garbage collection.
  • All data manipulation is done through shader like transforms which take one or more input and output streams. (more than one input stream is not normally recommended though)
  • Duck-typing (kind of, you can provide a struct translation to use a non-conforming struct in a transform)
Here is the list of things it doesn't allow in order to restrict the programmer from breaking the simplicity:
  • No capacity for object oriented development built-in, no types other than the user defined streams and the elements provided by the language)
  • No pointers (and therefore, to some extent, memory) there is no real need of them.
  • No recursive function calls. This is just so it plays well with FPGAs and GPUs
How does this compare to OpenCL?
I think OpenCL might be something that this language can compile to, but at present, TC is much higher level. OpenCL gives the programmer all they need to make good parallel code run on various parallel architecture, whereas TC restricts you to be sure that whatever you write can be run on anything. This distinction can be be summed up by my interpretation of the goals of the two languages:
  • OpenCL was developed to give a unified API to access GPU and beyond by restricting only where necessary, while providing fine control over how jobs are issues and to where.
  • Transformative C is being developed to stop the programmer from being asked to think about how transforms are being ran, and instead just think about how to write transforms for their data.
Under the hood of OpenCL, developers can shoot themselves in the foot by allowing the data to be organised in any way they see fit, luckily, most hardware can use striding to get around the norm of programmers dropping things into structs. However, this does not help older hardware that can't stride, or newer hardware designs that might have an advantage if they could guarantee non-sparse reads.

Under the hood of Transformative C, everything will be the simplest array possible with inputs and outputs being more like Structs of Arrays and all data streaming handled by the management layer, which can choose which layout fits the data usage pattern best.

Wednesday, August 03, 2011

And so I finally publish

three years after I finish writing the book, I have finally decided to publish through kindle and be done with it.


Tuesday, August 02, 2011

Transformative C

The most important part of the project is over. I have given the language a name.

Presently I'm at the stage where I'm still playing with the lexer, parser, and the code generator, trying to make it do anything at all, but there are some things that have become more obvious through the short time it's been in development.

  • a program is made up of transforms that operate on streams of data
  • the selection of, and order in which the transforms are called is the program.
  • the transforms know nothing about what created the data they are operating on
  • the transforms know nothing about what will use the data they create
  • you can reuse code in two main ways
    1: sub functions called by transforms
    2: using programmes of transforms given streams as arguments
  • everyone is going to love or hate this language
Once I have a program that prints hello world, I'll start putting up source code snippets.

Monday, August 01, 2011

Patently obvious

Patents are there to keep inventors safe, yet the people they keep safe are hardly ever inventors.

Surely that fact alone must mean it's time for change?

Could there be a workable alternative, or is this just indicative that any invention needs to be backed up with actually doing something about it in order for you to deserve some reward?

I used to believe that patents would protect my ideas (if I were to have any of note) but now I ponder:
should I be allowed to rest on my laurels after having a good idea?

Why should one moment of inspiration award me a lifetime of cash?
Why shouldn't the second person to have the same idea also be allowed the same award?
Why should anyone be rewarded continually for only one act?

So, an idea is born, of not one mind, but of the collection of minds that helped bring that mind to th singular point at which the possibility of it's non existence reaches zero. Who's mind it falls from, is but a lottery that rewards unfairly in these new times of infinite communication. The reward for such a feat of inevitable invention is to curse all those else who thought of it and instead of acting, pursued a course of engineering, made it material with effort rather than made it legal with uttered words or written letters.
Being rewarded continually for one act reminds me of the very undemocratic method of ruling known as despotism. The cruel world where you could be happily self sufficient, except you have to pay a self proclaimed king because you didn't claim the idea in your name first. But then you would be king, and they the peasant rightfully objecting.
Are you an invention despot? If you have an idea, is it moral to keep it to yourself? Is it moral not to free ideas, to keep ideas to yourself, to try to earn a living by keeping back the human race?
Are we not, as a race, made grand by our ability to copy, share, remix and create from the ashes and fallen bodies of half-baked ideas and inventions? Made more than mere animals by our unique ability to be transformed by the memes that we hold so dear?

But what about the inventors that did try? The web site owners that had their idea taken wholesale (such as Apple vs Instapaper). Well, I say, Apple spent their money to make a product. So did the inventor. Hopefully they will have another good idea and make more money, but if they are living on the cash from one idea, then all your eggs in one basket comes to mind. If you're big enough that a company like Apple takes notice and steals from you, then you're probably not doing too bad anyway.

Ideas are cheap, doing the work costs time and effort. If you can't be bothered to invest your time or money in an idea, do you really deserve it?

Monday, July 25, 2011

Regarding Lana

I know Jim from working at Broadsword, and yes, he does like to invent languages and this one seems to be quite good so far. Just like me, he seems to have started out on a project and headed off in whatever direction his requirements drag him (at least that's why I suspect he added GC) and he's generating a language that looks up to scratch for general scripting work. But my mind keeps wandering off into what else is possible, especially in light of the last couple of years of experience I've had within my profession.

A language that provides virtually the same benefits as python or ruby, but small enough to run on a games machine (which is what Lana looks like to me) is very welcome, but my brain has been mulling over the possibility of a new language. I've been thinking about a low level language, one deeply entrenched in promoting the right way to develop software given the features of modern hardware. One driven by the fact of a future with thousands of cores at our disposal, driven by the need for cache coherency, driven by the possibility of reconfigurable computing on a high bandwidth vector machine architectures.

The basics for me seem to be, data descriptions, data stores, streams, and functions on streams. Obviously I've not fleshed this out in any detail yet, but from my experience of tearing down and refactoring code into a data oriented approach: there is little else to programming than having data and transforming it into other data. To this end, the shader architecture seems perfect for building upon to start this new language. The only shortfall I can see immediately is support for branchy data such as XML files or other hierarchical data formats. If you were only allowed to use shaders from now on, how would you parse an XML file?

Monday, July 18, 2011

Sounds like...

I went to Gamefest 2011, and although I'm no longer really an audio programmer, I attended two of the audio presentations on the second day. The first one was on voice recognition on the 360, and even though it was actually a kind of middleware tech speak advertising moment, it was quite interesting.

For me the highlights were the different things I hadn't really come across before:
Use of Mel-frequency cepstrum coefficients for phoneme detection
Use of the fisher kernel or polynomial kernel to make the support vector machine learning work.
use of the Viterbi algorithm to provide a good result from the Hidden Markov Model of the phonemes.

I've only previous used Markov Chains for compression and mistakenly thought hidden Markov Model was the same thing. Back to my university books on ANNs I guess.

Wednesday, April 06, 2011

Code ownership

When working in object oriented games, a developer often has to think about what object owns some data, this is because we tend to write code that implements data as being managed by objects, handled by them, processed by them, and contained by them. Sometimes it can seem pretty obvious what should own the data, such as when you realise that you need a new member in all your entities to say whether or not it is active. The obvious answer is that the entity owns that data. It's wrong, but we'll get back to that later. Other times it can be a difficult question to answer, like where to store the pathing information for an ad-hoc group of AIs, should they determine a leader and store the pathing in the leader? Should their be a pathing manager that they can all subscribe to? Should it be part of the road / map manager seeing as the A* is running on that data more than anything else?

This is one of those things that keeps coming back to bite us when we try to solve real world problems following real world patterns in a computer simulation written in a computer language. The dysfunction there is that we're trying to solve a real world problem when in fact we should never submit that we're solving a real world problem at all. We're solving the wrong problem. We should never try to solve a real world problem in code. That's usually impossible.

What we do is simulate. We should first appreciate that no matter how much our codebase contains nice identifiers like Ally, or emotive words like Aggressive, or real world nouns like Car, or state adjectives like Dead, what we're actually doing, always, is processing data.

The idea of code owning data seems to have survived in spite of many reasons to not trust it. I've seen rooms owning their portals, even though they obviously share the portal with another room. I've seen player structures own their position data, even if there is not a world for that position to exist in (think about the last time you moved a player to a safehouse somewhere just so they wouldn't be rendered in a cutscene.) I've seen players own their guns (so they become tiresome to drop), and because of limitations, bullets are owned by the world (and then become tiresome to fire from the gun)

Code doesn't need to own data; it needs to process it into more data.

So, going back, your entity needs a new state for when it's active. Who owns this state data? No-one owns it. Ownership is for people or corporations. You can simulate ownership in your game, but code cannot own data. Got it yet? No? Object oriented approaches might pretend that this is possible, but it's not. Data is just data. You can lock yourself away from being able to transform it in simple and direct ways by hiding the data, but no code can actually own the data. Instead of focusing on who owns the data, focus on what needs it and how it's used to make the game run. In this case, rather than adding a new bool to a possibly overcrowded and messy class, don't add it at all. Add accessors for your crackpot brethren who think it should be inside the class, but instead of accessing a member variable, redirect to existence in a set.

That's right: the active state may as well be a list of entities that are active. If you want to only do something if an entity is active, by moving to a list of active entities you no longer have to check to see if the entity is active before doing that thing. Instead, you just commit that transform for all the entities in the list.

Pathing, equally as silly. No, pathing data does not belong to anyone. It is requested by whatever event caused the AIs to start pathing, it is used by the AIs, and it goes away once the path is no longer required (probably cleared up by the terminate of the state the AIs are currently in).

Initially there is a transform to generate the path that converts the world data, the AI position and ergonomics and goal position into a set of pathing instructions (waypoints for example, easy to follow breadcrumbs), then there is the next transform that converts the current AI and the pathing information into an entry on the IsFollowingPath list. Entries might take the form of a pair of pair AIPathFollowing andAIEntity. Then there is a per tick transform that transforms the AIEntity.Movement and AIPathFollowingelements based on timestep and current value for AIEntity.Position. Sometimes this transform would notify that the AIPathFollowing element needs to be deleted, and the pair should be removed from the IsFollowingPath list. Finally, there could be a per tick check to see if the IsFollowingPath list reaches zero, at which point, the path is deleted.

Data is just a source or a destination, not something to be owned. Code can transform data from source to destination, but should not be considered as owning data. Let's stop turning computer problems into a computer language descriptions of a real world problems. Let's admit that we're programming computers, not reality.

If you love it.

If you love it or like it, you have chosen it.
If you've chosen it, you've preferred it to something else.
If you do it, or have done it, you have not done something else in turn.

If you don't do something or haven't done something, you've chosen not to do it.
If you've chosen not to do something, you've preferred something else to it.
If you've preferred something else, then no matter how much you say you like it, or want to do it, you don't.

I would like to say that I love making computer games, and to some extent that is true. I think that most games developers, even if it's not true, at least want to say that they love making computer games. But, the people that actually love making computer games just make them. I do a day job that let's me do something that I enjoy, that is, write code that makes computer games go. I am part of a development team that makes games. I am a games developer, but do I love making computer games?

Going by the statements at the top, I would have said that a few years ago I wouldn't count as someone that loves to make games. Then, late in 2009 I started making games again, not for sale, not for work, just because. I used it as a means to the end of learning about STL and C#. To this end I would say that I was making a game because I wanted something other than a game at the end of it. So, even at this time I wouldn't quite say I loved to make games.

Things changed when I moved to my latest games company, mostly because my family was put in an extremely stressful situation; I am working in London, while my family is living 250 miles away in Wales. I stopped making the game for the sake of learning and instead started writing a non-fiction book on programming. So it turns out that at the time I was more interested in games programming than I was in making games.

But, after a few months of doing the weekend run to Wales, only spending a little time with my kids, I decided I'd try to do something creative with them. The obvious choice for me was to start up a kid safe Dungeons and Dragons campaign. I diluted the rules, made them overpowered, then sent them on mostly story driven hack-em-up missions. This worked out really well, and eventually it reminded me of why I wanted to make games in the first place. I wanted to DM to a much larger audience. I wanted to make games so I could DM without the hassle of the dice and paper. I never really loved making games, I loved games programming and loved seeing people play games.

So, I started making games again, first a 2D minecraft game (using python and pygame) which grew and grew until it stood up as a game that my wife would play for hours, making castles and other pretty things (as people are known to do in pure sandbox games). I then realised that I was already not sure where to go with this game, as it was starting to feel like work, so dropped it and began working on a isometric version of the minecraft game. My kids enjoyed it, but it was severely limited as I couldn't think of a simple and robust way of adding digging without causing all sorts of horrible rendering bugs, so it became boring to do again. So, that's when I decided to to move on again and wrote a very simple platformer. All this time, the kids have been playing the games and telling me about ideas of what to do next or given me level designs to put into the games.

And that is what I started making games for. The feedback from my audience (my family) has so far been the most rewarding experience I've ever had in my 11 years of making games. I don't love making games.

I love people experiencing what I've created, and I create games.

Now think. What do you love?

Wednesday, February 23, 2011

It looks like my wife still loves me

I decided to spend a little time writing a game with my kids, and of course, they wanted to do a minecraft clone. So, after saying no, I said "Hey, but we could do a 2D minecraft"... and that was when Flatcraft was born.

So, we've been working on it for two weeks, spending about an hour on it per day on average, teaching the eldest how to code in python and all but the youngest of them how to do art in a limited pixel area. The best bit has been being able to work on the game even though I'm a very long way away from them. We use Dropbox to sync, and I can often see a chat window response to my changes before I tell them that I've made the change!

The kids have enjoyed seeing their art in the game and pygame lets me turn around a 2D game in little to no time, even to the point that I'm now deciding to try out libraries rather than stick with what I know (what I tend to do when coding for my job). And that's made working on the game more exciting from a learning perspective even for me.

We've added a zoom out, which you'll see the result of below, skinning (so that the boys have minecraft style textures while mummy has pink fairy princess world), saving (which means I can inspect the worlds they have been creating), and there are a couple of creatures hanging around as proof it can be done (one passive, one aggressive).

Not only do the kids and I like it, so does my wife. She never tried Minecraft, and I'm thinking that it was probably a good thing, as the patience and determination to build that castle, and dig the words out would probably mean I'd never be able to talk to her again.

Well... I think she still loves me.

Each of the pixels is aworld square...

So, that's a lot of digging, and a lot of filling in. Yeah. I think she still loves me.

Monday, February 07, 2011

? Backwards Crunch Is

I've noticed over the years that during crunch periods, although people can often lose their minds slightly, they'll also do some of their most creative thinking... and also, during crunch, it's the time when people are least likely to be allowed to implement any new and crazy (or brilliant) ideas.

Now, I once thought that this might be because you tend to think more about the ideas you haven't been able to do anything about, but it is also true that once you enter a certain level of stress, you actually see more stuff going on around you than you would normally. This is called a lowered state of latent inhibition.

Allowing these crunch maddened people to work on their ultra nu ideas would be project suicide, but, surely this must be wasting the talent?

I have tried to make sure I have a log of all the crazy I've come up with over the years, and sometimes act on it when I finally have some time to myself. Do you?

Monday, January 24, 2011

Expect the unexpected

What slows development down more than anything else?

One thing that provides endless amusement to me and I'm sure many others is when libraries or the existing source you have to maintain do something unexpected. When I say unexpected, I mean something that no-one would expect given the standard practices that you've come to understand, or standards you've been lulled into appreciating by the first class obvious nature.

Why would you expect a function titled "addThing" to remove the last thing that was added and replace it with the new one? Why would you expect "addThing" to add to a list, but that list to be handled as if there was only ever the first item that was added in the list?

I think that most of the time I spend debugging something, it's because some common sense lead the coder into thinking that a function behaved a certain way, and that was wrong.

Does this mean that the coder responsible for the original code is wrong, or the user of the code? To me it is hard to say as we are all the makers of code, and also users, but why do we end up blamings the coders instead of something else? What else could there be that makes it possible for us to misinterpret the meaning or use of a function? This problem is normally covered by the Ronseal rule, but when functionality needs to change, sometimes people forget to change the name of a function / class.

The simplest solution is to peer review, but that doesn't always work as the peers have to be aware of the dangers too.
Maybe submitting misleading names as bugs?

What would you suggest?