Monday, August 08, 2011

The reason why, and what it's not.

The language I'm working on (Transformative C or TC) is designed to force the programmer into doing things in such a way as it becomes hard to make software that is bad for the hardware.

Here is a list of things it provides in order to persuade the programmer to play with it:
  • Cannot memory leak. Behind the scenes memory management without a need for garbage collection.
  • All data manipulation is done through shader like transforms which take one or more input and output streams. (more than one input stream is not normally recommended though)
  • Duck-typing (kind of, you can provide a struct translation to use a non-conforming struct in a transform)
Here is the list of things it doesn't allow in order to restrict the programmer from breaking the simplicity:
  • No capacity for object oriented development built-in, no types other than the user defined streams and the elements provided by the language)
  • No pointers (and therefore, to some extent, memory) there is no real need of them.
  • No recursive function calls. This is just so it plays well with FPGAs and GPUs
How does this compare to OpenCL?
I think OpenCL might be something that this language can compile to, but at present, TC is much higher level. OpenCL gives the programmer all they need to make good parallel code run on various parallel architecture, whereas TC restricts you to be sure that whatever you write can be run on anything. This distinction can be be summed up by my interpretation of the goals of the two languages:
  • OpenCL was developed to give a unified API to access GPU and beyond by restricting only where necessary, while providing fine control over how jobs are issues and to where.
  • Transformative C is being developed to stop the programmer from being asked to think about how transforms are being ran, and instead just think about how to write transforms for their data.
Under the hood of OpenCL, developers can shoot themselves in the foot by allowing the data to be organised in any way they see fit, luckily, most hardware can use striding to get around the norm of programmers dropping things into structs. However, this does not help older hardware that can't stride, or newer hardware designs that might have an advantage if they could guarantee non-sparse reads.

Under the hood of Transformative C, everything will be the simplest array possible with inputs and outputs being more like Structs of Arrays and all data streaming handled by the management layer, which can choose which layout fits the data usage pattern best.

1 comment:

Eli Ford said...

This is really interesting. I'm especially curious about how you choose an efficient order of operations, and what kind of memory model you use for storing intermediate results.