Jonas Lund

2014-09-24

A response on Jonathans Blow's first talk about programming languages.

(The talk is here: https://www.youtube.com/watch?v=TH9VCN6UkyQ )

It was a long talk full of thoughts and summarizing all implications of it wasn't possible without writing a long text so i put it into a few sections and i might also appear terse so please correct me and let me expand on things if needed. I'll also try to refer to other peoples work if possible so be prepared to do a little leg work to keep track of my points.

Part 1, something similiar

Having dabbled a lot in programming languages for games during a couple of years i've been on roughly the same train of thought as Jonathan presents in the talk. I actually got as far as implementing a preprocessor on top of C that did some of the main parts mentioned in the talk,

Back when i did this preprocessor i did a small sample that illustrated most of the features.

int main(int argc,char **argv) {
 List* l=@[ @Test{auto free char *str=strdup("hello"),int num=1},@Test{auto free str=strdup("world"),num=2} ];
 List_For(l,@void(Test* t) {
  printf("%s ",t->str);
 });
 printf("\n");
 List_Destroy(l); // should delete the list and the sub entries and their strings
 return 0;
}

Apart from being mostly regular C code there was a few additions.

The @[] syntax creates a "List" object.
The @Test{ } syntax declares the type and allocates an instance of the type Test, (This feature was meant for throwaway and intra-module objects but the syntax also allows one to construct objects with custom property values as shown in the second instance)
The "auto free" annotation inside the declaration tells the preprocessor to invoke the function "free" on the contents of str when destroying a Test object.
The @returnType(Args) { body } syntax creates a lambda that's passed as an argument to the List_For function
And finally the List_Destroy(l) will destroy the list and all contents inside thanks to the auto syntax

I also stored full type information to aid auto-serialization as can be seen in this bigger sample that uses annotations embedded in the objects to load a hierarchical 2D sprite animation from a JSON file (the functionality is similar to Spriter). The lambda used in the get function with JSONUnpack is there to make correct type selection (allowing for some dynamism in terms of object loading)

The motivation for doing this pre-processor was a notion of wanting to write C as i wrote my JavaScript code (not worrying about memory deallocations, constant type specifications, lambdas and far simpler serialization) without losing too many benefits of having an environment like MSVC. For these parts it worked fairly well. Being a C preprocessor also bought with it a couple of nice side benefits.

I had hot code loading working (Via TCC) and actually used that part for iteratively making some jam game.
Since it was just a C preprocessor i was also able to debug the code fairly well from MSVC since the types was more or less C types and with #line directives i had correct source mappings (Closed over variables in lambdas could be tricky since they got embedded inside objects rather than be on the regular stack).
I had some dynamic dispatching code added for objects (similar to Objective C) since it was a relatively trivial addition compared to what i had already. Didn't find myself using it that much though.

So where did it go wrong? Since i didn't go further than being a C preprocessor i skimped on making proper modules. I added a module syntax where Anim::get("some.json") would refer to function get in the the module compiled from Anim.shi and while this was a good solution it became troublesome when referring to "structs" that were changing within another module when coexisting in a MSVC project that compiled each module when changed but didn't have any notion of dependencies (Looking at it in retrospect i should have probably just gone for full program compilation or at least linking as a pre-link step instead of trying to fit everything into separate modules).

So the direction of the talk has valid points and by handling a few things i did wrong the proposed language could be there already (with the exception of required tooling).

As a little side note that during this project i did remember finding myself thinking that i was reimplementing Objective C at many times. Not doing C/C++/ObjC all over when doing a similiar lowlevel language is kind of hard. :)

Part 2, general memory management

One of the first points Jonathan raised was to avoid GC's, yet he quickly comes back to it as the main point of automated management (the debug build double free detection would be composed of GC like code). I think the points he mostly thinks about is some kind of declarative hierarchy of memory management that automates a lot of C++ boilerplate similar to SQL's cascading deletes. While i fully agree that many parts of C++ are overly complexity much of it stems from being flexible in terms of how one can code while at the same time allowing for custom memory handling.

A prime example is actually mentioned in the talk, namely lambdas and the different forms of declaration (And i think this partially refers to the capture rules), they exist due to the funarg problem of lambdas. You usually have 2 principal kinds of uses of lambdas, either as a means of composing functionality within a local context (apply code X to every node in an arbitrary tree hierarchy) and the delayed callbacks. The first example of compositions will get bad performance hits if you start allocating data needlessly while the second one of delayed callback will cause faults if things wrongly exist on the stack. With a GC'd language this problem is usually shunted onto the GC as a performance problem (this one of the primary reasons for having a generational or a collector with similar characteristics) or you end up with a situation like in C++ where syntax has to be introduced to handle the details manually.

And, outside of this the talk also skirts in at times on problems readily solved by GC's. So let's go into some technical parts about GC's before throwing away the baby with the bath water to see what exists, how it works and finally what kind of solutions we could look at. Because by saying no to GC's directly we are 100%'ing on performance while possibly skimping on happiness.

There are 2 basic approaches to garbage collection, reference counting and tracing methods. To recap, ref counting slows down mutation since all reference handling slows down the mutator (program code), makes the compiler more complex and has problems with cycles, pure tracing on the other hand causes GC pauses that are highly undesirable for games. A lot of optimization work has gone into removing the worst problems associated with tracing and as long as memory isn't too short and the language/runtime helps out then soft realtime targets are attainable. These optimizations work by merging characteristics from ref counting with tracing as proven by some guys who did a lot of research into GC's.

Also an incremental GC (the minimal you want to avoid the worst pauses) either requires compiler support in terms of write barriers OR OS support for write barriers via paging (This method MIGHT be possible with some custom code hacks via MMAP).

While i like GC's the fact is that for a full program with many allocations you will either need to have ample memory available at all times or suffer from GC pauses. And we all know that artists won't allow for ample amounts of memory to be available ;)

Part 3, value memory management

Rust is mentioned a few times since it takes a fairly novel approaches to memory management in the low level world. In many ways Rust is an attempt to popularize and make practical some of the desirable characteristics for the low level world found in Erlang, funnily enough Erlang itself in may ways was an attempt at making features from languages like Prolog and ML practical (although not the same features).

In Erlang you work with immutables (this helps the Erlang GC's work as you are guaranteed that older objects are ALWAYS referenced by newer object), an observation of immutables also leads to the fact that you can guarantee that a linear language as first defined by Baker will work (one reference only, as in Rust). I actually did 2 linear dialects meant for demos/games long ago, one very fast JIT interpreter of something quite near Bakers paper and another far slower one that ran on J2ME to support saving the entire program state for a Lucasarts style adventure game, Having done these i kinda agree on the fact that working with such constraints puts a bit of unnecessary burden when doing application programming and this was what we were out to avoid in the first place.

Part 4, hammers and nails

Full system GC's can be a problem performance wise, yet the ways around them are troublesome or potentially buggy (setting aside programmer competence for a second). With one language we couldn't (perhaps shouldn't) try to do everything with one hammer. What of the option of system and scripting/application languages? Code performance of scripting languages sucks, they come with GC's and writing bindings for them has been a big on the level of managing loading and saving?

V8 and LuaJIT are examples of modern interpreters with dynamic typing that is several times faster than the systems of old, if you further restrict yourself to static typing via inference you could actually gain even more speed (more on this later). Like Erlang has shown (and this has inspired Rust) local heaps can be collected much quicker without affecting the entire system, an analogue to this is that application/scripting languages could live in a smaller part of the "world" that is garbage collected but due to the smaller size any pauses would be manageable. Finally the pain of writing bindings since you usually in the end usually need to access about every part of the project via scripting or move it to C/C++ code, notice how i mentioned loading and saving in the last paragraph? As serialization is solved by introspection so can bindings be with compilers that give us more information, sadly the single biggest omission of C++11 is the total lack of standard enforced type information and it's holding us back. (See for example how Nashorn and Rhino before it is able to interact with Java code while not seamlessly atleast deeply enough).

In terms of memory management it could be summarized as the low level part having a restrictive system, something in terms of declared ownership structure like my preprocessor auto and the talk's ! annotation weighted in a value like system of Rust exposed in an uniform way to GC'd application code that can easily follow the more restrictive rules when accessing lowlevel parts. (And also allow the application GC's to introspect system stacks to enable better GC when interacting through system-app codepaths).

Part 5, other ideas

As for concurrent compilation, yes and no. Many features are helped by that but benefits from full program analysis are lost (There might be a middle ground here). Eclipse is a great example of a compiler that takes the worst parts out of compiling by incrementally compiling files as they are saved.

Language concurrency has been getting far better by the years, C++ AMP is a good example right in the mainstream already and of course the actor model that many Erlang buffs swear by.

A idea i think gets an extra look in this context is coroutines/continuations since they could replace use for callback lambdas(compare it to people talking of function stacks of doom in the Node.JS community) and constrain the usage lambdas to functional compositions instead.

Type inference was bought up (partly in the context of duck typing), one very exciting project in terms of performance is Shed Skin Python but it has very long compile times (apparently 10 minutes for 6000 lines!). That project however tries to solve a fairly hard problem of trying to mold an uncooperative language into a static C++ form and has to deal with n-expansion. (Imagine writing a Generic binary tree container, when you use that container with different types an inferring compiler must figure out that the tree container object isn't generic itself but that the tree nodes contained within it are generic and that it should fix itself to the nodes types).

Languages like Ocaml, Rust,etc suits better to type inference and when doing constrained type inference one can results fairly quickly compared to when working with an uncooperative language, runtime updating of running code should be fairly easy also since you only need to allow updates of code that doesn't violate the existing constraints and updating a constraint net isn't too bad in a bigger context of type inference.

Conclusion

My belief is that while we could easily do things proposed in the talk, my C preprocessor was done as a hack in 2 weeks (refined over a month while doing another project) and got half of the things mentioned in (and added other things).

However i'm leaning more on that we really need to address the lack of reflection in C/C++ as a stumbling block to future languages that has C relegated to "inline assembly" style tasks for performance and figure out a powerful but usable language on roughly the same level of abstraction as Rust, Haxe and Swift. I'm not entirely sure if the language should have constructs to help compact memory for cache efficiency or if that should be left to C code, some minimal constructs would probably solve most used cases.

Wow, that was a monster post and i hope that at some point in time someone manages to get through this post :)

And another long hiatus yet again.

In the previous post i noted that i had started looking at making games again, doing that however bought me back to languages yet again and i've been working on static compilation of javascript for the past year now (with breaks for ludum dares and smaller projects like my GBJam 3 project that i might post more about later).

This static compilation project is now my thesis work so i'm hoping that i'll be able to finish it in the near future, a small experimental project showed that statically compiled code with definite types can surpass even V8 in some specific cases when it comes to performance since a custom static compiler has less to worry about hostile code than a regular browser does.

More on this project later. Next up is a response to Jonathan Blow.

2011-10-08

Long time no posting

Silly of me not to update this blog more often. Since i last posted (and was active on this blog) i've completely changed the direction of my company, basically i had plans to make a Java Bytecode-> C compiler to enable game companies to port their J2ME games to various platforms. Then the iPhone really gained traction and most people abandoned J2ME game development (not negative really!), i also heard from some friends at a company doing some similiar tools about how their time is mostly spent on support (i went on my own to be able to focus on doing products and not support/consulting in the long term) and then Oracle happened (no need to elaborate on that) and that was the final nail in that coffin.

During the time since the last blogging i've also been on leave with my kids and spent some time trying to figure out what to focus on. In the end the most sane choice (even with all competition) seemed to get back into actual game development and try to make some money.

During this time i've made a few game prototypes i'm now in the process of porting one of them to the iPhone. The link to the playable Ant General prototype is here (the performace is sadly bad on FF3 but it should work pretty well on most modern browsers).

2009-03-25

Bureaucracy

My companys first fiscal year ended in August last year, that means that i must submit a fully audited financial report to the government within 7 months or face a fine. As this was my first time doing this it took a while to get it done and when i finally came around to submitting it the auditing firm about a month ago they apperantly were short staffed due to sickness. Finally i got a call on monday where they said that they'll be finished this week.

Still i'm only one week away from the final submission date and as the agency(Bolagsverket) is warning about a 5 day "handling" time i'm really on a short limb here. After reading up on the rules regarding online submission i really wasn't sure about how they do things as it seems like incorporated firms must use various kinds of software to generate files in some kind of format, so i decided to call them up and ask about applies if you don't want to spend money just to send something (or by some ideological or other choice would be running an operating system that doesn't have these apps).

The conversation went something like this:
Me: Hi, i'm wondering if i could submit a scanned PDF through an authenticated service instead of sending in papers? (Afaik everything will be scanned to pdf by them)
Lady on the phone: Hang on a second It says here that you have to have this program..
Me: yes, i read your page but is there a specific reason you can't accept a scanned document for an incorporated company but you will accept it for private business?
Lady on the phone: um... um.... um.... well ... umm ... umm.. well because you have to use one of these programs.
Me: <*sigh*> Ok nevermind

I don't think that this type of behaviour is an exclusivly Swedish thing, but i suspect it's more common here due to labour laws and the way the Swedish government tries to subsidize cities that looses jobs due to big companies shutting factories or the government closing down things like military bases.

Now before any politically correct Swede comes around and accuses me of being a "big-city-slicker", i'd like to point out the fact that i grew up in a small town in the northern part of the country (Haparanda) and i actually belive that you can make a financial case of moving many service jobs out to smaller towns with lower living costs.

But as with everything you actually have to have a plan and do stuff properly, the Swedish model of providing a job to people to keep them employed for the sake of employment and reducing unemployment numbers doesn't help ANYBODY the slightest bit.

2009-02-19

Late, but better that than never.

When i initially "started" this blog i was kinda swamped with my creating the reports for my previous fiscal year, so the promised post didn't show up. And after finishing that i've been sick and had to catch up a bit on other work. Regardless of that, here's my second post.

24hour business camp has passed (yeah it passed when i wrote the first post) and i gotta say that it intrigued me in several ways.

Partly as someone else already commented it did remind me of the demoparties i used to attend, people comming together and grinding out something before some arbitary deadline. Many people (i'm a prime example myself) tend to have this idea about doing stuff really well and forgetting that stuff needs to be out to be relevant at all. The counter argument is usually that you might end up with other headaches such as supporting buggy behaviour,scaling and such, but before you release you really don't know what those problems are going to be.

As for the sites people released at the event i gotta say that it was quite a mixed bag in my opinion. It wasn't that exciting to see a bunch of me-too that were some of the first ones i spotted. The other thing that saddened me was that some people were just there to get publicity for rather generic sites.

On the upside there was a bunch of really cool ideas and some sites that weren't original but certainly filled a niche that given time, exposure and luck oughta be profitable. I've actually forgot what ones i voted for but 2 of them were tampongdax and barnamun simply because i think they should be able to turn a profit big enough to sustain themselves. I think i missed the point of 90sec when voting earlier but i would definetly say that it goes within that same category.

I think Voicly was the truly most unconventional idea of the entire camp and somehow i wasn't surprised that one of the authors was Fredrik Berglund that also was in on the initial idea of Squace. As cool as the idea is i don't think it'd ever be truly successful, but i for one would be glad to be proven wrong as Fredrik is a cool guy.

Of all entries for 24hbc my definite favourite idea was SiteGuard. I don't know if it tells more about me or the crowd voting at the 24hbc site that SiteGuard was actually in the absolute bottom.

Malware and spam on the net isn't going to decrease, on the contrary at the same rate as more customized software(webapps and sites) is deployed on the net there are going to be exploits targeting those. Already today spamming and malware scripts targets most login sites and sites protected with captchas, providing us with a situation where you will end up with nasty things on your pages regardless of the measures you put in. Also as a developer you always don't get exactly the view of your site that your customers do most of the time, it's quite common to find problems with sites during off-hours just because you're in a more "common" environment when surfing from home for example.

Because of these reasons i feel that SiteGuard has the potential to protect the site that your customers see, saving both manual effort required to remove things and embarassment when you do miss things that your customers get exposed to simply because they work as an external resource. The main problem of SiteGuard is that the site doesn't work(?), the site is also very light on information about what malware is detected and finally there seems to be a total lack of customization for your own site. F.ex. it would be a good idea to be able to provide a sample-login for the scanner to be able to access the "inner" site. Regardless of all this, the idea is brilliant.

PS: I'm at full speed working on my first public product, i'm hoping to release more information and a beta soon.