The Optimization Game

I guess I'll play this game for a bit… my response to Josephs' response

(Overcoming Swift Resistance Explored) to my blog article about debug build performance, especially regarding the performance of arrays.

First I want to acknowledge that Swift Debug builds are hideously slow.

We should be able to stop right there. We both agree that non-optimized builds are hideously slow. But…

I still don't think it is an absolute block on any real development because

there are fairly simple workarounds.

Maybe, just maybe, we have different "real" development that we're doing. Could there possibly be a significant difference between the types of apps we are attempting to build?

The answer is obviously: yes.

I'm trying to prototype a game layer that must run at the minimum of 30Hz. Most apps are not real-time systems; every millisecond counts in a game. With a handful of milliseconds you could have better particle effects, better game AI, or more realistic physics. You only get 0.033s (33ms) in each frame

(@30Hz) to do your work. The more frames you have to use to compute things, the worse overall experience your players are going to have.

The other problem with this statement (and the solution presented) is that it assumes that we can optimize the slow parts of the game by putting the code into a framework. Ok… besides the annoying mess that is to pull stuff in and out of there, and completely ignoring that this is a trivial example of one algorithm that would need to be moved, it completely misses the fact that it's that very slow part that I need to debug because something strange is going on. If my debug build is now running at 2 or 3 Hz, it becomes extremely difficult

(impossible really) to actually play to game to get the error that is to be debugged.

So while it is most definitely not that huge of a deal for most app developers, it most certainly is to any one processing a large amount of data or building a real-time system. It creates a tremendous developer workflow problem.

These are the developers I'm talking to. These are the ones the blog post is for. Do not use arrays (as of Xcode 6.1.1) because they will attempt to suck your soul from your very body during development if you process a lot of data from them in tight loops.

If you are not iterating over a lot of data frequently, don't worry too much about it. But if you're building a SpriteKit game… don't use an array to store all of your elements that you'll be iterating over because it's going to suck if you have a bunch of them.

Again, my focus is on developer workflow. That is extremely important to me. Every slow down I hit is an impediment to me and solving my code problems.

Swift is Fast!

We get it, you think Swift is fast enough. Great!

So, I downloaded your code and ran the both the ObjC and Swift versions (CMD+R within Xcode).

-Onone: [Pixel] avg time: 0.0175069s, stddev: 0.000430218s, diff: 82%
-O:     [Pixel] avg time: 0.0177036s, stddev: 0.000400837s, diff: 11%

Ironically, I kept getting the -Onone build to be slightly faster, but there is some variance here, so all good. 82% faster than the ObjC debug version and 11% faster than the ObjC release build. Great! The workarounds suck, but ok.

There's one major oversight in these numbers though: the ObjC version is nowhere near the optimal code. We can help the compiler out in a very trivial way by changing the ObjC1 render function to this:

void RenderGradient(RenderBufferRef buffer, int offsetX, int offsetY)
    int width = buffer->width + offsetX;
    int height = buffer->height + offsetY;
    uint32_t *pixel = (uint32_t *)&buffer->pixels[0];

    for (int y = offsetY; y < height; ++y) {
        for (int x = offsetX; x < width; ++x) {
            *pixel = 0xFF << 24 | (x & 0xFF) << 16 | (y & 0xFF) << 8;

That's just taking 5 minutes to break out the common operations. Maybe more can be done, but I wouldn't want to go further than that without understanding if there was really a need because the code above is still clear and readable.

The timings after:

-Onone: [Pixel] avg time: 0.0178055s, stddev: 0.000500147s, diff: 53%
-O:     [Pixel] avg time: 0.0184924s, stddev: 0.00152481s, diff: -94%

The debug speed is good. You beat me there. Of course, yours is the optimized version. Fortunately, I can do the same hack you did, but much easier since compiler flags per source file is supported for ObjC.

So, pulling RenderGradient out and telling the compiler to always compile with the -Os flag gives us this result instead:

-Onone: [Pixel] avg time: 0.018294s, stddev: 0.00158887s, diff: -90%
-O:     [Pixel] avg time: 0.0180182s, stddev: 0.00145248s, diff: -89%

So no, Swift is not faster than ObjC if you take the time to optimize the code path as you did with the Swift version. In fact, Swift is now twice as slow as the more optimized version of the ObjC code.

Pull request to see what I did is here:

  1. I keep calling it ObjC code because it's in a .m file, but this is just C.
The Optimization Game