Swift Namespaces

I've seen some people complaining about the lack of namespaces in Swift… well, I'm not sure if I've seen this anywhere else, but you can fake namespaces if you really must.

enum std {
    static func add(x: Int, _ y: Int) -> Int { return x + y }

    enum more {
        static func sub(x: Int, _ y: Int) -> Int { return x - y }
    }

    struct apples {
        var count: Int;
    }
}

std.add(1, 2)
std.more.sub(1, 2)

var apples = std.apples(count: 12)
apples.count = 10

Now, I don't really recommend this, but it's kind of fun to hack around sometimes.

Update

Oh… and we can mimic using too:

typealias m = std.more
m.sub(1, 2)

I think you still need to have it qualified though, so no naked use of sub.

Swift Namespaces

Swift v1.2 β2 Performance

I've been taking a look at performance of Swift for a while now, with two very specific focus points:

  1. Developer workflow (non-optimized builds)
  2. Real-time applications (fully-optimized builds)

I'm very happy to report that Swift v1.2 β2 seems to have made some really great improvements in performance, even over Swift v1.2 β1.

Performance Numbers1

Language: C, Optimization: -O0                    ┃  6.3 β2  ┃  6.3 β1  ┃  6.1.1   ┃
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━┩
RenderGradient (Pointer Math)                     │   38.651 │   38.266 │   38.780 │
──────────────────────────────────────────────────┴──────────┴──────────┴──────────┘

Language: C, Optimization: -Os                    ┃  6.3 β2  ┃  6.3 β1  ┃  6.1.1   ┃
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━┩
RenderGradient (Pointer Math)                     │   10.374 │   10.042 │   10.162 │
──────────────────────────────────────────────────┴──────────┴──────────┴──────────┘

Language: C, Optimization: -Ofast                 ┃  6.3 β2  ┃  6.3 β1  ┃  6.1.1   ┃
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━┩
RenderGradient (Pointer Math)                     │    3.112 │    3.022 │    2.976 │
──────────────────────────────────────────────────┴──────────┴──────────┴──────────┘

Language: Swift, Optimization: -Onone             ┃  6.3 β2  ┃  6.3 β1  ┃  6.1.1   ┃
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━┩
RenderGradient ([Pixel])                          │ 3684.914 │ 4012.283 │ 29134.32 │
RenderGradient (UnsafeMutablePointer)             │ 171.3359 │ 162.9272 │ 152.2643 │
RenderGradient ([Pixel].withUnsafeMutablePointer) │ 386.9801 │ 523.7199 │ 1159.273 │
──────────────────────────────────────────────────┴──────────┴──────────┴──────────┘

Language: Swift, Optimization: -O                 ┃  6.3 β2  ┃  6.3 β1  ┃  6.1.1   ┃
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━┩
RenderGradient ([Pixel])                          │ 26.91437 │ 46.57198 │ 66.40995 │
RenderGradient (UnsafeMutablePointer)             │ 22.71384 │ 21.74668 │ 22.21477 │
RenderGradient ([Pixel].withUnsafeMutablePointer) │ 22.33094 │ 39.78517 │ 39.62590 │
──────────────────────────────────────────────────┴──────────┴──────────┴──────────┘

Language: Swift, Optimization: -Ounchecked        ┃  6.3 β2  ┃  6.3 β1  ┃  6.1.1   ┃
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━┩
RenderGradient ([Pixel])                          │ 19.66853 │ 40.28331 │ 66.04476 │
RenderGradient (UnsafeMutablePointer)             │ 20.02236 │ 19.53774 │ 19.51404 │
RenderGradient ([Pixel].withUnsafeMutablePointer) │ 20.25013 │ 41.26989 │ 38.34509 │
──────────────────────────────────────────────────┴──────────┴──────────┴──────────┘

This is great! The worst case, [Pixel], in non-optimized builds was 29s back in Xcode 6.1.1 (the latest official build). In latest beta, that's down to 3.6s! Sure, that's still a horribly slow time, but that's still a massive improvement.

The other thing to note is that nearly all of the optimized builds got twice as fast as before. The biggest thing to note here is that the usage of arrays is now just as good in optimized builds as the pointer versions. That's a big win for code clarity. Once the debug performance of arrays is fully addressed, this will be even better.

Hopefully 6.3 β3 shows just as remarkable improvements.

Also, I've started a GitHub repo to start tracking this stuff better:

https://github.com/owensd/swift-perf.

Feel free to send any pull requests for items you'd like to see tracked here. I do plan on doing some updates to the infrastructure piece of it to get some easier delta reporting.

  1. The times should be read as: "the amount of time, in milliseconds, it took to compute 30 iterations".
Swift v1.2 β2 Performance

Swift’s Protocol Extension

I have been working on porting over my ObjC platform layer from

Handmade Hero and I have a bit of code that looks like this:

self.aboutMenuItem.title = [self.aboutMenuItem.title
                            stringByReplacingOccurrencesOfString:DEADZONE_TITLE
                            withString:APP_NAME];
self.hideMenuItem.title = [self.hideMenuItem.title
                           stringByReplacingOccurrencesOfString:DEADZONE_TITLE
                           withString:APP_NAME];
self.quitMenuItem.title = [self.quitMenuItem.title
                           stringByReplacingOccurrencesOfString:DEADZONE_TITLE
                           withString:APP_NAME];
self.helpMenuItem.title = [self.helpMenuItem.title
                           stringByReplacingOccurrencesOfString:DEADZONE_TITLE
                           withString:APP_NAME];
self.window.title = [self.window.title
                     stringByReplacingOccurrencesOfString:DEADZONE_TITLE
                     withString:APP_NAME];

The point of this code is to go through and replace all of the text with the

target name in it to a string defined by a config file. But you can easily

imagine that this is any code that is starting to look fairly redundant across

a set of various differently typed items.

This is really just a simple case of iterating over the items in the collection and calling the title setting.

for (var item) in items {
    item.title = item.title.replace(placeholder, withString: title)
}

The astute readers out there may have caught an issue with this approach though. For one, a window is of type NSWindow and the other items are of type

NSMenuItem. That's not going to work as they share no base class that contains a title property.

We have two choices here:

  1. Handle window on it's own separate line, or 2. Create a protocol extension so they do have a common interface with a title

property.

I want my code to capture the intent as much as possible with as little redundancies as possible, so option #1 is out.

So, let's think about option #2. What I really want to say is that both

NSWindow and NSMenuItem share a common set of behaviors, or more specifically in this case, a single behavior: title.

Now, it seems that they actually already do share this behavior and that it is just not codified. However, if you look at the declaration of each, you'll notice that NSWindow actually returns a String? for title and NSMenuItem returns a String (note the optional vs. non-optional difference).

Ok… that's annoying, but we can work around that!

protocol HasTitle {
    var ht_title: String { get set }
}

extension NSMenuItem: HasTitle {
    var ht_title: String {
        get { return self.title }
        set { self.title = newValue }
    }
}

extension NSWindow : HasTitle {
    var ht_title: String {
        get { return self.title ?? "" }
        set { self.title = newValue }
    }
}

With the protocol extension HasTitle, we have now codified what the behavior semantics are. It's then trivial to implement that appropriately for each type that you want to share this HasTitle codification.

The final code to make use of this now looks like:

let items: [HasTitle] = [aboutItem, hideItem, quitItem, helpItem, window]
for (var item) in items {
    item.ht_title = item.ht_title.replace(placeholder, withString: title)
}

I almost forgot… you'll probably paste this code in and wonder why it is not

working for you. I've created an extension on String that adds the replace

function. It simply forwards the parameters into

stringByReplacingOccurrencesOfString.

Now, unfortunately, Swift needed a little bit of type annotation help to make it all work 100%, but we've done it. With one simple protocol and two easy class extensions, our usage code has become significantly cleaner, easier to update, and less error prone.

Could we bring this technique back to ObjC? Of course. Just write the simple for-loop to begin with. However, you'll probably stop there because that is all that is needed to make the ObjC side happy as we can send messages to any object instance; we can completely forgo the HasTitle protocol.

However, that is the part that I think it vastly more important as it's that protocol that leads us into thinking about better composition models. I hope to do a longer blog post about that at another time.

Happy Swifting!

Update, February 16th: I removed the usage1 of map as it was an abuse of the function. I used it because I didn't want to use the ugly version of the for-loop and I was having issues with the iterative version. Turns out I just need (var item) instead of var item. Seems like a compiler bug to me…

  1. Ok… it was really a very bad abuse of the map() function.
Swift’s Protocol Extension

Swift v1.2 Performance

In a series of posts I wrote a few weeks back, I looked at the performance of Swift in the context of debug (e.g. non-optimized) builds. The developer workflow pain that I was having was significant.

Of course, today a new build of Xcode dropped that finally (I jest a little) contained a new build of Swift: Swift v.1.2.

So, is it any better?

I've spent about ten minutes playing around with my projects (described in the blog posts mentioned above). I ran into the following issues:

  • Xcode crashed opening a source file (/sigh)
  • Swift v1.2 conversion missed updating two cases:
  • Byte was removed and not automatically changed to UInt8 in the

    initializer case.

  • UnsafeMutablePointer<Pixel>.null() was not replaced to it's proper usage:

    let pixels: UnsafeMutablePointer<Pixel> = nil.

The crash was annoying. However, the Swift conversion process went well and I don't really care that they missed those two cases.

So…?

SO MUCH BETTER!! BUT…

The ObjC optimized RenderWeirdGradient() function looked like this:

void RenderGradient(RenderBufferRef buffer, int offsetX, int offsetY)
{
    int width = buffer->width + offsetX;
    int height = buffer->height + offsetY;
    uint32_t *pixel = (uint32_t *)&buffer->pixels[0];

    for (int y = offsetY; y < height; ++y) {
        for (int x = offsetX; x < width; ++x) {
            *pixel = 0xFF << 24 | (x & 0xFF) << 16 | (y & 0xFF) << 8;
            ++pixel;
        }
    }
}

The timings I saw on my machine here:

-O0   0.038723s
-Os   0.010972s

I tested against three different Swift loops:

Raw UnsafeMutablePointer

func RenderGradient(var buffer: RenderBuffer, offsetX: Int, offsetY: Int)
{
    var offset = 0
    for (var y = 0, height = buffer.height; y < height; ++y) {
        for (var x = 0, width = buffer.width; x < width; ++x) {
            let pixel = Pixel(
                red: 0,
                green: UInt8((y + offsetY) & 0xFF),
                blue: UInt8((x + offsetX) & 0xFF),
                alpha: 0xFF)
            buffer.pixels[offset] = pixel;
            ++offset;
        }
    }
}

Pixel Array

func RenderGradient(inout buffer: RenderBuffer, offsetX: Int, offsetY: Int)
{
    // I truly hope you have turned down the number of iterations or you have picked
    // up a new build of Swift where this is not dog slow with -Onone.
    var offset = 0
    for (var y = 0, height = buffer.height; y < height; ++y) {
        for (var x = 0, width = buffer.width; x < width; ++x) {
            let pixel = Pixel(
                red: 0,
                green: UInt8((y + offsetY) & 0xFF),
                blue: UInt8((x + offsetX) & 0xFF),
                alpha: 0xFF)
            buffer.pixels[offset] = pixel;
            ++offset;
        }
    }
}

Pixel Array with UnsafeMutablePointer

func RenderGradient(inout buffer: RenderBuffer, offsetX: Int, offsetY: Int)
{
    buffer.pixels.withUnsafeMutableBufferPointer { (inout p: UnsafeMutableBufferPointer<Pixel>) -> () in
        var offset = 0
        for (var y = 0, height = buffer.height; y < height; ++y) {
            for (var x = 0, width = buffer.width; x < width; ++x) {
                let pixel = Pixel(
                    red: 0,
                    green: UInt8((y + offsetY) & 0xFF),
                    blue: UInt8((x + offsetX) & 0xFF),
                    alpha: 0xFF)
                p[offset] = pixel
                ++offset;
            }
        }
    }
}

With -Onone, the timings are (the diff % is compared against the optimized ObjC code presented above):

Swift v1
Swift Rendering Tests: 30 iterations per test
---------------------
UnsafeMutablePointer<Pixel>     avg time: 0.186799s
[Pixel]                         avg time: 27.9754s
[Pixel].unsafeMutablePointer()  avg time: 1.18535s
Swift v1.2
Swift Rendering Tests: 30 iterations per test
---------------------
UnsafeMutablePointer<Pixel>     avg time: 0.185078s, stddev: 0.00442746s, diff: -377%
[Pixel]                         avg time: 3.92371s, stddev: 0.0777589s, diff: -10032%
[Pixel].unsafeMutablePointer()  avg time: 0.518955s, stddev: 0.00921328s, diff: -1239%

With -O, the timings are:

Swift v1
Swift Rendering Tests: 30 iterations per test
---------------------
UnsafeMutablePointer<Pixel>     avg time: 0.0223397s
[Pixel]                         avg time: 0.0287606s
[Pixel].unsafeMutablePointer()  avg time: 0.0402743s
Swift v1.2
Swift Rendering Tests: 30 iterations per test
---------------------
UnsafeMutablePointer<Pixel>     avg time: 0.022574s, stddev: 0.00115951s, diff: -105%
[Pixel]                         avg time: 0.0268918s, stddev: 0.00143757s, diff: -144%
[Pixel].unsafeMutablePointer()  avg time: 0.0414112s, stddev: 0.00207772s, diff: -276%

There are few large takeaways here:

  1. The raw UnsafeMutablePointer<Pixel> is still significantly faster, in all

scenarios; no performance change between Swift versions. 2. The [Pixel] has significantly better performance than before. It seems

the optimized build even got slightly faster. HOWEVER, the array solution

is still not good enough as it can only render at about 7 to 8Hz (target is

30Hz). 3. Keeping the [Pixel] array and using the unsafeMutablePointer() call for

"hot areas" of your code has also gotten much better. It now gets us to the

target 30Hz in non-optimized builds, but still nearly 3x slower than

simply using an UnsafeMutablePointer<Pixel> to begin with.

My key takeaway:

Swift is getting better and faster but it's still not suitable in the domain

space I'm currently working in for this project: high-performance, real-time

systems.

Eagerly awaiting Swift vNext!

Swift v1.2 Performance

Swift, Protocols, and Composition

Ash Furrow wrote up this post yesterday: http://ashfurrow.com/blog/protocols-and-swift/. It's a good post, you should read it. The problem he outlines is this:

"Here's the problem: you have two types that want to talk to each other."

Ash's solution to the problem was to use protocols to essentially turn the problem inside out. His solution definitely works and it's better than his original way of doing it. It did get me thinking about solving the same problem in maybe a slightly more flexible way.

The drawback with Ash's solution, in my opinion, is that it still takes a heavy leaning on an object-oriented paradigm. I don't think we need that, especially in this case.

If we break down the problem into parts, I think we end up with this:

  1. A set of nutritional information (calories and hydration level). 2. Types of food (wet and dry) that can affect that nutritional information. 3. An animal type (cat) that has a set of nutritional information and the

ability to eat food.

The issue with the object-oriented approach is that it coupled the notion of eating with the animal type (see the eat extension). I don't think we really want that. Instead, we want to capture the relationship between food and nutrition through some eat function.

Let's setup the types:

enum Food {
    case Dry(caloriesPerGram: Double)
    case Wet(caloriesPerGram: Double, hydrationPerGram: Double)
}

protocol Nutrition {
    var calorieCount: Double { get set }
    var hydrationLevel: Double { get set }
}

struct Cat: Nutrition {
    var calorieCount: Double = 0
    var hydrationLevel: Double = 0
}

Ok, so what do we have:

  • The Food type is an enum because we have clear restrictions on the type of

food available and how it impacts Nutrition.

  • The Nutrition item is a protocol that defines the data that it needs. This

is done because we are saying that a cat "has a" set of nutrition information.

  • The Cat is simply a data type that is made up of Nutrition data.

The last thing we are missing is the ability to eat. If you are thinking about objects then I think the obvious thing to do is to add an eat function on

Nutrition. There is another approach though: thinking instead about the transformations of the data.

Because I care more about transforming data, I'm going to create a top-level function eat:

func eat(var nutrition: Nutrition, food: Food, grams: Double) -> Nutrition {
    switch food {
    case let .Dry(caloriesPerGram):
        nutrition.calorieCount += caloriesPerGram * grams
    case let .Wet(caloriesPerGram, hydrationPerGram):
        nutrition.calorieCount += caloriesPerGram * grams
        nutrition.hydrationLevel += hydrationPerGram * grams
    }

    return nutrition
}

func eat<T: Nutrition>(nutrition: T, food: Food, grams: Double) -> T {
    return eat(nutrition, food, grams) as T
}

There are a few things to note:

  1. The var usage in the parameter list just let's us get a copy of that

parameter and use it locally as a new variable. 2. There is absolutely no coupling with any of the types that could possibly

contain Nutrition information. 3. There is a generic version so we do not have to cast when dealing with

concrete implementations of Nutrition, such as Cat.

The usage of the code looks like this:

let kibble = Food.Dry(caloriesPerGram: 40)
let fancyFeast = Food.Wet(caloriesPerGram: 80, hydrationPerGram: 0.2)

var dave = Cat()
dave = eat(dave, kibble, 30)
dave = eat(dave, fancyFeast, 20)

And then we can do some other more functional oriented operations a bit more naturally:

let cats = [Cat(), Cat(), Cat(), Cat(), Cat(), Cat(), Cat()]
let caloriesEatenByCats = reduce(map(cats) { cat in
    eat(cat, kibble, 30) }, 0) { sum, cat in
        sum + cat.calorieCount }

But here is where the really nice part of this choice comes along: if I want to add a Dog type, I only need to create the struct for it I'm done.

struct Dog: Nutrition {
    var calorieCount: Double = 0
    var hydrationLevel: Double = 0
}

let dogs = [Dog(), Dog(), Dog(), Dog()]
let caloriesEatenByDogs = reduce(map(dogs) { dog in
    eat(dog, kibble, 30) }, 0) { sum, dog in
        sum + dog.calorieCount }

In Ash's implementation, you have to then go add an eat function on each new type you are dealing with. We lose no flexility either because if a Dog is supposed to eat differently, we can just create an overload for it.

The other nice part is we can mix and match our Nutrition types like this:

let animals: [Nutrition] = [Dog(), Cat()]
let caloriesEatenByAnimals = reduce(map(animals) { animal in
    eat(animal, kibble, 30) }, 0) { sum, animal in
        sum + animal.calorieCount }

It's just another way of thinking about the problem. Swift seems to be a language that values immutability and composition over mutability and if we start to think about solving problems that way, I think we end up with more solutions like I presented where it's really just about how we manipulate the data in composable ways.

Source: https://gist.github.com/owensd/7bda3f7b3f91245e3542

Swift, Protocols, and Composition

The Optimization Game

I guess I'll play this game for a bit… my response to Josephs' response

(Overcoming Swift Resistance Explored) to my blog article about debug build performance, especially regarding the performance of arrays.

First I want to acknowledge that Swift Debug builds are hideously slow.

We should be able to stop right there. We both agree that non-optimized builds are hideously slow. But…

I still don't think it is an absolute block on any real development because

there are fairly simple workarounds.

Maybe, just maybe, we have different "real" development that we're doing. Could there possibly be a significant difference between the types of apps we are attempting to build?

The answer is obviously: yes.

I'm trying to prototype a game layer that must run at the minimum of 30Hz. Most apps are not real-time systems; every millisecond counts in a game. With a handful of milliseconds you could have better particle effects, better game AI, or more realistic physics. You only get 0.033s (33ms) in each frame

(@30Hz) to do your work. The more frames you have to use to compute things, the worse overall experience your players are going to have.

The other problem with this statement (and the solution presented) is that it assumes that we can optimize the slow parts of the game by putting the code into a framework. Ok… besides the annoying mess that is to pull stuff in and out of there, and completely ignoring that this is a trivial example of one algorithm that would need to be moved, it completely misses the fact that it's that very slow part that I need to debug because something strange is going on. If my debug build is now running at 2 or 3 Hz, it becomes extremely difficult

(impossible really) to actually play to game to get the error that is to be debugged.

So while it is most definitely not that huge of a deal for most app developers, it most certainly is to any one processing a large amount of data or building a real-time system. It creates a tremendous developer workflow problem.

These are the developers I'm talking to. These are the ones the blog post is for. Do not use arrays (as of Xcode 6.1.1) because they will attempt to suck your soul from your very body during development if you process a lot of data from them in tight loops.

If you are not iterating over a lot of data frequently, don't worry too much about it. But if you're building a SpriteKit game… don't use an array to store all of your elements that you'll be iterating over because it's going to suck if you have a bunch of them.

Again, my focus is on developer workflow. That is extremely important to me. Every slow down I hit is an impediment to me and solving my code problems.

Swift is Fast!

We get it, you think Swift is fast enough. Great!

So, I downloaded your code and ran the both the ObjC and Swift versions (CMD+R within Xcode).

-Onone: [Pixel] avg time: 0.0175069s, stddev: 0.000430218s, diff: 82%
-O:     [Pixel] avg time: 0.0177036s, stddev: 0.000400837s, diff: 11%

Ironically, I kept getting the -Onone build to be slightly faster, but there is some variance here, so all good. 82% faster than the ObjC debug version and 11% faster than the ObjC release build. Great! The workarounds suck, but ok.

There's one major oversight in these numbers though: the ObjC version is nowhere near the optimal code. We can help the compiler out in a very trivial way by changing the ObjC1 render function to this:

void RenderGradient(RenderBufferRef buffer, int offsetX, int offsetY)
{
    int width = buffer->width + offsetX;
    int height = buffer->height + offsetY;
    uint32_t *pixel = (uint32_t *)&buffer->pixels[0];

    for (int y = offsetY; y < height; ++y) {
        for (int x = offsetX; x < width; ++x) {
            *pixel = 0xFF << 24 | (x & 0xFF) << 16 | (y & 0xFF) << 8;
            ++pixel;
        }
    }
}

That's just taking 5 minutes to break out the common operations. Maybe more can be done, but I wouldn't want to go further than that without understanding if there was really a need because the code above is still clear and readable.

The timings after:

-Onone: [Pixel] avg time: 0.0178055s, stddev: 0.000500147s, diff: 53%
-O:     [Pixel] avg time: 0.0184924s, stddev: 0.00152481s, diff: -94%

The debug speed is good. You beat me there. Of course, yours is the optimized version. Fortunately, I can do the same hack you did, but much easier since compiler flags per source file is supported for ObjC.

So, pulling RenderGradient out and telling the compiler to always compile with the -Os flag gives us this result instead:

-Onone: [Pixel] avg time: 0.018294s, stddev: 0.00158887s, diff: -90%
-O:     [Pixel] avg time: 0.0180182s, stddev: 0.00145248s, diff: -89%

So no, Swift is not faster than ObjC if you take the time to optimize the code path as you did with the Swift version. In fact, Swift is now twice as slow as the more optimized version of the ObjC code.

Pull request to see what I did is here: https://github.com/josephlord/OvercomingSwiftResistance/pull/1.

  1. I keep calling it ObjC code because it's in a .m file, but this is just C.
The Optimization Game

Swift Resistance Explored

Evidently I struck a nerve with some people on this one, while others simply missed the entire point of the blog post. Let's revisit Swift Resistance and dig deeper into the problem.

I'm going to put my claim right up here so that it will not be missed:

Claim: The performance of DEBUG (that is, -Onone optimization) builds of Swift can have vastly different performance characteristics depending on the type of Swift intrinsics and foundational types that are being used. Some of these choices can lead down a path where your DEBUG builds are all but useless to use during development.

I'll talk about the implications of this towards the end.

Goals

Problem Statement: Design an algorithm that fills in a buffer of pixel data with a gradient starting with green in the bottom right corner and turning into blue in the top left corner. RGB colors should be used with values in the range [0, 255].

This algorithm must be written in Swift. In addition to that, it is meant to be used in a game loop with a desired framerate of 30 FPS for the software rendered algorithm at a resolution of 960×540 (this is 1/8th the desired rate of the hardware accelerated algorithm at 1920×1080@60Hz).

Additional Information: An existing algorithm already exists in an ObjC program; we will take the algorithm from there and use that as a baseline for both performance and functionality. When rendered to screen, your image should look something like the image below:

A picture of a window with green-to-blue gradient squares.

In addition, the data must be processed sequentially; parallelization of this algorithm is not allowed.

Strategy

Given that an existing algorithm exists, the first attempt at such an algorithm should be a straight port ignoring the language features. After that algorithm is working, it would be good to explore algorithms that may be more natural to the language.

So the two approaches that will be used are:

  1. Use an UnsafeMutablePointer to create a buffer of memory that will be

manipulated. 2. Use an Array to act as the buffer so that we are not using "unsafe" Swift

code.

ObjC Baseline

Here is the algorithm for the ObjC version of the code:

#import <Foundation/Foundation.h>

typedef struct {
    uint8_t red;
    uint8_t blue;
    uint8_t green;
    uint8_t alpha;
} Pixel;

typedef struct {
    Pixel *pixels;
    int width;
    int height;
} RenderBuffer, *RenderBufferRef;

RenderBufferRef RenderBufferCreate(int width, int height)
{
    assert(width > 0);
    assert(height > 0);

    RenderBufferRef buffer = malloc(sizeof(RenderBuffer));
    assert(buffer);

    buffer->pixels = malloc(width * height * sizeof(Pixel));
    assert(buffer->pixels);

    buffer->width = width;
    buffer->height = height;

    return buffer;
}

void RenderBufferRelease(RenderBufferRef buffer)
{
    if (buffer->pixels) {
        free(buffer->pixels);
    }

    buffer->width = 0;
    buffer->height = 0;
}

void RenderGradient(RenderBufferRef buffer, int offsetX, int offsetY)
{
    int offset = 0;
    for (int y = 0, height = buffer->height; y < height; ++y) {
        for (int x = 0, width = buffer->width; x < width; ++x) {
            Pixel pixel = { 0, y + offsetY, x + offsetX, 0xFF };
            buffer->pixels[offset] = pixel;
            ++offset;
        }
    }
}

int main(int argc, const char * argv[]) {
    uint64_t start = mach_absolute_time();

    RenderBufferRef buffer = RenderBufferCreate(960, 540);

    const int NUMER_OF_ITERATIONS = 30;
    for (int i = 0; i < NUMER_OF_ITERATIONS; ++i) {
        RenderGradient(buffer, i, i * 2);
    }

    RenderBufferRelease(buffer);

    uint64_t elapsed = mach_absolute_time() - start;
    printf("elapsed time: %fs\n", (float)elapsed / NSEC_PER_SEC);

    return 0;
}

This code was compiled as a command-line tool under two different optimization flags: -O0 and -Os. These are the default "debug" and "release" configs.

  • The timing output for -O0 (debug) was: 0.099769s
  • The timing output for -Os (release) was: 0.020427s

Both of these timings fall well within the target goal of 30Hz1.

Swift Implementations

We already know that we are going to need to have multiple algorithms for the Swift version, so it's important to setup out test harness in a reusable way. One of the things to note about Swift is that while types and functions can be private to a file, they still collide in name usage. This means that each test implementation we write will need to be wrapped in a function.

The first thing we'll do is create a command-line tool in Swift, create the two schemes (for debug and release) to make timing easier.

Ok, let's start with what our test rig will look like:

import Foundation

let NUMBER_OF_ITERATIONS = 30

#if DEBUG
let BASELINE: Float = 0.099769
#else
let BASELINE: Float = 0.020427
#endif

func timing(samples: Int, iterations: Int, fn: (Int) -> Float) -> (avg: Float, stddev: Float, diff: Int) {
    var timings = [Float](count: samples, repeatedValue: 0.0)
    for s in 0..<samples {
        timings[s] = fn(iterations)
    }

    let avg = reduce(timings, 0.0, +) / Float(samples)

    let sums = reduce(timings, 0.0) { sum, x in ((x - avg) * (x - avg)) + sum }
    let stddev = sqrt(sums / Float(timings.count - 1))
    let diff = Int(((BASELINE - avg) / BASELINE * 100.0) + 0.5)
    return (avg, stddev, diff)
}

println("Swift Rendering Tests: \(NUMBER_OF_ITERATIONS) iterations per test")
println("---------------------")

There is a simple timing function that allows us to capture the average time across some number of samples and some iterations of the rendering function.

NOTE: You'll need to add a custom build flag (-D DEBUG) to your Swift compiler options so the #if DEBUG will match.

The Unsafe Swift Approach

The naïve approach is to simply copy and paste the ObjC code into main.swift. Then simply make the necessary updates get it to compile.

The result is this (remember, we need to wrap everything in a closure so that the names do not collide as we add more tests that may want to use the same name for a struct but lay it out a little differently):

import Foundation

func unsafeMutablePointerTest(iterations: Int) -> Float {
    struct Pixel {
        var red: Byte
        var green: Byte
        var blue: Byte
        var alpha: Byte
    }

    struct RenderBuffer {
        var pixels: UnsafeMutablePointer<Pixel>
        var width: Int
        var height: Int

        init(width: Int, height: Int) {
            assert(width > 0)
            assert(height > 0)

            pixels = UnsafeMutablePointer.alloc(width * height * sizeof(Pixel))

            self.width = width
            self.height = height
        }

        mutating func release() {
            pixels.dealloc(width * height * sizeof(Pixel))
            width = 0
            height = 0
        }
    }

    func RenderGradient(var buffer: RenderBuffer, offsetX: Int, offsetY: Int)
    {
        var offset = 0
        for (var y = 0, height = buffer.height; y < height; ++y) {
            for (var x = 0, width = buffer.width; x < width; ++x) {
                let pixel = Pixel(
                    red: 0,
                    green: Byte((y + offsetY) & 0xFF),
                    blue: Byte((x + offsetX) & 0xFF),
                    alpha: 0xFF)
                buffer.pixels[offset] = pixel;
                ++offset;
            }
        }
    }

    let start = mach_absolute_time()

    var buffer = RenderBuffer(width: 960, height: 540)

    for (var i = 0; i < iterations; ++i) {
        RenderGradient(buffer, i, i * 2);
    }

    buffer.release()

    return Float(mach_absolute_time() - start) / Float(NSEC_PER_SEC)
}

Here are the timings:

  • DEBUG: avg time: avg time: 0.186799s, stddev: 0.0146862s, diff: -86%
  • RELEASE: avg time: 0.0223397s, stddev: 0.00101094s, diff: -8%

The timing code is here (add this to main.swift):

let timing1 = timing(10, NUMBER_OF_ITERATIONS) { n in unsafeMutablePointerTest(n) }
println("UnsafeMutablePointer<Pixel> avg time: \(timing1.avg)s, stddev: \(timing1.stddev)s, diff: \(timing1.diff)%")

This is not looking too bad; both are well within our target rate in both configurations. However, we can see that both of these implementations are slower than the ObjC version.

Takeaway: While this implementation is slower than the ObjC version, there is nothing blocking us at this time from being able to maintain a solid 30Hz in both debug and release builds. This is great news.

The "Safe" Swift Approach

UPDATE: I made a pretty obvious (well, easy to overlook, but still should

have been obvious) and significant error in this section… of course that would

happen in a post I try to better show the issues. I used var instead of

inout on the buffer… which, of course, creates a copy of the buffer array

each time… yeah, it's was that bad. Ironically, it didn't affect the debug

performance, but it did help the release build.

The func RenderGradient(var buffer: RenderBuffer, offsetX: Int, offsetY: Int)

should have been defined as: func RenderGradient(inout buffer: RenderBuffer, offsetX: Int, offsetY: Int).

The original implementation created a copy of the buffer each time which left

us with an empty buffer outside of the function call. Not what we wanted.

Again, this mistake only had two repercussions: incorrect functionality and

a 2x regression on the release build. The debug build is still as

painfully slow.

Let me know if you spot any other mistakes.

Now, there is another way that Swift allows us to access a region of contiguous memory: arrays. After all, that is really what the semantics are. So let's give it a try:

import Foundation

func pixelArrayTest(iterations: Int) -> Float {
    struct Pixel {
        var red: Byte
        var green: Byte
        var blue: Byte
        var alpha: Byte
    }

    struct RenderBuffer {
        var pixels: [Pixel]
        var width: Int
        var height: Int

        init(width: Int, height: Int) {
            assert(width > 0)
            assert(height > 0)

            let pixel = Pixel(red: 0, green: 0, blue: 0, alpha: 0xFF)
            pixels = [Pixel](count: width * height, repeatedValue: pixel)

            self.width = width
            self.height = height
        }
    }

    func RenderGradient(inout buffer: RenderBuffer, offsetX: Int, offsetY: Int)
    {
        var offset = 0
        for (var y = 0, height = buffer.height; y < height; ++y) {
            for (var x = 0, width = buffer.width; x < width; ++x) {
                let pixel = Pixel(
                    red: 0,
                    green: Byte((y + offsetY) & 0xFF),
                    blue: Byte((x + offsetX) & 0xFF),
                    alpha: 0xFF)
                buffer.pixels[offset] = pixel;
                ++offset;
            }
        }
    }

    let start = mach_absolute_time()

    var buffer = RenderBuffer(width: 960, height: 540)

    for (var i = 0; i < iterations; ++i) {
        RenderGradient(&buffer, i, i * 2);
    }

    return Float(mach_absolute_time() - start) / Float(NSEC_PER_SEC)
}

The nice thing about this change is that it was super easy to do; it was really only a handful of changes. This method has the benefit that I should not ever leak the buffer because I forgot to call dealloc on the UnsafeMutablePointer value.

Let's check out the timings:

  • DEBUG: avg time: 27.9754s, stddev: 0.0333994s, diff: -27939%
  • RELEASE: avg time: 0.0287606s, stddev: 0.00180078s, diff: -40%

What on earth just happened… 27.9s seconds to compute that loop above 30 times… this loop, right here:

int offset = 0;
for (int y = 0, height = buffer->height; y < height; ++y) {
    for (int x = 0, width = buffer->width; x < width; ++x) {
        Pixel pixel = { 0, y + offsetY, x + offsetX, 0xFF };
        buffer->pixels[offset] = pixel;
        ++offset;
    }
}

That's 960 * 540 (518,400) iterations. This is unacceptable. This is what my previous blog post was entirely about. There is not a SINGLE argument that you can make where you can justify this performance characteristic. Not one.

Now, had this performance been like 300% slower, I might have been able to take that and the safety justifications… maybe. At least if it was only 300% slower I'd still be in a spot where I could run my game at 30Hz with reasonable head room left over for other logic to run.

But no… we are talking about this loop taking 1 entire SECOND to compute. It was nearly 28,000% slower…

Here's a screenshot of the profile with the NUMBER_OF_ITERATIONS dropped down to 2 (there's no way I was going to sit through another full 10 samples of 30 iterations).

A screenshot of the Instruments 'profile of death' for this monstronsity.

Ok… now, I'm left with a few of choices:

  1. Say screw it and leave Swift on the table, convulsing from the seizure this

basic loop setting a value in an array just caused it. 2. Go back to the UnsafeMutablePointer method that was back in the land of all

things sane was in, but then I get to risk all of my consumers forgetting to

call release(). But you know… we're all adults here (in spirit at least),

we should be able to handle our own memory. And really, if I'm going to need

to resort to naked pointers, I might as well stick with C, yeah? 3. I can create another wrapper around the array so that I can use a backing

array to keep track of the memory for me, but expose an unsafe pointer to

that array. 4. Say screw it with the non-optimized builds and live in the land of crap(ier)

debugging which completely breaks the logic flow of your algorithms. Yes,

there is a time for this land, but that time is not the in the beginning of

your project when you are still prototyping, scaffolding, and shaping your

program into what it will one day be.

Of these options, #3 is the worst choice, in my opinion. This is a choice where you are explicitly said you want a "safe" array, but, in this part of the code, I'm just going to go hog wild. It exists for a reason; I'm guessing that one of those reasons is because performance sucks.

Here's the code update for that option:

buffer.pixels.withUnsafeMutableBufferPointer { (inout p: UnsafeMutableBufferPointer<Pixel>) -> () in
    var offset = 0
    for (var y = 0, height = buffer.height; y < height; ++y) {
        for (var x = 0, width = buffer.width; x < width; ++x) {
            let pixel = Pixel(
                red: 0,
                green: Byte((y + offsetY) & 0xFF),
                blue: Byte((x + offsetX) & 0xFF),
                alpha: 0xFF)
            p[offset] = pixel
            ++offset;
        }
    }
}

Let's check out the timings:

  • DEBUG: avg time: 1.18535s, stddev: 0.00684964s, diff: -1087%
  • RELEASE: avg time: 0.0402743s, stddev: 0.0018447s, diff: -96%

Ok… at least I cannot literally go the bathroom, get a drink from the kitchen, come back to my computer and wait for another few minutes while all the iterations are going. However, it's still too slow in DEBUG mode and it's still twice as slow as the ObjC version in RELEASE mode.

Conclusion

OK, so let's be explicitly clear here: this post is not about how Swift is horrendously slow in builds you'll be giving your customers. No, it's about how terribly slow and painful your life as a developer will be trying to write any amount of Swift code that works on any reasonable amount of data inside of arrays in Swift. However, you can see that none of the Swift options are faster or even as fast as the C version. And frankly, none of them are really that much clearer… but that's a different topic.

And don't tell me to open another Swift bug; I have. I have opened many Swift bugs since last WWDC. This post is a way for me to better reflect the current state of Swift to others. It's a way to let the people at Apple see the real impact developers like myself are feeling when trying to even the most basic of things in Swift, and so that when people do run into these issues, they won't have to bang their heads against the wall trying to figure out to solve the problem.

Here is the source for both the Swift and the ObjC versions: SwiftResistance.zip. I release it all in the public domain; do whatever you want with it.

  1. Hz, or hertz, is simply a measurement of cycles per second.
Swift Resistance Explored

Swift Resistance

I feel like every time I sit down to actually start using Swift, I'm constantly battling with the language. My latest challenge: creating a backing buffer of pixel data that will be rendered to the screen.

This really shouldn't be a difficult thing to do, so hopefully someone can point out the obvious thing I'm doing wrong in Swift so that my performance is so terrible.

Here's the basic code that I wanted to profile (of course, there is a version for each of the languages):

void RenderWeirdGradient(BackingBuffer *buffer)
{
    for (int y = 0; y < buffer->height; ++y) {
        int row = y * buffer->width;
        for (int x = 0; x < buffer->width; ++x) {
            Pixel p = { 0, y + GreenOffset, x + BlueOffset, 255 };
            buffer->data[row + x] = p;
        }
    }
}

- (void)drawRect:(NSRect)dirtyRect
{
    CFTimeInterval start = CACurrentMediaTime();

    if (buffer.data) {
        RenderWeirdGradient(&buffer);

        CGContextRef contextRef = [[NSGraphicsContext currentContext] graphicsPort];
        assert(contextRef != NULL);

        NSData *data = [NSData dataWithBytesNoCopy:buffer.data
                                            length:buffer.width * buffer.height * sizeof(Pixel)
                                      freeWhenDone:NO];
        CGDataProviderRef providerRef = CGDataProviderCreateWithCFData((CFDataRef)data);

        CGImageRef imageRef = CGImageCreate(buffer.width, buffer.height,
                                            buffer.bitsPerComponent,
                                            buffer.bitsPerPixel,
                                            sizeof(Pixel) * buffer.width,
                                            buffer.colorSpace,
                                            buffer.bitmapInfo,
                                            providerRef,
                                            NULL,
                                            true,
                                            kCGRenderingIntentDefault);

        CGContextDrawImage(contextRef, NSMakeRect(0, 0, buffer.width, buffer.height), imageRef);

        CGImageRelease(imageRef);
        CGDataProviderRelease(providerRef);
    }

    CFTimeInterval elapsed = CACurrentMediaTime() - start;
    NSLog(@"elapsed: %f", elapsed);
}

Hopefully the code is straight forward, basically it just renders a window that looks the picture below with the pattern moving at each update call.

![window with blue-green gradient squares][window_pic]

Let's just start with the timings…

|———————–|———–|

| Language | Timing |

|:———————-|———-:|

| ObjC (NO ARC, -O0) | 0.01137s |

| ObjC (NO ARC, -Os) | 0.01051s |

|———————————–|

| ObjC (ARC, -O0) | 0.01159s |

| ObjC (ARC, -Os) | 0.00983s |

|———————————–|

| Swift * (ARC, -Onone) | 0.03005s |

| Swift * (ARC, -O) | 0.01707s |

|———————————–|

| Swift [] (ARC, -Onone)| 1.19796s |

| Swift [] (ARC, -O) | 0.02701s |

|———————————–|

: style="width: 50%"

The "Swift *" is a version that uses UnsafeMutablePointer<UInt8> to handle the buffer. The "Swift []" is one that uses an array of pixel data, where pixel is simply a struct:

struct Pixel {
    var red: Byte       = 0x00
    var green: Byte     = 0x00
    var blue: Byte      = 0x00
    var alpha: Byte     = 0xFF
}

OK… that is not a freaking typo up there. It literally takes 1.2 SECONDS to render each frame. Both ObjC versions did it in 0.01s and even the pointer version of Swift only took 0.03s seconds.

THAT IS 120 TIMES SLOWER!!!!

I can hear you now: oh, that's a debug build, you should always do your performance profiling with optimizations on.

My answer is this: I'm not trying to profile the app, I'm simply trying to build a small game. I cannot do it because the performance is so flipping terrible in debug mode. Trying to debug your app with optimized code is just a pain. Sure, at the end of the project, that's probably the right thing to be doing, but until you get there, builds without optimizations are just so much easier to work with.

So what are we to do?

Well… if you want to use Swift, you have to bang your head against your desk and try different things until something gets better. In this case, I can make the performance significantly better (though, let's be honest here, it's still

37 TIMES slower than the ObjC version):

func renderWeirdGradient(blueOffset: Int, _ greenOffset: Int) {
    for var y = 0; y < buffer.height; y++ {
        let row = y * buffer.width
        for var x = 0; x < buffer.width; x++ {
            // Simply using the "FAST" code block changes the timing from 1.2s
            // per call to 0.37s per call.

            // ---- SLOW CODE BLOCK
            buffer.pixels[row + x].green = Byte((y + greenOffset) & 0xFF)
            buffer.pixels[row + x].blue = Byte((y + blueOffset) & 0xFF)
            // ---- END SLOW CODE BLOCK

            // ---- FASTER CODE BLOCK
            let pixel = Pixel(
                red: 0,
                green: Byte((y + greenOffset) & 0xFF),
                blue: Byte((x + blueOffset) & 0xFF),
                alpha: 255)

            buffer.pixels[row + x] = pixel
            // ---- END FASTER CODE BLOCK
        }
    }

    self.needsDisplay = true
}

I ran into this same type of stuff when I was working on the JSON parser. The fundamental problem with Swift, with regards to performance, is that it is impossible to reason what the performance of your code is going to be. It is impossible to know if your code is just horrendously slow in debug, or if the optimizing is going to get ride of all of the extra crap that is going on.

This sucks.

What I'm Left With

Seriously, what are we supposed to be doing here? Is it actually possible to build highly responsive apps that process a lot of complex information in Swift?

If most of your code is Swift code that is simply bridging into ObjC or C, you will probably read this post and not have a clue what I'm talking about. You are benefiting from the speed of ObjC.

The sad state of affairs is that, today, Swift is slow. I don't see that changing by next WWDC.

Swift Source: https://github.com/owensd/handmade-swift/tree/day005_badperf

ObjC Projects: BackingBufferObjC.zip

Swift Resistance

Response to Monads (and Functional Programming)

NOTE: I've posted an update at the end based on some of the Twitter

conversations.

Rui Peres asked me for my opinion about a piece he wrote about

Monads, partially the context of Swift. Sure!

I'm by no means an expert in functional programming, so be sure to question all things I say. =) Chris Eidhof knows this stuff much, MUCH better than I do and could probably offer a much different perspective.

So, my thoughts… well. It's more complicated than that. =) It's more complicated because it actually feeds into something I've been feeling for many years and only recently really been trying to understand and figure out if there is something tangible I can do to make it better, if for no one else, at least for myself.

So, general critique of the article: I think it was a fair overview of monads and somewhat of an introduction to functional programming in general. I would have maybe added some mention about monads and their use for capturing side-effects in programs, but that's about it. Oh, and make sure the code samples work and state the version of Xcode the Swift samples work on as that's a moving target.

As for functional programming in general: I think there is a tendency to treat it as a magic bullet for solving many of the problems in computer science. Now, this is not to knock on functional programming in the slightest; I believe that it's a paradigm that can teach us many good things. But like everything, it must be used in proper moderation and with care in order to be truly effective.

When I think of functional programming, I see it as trying to apply some core principles to your code base. I would call these out as:

  1. Functional evaluation – everything is a function and should be composable. 2. Clean state – side effects should be scoped to their context.

And that's really it; everything else stems from those two things. Composable functions like map are just a satisfaction of goal #1. Monads are a way to make goal #2 happen. At the end of the day, these are all abstractions.

Here's the thing to remember though: there's no such thing as zero-cost abstractions (slides, video). I'll touch on this a bit more later.

Back to the article… let's look at one of Rui's examples (modified to make Xcode 6.1.1 happy):

public final class Box<T>
{
    let value : T

    public init(_ value : T) {
        self.value = value
    }
}

func flatten<T>(box : Box<Box <T>>) -> Box <T> {
    return box.value
}

func map<T,U>(box : Box<T>, f : T -> U) -> Box<U> {
    return Box(f(box.value))
}

infix operator  >>=  {associativity left}
func >>= <T,U>(box : Box<T>, f : T -> Box<U>) -> Box<U> {
    return flatten(map(box,f))
}

let box = Box(4)
let sum3 = { Box($0 + 3) }
let toString = { Box(String($0 + 0)) }
let iReallyLike = { Box("I really like the number " + $0) }

Each line is perfectly understandable. Each is also fairly near to its most generic abstraction for each case (at least in terms to make Swift happy). But it's the next line that gets me:

let luckyNumber7 = box >>= sum3 >>= toString >>= iReallyLike

These are the questions that I ask myself for most of my code:

  1. What does it do? 2. Is it maintainable? 3. Is the performance OK? 4. How about memory usage? 5. Is it safe? 6. Reentrant? 7. Does it need to be?

Now, #1 can be tricky. It's a pretty common convention to define the >>= as composition type operator (or more formally, the bind operator), so it's fairly safe to say what it does without explicitly checking. However, you still need to do so if you really want to be sure.

The basic point of the operator is to take something like this (from an imperative view)1:

iReallyLike(toString(sum3(box.value).value).value).value

And turn it into the most compostable solution presented above. Granted, if we were not boxing the types we'd simply have:

iReallyLike(toString(sum3(box)))

Of course, if we didn't have the boxed types, well, then we wouldn't have the monads, and that is, after all, the point of the post.

What you need to realize is that a monad is an abstraction. Again, that doesn't make it a bad abstraction.

In the example above, we have multiple considerations to make:

  1. How much memory does each Monad copy take? 2. Can the memory be safely shared to reduce the amount of memory necessary? 3. How much function overhead are we actually imposing with every call? 4. Can we write a compiler that is smart enough to know when it is safe to

deconstruct the unwrapping of the monad? 5. If we hit a performance issue, do I have a way that I can make this better

without getting rid of the abstractions? Without making the code difficult to

maintain?

These are just some of the basic questions we need to ask ourselves.

You really should be asking yourself these questions. I feel like we have

regressed significantly as an industry because we've gotten fat and are used

to fast machines with lots of memory. Even our smart phones have gigs of RAM.

Too often we blindly make the trades without even understanding we are doing

it.

Here's an example that I saw recently from an email conversation I was having with another individual about a blog post I had written:

func functionalCount(sentence: String) -> Int {
    let words = sentence.componentsSeparatedByString(" ")
    return words
        .map { countElements($0) }
        .filter { $0 > 3 }
        .reduce(0, combine: +)
}

The purpose was to count the number of characters in all of the words that had more than three characters in them. Now, maybe this is a naïve approach to writing this solution, but it is a straight-forward implementation of using the individual components to glue together a solution.

Here's the same, basic algorithm in the imperative form:

func imperativeCount(sentence: String) -> Int {
    var count = 0
    let words = sentence.componentsSeparatedByString(" ")

    for word in words {
        let characters = countElements(word)
        if characters > 3 {
            count += characters
        }
    }

    return count
}

The rough timing and memory estimates I did showed basically a 2x improvement for the imperative version and a 70KB memory reduction. The sample text was 10 instances of the Gettysburg Address. I'm sure we could spend a good deal of time fine tuning the performance of each of them. But, if we are in somewhat of an agreement that this is idiomatic code for the two styles, it's a fair question to ask: what's the point of the functional version and is it inherently better?

Here's my high-level, meta point: the functional version doesn't offer any intrinsic value at understanding the problem as a whole. In fact, both are really just a means to the same end: counting those characters.

Update: January 19th, 2015

I received a bit of feedback on Twitter, and since Twitter is a terrible place to actually have a conversation of any substance, I'll put my replies here.

First off, I'm not surprised to see the recommendations for improvements to the functional implementation. While that misses the point somewhat, especially since that code was not mine to begin with, I'll make the updates so we can see some of the potential benefits.

func lazyFunctionalCount(sentence: String) -> Int {
    let words = sentence.componentsSeparatedByString(" ")
    return reduce(
        lazy(words)
            .map { countElements($0) }
            .filter { $0 > 3 },
        0, +)
}

This is faster than the initial version, and essentially the same memory footprint as the imperative version. I'd argue that readability has suffered though.

Next, my "imperative" version wasn't really all that imperative. I agree. But that is one of the points of the article: we should let the good ideas of the functional world flow back into the imperative world. I think that a synergy can be had when we merge the two instead of attempt to stay wholly in one of the realms.

Rob Napier mentions that "Binding (as in for-in) rather than storing vars is the heart of FP. Bind vs. store is the whole argument." I agree with that. I think another way of looking at it is in the context of "what languages" vs. "how languages". I think there is great value in expressing your logic as a description of what you want to do instead of how you want to do it.

Take for example this imperative approach to the problem:

func reallyImperativeCount(sentence: String) -> Int {
    var count = 0, size = 0
    let utf16 = sentence.utf16

    for var idx = utf16.startIndex; idx != utf16.endIndex; idx++ {
        let c = utf16[idx]

        if c == 32 {
            if size > 3 {
                count += size
            }
            size = 0
        }
        else {
            size++
        }
    }

    if size > 3 {
        count += size
    }

    return count
}

This is ugly… but boy is it fast. About 26x faster than the initial imperative solution. The above solution tells you the "how to do it" and has lost much of the "what to do". It's also error-prone; I had to spend about 10 minutes tracking down a couple of bugs in this implementation that were gracefully handled in the other two solutions above.

But I think this illustrates one of my main points: abstractions have a cost. It's not always important to write the truly imperative solution above, but when it is necessary, it should be possible, and preferably, fit into the same domain space as the original problem and not feel foreign in the language.

  1. Yeah, yeah. The bind operator is for transferring one monadic value

    to another monadic value via a function that takes the non-monadic

    value as a parameter.

Response to Monads (and Functional Programming)

Swift Makefiles – Take 2

A couple of days I ago I started a basic Makefile for compiling Swift code. That was a pretty basic Makefile and didn't actually result in a usable executable or library at the end.

Today we'll remedy that.

Makefile

Below is a Makefile that is starting to pull in more of the options and settings that Xcode uses. My intention is to create a Makefile that can capture most of those settings and allow me to simply clone it for my other projects.

# A more complicated build file that is starting to look a lot like how
# Xcode internally structures its builds. The goal here is to show how
# one might actually go about building a full Xcode-capable build system
# without all of the requirements of Xcode, xcodebuild, and xcproj files.
#
# There will likely be limitations to this process, but lets see how far
# the wind takes us.

## USER CONFIGURABLE SETTINGS ##
CONFIG       = debug
PLATFORM     = macosx
ARCH         = x86_64
MODULE_NAME  = tool
MACH_O_TYPE  = mh_execute

## GLOBAL SETTINGS ##
ROOT_DIR            = $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))
BUILD_DIR           = $(ROOT_DIR)/bin
SRC_DIR             = $(ROOT_DIR)/src
LIB_DIR             = $(ROOT_DIR)/lib

TOOLCHAIN           = Toolchains/XcodeDefault.xctoolchain/usr/lib/swift/$(PLATFORM)
TOOLCHAIN_PATH      = $(shell xcode-select --print-path)/$(TOOLCHAIN)

SWIFT               = $(shell xcrun -f swift) -frontend -c -color-diagnostics

## COMPILER SETTINGS ##
CFLAGS       = -g -Onone
SDK_PATH     = $(shell xcrun --show-sdk-path -sdk $(PLATFORM))

## LINKER SETTINGS ##
LD           = $(shell xcrun -f ld)
LDFLAGS      = -syslibroot $(SDK_PATH) -lSystem -arch $(ARCH) \
                -macosx_version_min 10.10.0 \
                -no_objc_category_merging -L $(TOOLCHAIN_PATH) \
                -rpath $(TOOLCHAIN_PATH)
OBJ_EXT      = 
OBJ_PRE      =

ifeq (mh_dylib, $(MACH_O_TYPE))
    OBJ_EXT  = .dylib
    OBJ_PRE  = lib
    LDFLAGS += -dylib
endif

## BUILD LOCATIONS ##
PLATFORM_BUILD_DIR    = $(BUILD_DIR)/$(MODULE_NAME)/bin/$(CONFIG)/$(PLATFORM)
PLATFORM_OBJ_DIR      = $(BUILD_DIR)/$(MODULE_NAME)/obj/$(CONFIG)/$(PLATFORM)
PLATFORM_TEMP_DIR     = $(BUILD_DIR)/$(MODULE_NAME)/tmp/$(CONFIG)/$(PLATFORM)

SOURCE = $(notdir $(wildcard $(SRC_DIR)/*.swift))

## BUILD TARGETS ##
tool: setup $(SOURCE) link

## COMPILE RULES FOR FILES ##

%.swift:
    $(SWIFT) $(CFLAGS) -primary-file $(SRC_DIR)/$@ \
        $(addprefix $(SRC_DIR)/,$(filter-out $@,$(SOURCE))) -sdk $(SDK_PATH) \
        -module-name $(MODULE_NAME) -o $(PLATFORM_OBJ_DIR)/$*.o -emit-module \
        -emit-module-path $(PLATFORM_OBJ_DIR)/$*~partial.swiftmodule

main.swift:
    $(SWIFT) $(CFLAGS) -primary-file $(SRC_DIR)/main.swift \
        $(addprefix $(SRC_DIR)/,$(filter-out $@,$(SOURCE))) -sdk $(SDK_PATH) \
        -module-name $(MODULE_NAME) -o $(PLATFORM_OBJ_DIR)/main.o -emit-module \
        -emit-module-path $(PLATFORM_OBJ_DIR)/main~partial.swiftmodule

link:
    $(LD) $(LDFLAGS) $(wildcard $(PLATFORM_OBJ_DIR)/*.o) \
        -o $(PLATFORM_BUILD_DIR)/$(OBJ_PRE)$(MODULE_NAME)$(OBJ_EXT)

setup:
    $(shell mkdir -p $(PLATFORM_BUILD_DIR))
    $(shell mkdir -p $(PLATFORM_OBJ_DIR))
    $(shell mkdir -p $(PLATFORM_TEMP_DIR))

While there seems to be a lot going on in this file, it is all fairly straight forward. Basically it just does the following:

  1. Configures the settings for the build.
  2. Compiles each individual file, treating main.swift as special.
  3. Links the object files together to create a combined binary.

The Makefile doesn't support everything yet; for instance, there is no optimization flags being set for release builds. But, it is a start.

Testing Our Work

To test it out, create our go to foo.swift and main.swift files:

foo.swift

public func foo() -> String {
    let message = testing()
    return "Foo - \(message)"
}

main.swift

import Foundation

public func testing() -> String { return "testing!" }

println("Hello, World!")
println("testing: \(testing())")
println("foo: \(foo())")

As you can see, there is a bit of circular usage as both foo and testing are used in both files. This allows us a basic check that each file can see each others symbols and invoke them.

Running make gets us the following output (raw output):

$ make
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swift -frontend -c -color-diagnostics -g -Onone -primary-file /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/src/foo.swift \
        /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/src/main.swift -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk \
        -module-name tool -o /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/bin/tool/obj/debug/macosx/foo.o -emit-module \
        -emit-module-path /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/bin/tool/obj/debug/macosx/foo~partial.swiftmodule
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swift -frontend -c -color-diagnostics -g -Onone -primary-file /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/src/main.swift \
        /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/src/foo.swift -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk \
        -module-name tool -o /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/bin/tool/obj/debug/macosx/main.o -emit-module \
        -emit-module-path /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/bin/tool/obj/debug/macosx/main~partial.swiftmodule
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk -lSystem -arch x86_64 -macosx_version_min 10.10.0 -no_objc_category_merging -L /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/swift/macosx -rpath /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/swift/macosx /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/bin/tool/obj/debug/macosx/foo.o /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/bin/tool/obj/debug/macosx/main.o \
        -o /Users/dowens/Projects/Playground/SwiftOptions/SwiftTool/SwiftTool/bin/tool/bin/debug/macosx/tool

The thing to note is that a tool executable has been built as can be seen in that last line.

$ bin/tool/bin/debug/macosx/tool 
Hello, World!
testing: testing!
foo: Foo - testing!

Running it, we can see that everything seems to be in working order.

So that's it, a somewhat basic Makefile that can be used to create executables and libraries without having to rely on Xcode. Now, should we be doing this? I'll leave that question as an exercise to the reader.

Swift Makefiles – Take 2