Error Handling – Take Two

Make sure to see the update below for a bit for more information on the causes of memory usage.

In my seemingly never ending and not quite achievable goal of beating NSJSONSerialization in both performance and memory utilization for parsing a JSON string, I've come across another pearl of wisdom with regards to Swift: ignore my Error Handling in Swift piece and others that recommend using the Either<T,U> as in other languages (at least for the current version of Swift, as of Beta 6).

I have been able to get my parsing speed to within 0.01s of NSJSONSerialization; while my goal is domination, I also am pragmatic (at times). Next up was memory utilization. Unfortunately, I was (and still am), far behind the total memory usage of the ObjC version. So like a good little software engineer, I fired up Instruments and started investigating what I saw.

When you investigate memory usage, there are three primary concerns that we need to watch out for:

  1. Total amount of memory used over the life of the scenario
  2. Total amount of memory every actually in use at any given time
  3. Highest spike in memory used over the life of the scenario

Instruments visualizes this data pretty nicely for us:

screenshot of instruments with multiple memory profiles visualized in the editor

The picture above is showing the results of the NSJONSerialization code path. My implementation actually has a better "total persistent bytes" overall of 1.92MB vs. the 2.51MB shown above. However, the total memory used in mine was about 6.5MB while we see that NSJONSerialization only used about 4.7MB.

Taking a Dive

There are a couple of approaches we can take to tracking down and solving memory issues:

  1. Examine the code 2. Examine the profiles

Unfortunately, the profiles were not really helping me track down root cause of the issues, but were illustrative in helping me understand that I was creating many, many copies of objects all around the place.

Examining the Error type I first took a quick look over my code to see if I could see anything obvious. There was one thing I noticed right off the bat: FailableOf<T> stores an Error object in its Failure case. Well, the Error type is a struct with three values in it, and since I return a FailableOf<T> in all of my parsing calls, I'm going to need to return a copy of that Error, even if it's empty, all of the time.

Knowing that the Error object is going to be copied so many times throughout the call chain, we can instead mark the Error type as public final class.

When we do this, the total memory usage drops to 6.06MB.

The other option is to create a backing class to store all of the data: that class looks like this:

public struct Error {
    public typealias ErrorInfoDictionary = [String:String]

    class ErrorInfo {
        let code: Int
        let domain: String
        let userInfo: ErrorInfoDictionary?

        init(code: Int, domain: String, userInfo: ErrorInfoDictionary?) {
            self.code = code
            self.domain = domain
            self.userInfo = userInfo

    var errorInfo: ErrorInfo

    public var code: Int { return errorInfo.code }
    public var domain: String { return errorInfo.domain }
    public var userInfo: ErrorInfoDictionary? { return errorInfo.userInfo }

    public init(code: Int, domain: String, userInfo: ErrorInfoDictionary?) {
        self.errorInfo = ErrorInfo(code: code, domain: domain, userInfo: userInfo)

However, that seems to be a lot more complicated over simply do this:

public final class Error {
    public typealias ErrorInfoDictionary = [String:String]

    public let code: Int
    public let domain: String
    public let userInfo: ErrorInfoDictionary?

    public init(code: Int, domain: String, userInfo: ErrorInfoDictionary?) {
        self.code = code
        self.domain = domain
        self.userInfo = userInfo

And since all my values are immutable to begin with, I'm not sure why I would chose the struct approach for this problem.

Investigating the FailableOf<T> Since I'm having copying issues with the Error (gist) type, it is only logical to look at the FailableOf<T> type next. Instead of using my JSON parser as the test ground, I decided to create a little sample app that would loop many times calling a function that returned the following types:

  • FailableOf<T> – my implementation of the Either<T, U> concept (gist)
  • Either<T, U> – a more generic solution to my FailableOf<T> problem (gist)
  • (T, Error) – a tuple that contains the two pieces of information

The sample program is straight forward:

func either<T>(value: T) -> Either<T, Error> {
    return Either(left: value)

// test: either
for var i = 0; i < 100_001; i++ {
    let r = either(i)
    if (r.right != nil) {
        println("error at \(i)")

Each of the different constructs have the same form (gist).

This is where I found something interesting: both the FailableOf<T> and Either<T, U> take up about 3MB of memory, while the (T, Error) tests only take 17KB. Clearly, there has to be some missed compiler optimizations in Swift. Regardless, the tuple approach is clearly the one we should be taking, at least for now, if we really care about every ounce of memory.

In order to work with it better in my code, I create a typealias and use named tuples:

/// The type that represents the result of the parse.
public typealias JSParsingResult = (value: JSValue?, error: Error?)

After updating all of the JSON.parse code to return this new type, memory usage is down to 5.33MB!! Simply switching from a struct-based approach to this named tuple approach (which I think is just a good, frankly), I was able to shave off another 700KB of unnecessary memory creation.

I'm not done investigating other opportunities right now, but things are starting to look really promising here.

UPDATE After some more investigating, I realized why the enum case was causing such memory bloat: we need to box all of the types that get stored in them until Swift implements the proper generic support for an enum.

Error Handling – Take Two