Looking at my JSON Parsing Performance

I’ve been working on my JSON parser lately in the hopes to fix two major issues:

  1. Correctness
  2. Performance

A lot of the design of the parser was influenced by early Swift limitations, so I was able to go through and get rid of a bunch of the weird boxing and internal backing stores I needed to use back then. Sadly, removing those didn’t really help performance much.

However, there is one piece that is hit by pretty much every part of the parser all of the time: my ReplayableGenerator type. The idea of this was that I’d simply call next() and replay() in the parsing code to remove the a lot of that logic out.

The current implantation requires a Sequence. This is fine except that the way I had things setup, I needed to turn the string into an array of UInt8. It turns out, that is relatively expensive. Even when creating the generator by using string.utf8 and using that iterator directly, performance was still 10x worse than JSONSerialization.

Uck!

All was not lost though! Instead of using a Sequence.Iterator to back my ReplayableGenerator, I figured I’d just straight up use an UnsafeBufferPointer<UInt8>.

Results:

NSJONSerialization:
performance results: min: 0.0126, max: 0.0215, avg: 0.014

JSONLib:
performance results: min: 0.0364, max: 0.050, avg: 0.0392

Yay! Getting there. There is still more work to be done and some correctness issues to work out, but getting happier with things now.

Just one more quick thing to note: one of the biggest perf gains was changing how I was getting the string content.


let data = string.data(using: String.Encoding.utf8, allowLossyConversion: false)!
return try data.withUnsafeBytes { (ptr: UnsafePointer<UInt8>) -> JSValue in
let buffer = UnsafeBufferPointer(start: ptr, count: data.count)
let generator = ReplayableGenerator(buffer)
let value = try parse(generator)
try validateRemainingContent(generator)
return value
}

One thing I maybe should have tried, but forgot to, was getting a lazy string.utf8 back. That might have made some difference.

Looking at my JSON Parsing Performance