Parsing Package.swift

As I’m working on my Swift language server, one of the things I need to do is to parse the Package.swift file. There is a PackageDescription library, but that’s unable for use within your own program if you are actually using SwiftPM.

Yeah…

So what are we to do? HACK IT UP!

Basically, we need to run this command:

$ swift -I /Library/Developer/Toolchains/swift-latest.xctoolchain/usr/lib/swift/pm \
    -L /Library/Developer/Toolchains/swift-latest.xctoolchain/usr/lib/swift/pm \
    -lPackageDescription Package.swift -fileno 1

Not really a big deal, just annoying.

Here’s the snippet to get the output:

Curious note: when you use string literal interpolation, even if your variable is an implicitly unwrapped optional, it will get output as an optional. Hence you see: "\(projectPath!)/package.swift". Go figure.

Of course, the output is JSON. If you need a parser, you can use json-swift or any of the many others.

 

The shell function is just a helper that I add to my SwiftFixes.swift file:

Anyhow, if you too need to parse that pesky Package.swift file, here you go.

Parsing Package.swift

vscode-json swift v2.0 released

I just published version 2.0 of my JSON parser library for Swift: json-swift. Version 2 brings in a lot of changes, mostly around RFC-7159 compliance and performance. As of this writing, I am not aware of any issues with the compliance tests.

As for performance, well, it’s gotten markedly faster! One of the test files I have is large-dict.json. It’s about 26 megs and contains lot of various data in it. This took a really long time to parse, I believe around 14s. However, now we’re looking pretty good!

NSJONSerialization:
performance results: min: 0.427, max: 0.484, avg: 0.446

JSONLib:
performance results: min: 0.966, max: 1.060, avg: 0.985

Freddy Results:
performance results: min: 0.875, max: 0.920, avg: 0.897

 

I put the comparison with Freddy in there too for good measure. There’s still a bit more I think can squeeze out, but there is a lot of retain/release overhead too that I need to figure out before I’ll be able to get down to beating NSJSONSerialization.

There are some breaking changes in this release as well, namely if you used the error and hasValue properties. Those are now gone as the entire API surface has been changed to use throws as the mechanism to promote errors. This cleaned up the code quite a bit and unified some of the concepts better.

This means that code like this: json["hello"][123]["would"] would normally require ? at each index. However, I provided Optional<JSValue> wrappers for all of those. This wasn’t possible in the early versions of Swift, so it was nice to clean up this design workaround.

Happy JSON parsing!

P.S. When Swift 4 becomes a bit more stabilized, I’ll be providing the Encoder and Decoder implementations to work with the new Codable protocol.

vscode-json swift v2.0 released

Looking at my JSON Parsing Performance

I’ve been working on my JSON parser lately in the hopes to fix two major issues:

  1. Correctness
  2. Performance

A lot of the design of the parser was influenced by early Swift limitations, so I was able to go through and get rid of a bunch of the weird boxing and internal backing stores I needed to use back then. Sadly, removing those didn’t really help performance much.

However, there is one piece that is hit by pretty much every part of the parser all of the time: my ReplayableGenerator type. The idea of this was that I’d simply call next() and replay() in the parsing code to remove the a lot of that logic out.

The current implantation requires a Sequence. This is fine except that the way I had things setup, I needed to turn the string into an array of UInt8. It turns out, that is relatively expensive. Even when creating the generator by using string.utf8 and using that iterator directly, performance was still 10x worse than JSONSerialization.

Uck!

All was not lost though! Instead of using a Sequence.Iterator to back my ReplayableGenerator, I figured I’d just straight up use an UnsafeBufferPointer<UInt8>.

Results:

NSJONSerialization:
performance results: min: 0.0126, max: 0.0215, avg: 0.014

JSONLib:
performance results: min: 0.0364, max: 0.050, avg: 0.0392

Yay! Getting there. There is still more work to be done and some correctness issues to work out, but getting happier with things now.

Just one more quick thing to note: one of the biggest perf gains was changing how I was getting the string content.

One thing I maybe should have tried, but forgot to, was getting a lazy string.utf8 back. That might have made some difference.

Looking at my JSON Parsing Performance

Making Mistakes: print()

I’m implementing a Swift version of the Language Server Protocol. The way that it integrates within Visual Studio Code (VS Code) is via stdin and stdout. That’s all fine and dandy. It also makes uses of a modified JSON-RPC message construct for its communication.

While testing out my server’s ability to handle commands coming in from stdin, I was simply using print() to output the response message.  Anyhow, input a message, and the output was working great.

However, when I went to test it within VS Code, I would get the initialize request, send a message back, and nothing. What I expected to have happen was for VS Code to start sending me more messages.

So what was the problem? I honestly had no idea.

Problem #1: From the LSP spec, it isn’t immediately obvious what the response messages should look like. Should it include the message header? Should it just have the JSON-RPC part? Is my message even formatted correctly? The spec calls for \r\n instead of \n, did I mess that up?

I go through and validate the message and output in all of the different permutations I can think of, but nothing. After spending some time digging around other LSP implementations, I come to the conclusion that I am indeed sending back the right message format, so what could it be?

Problem #2: Esoteric history and undocumented (or implied) behavior.

Ok, so if the message format is correct, maybe the output isn’t actually working as it looks like it is. So I run mkfifo output, mkfifo input, and tail output. Let’s see what is happening.

Running cat initialize.lsp > input (my saved message content for an initialize request) gets my language server to handle the message, but no output.

images

It turns out that Swift’s print() simply routes to the underlying stdio output. Which, if you don’t know, does buffered or unbuffered output depending on what its actually being output to. In the case of the console, it’s output immediately. In the case of a file descriptor, it’s buffered.

Solution (temporary):

setbuf(stdout, nil)

It’s temporary because I actually need to write the proper version of the output code to ensure that I’m writing the content correct and only the number of bytes that are specified in the response message.

Retrospective

Here’s the thing, I actually knew about how stdio buffers it’s output. However, when I looked at the print() documentation, I simply became complacent and assumed since it didn’t mention buffering that it indeed immediately wrote the content out. Later testing, of course, would prove otherwise.

 

The problem here is somewhat systemic of our programming culture. It came from a combination of unclear documentation (from two sources, nonetheless), and an assumption of knowledge that, even if the person knows, can forget to apply in certain contexts.

Hopefully this radar gets fixed:

Making Mistakes: print()