Version Info in Swift CLI tools

I have a simple task… I just want to output a version number when I run langsrv -v. Is that so hard? Well, if you’re using Swift, yes.

I make mistakes, so manual steps for me are both error prone and tiresome. When I update the version, I also need to make sure my git tag will be correct. So… how do we accomplish this in Swift?

Just follow these easy steps!

  1. Create a Makefile – yep, just get used to it. SwiftPM should be though of as simply the mechanism to save you from writing some of the swiftc arguments directly. Sure, it also makes downloading some dependencies a bit easier too.
  2. Create a new module, say VersionInfo.
  3. Create a template Swift file, say VersionInfo.swifttemplate
  4. Create a version info file, say VersionInfo.yaml
  5. Create a script that will parse that VersionInfo.yaml file and generate the appropriate VersionInfo.swift file.
  6. Import your VersionInfo module in your main.swift file.
  7. Reference that version info however you’d like.
  8. Update your Makefile to include a tag  to create the correct git tag.

Yeah… I think that is all of the steps.

Here’s my Makefile:

Any my genvers.sh script:

And my VersionInfo.swifttemplate file:

And finally, my VersionInfo.yaml file:

version: 0.9.2

Now when I want to update the version, I simply edit the VersionInfo.yaml file, build, and run make tag before I push my changes.

Of course, if we had compiler variables in Swift, I could have avoided nearly all of this work…

Version Info in Swift CLI tools

Protocols, Generics, and Enums

So I have a problem… my Language Server Protocol implementation has a pluggable API surface. The transport mechanism and how you encode the data within the VS Code message format are both abstracted out so you can plug in a different implementation, say an IPC transport mechanism instead of stdin/stdout.

Anyhow, the spec is a bit under restricted. That is, there are places that allow for Any type to be stored within there. Now… for the spec, it ties its implementation to JSONRPC, so the actual potential types of Any must be one of the valid JSON types.

The way I handle encoding and decoding are via two very simple interfaces:

public protocol Encodable {
  associatedtype EncodableType
  func encode() -> EncodableType
}

public protocol Decodable {
  associatedtype EncodableType
  static func decode(_ data: EncodableType?) throws -> Self
}

public typealias Codeable = Encodable & Decodable

OK, pretty straight forward. The first problem though is the associated type. This already caused me some troubles earlier with my default extension for encode(): I couldn’t figure out how to re-write it now that it used an associated type.

The next design choice I have is that all LSP commands are modeled within an enum.

public enum LanguageServerCommand {
  case initialize(requestId: RequestId, params: InitializeParams)
  case initialized
 ///...

  case workspaceDidChangeConfiguration(params: DidChangeConfigurationParams)
 
  /// ...
}

I really like this as it makes it clear what commands have not been implemented yet.

So here’s where the problem really comes in: DidChangeConfigurationParams is one of those APIs that has an Any type as one of its members. Updating that type looks like this:

public struct DidChangeConfigurationParams<SettingsType> {
  public init(settings: SettingsType) {
    self.settings = settings
  }
  public var settings: SettingsType
}

But this requires changes to the LanguageServerCommand now.

public enum LanguageServerCommand<SettingsType> {
  case workspaceDidChangeConfiguration(params: DidChangeConfigurationParams<SettingsType>)
}

And now everywhere that uses LanguageServerCommand needs to be updated… not to mention that I need to do this for each time that uses Any.

What to do?

I basically have a handful of options:

  1. Re-design the Encodable and Decodable interfaces to remove the associated type or merge with the Swift 4 design. I don’t really like the approach of the Swift 4 model much though, so I’m not really keen on doing this until absolutely necessary.
  2. Re-design how I’m handling my responses and not use an enum. However, I don’t really like this either.

What did I do?

Well… I said, “Screw you type system! I know what I want and when I want it!”. Seriously… I cannot express what I want to express in a simple manner, so I created this:

public protocol AnyEncodable {
    func encode() -> Any
}

Then I updated my type definitions to look like this:

The change is registerOptions is now AnyEncodable? instead of JSValue?. This removes the leaky abstraction of the serialization mechanism and moves it back into the serialization layer itself.

Is this dirty? Is this bad? Eh… I don’t really care. It’s what I needed to do so I did it.

Is there a better way? I don’t know…

You can see the full change here if you’re interested: https://github.com/owensd/swift-lsp/commit/8e2de14124fae91cf0d02c873acd16f9e93f5ef2

 

Protocols, Generics, and Enums

Parsing Package.swift

As I’m working on my Swift language server, one of the things I need to do is to parse the Package.swift file. There is a PackageDescription library, but that’s unable for use within your own program if you are actually using SwiftPM.

Yeah…

So what are we to do? HACK IT UP!

Basically, we need to run this command:

$ swift -I /Library/Developer/Toolchains/swift-latest.xctoolchain/usr/lib/swift/pm \
    -L /Library/Developer/Toolchains/swift-latest.xctoolchain/usr/lib/swift/pm \
    -lPackageDescription Package.swift -fileno 1

Not really a big deal, just annoying.

Here’s the snippet to get the output:

Curious note: when you use string literal interpolation, even if your variable is an implicitly unwrapped optional, it will get output as an optional. Hence you see: "\(projectPath!)/package.swift". Go figure.

Of course, the output is JSON. If you need a parser, you can use json-swift or any of the many others.

 

The shell function is just a helper that I add to my SwiftFixes.swift file:

Anyhow, if you too need to parse that pesky Package.swift file, here you go.

Parsing Package.swift

vscode-json swift v2.0 released

I just published version 2.0 of my JSON parser library for Swift: json-swift. Version 2 brings in a lot of changes, mostly around RFC-7159 compliance and performance. As of this writing, I am not aware of any issues with the compliance tests.

As for performance, well, it’s gotten markedly faster! One of the test files I have is large-dict.json. It’s about 26 megs and contains lot of various data in it. This took a really long time to parse, I believe around 14s. However, now we’re looking pretty good!

NSJONSerialization:
performance results: min: 0.427, max: 0.484, avg: 0.446

JSONLib:
performance results: min: 0.966, max: 1.060, avg: 0.985

Freddy Results:
performance results: min: 0.875, max: 0.920, avg: 0.897

 

I put the comparison with Freddy in there too for good measure. There’s still a bit more I think can squeeze out, but there is a lot of retain/release overhead too that I need to figure out before I’ll be able to get down to beating NSJSONSerialization.

There are some breaking changes in this release as well, namely if you used the error and hasValue properties. Those are now gone as the entire API surface has been changed to use throws as the mechanism to promote errors. This cleaned up the code quite a bit and unified some of the concepts better.

This means that code like this: json["hello"][123]["would"] would normally require ? at each index. However, I provided Optional<JSValue> wrappers for all of those. This wasn’t possible in the early versions of Swift, so it was nice to clean up this design workaround.

Happy JSON parsing!

P.S. When Swift 4 becomes a bit more stabilized, I’ll be providing the Encoder and Decoder implementations to work with the new Codable protocol.

vscode-json swift v2.0 released

Looking at my JSON Parsing Performance

I’ve been working on my JSON parser lately in the hopes to fix two major issues:

  1. Correctness
  2. Performance

A lot of the design of the parser was influenced by early Swift limitations, so I was able to go through and get rid of a bunch of the weird boxing and internal backing stores I needed to use back then. Sadly, removing those didn’t really help performance much.

However, there is one piece that is hit by pretty much every part of the parser all of the time: my ReplayableGenerator type. The idea of this was that I’d simply call next() and replay() in the parsing code to remove the a lot of that logic out.

The current implantation requires a Sequence. This is fine except that the way I had things setup, I needed to turn the string into an array of UInt8. It turns out, that is relatively expensive. Even when creating the generator by using string.utf8 and using that iterator directly, performance was still 10x worse than JSONSerialization.

Uck!

All was not lost though! Instead of using a Sequence.Iterator to back my ReplayableGenerator, I figured I’d just straight up use an UnsafeBufferPointer<UInt8>.

Results:

NSJONSerialization:
performance results: min: 0.0126, max: 0.0215, avg: 0.014

JSONLib:
performance results: min: 0.0364, max: 0.050, avg: 0.0392

Yay! Getting there. There is still more work to be done and some correctness issues to work out, but getting happier with things now.

Just one more quick thing to note: one of the biggest perf gains was changing how I was getting the string content.

One thing I maybe should have tried, but forgot to, was getting a lazy string.utf8 back. That might have made some difference.

Looking at my JSON Parsing Performance

Making Mistakes: print()

I’m implementing a Swift version of the Language Server Protocol. The way that it integrates within Visual Studio Code (VS Code) is via stdin and stdout. That’s all fine and dandy. It also makes uses of a modified JSON-RPC message construct for its communication.

While testing out my server’s ability to handle commands coming in from stdin, I was simply using print() to output the response message.  Anyhow, input a message, and the output was working great.

However, when I went to test it within VS Code, I would get the initialize request, send a message back, and nothing. What I expected to have happen was for VS Code to start sending me more messages.

So what was the problem? I honestly had no idea.

Problem #1: From the LSP spec, it isn’t immediately obvious what the response messages should look like. Should it include the message header? Should it just have the JSON-RPC part? Is my message even formatted correctly? The spec calls for \r\n instead of \n, did I mess that up?

I go through and validate the message and output in all of the different permutations I can think of, but nothing. After spending some time digging around other LSP implementations, I come to the conclusion that I am indeed sending back the right message format, so what could it be?

Problem #2: Esoteric history and undocumented (or implied) behavior.

Ok, so if the message format is correct, maybe the output isn’t actually working as it looks like it is. So I run mkfifo output, mkfifo input, and tail output. Let’s see what is happening.

Running cat initialize.lsp > input (my saved message content for an initialize request) gets my language server to handle the message, but no output.

images

It turns out that Swift’s print() simply routes to the underlying stdio output. Which, if you don’t know, does buffered or unbuffered output depending on what its actually being output to. In the case of the console, it’s output immediately. In the case of a file descriptor, it’s buffered.

Solution (temporary):

setbuf(stdout, nil)

It’s temporary because I actually need to write the proper version of the output code to ensure that I’m writing the content correct and only the number of bytes that are specified in the response message.

Retrospective

Here’s the thing, I actually knew about how stdio buffers it’s output. However, when I looked at the print() documentation, I simply became complacent and assumed since it didn’t mention buffering that it indeed immediately wrote the content out. Later testing, of course, would prove otherwise.

 

The problem here is somewhat systemic of our programming culture. It came from a combination of unclear documentation (from two sources, nonetheless), and an assumption of knowledge that, even if the person knows, can forget to apply in certain contexts.

Hopefully this radar gets fixed:

Making Mistakes: print()