Performance Matters: O(N)

In college and other performance books, we learned about Big O notation. We were taught that the thing you really need to be careful of is the algorithmic complexity of your algorithms. And while that is true to some extent, it leads to being extremely wasteful.

The problem isn’t so much what you’ve been told about algorithmic complexity, the problem is that the comparison is done in the abstract, typically with no consideration of how the computer actually works.

And that’s a huge problem!

Take for instance a problem of inserting items into a the middle of an sequence of values.  When we look at the comparison between a linked list and an array of items, the algorithmic complexity tells us that both operations are going to be O(N).

Of course, with a linked list, we only need to:

  1. Find the location to insert at
  2. Create the new node
  3. Update the next pointer of the new new node
  4. Update the next pointer of the previous node

That sounds pretty cheap!

For an array, we only need to:

  1. Resize the contiguous piece of memory making up the array
  2. Copy all the data over
  3. Add our new value in the correct offset

Well crap, that sounds really expensive!

Spoiler: you’ve been misled.

Yep, that’s right. What if I told you that even in an array of 10,000 items and a linked list of 10,000 items, insertion into the middle is still going to be faster in the array? Would you believe me?

The timings are for inserting 10,000 items into the middle of the vector each time.

  • Insertion into a std::vector: 1.35ms
  • Insertion into a std::list:  104.89ms

(side note: the std::list was 4x faster than my own linked list implementation, so it’s definitely not an STL issue)

What the heck is going on here!? Ok, this was a 32-bit size structure. What about a bigger struct, like 2048-bits?

  • Insertion into a std::vector: 155.15ms
  • Insertion into a std::list:  211.38ms

Ok, as expected, the size of the data structure has a much greater impact on the array than the list. But what’s the deal with the original timings?

The problem is: data locality.

This is a key thing that is missed when talking about performance. Regardless of how you implement a list of items, you’re going to have something like this:

node = node->next;

The problem is that the -> operator (de-referrencing) is a relatively expensive operation.

For comparison, here are the timings for insertion into the beginning of the array/list (32-bit structure):

  • Insertion into a std::vector: 2.41ms
  • Insertion into a std::list: 0.52ms

Conclusion

Remember, computers are extremely fast at dealing with data that is sequential. A lot of that data just gets pulled into the processor’s cache lines and it’s like a shark at a buffet line. When you have to follow pointers around, it’s like having to drive to the grocery store every time you want a bite to eat.

Further Reading:

The above article has a much more thorough investigation and provides some excellent advice about increasing the performance of your game loop without messing up your data structures. It continues this same concept of “data locality” and shows just how important it is.

Swift Comparison

Here’s an interesting Swift comparison on the 32-bit sized data structure for a linked list and an array:

  • Insertion into a Array: 1.22ms
  • Insertion into a LinkedList: 396ms

If you’ve been wondering where my Swift posts have been… this is why. Honestly, I’m tired of trying to figure out the performance issues in the language (mostly between debug and release builds) and have been spending most of my hobby programming time in C++. ¯\_(ツ)_/¯

UPDATE: There was an error in the Swift version, it’s been fixed and the array times are SIGNIFICANTLY better. It’s still slower than the C++ version, but only by a factor of 2 this time. WAY better.

UPDATE2: When Ints are 64bits… duh! Sorry, Swift’s Int type is 64 bits. Using Int32, as I should have been, get’s us all on even playing field! GO SWIFT! 🙂

All code for this can be found and verified here: https://gist.github.com/owensd/94490781a0ce1dce6737834876ff3a02.

Performance Matters: O(N)

Confederacy, Heroes, and Romans 14

I’m simply going to start with this, because it’s the point of all of this really:

Do not, for the sake of food, destroy the work of God. Everything is indeed clean, but it is wrong for anyone to make another stumble by what he eats. It is good not to eat meat or drink wine or do anything that causes your brother to stumble. – Roman 14:20-21

Let’s say, for the sake of argument, that everything you are arguing about the Confederacy and the “heroes” (in your words) of that time is true. Let’s also say, for the sake of argument, that the only part of that time in history you are trying to celebrate is your heritage, Let’s also say, for the sake of argument, that the history of slavery and oppression of people is not part of that heritage you are trying to remember and honor. With all that, how do you still answer Paul’s charge about stumbling blocks? How can you, knowing what the imagery looks like to other people, especially those not of the caucasian pigmentation, continue to hold it up and fight for it?

Paul says that it is better to not eat meat or to abstain from drinking for the benefit of those that might struggle with those things and their pursuit of God. So as a Christian, how can you go about trying to uphold something that is so much more grave and important then eating meat and drinking wine when you know it causes a stumbling block to your brother?

So please, if you consider yourself a Christian, I implore you to wrestle with the words that Paul wrote: “But rather decide never to put a stumbling block or hindrance in the way of a brother.” – Romans 14:13b

Confederacy, Heroes, and Romans 14

Swift for Visual Studio Code – Early Preview

Well, it’s finally here… at least in a very early beta stage: Swift for Visual Studio Code (aka vscode-swift). This is my new code editor experience for Visual Studio Code. The project is actually made up of multiple projects:

  1. json-swift: a cross-platform JSON parsing library written completely in Swift.
  2. swift-lsp: an implementation of the Language Server Protocol put out by Microsoft. This allows anyone to use this project to write a language server for any language they want. It essentially creates an interface to talk between any code editor that implements the LSP and your language server.
  3. vscode-swift: This is simply the host extension that handles all of the magic. This is the least interesting part, but will be the project you use to actually get all of this stuff working within VS Code.
  4. swift-langsrv: This is the language server for Swift. Currently it is backed by SourceKit, though this is going to have to change or some SourceKit reckoning will need to happen.

Getting Started

First things first, this is still very early in the development cycle. What I have working right now is:

  1. Better syntax highlighting
  2. Basic snippets
  3. Code completion, which should work for all imported modules as well
  4. Linking to Swift bugs in your comments:
    // SwiftBug(SR-2688)
    

    That will be a hyperlink in VS Code to: https://bugs.swift.org/browse/SR-2688

There has only been ad-hoc, smoke testing done as of right now, so if you want to try it out, you can, but don’t get mad at me if it works just as well as Xcode.

macOS

Getting things to work on macOS is the as simple as downloading the extension from the marketplace. The Swift Language Server binaries are included for you. The only external requirement is that you need to have Swift 3.1 installed in one of the standard locations. If it is not, you can update the path in your settings.json file.

Linux

This will happen soon… I have this working on local builds, but it requires a bunch of hackery because SourceKit isn’t build with the Linux builds of Swift 3.

Windows (Linux Subsystem)

Similar story as Linux above but there are additional complexities here, including the reliability of swift build itself. As such, this is not supported, but I will post some instructions on how to do this later.

Go try it out! https://marketplace.visualstudio.com/items?itemName=kiadstudios.vscode-swift.

Bugs

If you find bugs, which I’m sure there are still many as it’s early in the release cycle, please log them here: vscode-swift. Also, if you’re on macOS 10.12, you can find log data in the Console under the subsystem com.kiadstudios. That should be helpful information to attach.

Swift for Visual Studio Code – Early Preview

SE-110 Fallout… I’m Confused

I was going to put this in a comment on the bug, but it really belongs on swift-evolution, but I left that a year or more ago now… so I’ll just put it here, I suppose:


I’m confused how this change also requires removing destructuring of tuple elements in these closures?

let a : ((Int, Int, Int)) -> Int = { x in return x.0 + x.1 + x.2 }

That makes sense that x is of type (Int, Int, Int), but it makes no sense that I cannot write this:

let a : ((Int, Int, Int)) -> Int = { (x, y, z) in return x + y + z }

The proposal also completely glosses over these changes by referring to them as “minor changes to user code may be required if this proposal is accepted.”

It’s further even more confusing to me why this syntax (note the (x) vs. x):

let a : ((Int, Int, Int)) -> Int = { (x) in return x.0 + x.1 + x.2 }

Would even be allowed to be the same as the entire change is about clarifying tuple vs. non-tuple arguments. Further, allowing this syntax creates yet-another-breaking-change if the destructuring convenience is brought back in the future.

Ugh.

SE-110 Fallout… I’m Confused

Version Info in Swift CLI tools

I have a simple task… I just want to output a version number when I run langsrv -v. Is that so hard? Well, if you’re using Swift, yes.

I make mistakes, so manual steps for me are both error prone and tiresome. When I update the version, I also need to make sure my git tag will be correct. So… how do we accomplish this in Swift?

Just follow these easy steps!

  1. Create a Makefile – yep, just get used to it. SwiftPM should be though of as simply the mechanism to save you from writing some of the swiftc arguments directly. Sure, it also makes downloading some dependencies a bit easier too.
  2. Create a new module, say VersionInfo.
  3. Create a template Swift file, say VersionInfo.swifttemplate
  4. Create a version info file, say VersionInfo.yaml
  5. Create a script that will parse that VersionInfo.yaml file and generate the appropriate VersionInfo.swift file.
  6. Import your VersionInfo module in your main.swift file.
  7. Reference that version info however you’d like.
  8. Update your Makefile to include a tag  to create the correct git tag.

Yeah… I think that is all of the steps.

Here’s my Makefile:

Any my genvers.sh script:

And my VersionInfo.swifttemplate file:

And finally, my VersionInfo.yaml file:

version: 0.9.2

Now when I want to update the version, I simply edit the VersionInfo.yaml file, build, and run make tag before I push my changes.

Of course, if we had compiler variables in Swift, I could have avoided nearly all of this work…

Version Info in Swift CLI tools

Protocols, Generics, and Enums

So I have a problem… my Language Server Protocol implementation has a pluggable API surface. The transport mechanism and how you encode the data within the VS Code message format are both abstracted out so you can plug in a different implementation, say an IPC transport mechanism instead of stdin/stdout.

Anyhow, the spec is a bit under restricted. That is, there are places that allow for Any type to be stored within there. Now… for the spec, it ties its implementation to JSONRPC, so the actual potential types of Any must be one of the valid JSON types.

The way I handle encoding and decoding are via two very simple interfaces:

public protocol Encodable {
  associatedtype EncodableType
  func encode() -> EncodableType
}

public protocol Decodable {
  associatedtype EncodableType
  static func decode(_ data: EncodableType?) throws -> Self
}

public typealias Codeable = Encodable & Decodable

OK, pretty straight forward. The first problem though is the associated type. This already caused me some troubles earlier with my default extension for encode(): I couldn’t figure out how to re-write it now that it used an associated type.

The next design choice I have is that all LSP commands are modeled within an enum.

public enum LanguageServerCommand {
  case initialize(requestId: RequestId, params: InitializeParams)
  case initialized
 ///...

  case workspaceDidChangeConfiguration(params: DidChangeConfigurationParams)
 
  /// ...
}

I really like this as it makes it clear what commands have not been implemented yet.

So here’s where the problem really comes in: DidChangeConfigurationParams is one of those APIs that has an Any type as one of its members. Updating that type looks like this:

public struct DidChangeConfigurationParams<SettingsType> {
  public init(settings: SettingsType) {
    self.settings = settings
  }
  public var settings: SettingsType
}

But this requires changes to the LanguageServerCommand now.

public enum LanguageServerCommand<SettingsType> {
  case workspaceDidChangeConfiguration(params: DidChangeConfigurationParams<SettingsType>)
}

And now everywhere that uses LanguageServerCommand needs to be updated… not to mention that I need to do this for each time that uses Any.

What to do?

I basically have a handful of options:

  1. Re-design the Encodable and Decodable interfaces to remove the associated type or merge with the Swift 4 design. I don’t really like the approach of the Swift 4 model much though, so I’m not really keen on doing this until absolutely necessary.
  2. Re-design how I’m handling my responses and not use an enum. However, I don’t really like this either.

What did I do?

Well… I said, “Screw you type system! I know what I want and when I want it!”. Seriously… I cannot express what I want to express in a simple manner, so I created this:

public protocol AnyEncodable {
    func encode() -> Any
}

Then I updated my type definitions to look like this:

The change is registerOptions is now AnyEncodable? instead of JSValue?. This removes the leaky abstraction of the serialization mechanism and moves it back into the serialization layer itself.

Is this dirty? Is this bad? Eh… I don’t really care. It’s what I needed to do so I did it.

Is there a better way? I don’t know…

You can see the full change here if you’re interested: https://github.com/owensd/swift-lsp/commit/8e2de14124fae91cf0d02c873acd16f9e93f5ef2

 

Protocols, Generics, and Enums

Parsing Package.swift

As I’m working on my Swift language server, one of the things I need to do is to parse the Package.swift file. There is a PackageDescription library, but that’s unable for use within your own program if you are actually using SwiftPM.

Yeah…

So what are we to do? HACK IT UP!

Basically, we need to run this command:

$ swift -I /Library/Developer/Toolchains/swift-latest.xctoolchain/usr/lib/swift/pm \
    -L /Library/Developer/Toolchains/swift-latest.xctoolchain/usr/lib/swift/pm \
    -lPackageDescription Package.swift -fileno 1

Not really a big deal, just annoying.

Here’s the snippet to get the output:

Curious note: when you use string literal interpolation, even if your variable is an implicitly unwrapped optional, it will get output as an optional. Hence you see: "\(projectPath!)/package.swift". Go figure.

Of course, the output is JSON. If you need a parser, you can use json-swift or any of the many others.

 

The shell function is just a helper that I add to my SwiftFixes.swift file:

Anyhow, if you too need to parse that pesky Package.swift file, here you go.

Parsing Package.swift