Efficient Mutation

Here's the thing – I'm not very good at the whole concept of dealing with non-mutating data. Partly because I have a super difficult time understanding how you can write programs as the primary function of a program is to mutate data. So it's hard for me to grok the reason for design decisions such as "no pointers" and type-defined value/reference semantics. To me, they are actually a detriment to understanding how a program works.

So, I thought I'd post this to see if I can get some help from the community. I'll be glad to be corrected on this issue as I genuinely do not understand why there is this huge strive for immutable data everywhere.

Disclaimer: I completely understand why portions of your code would use immutable data; protecting yourself or others from accidentally causing side-effects is usually a good thing. That's not what I'm asking about though.

Ok, so what's the problem? That's easy: a syntax tree. I always find it best to ground my examples in real-world examples, and as I'm working on a programming language (Proteus) in my spare time, this seems like a great candidate.

The question is: how can I build up a tree over time efficiently without mutating but by retrieving new copies of the data? Further, how can I modify just a portion of the tree without needing to perform a copy of the entire tree?

It's the latter question that I'm really more interested in. I can completely see how the compiler could optimize away the copy while a type is being additively modified, especially since there are no other references to it. However, let's say you're building a code editor and you change the name of a function. There's no reason to re-parse the entire file; you know the location of the function in the file and where it sits in the AST, so the parse should be able to just parse the surrounding change and apply the delta to the tree.

Another way that I think about this problem is through the lens of a stream of changes that happen to the tree over time. So let's say we have the following stream:

  1. Parse top-level node (delta: add root node)
  2. Parse func decl (delta: add child to root node)
  3. Parse func body (delta: add children to func decl node)

Right, these are some very high-level deltas. With these three deltas, I don't see how to avoid the copy in the non-mutable world of the top-level node multiple times. In the world of mutation, I'd simply append the child to the parent for each of these deltas.

Am I just misunderstanding something?

To me, the efficiency of the program is extremely important. That's the time that the user is going to spend waiting for your code to run; it's important.

Efficient Mutation

Swift and Visual Studio Code

If you've been living under a rock, you might not know that Microsoft released a new editor (focused on web development): Visual Studio Code.

So… is it really just for web development? Honestly, it doesn't matter if it starts out that way because the thing that Microsoft tends to always get right is extensibility. We'll be able to start building plug-ins as the product matures to enable all sorts of development, including Swift!

Syntax Highlighting

Ok, the first thing that we need are the nice colors! This is actually fairly trivial to do:

  1. Open up the package contents of the Visual Studio Code app 2. Navigate to Contents/Resources/app/plugins 3. Create a new folder: vs.languages.swift 4. Place these three files into the created folder:
    1. swiftDef.js
    2. swiftMain.js
    3. ticino.plugin.json 5. Restart Visual Studio Code and open a Swift file!

You should see something like this:

Swift with basic syntax highlighting.

Note that this is just the most basic of highlighting that supports basic comments, strings, and keywords. More to come later!

Building

Next up, build errors! Again, this is also quite trivial to setup. You'll need to create a new build task. You can do this in a couple of ways. The easiest is:

  1. ‚åòP to bring up the command palette 2. Type build and press RETURN 3. There will be a prompt that you have not tasks; create the task 4. This will create a .settings folder with a tasks.json file

The contents of the file should be:

{
    "version": "0.1.0",
    "command": "swiftc",
    "showOutput": "silent",

    "args": ["main.swift"],

    "problemMatcher": {
        "pattern": {
            "regexp": "^(.*):(\\d+):(\\d+):\\s+(warning|error):\\s+(.*)$",
            "file": 1,
            "line": 2,
            "column": 3,
            "severity": 4,
            "message": 5
        }
    }
}

Now, you can press ⇧⌘B now to build. If you have an error, press ⇧⌘M to show the error pane.

Swift with a build error shown.

Next Steps

I'll probably be taking a look at making the editing experience better by playing around with the options for syntax highlighting. I also want to take a look at building a language service, though, officially this is not yet supported.

Goodie Bin

Ok… if you want to contribute to this, I've actually posted a fill GitHub repo here: https://github.com/owensd/vscode-swift.

What I actually have done, instead of creating a directory, is create a link within the Visual Studio Code package to my repo location on disk.

  1. cd /Applications/Visual\ Studio\ Code.app/Contents/Resources/app/plugins 2. ln -s <path/to/repo> vs.languages.swift 3. Restart Visual Studio Code

Voilà!

Swift and Visual Studio Code

Switch Statement Matching

What is the output of this problem?

let x = 5, y = 10

switch (x, y) {
case (_, _):
    println("_, _")

case (5, _):
    println("5, _")

case (_, 10):
    println("_, 10")

case (5, 10):
    println("5, 10")
}

If you said, "", then great! If you're sad that case (5, 10): doesn't match… well, you're not alone.

Friendly reminder, always read your swift-statements like a chain of if-statements. With pattern matching in Swift, this could be the source of some inadvertent bugs, especially when adding new case clauses.

UPDATE: I filed rdar://21001503 about having a warning for case statements that can not be reached.

Switch Statement Matching

Brent’s Week 3 Bike Sheed

If you're not following Brent Simmons, then go here first: http://inessential.com/langvalue.

Ok, now that you've read that, you should probably stop reading here until you go and try to solve the problems there. For extra points, only pick one of the struct/enum choices and pick one of the protocol based choices.

I'll wait…


If you've been following along with any of Brent's posts, you might notice that he has (intentionally?) created a problem that will help you uncover some of the limitations of Swift's type system based on some of his own frustrations (see his KVC diary posts). Nonetheless, it is good to compare what options are on the table when a problem like he presented is put before us.

Now, if you've only played with Swift on the surface, you might not know the gotcha's yet. Also, you've probably heard a bit about "PROTOCOLS, PROTOCOLS, PROTOCOLS!". Naturally, you might say to yourself, "AHA! The protocol solution must be the 'right one', 'best one', or the 'easiest one' to implement."

However, if you have been around Swift for a while, I'm sure you noticed that Brent has fiendishly put a requirement of Set<T>.

Meme - Train on bridge falling off, Caption: Use Protocols They Said... It's Awesome They Said...

So there's a dirty little (apparent) problem with protocols today… once they get a Self requirement, hetergenous usage of them goes out the window for any strongly typed collection.

But… maybe this is a sign. Maybe the "protocols first" approach tells us something. After all, implementing the protocol version of Brent's question is actually not even supported today in Swift (option 4 at least).

Solving the Problem

Let me suggest something: the spec artificially makes the problem harder and implies that protocols should be used in a way that I do not think Swift intends you to use them. Protocols are not types, they do not behave like types, and they cannot be used like types.

I think the better option to solving the problem presented is option #5: using enums and protocols.

Enums

The problem statement is very clear: there are three, and only three, different type constructs that are allowed. Namely:

  • Integer
  • String
  • Table

To me, this clearly implies that an enum is going to be the right structure to use.

enum LangValue {
    case IntegerValue(Int)
    case StringValue(String)
    indirect case TableValue([String:LangValue])
}

Beautiful. Seriously, I think that is an extremely clear and precise model of exactly what we want.

Now, we need to add a type function. So the complete, initial model looks like this:

enum LangValueType {
    case Integer
    case String
    case Table
}

enum LangValue {
    case IntegerValue(Int)
    case StringValue(String)
    indirect case TableValue([String:LangValue])

    var type: LangValueType {
        switch self {
        case .IntegerValue(_): return .Integer
        case .StringValue(_): return .String
        case .TableValue(_): return .Table
        }
    }
}

Protocols

OK, now here is where I think protocols should be used. You see, the spec now asks for four different classes of behaviors:

  • Convertible to an Integer
  • Convertible to a String
  • Addable
  • Storable (my term for the dictionary-like operations)

To me, this screams protocols.

protocol IntegerConvertible {
    func integerValue() throws -> Int
}

protocol StringConvertible {
    func stringValue() throws -> String
}

protocol Addable {
    typealias AddableType
    func add(other: AddableType) throws -> AddableType
}

protocol Storeable {
    typealias ValueType
    typealias KeyType

    mutating func set(object object: ValueType, forKey key: KeyType) throws
    mutating func remove(forKey key: KeyType) throws
    func object(forKey key: KeyType) throws -> ValueType?
    func keys() throws -> [KeyType]
}

Notice something: none of these protocols are tied to the LangValue type. If a type is needed, a typealias is used to make this generic so that these protocols can be applied to any type. This is key (and goes a bit against some of my previous recommendations about not making protocols generic).

Now it's simply a matter of applying these protocols to our type:

extension LangValue : IntegerConvertible {
    func integerValue() throws -> Int {
        switch self {
        case let IntegerValue(value): return value
        case let StringValue(value): return (value as NSString).integerValue
        default: throw LangCoercionError.InvalidInteger
        }
    }
}

extension LangValue : StringConvertible {
    func stringValue() throws -> String {
        switch self {
        case let IntegerValue(value): return NSString(format: "%d", value) as String
        case let StringValue(value): return value
        default: throw LangCoercionError.InvalidString
        }
    }
}

extension LangValue : Addable {
    func add(other: LangValue) throws -> LangValue {
        switch (self, other) {
        case let (.IntegerValue(lvalue), .IntegerValue(rvalue)):
            return .IntegerValue(lvalue + rvalue)

        case (.StringValue(_), .IntegerValue(_)): fallthrough
        case (.IntegerValue(_), .StringValue(_)): fallthrough
        case (.StringValue(_), .StringValue(_)):
            return try LangValue.StringValue(self.stringValue() + other.stringValue())

        default: throw LangCoercionError.InvalidAddition
        }
    }
}

extension LangValue : Storeable {
    mutating func set(object object: LangValue, forKey key: String) throws {
        switch self {
        case let .TableValue(table):
            var copy = table
            copy[key] = object
            self = LangValue.TableValue(copy)

        default: LangCoercionError.NotOfTypeTable
        }
    }

    mutating func remove(forKey key: String) throws {
        switch self {
        case let .TableValue(table):
            var copy = table
            copy.removeValueForKey(key)
            self = LangValue.TableValue(copy)

        default: throw LangCoercionError.NotOfTypeTable
        }
    }

    func object(forKey key: String) throws -> LangValue? {
        switch self {
        case var .TableValue(table):
            return table[key]

        default: throw LangCoercionError.NotOfTypeTable
        }
    }

    func keys() throws -> [String] {
        switch self {
        case let .TableValue(table):
            return Array(table.keys)

        default: throw LangCoercionError.NotOfTypeTable
        }
    }
}

Conclusion

So that's it, that's the basic approach to the problem that I think is a concise, clear model of the specification. It doesn't exactly fit the asks from the four options, but I think it is the better approach.

Full source code here: https://gist.github.com/owensd/33d85872c15c2b496515

Brent’s Week 3 Bike Sheed

Beware of the Enum

So apparently enums are still a flaking beast (Xcode 7.0 GM).

enum Awesome {
    case HowAwesome(String)
    case KindaAwesome
    case NotReallyAwesome
}

let a = Awesome.HowAwesome("THIS AWESOME")
let b = Awesome.KindaAwesome
let c = Awesome.NotReallyAwesome

print(a)
print(b)
print(c)

Ok, this code seems fine, and the output is:

HowAwesome("THIS AWESOME")
KindaAwesome
NotReallyAwesome

Now, let's image a new case statement is desired…

enum Awesome {
    case HowAwesome(String)
    case KindaAwesome
    case NotReallyAwesome
    case WithOutMeYouWouldBeHappy(String)
}

let a = Awesome.HowAwesome("THIS AWESOME")
let b = Awesome.KindaAwesome
let c = Awesome.NotReallyAwesome

print(a)
print(b)
print(c)

Here's the output:

HowAwesome("THIS AWESOME")
KindaAwesome
KindaAwesome

Yeah, not really that awesome. rdar://22915709.

Here's the clip from last night: https://youtu.be/DA-wjc6hwME?t=51m30s

UPDATE: 9:30am PDT, Sept 30, 2015

//platform.twitter.com/widgets.js

So yay! Maybe I should just switch to the beta…

Beware of the Enum

Associated Enum Cases as Types

This is a bit of a follow-up from my Enums with Associated Data vs. Structs. If you haven’t seen that, some of this might be a little out-of-context.

Another issue I run into with working with enums with associated data is actually getting the data out of them! Let’s take a look at the following code sample:

enum Token  {
    case Keyword(keyword: String, offset: Int)
    case Identifier(identifier: String, offset: Int)
}

//
// The declarations; no real issues here.
//

let importkey = Token.Keyword(keyword: "import", offset: 0)
let funckey = Token.Keyword(keyword: "func", offset: 0)

//
// The ways to match on a value and get some data out. Ugh!
//

if case Token.Keyword("import", _) = importkey {
    print("import keyword")
}

if case let Token.Keyword(keyword, _) = importkey {
    if keyword == "import" { print("import keyword") }
    if keyword == "func" { print("func keyword") }
}

switch importkey {
case let .Keyword(keyword): print("\(keyword) keyword")
case let .Identifier(identifier): print("\(identifier)")
}

All of those suck, in my opinion. There are a few of problems:

  1. If I want to store the result, I end up with a really stupid construct.
  2. If there are multiple associated data fields, I get lots of _ in the matches.
  3. They use this = somevar syntax that is completely unlike anything else in Swift.

What I wish had happened was to treat the case values as types. I’ve modelled what that would look like here:

enum Token  {
    case Keyword(keyword: String, offset: Int)
    case Identifier(identifier: String, offset: Int)

    var Keyword_ : (keyword: String, offset: Int)? {
        if case let .Keyword(keyword, offset) = self {
            return (keyword, offset)
        }
        return nil
    }

    var Identifier_ : (identifier: String, offset: Int)? {
        if case let .Identifier(identifier, offset) = self {
            return (identifier, offset)
        }
        return nil
    }
}

Imagine that Keyword_ and Identifier_ were implicitly created by the compiler. The idea is to make use of the other constructs of the language, in this case, Optional to allow us to access the inner details of the enum.

// It's a tuple, so access by index is allowed.
if importkey.Keyword_?.0 == "import" {
    print("import keyword")
}

// These are named tuples, so access by name is allowed.
if importkey.Keyword_?.keyword == "import" {
    print("import keyword")
}

if let keyword = importkey.Keyword_?.keyword {
    print("\(keyword) keyword")
}

Now, here’s the part I really don’t understand. Swift is already essentially doing this, we just don’t have access to it in any way other than the case pattern matching stuff. I think it would have been more preferable to do this:

if case ("import", _) = importkey.Keyword {
    print("import keyword")
}

To me, this makes it clear that tuples are all matched the exact same way. Instead, we have associated data in enums (which is essentially a tuple) matched one way, and tuples, that may have the exact same structure, matched a completely differently way.

rdar://22704262

Associated Enum Cases as Types

On Demand Resources and Games

Alright, this article over on imore.com that was linked by Darring Fireball talking about how on-demand resources isn’t going to be a problem for games got me a bit riled up.

On-demand resources is fine for some classes games. However, this is not true for games like XCOM. The desktop version of this game clocks in at 20GB (Enemy Within)1. There is no amount of tagging, stripping, or slicing that is going to get a company like Firaxis Games to deliver a desktop quality game on Apple’s supposed desktop class hardware because desktop (and console) quality games are bigger than 2GB.

Let’s take the defense of this, from the article:

But: You have a 4GB game! How do you get those other 25 levels?

Easily, thanks to the power of background processing. On-Demand Resources works in conjunction with whatever your user is actively accessing, and will flush older, unused content to make room for additional resources. If a user is playing level 24 of your game, the system automatically flushes a few 100MB tags of old levels (say, 1-5) to make room for levels 25-30. As the user gets further into your game, older levels drop off and get deleted from the Apple TV, and your new levels (also in tag bundles) get installed – all in the background.

Let’s just play out this scenario: I am playing a game and I’m on level 25 (older levels are now purged because 2GB is enough for any game). Now, my son or daughter comes in and they want to play, but they need to start a new character because they haven’t played before. Ok, they to go play the game…

beach ball of death

Oh… pardon me, I need to download those levels… Meanwhile, while this is downloading, other resources are being dumped out (I sure hope they are not the levels that I’m playing). You see, there’s no telling how much overlap between game assets are going to be between levels 1-5 and 25-30. If a game is 10~15GB, it’s reasonable to assume that there is not a lot of overlap of resources between levels as you progress through the game.

The kids get tired of waiting for the levels to download, so they go do something else. I then go and try to play my levels, and of course… data has been purged and more assets are coming in. This is fun!

There is another description for this phenomenon in computer science: thrashing.

Now, this probably works better of iOS devices because those are mostly single-user devices. However, the TV is centralized and consumed by multiple individuals.

It’s decisions like this and the game controller decision (which is a fascinating case of stealth documentation changes) that tell me Apple just doesn’t care to really enable high-quality gaming on tvOS. Instead, companies are going to basically bringing their iOS versions over, which I find so disheartening. Especially since disk space is so cheap these days; put a 1TB fusion drive in the device and charge $50 more or stop teasing us with actually making the iOS and tvOS platforms a contender for more than just casual games.

  1. Now, Firaxis stripped out a LOT and was able to get the iOS version down to 2GB. However, it took a big hit on what it could actually deliver.
On Demand Resources and Games

Enums with Associated Data vs. Structs

Maybe everyone can help me out here: I'm trying to find out what the value of enums with associated values is over structs that conform to a protocol.

Here's the context: I'm playing around with the parser code and I'm using an enum to describe what statements might look like in Proteus.

Let's say it looks like this:

enum Statement {
    case Import(keyword: Token, packageName: Token)
    case Assignment(binding: Token, value: Expression)
} 

Ok, so here's the issue. When I'm writing the parser, I want to have a function parseImport() to handle the actual parsing of the import construct. However, Swift doesn't allow me to define the function definition that I actually want:

func parseImport() throws -> Statement.Import { ... }

The thing is, I don't want parseImport to return any of the possible values for Statement; I only want the function to be able to return the specific case of Import. Furthermore, the data stored within each of the associated values isn't related.

So the question is, when is an associated value enum a better choice than structs? You see, I could define the above this way too:

protocol Statement {}

struct Import : Statement {
    let keyword: Token
    let packageName: Token
}

struct Assignment : Statement {
    let binding: Token
    let value: Expression
}

func parseImport() throws -> Import { ... }

The only clear win I see is that the enum version is more terse. At the end of the day, the usage code is pretty similar in the fact that each needs to determine what specific type of Statement is being dealt with, except I can remove the switch necessary to unpack the Statement.Import version.

There has to be some benefit right? Right now, I'm only see a downside, especially given the fact that I cannot model the desire to have a function return only a specific type of enum value.

UPDATE Sunday, September 13th, 2015 @ 11:49 PM PDT

I should have been a little less cavalier about the "some benefit". I understand that enums limit the potential values while protocols and structs allow this to be unbound. That's not really the problem though. The problem is about the coupling of the parts that make up the associated value of the enum with the notion of the actual type itself.

One suggestion to my dilemma was to essentially do the following:

typealias ImportStatement = (keyword: Token, packageName: Token)
typealias AssignmentStatement = (binding: Token, value: Expression)

enum Statement {
    case Import(ImportStatement)
    case Assignment(AssignmentStatement)
}

This allows the parseImport to return an ImportStatement instead. Of course, I then need to turn that into a Statement in my calling code. That kinda works; though it's much more tedious to actually do that, and it puts the construction of the actual type in the wrong location. Another problem with that approach is the assumption that I own the code. What if I was using a library that provided the enum, such as the one in Apple's Swift documentation:

enum Barcode {
    case UPCA(Int, Int, Int, Int)
    case QRCode(String)
}

Let's say I'm writing a parser for that and I want to write the parseUPCA function. What is the return type? Do I provide the loosely-typed version of parseUPCA() -> (Int, Int, Int, Int) and then have to pust the creation of the Barcode.UPCA instance at the caller (and if the UPCA data changes, it's more than a usage code update, I need to update all places that used the parts too, which is a potentially much more difficult search-and-replace fix)? Or return a Barcode and hope that I only ever get the UPCA version out of it?

I guess I just don't find any of the options available to us particularly good. In my code, Barcode.UPCA is all together a different type than Barcode.QRCode; I'd simply like to be able to actually model that.

So I need to ask myself: allow for the proper modelling of the type signatures or allow for the proper modelling of the closed nature of the types; I don't know how to do both in Swift today.

Enums with Associated Data vs. Structs

Language Design: Declarations

In case you haven’t caught on, the “Language Design” posts are basically me rambling about ideas during the development of the Proteus language.

I’m implementing the first part of the parser that actually handles declarations… The question my mind is using a simplified syntax or introducing keywords. I’m going to put this out here first so everyone can all gasp at once: I don’t like the idea of const; I’m not even sure the concept will be in the language.

So let’s look at some declarations:

// With a keyword (option 1)
var foo: i32 = 12    // declare and assign; type is explicit
var foo = 12         // declare and assign; type is inferred
var foo: i32         // declare; type is explicit

// No keyword used (option 2)
foo : i32 = 12    // declare and assign; type is explicit
foo := 12         // declare and assign; type is inferred
foo : i32         // declare; type is explicit
foo :: i32        // declare; type is explicit (two :: for clarity)

Option 1: Using a Keyword

At first look, I’m not really sure what that keyword is worth it. For one, I’m not sure if I’m going to be having a corresponding thing like let. For another, it’s just more typing. Is there a true benefit for it?

Option 2: No Keyword

There is something nice about this approach; it’s clean, fewer characters, and still plays nice with type inference. Ok, so this is a clear winner, yeah? Well, I’m not so sure.

Defining Other Types

Ok, so what about other types, like functions, structs, and enums? Now it starts to get a bit more interesting.

Functions
// With a keyword (option 1)
func foo(i32, i32) -> i32 = (x, y) { return x + y }      // declare and assign; type is explicit
var foo = func (x: i32, y: i32) -> i32 { return x + y }  // declare and assign; type is inferred
var foo: func (i32, i32) -> i32                          // declare; type is explicit

var foo = (x: i32, y: i32) -> i32 { return x + y }       // declare and assign; type is inferred
var foo: (i32, i32) -> i32                               // declare; type is explicit

// No keyword used (option 2)
foo : (i32, i32) -> i32 = (x, y) { return x + y }      // declare and assign; type is explicit
foo := (x: i32, y: i32) -> i32 { return x + y }        // declare and assign; type is inferred
foo : (i32, i32) -> i32                                // declare; type is explicit
foo :: (i32, i32) -> i32                               // declare; type is explicit (two :: for clarity)

With functions, it looks like option 2 is coming out ahead. I really like that there is explicit consistency throughout. With the keyword version, it starts to get really cumbersome. In order to fix that, special cases need to be introduced to start dropping some of the verbosity.

Also, the var keyword in option 1 is pretty problematic. Can foo really be assigned to something else? Now it seems that a let keyword has to be introduced just to support functions. Though, option 2 has this problem as well to some extent. However, the foo that are just declarations could be treated as a typealias instead, a and a variable that points to a function would be declared as: foo :: *(i32, i32) -&gt; i32.

Again, option 2 is looking like it’s pulling ahead to the cleanest choice.

Structs
// With a keyword (option 1)
struct Foo {
    f: i32
}

// No keyword used (option 2)
Foo :: {
    f: i32
    defaultValue := 32
}

Ok, option 2 seems a bit strange in this scenario, but that could just be my eyes are so used to the first style.

Enums
// With a keyword (option 1)
enum Foo {
    ChooseMe
    SpecificValue
}

// No keyword used (option 2)
Foo :: {
    SomeValue
    SpecificValue   := 32
}

Ok… that option 2 doesn’t work; it’s ambiguous now. Hmm… something has to change here, and while some syntactic sugar could be added, like using | to separate the cases, this is starting to look like a general problem of making declarations less ambiguous at a quick glance.

Foo :: enum {
    SomeValue
    SpecificValue   := 32
}

If an approach like that is taken, then why is a function treated differently? Right, it should be foo :: func (x: i32) -&gt; i32. If a keyword is going to need to be used, well, then it seems like the keyword-first approach is both the easier construct to use and easier to extend.

Conclusion

So where does this leave me? Well… I think I’m actually going to take the hybrid approach for now and treat variable declarations fundamentally different than type declarations. We’ll see how that goes for now.

Here is a short summary of the basic decision:

foo := 12

func add(x: i32, y: i32) -> i32 { return x + y }

// This code, for now at least, will be invalid.
foo := (x: i32, y: i32) -> i32 { return x + y }

// Instead, this would need to be used.
foo := &add

struct BigFoo {
    littleFoo: i32
}

enum ChooseFoo {
    DefaultChoice
    MyChoice := 5
}

As always, feel free to leave any comments in the Proteus Discussion Group.

Sidebar: Yeah, sure, technically the let says that the value it holds is constant, and yes, it’s technically true that only requires that the pointer address cannot be changed. However, this is what I mean by “semantically incorrect”. How do you say that not only is the pointer immutable, but the value that I point to is also immutable? This starts to add a bunch of complexity that I don’t want to deal with at the moment.

Language Design: Declarations

Proteus – The First Couple of Weeks

Here it is, the first update for September. I’ve been thinking a bunch about what I want from the language and what I think I want the shape of it to be. I’ve come the following basic conclusions:

  1. The syntax will gently evolve from C to Proteus. We’ll take changes a few at a time while the language starts to develop.
  2. To ensure we’re building something real, I’ll be porting the code from Handmade Hero from C to Proteus1.

So, the last language design article had this snippet:

import stdlib

let arguments = environment.arguments
guard if arguments.count != 2:
    println "Invalid usage!"; exit -1

let filename = arguments[1]
println "compiling file: {0}\n" filename
guard let file = sys.io.fopen filename sys.io.flags.read:
    println "Unable to open file."; exit -1

guard let content = sys.io.read file:
    println "Unable to read the entire file."; exit -1

println "file size: {0}" file.size
println "file contents: \n{0}" content

The more I played around with this style, I’m a bit concerned that there simply isn’t enough lexical information to make it easily scannable and parseable by humans. Also, I do wonder about the block distinction; I’ve gone back and forth on this. In one sense, I really like the significant whitespace, but on the other, I like the visual clarity of { and }; they are easier to grep while scanning quickly2.

With that in mind, the following will be the focus on the syntactical changes from C in the first iteration:

  1. Switch the order of type declaration
  2. Stub out the import mechanism
  3. Remove the ; for statement termination; it’s only used for multiple statements per line now.

Note: One other benefit of do the migrations from C to Proteus like this is the ability to convert a C file to a Proteus file (for the most part).

This means that the above sample will look like this:

import stdlib

let arguments = environment.arguments
guard if arguments.count != 2 {
    println("Invalid usage!"); exit(-1)
}

let filename = arguments[1]
println("compiling file: {0}\n", filename)
guard let file = sys.io.fopen(filename,sys.io.flags.read) {
    println("Unable to open file."); exit(-1)
}

guard let content = sys.io.read(file) {
    println("Unable to read the entire file."); exit(-1)
}

println("file size: {0}", file.size)
println("file contents: \n{0}", content)

The work items over the next couple of weeks boil down to:

  1. Finish up the initial lexer (scanner and tokenizer are basically there already)
  2. Start generating executable code (the first round of output will generate C code)
  3. Start getting the C interop in place

Also, I’ve got a spot setup for the location of the Proteus documentation: https://owensd.io/proteus.


  1. Handmade Hero is currently closed source. As Proteus matures, I may look into adding an official fork in the GitHub repo. However, until that time, I’ll only talk about my experiences and show a few snippets; I will not be releasing the Proteus version of Handmade Hero until it’s officially released. 
  2. Of course, if the starting { is offscreen, then the } might not be as helpful‚Ķ I don’t know yet. 
Proteus – The First Couple of Weeks