Language Design: Declarations

In case you haven’t caught on, the “Language Design” posts are basically me rambling about ideas during the development of the Proteus language.

I’m implementing the first part of the parser that actually handles declarations… The question my mind is using a simplified syntax or introducing keywords. I’m going to put this out here first so everyone can all gasp at once: I don’t like the idea of const; I’m not even sure the concept will be in the language.

So let’s look at some declarations:

// With a keyword (option 1)
var foo: i32 = 12    // declare and assign; type is explicit
var foo = 12         // declare and assign; type is inferred
var foo: i32         // declare; type is explicit

// No keyword used (option 2)
foo : i32 = 12    // declare and assign; type is explicit
foo := 12         // declare and assign; type is inferred
foo : i32         // declare; type is explicit
foo :: i32        // declare; type is explicit (two :: for clarity)

Option 1: Using a Keyword

At first look, I’m not really sure what that keyword is worth it. For one, I’m not sure if I’m going to be having a corresponding thing like let. For another, it’s just more typing. Is there a true benefit for it?

Option 2: No Keyword

There is something nice about this approach; it’s clean, fewer characters, and still plays nice with type inference. Ok, so this is a clear winner, yeah? Well, I’m not so sure.

Defining Other Types

Ok, so what about other types, like functions, structs, and enums? Now it starts to get a bit more interesting.

// With a keyword (option 1)
func foo(i32, i32) -> i32 = (x, y) { return x + y }      // declare and assign; type is explicit
var foo = func (x: i32, y: i32) -> i32 { return x + y }  // declare and assign; type is inferred
var foo: func (i32, i32) -> i32                          // declare; type is explicit

var foo = (x: i32, y: i32) -> i32 { return x + y }       // declare and assign; type is inferred
var foo: (i32, i32) -> i32                               // declare; type is explicit

// No keyword used (option 2)
foo : (i32, i32) -> i32 = (x, y) { return x + y }      // declare and assign; type is explicit
foo := (x: i32, y: i32) -> i32 { return x + y }        // declare and assign; type is inferred
foo : (i32, i32) -> i32                                // declare; type is explicit
foo :: (i32, i32) -> i32                               // declare; type is explicit (two :: for clarity)

With functions, it looks like option 2 is coming out ahead. I really like that there is explicit consistency throughout. With the keyword version, it starts to get really cumbersome. In order to fix that, special cases need to be introduced to start dropping some of the verbosity.

Also, the var keyword in option 1 is pretty problematic. Can foo really be assigned to something else? Now it seems that a let keyword has to be introduced just to support functions. Though, option 2 has this problem as well to some extent. However, the foo that are just declarations could be treated as a typealias instead, a and a variable that points to a function would be declared as: foo :: *(i32, i32) -> i32.

Again, option 2 is looking like it’s pulling ahead to the cleanest choice.

// With a keyword (option 1)
struct Foo {
    f: i32

// No keyword used (option 2)
Foo :: {
    f: i32
    defaultValue := 32

Ok, option 2 seems a bit strange in this scenario, but that could just be my eyes are so used to the first style.

// With a keyword (option 1)
enum Foo {

// No keyword used (option 2)
Foo :: {
    SpecificValue   := 32

Ok… that option 2 doesn’t work; it’s ambiguous now. Hmm… something has to change here, and while some syntactic sugar could be added, like using | to separate the cases, this is starting to look like a general problem of making declarations less ambiguous at a quick glance.

Foo :: enum {
    SpecificValue   := 32

If an approach like that is taken, then why is a function treated differently? Right, it should be foo :: func (x: i32) -> i32. If a keyword is going to need to be used, well, then it seems like the keyword-first approach is both the easier construct to use and easier to extend.


So where does this leave me? Well… I think I’m actually going to take the hybrid approach for now and treat variable declarations fundamentally different than type declarations. We’ll see how that goes for now.

Here is a short summary of the basic decision:

foo := 12

func add(x: i32, y: i32) -> i32 { return x + y }

// This code, for now at least, will be invalid.
foo := (x: i32, y: i32) -> i32 { return x + y }

// Instead, this would need to be used.
foo := &add

struct BigFoo {
    littleFoo: i32

enum ChooseFoo {
    MyChoice := 5

As always, feel free to leave any comments in the Proteus Discussion Group.

Sidebar: Yeah, sure, technically the let says that the value it holds is constant, and yes, it’s technically true that only requires that the pointer address cannot be changed. However, this is what I mean by “semantically incorrect”. How do you say that not only is the pointer immutable, but the value that I point to is also immutable? This starts to add a bunch of complexity that I don’t want to deal with at the moment.

Language Design: Declarations

Proteus – The First Couple of Weeks

Here it is, the first update for September. I’ve been thinking a bunch about what I want from the language and what I think I want the shape of it to be. I’ve come the following basic conclusions:

  1. The syntax will gently evolve from C to Proteus. We’ll take changes a few at a time while the language starts to develop.
  2. To ensure we’re building something real, I’ll be porting the code from Handmade Hero from C to Proteus1.

So, the last language design article had this snippet:

import stdlib

let arguments = environment.arguments
guard if arguments.count != 2:
    println "Invalid usage!"; exit -1

let filename = arguments[1]
println "compiling file: {0}\n" filename
guard let file = filename
    println "Unable to open file."; exit -1

guard let content = file:
    println "Unable to read the entire file."; exit -1

println "file size: {0}" file.size
println "file contents: \n{0}" content

The more I played around with this style, I’m a bit concerned that there simply isn’t enough lexical information to make it easily scannable and parseable by humans. Also, I do wonder about the block distinction; I’ve gone back and forth on this. In one sense, I really like the significant whitespace, but on the other, I like the visual clarity of { and }; they are easier to grep while scanning quickly2.

With that in mind, the following will be the focus on the syntactical changes from C in the first iteration:

  1. Switch the order of type declaration
  2. Stub out the import mechanism
  3. Remove the ; for statement termination; it’s only used for multiple statements per line now.

Note: One other benefit of do the migrations from C to Proteus like this is the ability to convert a C file to a Proteus file (for the most part).

This means that the above sample will look like this:

import stdlib

let arguments = environment.arguments
guard if arguments.count != 2 {
    println("Invalid usage!"); exit(-1)

let filename = arguments[1]
println("compiling file: {0}\n", filename)
guard let file =, {
    println("Unable to open file."); exit(-1)

guard let content = {
    println("Unable to read the entire file."); exit(-1)

println("file size: {0}", file.size)
println("file contents: \n{0}", content)

The work items over the next couple of weeks boil down to:

  1. Finish up the initial lexer (scanner and tokenizer are basically there already)
  2. Start generating executable code (the first round of output will generate C code)
  3. Start getting the C interop in place

Also, I’ve got a spot setup for the location of the Proteus documentation:

  1. Handmade Hero is currently closed source. As Proteus matures, I may look into adding an official fork in the GitHub repo. However, until that time, I’ll only talk about my experiences and show a few snippets; I will not be releasing the Proteus version of Handmade Hero until it’s officially released. 
  2. Of course, if the starting { is offscreen, then the } might not be as helpful‚Ķ I don’t know yet. 
Proteus – The First Couple of Weeks

Language Design: Building a Modern C, Round 1

When I think about building a modern C language, there really are just a set of a few things I’m thinking:

  1. Cleaner, more consistent syntax
  2. Proper meta-programming (e.g. a real preprocessor and introspection)
  3. Consistency across types
  4. Better built-in types (e.g. a proper string type and bounded arrays)
    So let’s just start with that.

Here is a very basic C program that was the start of the Proteus compiler:

#include <stdio.h>
#include <stdlib.h>

int main(int ArgumentCount, char** ArgumentValues) {
      if (ArgumentCount != 2) {
          printf("Invalid usage!\n");
          return -1;

    printf("compiling file: %s\n", ArgumentValues[1]);
    FILE *InputFile = fopen(ArgumentValues[1], "r");
    if (InputFile == 0) {
        printf("Unable to open file.\n");
        return -1;

    fseek(InputFile, 0L, SEEK_END);
    long FileLength = ftell(InputFile);

    char *FileBuffer = (char *)malloc(sizeof(char) * (FileLength + 1));
    if (FileBuffer == 0) {
        printf("Unable to allocate memory for the file buffer.\n");
        return -1;

    if (fread(FileBuffer, 1, FileLength, InputFile) != FileLength) {
        printf("Unable to read the entire file.\n");
        return -1;
    FileBuffer[FileLength] = '\0';

    printf("file size: %ld\n", FileLength);
    printf("file contents: \n%s\n", FileBuffer);


    return 0;

This is a translation of some of the items I mentioned above:

import stdlib

let arguments = environment.arguments
guard if arguments.count != 2:
    println "Invalid usage!"; exit -1

let filename = arguments[1]
println "compiling file: {0}\n" filename
guard let file = filename
    println "Unable to open file."; exit -1

guard let content = file:
    println "Unable to read the entire file."; exit -1

println "file size: {0}" file.size
println "file contents: \n{0}" content

This includes:

  • module system
  • improvements to the standard library
  • type inference
  • syntactical noise reduction (the : is the “open scope” syntactic sugar)
  • error handling mechanism (guard, which forces the code to exit scope if the condition fails)

I’m not entirely convinced on the lack of () to invoke functions, but they seem fairly superfluous. Removing them also makes this type of code clearer, in my opinion: foo 2 (3 + 2) vs. foo(2, (3 + 2)). Reading the second option requires a context shift to understand that the first ( is a function call group, where as the second ( is a grouping of work that should be done first. Plus, who doesn’t want to be more like Haskell?

Language Design: Building a Modern C, Round 1

Proteus Development Starts

Today, development on Proteus officially starts!

I’ve created a Patreon if you feel keen to supporting the development. The plan is simple: I’ll work on it weekly posting updates every two weeks. Backers at different levels will get a few additional perks.

The goals of the language are simple: create a modern-day C language, and on top of that, create a modern-day Objective-C language.

I’ll be writing the parser using Swift, so other platform support will come when Swift is open sourced and working on various platforms. I expect to see a lot of interesting tangential posts comparing the two languages.

This is an experiment. At the end of it, I hope to have a language more streamlined and “safe” than C without sacrificing the raw hardware access that it provides. On top of that will be the dynamic runtime that will serve as the modern-day Objective-C implementation.

Eventually, the project will be completely open-sourced.

Although this project was inspired by Handmade Hero, it will be not “handmade”. I’ll be writing the lexer and parsing for the language by hand, and likely with few other tools. However, the compilation steps will be leveraging LLVM. So, I expect we’ll see some interesting Swift-to-LLVM C API challenges along the way.

More details to come later!

Proteus Development Starts

We Can Do Better


It is the idea that we can break things down into nice, cohesive parts. Once we do that, we start to build more and more abstractions on top of what we have until we are left with what we have today: obscurity.

It saddens me that the we are equating “modern programming” as meaning we need to abstract away the fundamental parts of programming. Many languages coming out today are trying to get away from what programming fundamentally is: manipulating memory.

Now, I get it. Memory management can be a pain. However, as I reflect on where our industry has gone, it seems to be as if we’ve thrown the baby out with the bathwater. We’ve introduced highly involved and complicated systems to attempt to solve this memory management problem.

And have we succeeded?

For some domains, we have gotten “good enough”. So when I see languages coming out that are so focused on getting rid of memory management as a core principle, it just feels wrong to me. It feels like we are solving the wrong problem.

I think the problem comes down to two things:

  1. Clarity of expression – how well can we author our code, understand our code, and maintain our code?
  2. Introspective ability – how good are our tools that we use to help us diagnose our problems?

The first, I think it is a language problem. The second is a tooling problem. If we could track the lifetime of every allocation in our program, we could understand just exactly where and why memory is being leaked. We could track who is overwriting memory that they don’t own. We could know when memory is being accessed that has been reclaimed.

Yes, to some extent we’ve invested in some of these tools. But most of them suck and the language does little to help diagnose any of these problems. The language also does little to help you prevent these problems, and when they do try and help, those languages make doing many things much more difficult. High Friction.1

That’s what I want to try and do. Create a better language that allows you to be expressive, pragmatic, and get stuff done without handcuffing and treating you like a child.

Hello Proteus.

  1. The “high friction” term is something that Jonathan Blow talks about in his exploration of a new programming language for games. If you have not seen his videos, they are great; I highly recommend you check them out.
We Can Do Better

Stop to Smell the Roses

I would love to build a better Objective C language. It would be a language that embraces the functional programming world and the imperative programming world, the world of dynamic runtimes and static types, and build a bridge to the future from the past.

The following is taken from some of my notes, so the code samples might be a bit rough.

Class definitions would be simple and straight forward:

' All classes derive from `NSObject` by default. Use the `derives <class>` syntax
' to subclass differently. Also, use `implements <type>, <type>, ...` for protocols.
interface Person
    ' The `let` keyword creates readonly properties.
    let firstName :: NSString
    let lastName :: NSString

    ' The `var` keyword creates read/write properties.
    ' The `[copy]` is the attribute applied to the property, in this case
    ' all writes to `emailAddress` will create a copy of the incoming value.
    var emailAddress :: NSString [copy]

    ' initializer with parameters, defaults to returning `instancetype` in ObjC
    def initWithFirstName::NSString lastName::NSString
        _firstName = firstName
        _lastName = lastName

    ' a message name with a single parameter, defaults to returning `void` in ObjC
    def say message::NSString
        log("\(firstName) says \"\(message)\"")

    ' A simple message that returns a `NSString`
    def fullName -> NSString
        return "\(firstName) \(lastName)"

All function definitions would allow for partial application:

def sum x: Number -> y: Number -> Number
    return x + y

let sum1 := sum 1
let sum2 := sum1 2
let full := sum 3 4
log "sum of 1 + 2 = \(sum2)"

Syntax would be light and free of unnecessary tokens:

' No difference between message calls and function calls
let david := Person newWithFirstName:"David" lastName:"Owens
david.emailAddress := ""

' Use () to group nested calls for ordering
let sally := (Person alloc) initWithFirstName:"Sally" lastName:"Sue"
sally say:"Hello all the peoples!"

let ages = {
    "Tim":  53,  "Angela": 54,  "Craig":   44,
    "Jony": 47,  "Chris":  37,  "Michael": 34,

let people = filter (key => ages[key] < 50) (sort < (ages allKeys))

All of this code can be re-written to ObjC (by a tool). I have prototypes of some of it and manual constructions of others. I know it’s possible and it allows for full interop between this language dubbed Proteus and ObjC.

But… what’s the point?

I don’t mean to ask that as a submission or a throw-in-the-towel remark. But really, what’s the point? Where do I want to go with it? It would be fun to go through designing this out and getting all of the main scenarios working – I haven’t done that since my college days. I don’t even think it’s a super amount of work. But the edge cases will suck. The considerations of certain C constructs will get a bit dodgy. And in order to really get the performance up to par, I’m likely going to need to emit LLVM code instead of going through the ObjC source code, or worse (in my opinion), need to make changes to LLVM or Clang like the Eero project did. And this is before I even get into editor support, code highlighting, debugging, etc…

Swift has it’s warts. It’s a baby that’s trying to grow up quickly in a world that is harsh. The question I’ve been asking myself, both through these posts and throughout the weeks is really this: are Swift’s warts and ugly places worse than those of Objective C’s? Will they be tomorrow?

There will be a lot I miss about ObjC. Hopefully we’ll see more of that over time. But I think it’s time for this station to get back to it’s regularly scheduled program.

Stop to Smell the Roses

Building a Better Objective C

If you’ve been following along with my recent posts regarding enums and Objective C, you’ll notice a theme: Objective C is quite capable of expressing what we want, it’s just extremely verbose, especially compared to the improved syntax that Swift brings to the table.

There are many significant drawbacks to Swift, especially when you need to interop with ObjC code. In this world, we need something better. We need to be able to transparently bridge between the world of modern syntax and conveniences without being shackled to the days of C programming, while at the same time, being able to maintain 100% interop with all of our existing ObjC code.

We could continue to suffer through the ObjC syntax, or, we can do something about it.

Framing the Solution One solution to solving the problems I exposed in the series with enums is to use a bunch of MACROs. That works, but it doesn’t get us nearly where we want to be: a modern, sleek syntax that is both easy to write, and more importantly, easy to read and reason about.

This is where we want to be:

enum CompassPoint: North | South | East | West

That’s it. That’s all it should take to define one of the “traditional” enums.

enum Barcode:
  | UPCA(numberSystem :: int, manufacturer :: int, product :: int, checkInt :: int)
  | QRCode(code :: NSString)

And above is what the “associative value” enum looks like. I’d like to call out something here, we actually have more information encoded in the enum then Swift here – we capture the name of the components1. This adds much more clarity to what the values actually are.

And finally, the raw values:

enum ASCIIControlCharacter :: NSString:
  | Tab := "\t"
  | LineFeed := "\n"
  | CarriageReturn := "\r"

You might be wondering at how we are going to accomplish this. Well, it’s actually pretty straight forward. Here are the items we’ll be building:

  1. A parser for our enum syntax
  2. A tool to convert our parse tree to ObjC code

That’s it. We don’t actually need anything else2. I’ll show how we can leverage Xcode’s built-in extensibility points to give us nice error reporting and seamless integration of our new enum definition.

I will be calling this new language: Proteus.

Building the Parser There are basically two ways to generate the parser: handwrite one or use some tools to generate one from a grammar. I’m going to handwrite our parser, primarily because our enum grammar is quite simple and it’s really easy to provide very friendly error messages with a handwritten parser.

Note: I’ll only be showing building the most simple of parsers accepting only the basic version of the enum. The full github link will be posted at the end. Overtime I’ll continue to add support more features.

The work is going to be broken up into the following components:

  1. Scanner – this is going to break up the input file into a set of tokens.
  2. Analyzer – this is going to take the tokens from the Scanner and apply meaning to them, returning an array of Constructs.
  3. Construct – a representation of the different type of language constructs, such as enum or func.
  4. Rewriter – a function that can take a Construct and turn it into the appropriate Objective C header and implementation files.

I will be writing the components in Swift – I like the irony of building a better Objective C in a language that I think didn’t live up to that promise. =)

The full source for the project can be found here:

If you take a look at the code for the scanner (in lexer.swift), the output for the CompassPoint declaration are the following tokens:

  1. Token(Keyword) “enum”
  2. Token(Identifier) “CompassPoint”
  3. Token(Colon) “:”
  4. Token(Identifier) “North”
  5. Token(Pipe) “|”
  6. Token(Identifier) “South”
  7. Token(Pipe) “|”
  8. Token(Identifier) “East”
  9. Token(Pipe) “|”
  10. Token(Identifier) “West”

This input is passed into the Analyzer which outputs an Enum construct.

  typeName = "enum"
  identifier = "CompassPoint"
  options = [

Then there is the rewriteEnumToObjC rewriter function. The full code for that is below:

private func rewriteEnumToObjC(value: Enum) -> (header: String, implementation: String)
    var header = "@interface \(value.identifier) : NSObject\n\n"

    var implementation = "#include \"\(value.identifier).h\"\n\n"
    implementation += "#define RETURN_ENUM_INSTANCE() \\\n"
    implementation += "    static \(value.identifier) *instance = nil;\\\n"
    implementation += "    static dispatch_once_t onceToken;\\\n"
    implementation += "    dispatch_once(&onceToken, ^{\\\n"
    implementation += "        instance = [[\(value.identifier) alloc] init];\\\n"
    implementation += "    });\\\n"
    implementation += "    return instance;\n\n"

    implementation += "@implementation \(value.identifier)\n\n"

    for option in value.options {
        header += "+ (\(value.identifier) *)\(;\n"
        implementation += "+ (\(value.identifier) *)\( { RETURN_ENUM_INSTANCE(); }\n"

    header += "\n+ (NSArray *)values;\n"
    implementation += "\n+ (NSArray *)values\n"
    implementation += "{\n"
    implementation += "    static NSArray *values = nil;\n"
    implementation += "    static dispatch_once_t onceToken;\n"
    implementation += "    dispatch_once(&onceToken, ^{\n"
    implementation += "        values = @[ "

    for (idx, option) in enumerate(value.options) {
        implementation += "\(value.identifier).\("
        if (idx != value.options.count - 1) {
            implementation += ", "

    implementation += " ];\n"
    implementation += "    });\n\n"
    implementation += "    return values;\n"
    implementation += "}\n"

    header += "\n@end\n"
    implementation += "\n@end\n"

    return (header, implementation)


The basic concept is to literally just write the header and implementation files as you would normally.

With each of those components in place, it’s time to hook it up!

Integration with Xcode If you download the project I linked, there will be a command-line tool called protc. It takes two parameters:

  1. -file – the path to the .prot file that contains our enum definition
  2. -output – the path that the .h and .m files will be written to

So here’s the magic… Xcode has these things called “Build Rules”. It’s basically how anything gets compiled within Xcode. Well, we can create our own build rules for our .prot files.

Step 1: Build Rules To do that:

  1. Select your project in the project navigator
  2. Select the target for your app/tool
  3. Select the “Build Rules” tab in the project editor
  4. Add a new build rule

The script will be filled in with this content:

rm "${DERIVED_FILE_DIR}/${INPUT_FILE_BASE}.h" 2> /dev/null
rm "${DERIVED_FILE_DIR}/${INPUT_FILE_BASE}.m" 2> /dev/null
/Users/owensd/Library/Developer/Xcode/DerivedData/protc-ecehehjngfljckcirxwdltnqqayl/Build/Products/Debug/protc -file ${INPUT_FILE_PATH} -output ${DERIVED_FILE_DIR}

Then be sure to set the “Output Files” to:


Here’s a screenshot of what it looks like:

Custom build rules allow us great flexibility.

: .caption

Step 2: Create your .prot file Next, create your .prot file just as you would any file. The trick is to add it to your “Compile Sources”.

  1. Select your project in the project navigator
  2. Select the target for your app/tool
  3. Select the “Build Phases” tab in the project editor
  4. Add your .prot file to the list of “Compile Sources”

Don’t forget this step or nothing will seem to be working!

: .caption

Step 3: Use your new enum! That’s really it. Now when you build your project, the .prot will be processed and in turn the generated .m file will be compiled into your project’s target.

To use it, simply add the header file as normal:

#import <Foundation/Foundation.h>
#import "CompassPoint.h"

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        for (CompassPoint *point in CompassPoint.values) {
            if (point == CompassPoint.North) {
                NSLog(@"direction is north");
            else if (point == CompassPoint.South) {
                NSLog(@"direction is south");
            else if (point == CompassPoint.East) {
                NSLog(@"direction is east");
            else if (point == CompassPoint.West) {
                NSLog(@"direction is west");

    return 0;


If you set your “Output Files” to be ${DERIVED_FILE_DIR}, the header paths have no trouble finding the generated header.

Step 4: Handling Build Errors Of course, sometimes you might make a mistake in your .prot file. It would suck if there was no way to handle this or to be notified of this… but of course that’s not the case!

Change your .prot file to this:

enum CompassPoint North | South | East | West

Notice the missing : between CompassPoint and North. Re-build your project, and voila!

Errors are reported and clickable!

: .caption

Not only can the error be reported, but it will cause your project to report a build failure. Those errors are also linked back to your source, so double-clicking will open your .prot file and show the error there.

Pretty cool.

The only missing thing is code completion and colorization. Unfortunately, to the best of my knowledge, those two features require full-blown Xcode plug-ins – way out of scope for this blog entry.

Next Steps Over the years, Objective C has attempted to get facelifts. Way back the 1997 timeframe, there was a project to create a “modern syntax” for ObjC. It didn’t go well. Then there was the ObjC-Java bridge that came in 2001. That also didn’t go well.

However, in 2006 we got ObjC 2.0. That was pretty good step forward for the Objective C language. And of course, in 2014, we got Swift.

I think the biggest disservice we can do to the Cocoa developer community is remove the underpinnings of the ObjC runtime. It is the language’s, and I truly believe, the platforms’ greatest strength.

I believe if we hide the complexities of C from our source code and focus on letting the power of the ObjC runtime shine through in our code, we can create a new language that provides of the great flexibility of the ObjC runtime while still accomplishing many of the goals that Swift is attempting to solve – namely safer code by default.

So I guess this is the start of the project I’m calling Proteus. We’ll see how far it gets.

  1. Note that you can put labels here in Swift as well, they just are not required.
  2. Now, we could go the extra mile on step #2 and simply emit the LLVM code, but honestly, I don’t have time to go down that route. It’s also not necessary for what I want to accomplish here.
  3. Here’s the repo link for the state at the time of authoring this blog entry:
Building a Better Objective C