Beware of the Enum

So apparently enums are still a flaking beast (Xcode 7.0 GM).

enum Awesome {
    case HowAwesome(String)
    case KindaAwesome
    case NotReallyAwesome
}

let a = Awesome.HowAwesome("THIS AWESOME")
let b = Awesome.KindaAwesome
let c = Awesome.NotReallyAwesome

print(a)
print(b)
print(c)

Ok, this code seems fine, and the output is:

HowAwesome("THIS AWESOME")
KindaAwesome
NotReallyAwesome

Now, let's image a new case statement is desired…

enum Awesome {
    case HowAwesome(String)
    case KindaAwesome
    case NotReallyAwesome
    case WithOutMeYouWouldBeHappy(String)
}

let a = Awesome.HowAwesome("THIS AWESOME")
let b = Awesome.KindaAwesome
let c = Awesome.NotReallyAwesome

print(a)
print(b)
print(c)

Here's the output:

HowAwesome("THIS AWESOME")
KindaAwesome
KindaAwesome

Yeah, not really that awesome. rdar://22915709.

Here's the clip from last night: https://youtu.be/DA-wjc6hwME?t=51m30s

UPDATE: 9:30am PDT, Sept 30, 2015

//platform.twitter.com/widgets.js

So yay! Maybe I should just switch to the beta…

Beware of the Enum

Associated Enum Cases as Types

This is a bit of a follow-up from my Enums with Associated Data vs. Structs. If you haven’t seen that, some of this might be a little out-of-context.

Another issue I run into with working with enums with associated data is actually getting the data out of them! Let’s take a look at the following code sample:

enum Token  {
    case Keyword(keyword: String, offset: Int)
    case Identifier(identifier: String, offset: Int)
}

//
// The declarations; no real issues here.
//

let importkey = Token.Keyword(keyword: "import", offset: 0)
let funckey = Token.Keyword(keyword: "func", offset: 0)

//
// The ways to match on a value and get some data out. Ugh!
//

if case Token.Keyword("import", _) = importkey {
    print("import keyword")
}

if case let Token.Keyword(keyword, _) = importkey {
    if keyword == "import" { print("import keyword") }
    if keyword == "func" { print("func keyword") }
}

switch importkey {
case let .Keyword(keyword): print("\(keyword) keyword")
case let .Identifier(identifier): print("\(identifier)")
}

All of those suck, in my opinion. There are a few of problems:

  1. If I want to store the result, I end up with a really stupid construct.
  2. If there are multiple associated data fields, I get lots of _ in the matches.
  3. They use this = somevar syntax that is completely unlike anything else in Swift.

What I wish had happened was to treat the case values as types. I’ve modelled what that would look like here:

enum Token  {
    case Keyword(keyword: String, offset: Int)
    case Identifier(identifier: String, offset: Int)

    var Keyword_ : (keyword: String, offset: Int)? {
        if case let .Keyword(keyword, offset) = self {
            return (keyword, offset)
        }
        return nil
    }

    var Identifier_ : (identifier: String, offset: Int)? {
        if case let .Identifier(identifier, offset) = self {
            return (identifier, offset)
        }
        return nil
    }
}

Imagine that Keyword_ and Identifier_ were implicitly created by the compiler. The idea is to make use of the other constructs of the language, in this case, Optional to allow us to access the inner details of the enum.

// It's a tuple, so access by index is allowed.
if importkey.Keyword_?.0 == "import" {
    print("import keyword")
}

// These are named tuples, so access by name is allowed.
if importkey.Keyword_?.keyword == "import" {
    print("import keyword")
}

if let keyword = importkey.Keyword_?.keyword {
    print("\(keyword) keyword")
}

Now, here’s the part I really don’t understand. Swift is already essentially doing this, we just don’t have access to it in any way other than the case pattern matching stuff. I think it would have been more preferable to do this:

if case ("import", _) = importkey.Keyword {
    print("import keyword")
}

To me, this makes it clear that tuples are all matched the exact same way. Instead, we have associated data in enums (which is essentially a tuple) matched one way, and tuples, that may have the exact same structure, matched a completely differently way.

rdar://22704262

Associated Enum Cases as Types

On Demand Resources and Games

Alright, this article over on imore.com that was linked by Darring Fireball talking about how on-demand resources isn’t going to be a problem for games got me a bit riled up.

On-demand resources is fine for some classes games. However, this is not true for games like XCOM. The desktop version of this game clocks in at 20GB (Enemy Within)1. There is no amount of tagging, stripping, or slicing that is going to get a company like Firaxis Games to deliver a desktop quality game on Apple’s supposed desktop class hardware because desktop (and console) quality games are bigger than 2GB.

Let’s take the defense of this, from the article:

But: You have a 4GB game! How do you get those other 25 levels?

Easily, thanks to the power of background processing. On-Demand Resources works in conjunction with whatever your user is actively accessing, and will flush older, unused content to make room for additional resources. If a user is playing level 24 of your game, the system automatically flushes a few 100MB tags of old levels (say, 1-5) to make room for levels 25-30. As the user gets further into your game, older levels drop off and get deleted from the Apple TV, and your new levels (also in tag bundles) get installed – all in the background.

Let’s just play out this scenario: I am playing a game and I’m on level 25 (older levels are now purged because 2GB is enough for any game). Now, my son or daughter comes in and they want to play, but they need to start a new character because they haven’t played before. Ok, they to go play the game…

beach ball of death

Oh… pardon me, I need to download those levels… Meanwhile, while this is downloading, other resources are being dumped out (I sure hope they are not the levels that I’m playing). You see, there’s no telling how much overlap between game assets are going to be between levels 1-5 and 25-30. If a game is 10~15GB, it’s reasonable to assume that there is not a lot of overlap of resources between levels as you progress through the game.

The kids get tired of waiting for the levels to download, so they go do something else. I then go and try to play my levels, and of course… data has been purged and more assets are coming in. This is fun!

There is another description for this phenomenon in computer science: thrashing.

Now, this probably works better of iOS devices because those are mostly single-user devices. However, the TV is centralized and consumed by multiple individuals.

It’s decisions like this and the game controller decision (which is a fascinating case of stealth documentation changes) that tell me Apple just doesn’t care to really enable high-quality gaming on tvOS. Instead, companies are going to basically bringing their iOS versions over, which I find so disheartening. Especially since disk space is so cheap these days; put a 1TB fusion drive in the device and charge $50 more or stop teasing us with actually making the iOS and tvOS platforms a contender for more than just casual games.

  1. Now, Firaxis stripped out a LOT and was able to get the iOS version down to 2GB. However, it took a big hit on what it could actually deliver.
On Demand Resources and Games

Enums with Associated Data vs. Structs

Maybe everyone can help me out here: I'm trying to find out what the value of enums with associated values is over structs that conform to a protocol.

Here's the context: I'm playing around with the parser code and I'm using an enum to describe what statements might look like in Proteus.

Let's say it looks like this:

enum Statement {
    case Import(keyword: Token, packageName: Token)
    case Assignment(binding: Token, value: Expression)
} 

Ok, so here's the issue. When I'm writing the parser, I want to have a function parseImport() to handle the actual parsing of the import construct. However, Swift doesn't allow me to define the function definition that I actually want:

func parseImport() throws -> Statement.Import { ... }

The thing is, I don't want parseImport to return any of the possible values for Statement; I only want the function to be able to return the specific case of Import. Furthermore, the data stored within each of the associated values isn't related.

So the question is, when is an associated value enum a better choice than structs? You see, I could define the above this way too:

protocol Statement {}

struct Import : Statement {
    let keyword: Token
    let packageName: Token
}

struct Assignment : Statement {
    let binding: Token
    let value: Expression
}

func parseImport() throws -> Import { ... }

The only clear win I see is that the enum version is more terse. At the end of the day, the usage code is pretty similar in the fact that each needs to determine what specific type of Statement is being dealt with, except I can remove the switch necessary to unpack the Statement.Import version.

There has to be some benefit right? Right now, I'm only see a downside, especially given the fact that I cannot model the desire to have a function return only a specific type of enum value.

UPDATE Sunday, September 13th, 2015 @ 11:49 PM PDT

I should have been a little less cavalier about the "some benefit". I understand that enums limit the potential values while protocols and structs allow this to be unbound. That's not really the problem though. The problem is about the coupling of the parts that make up the associated value of the enum with the notion of the actual type itself.

One suggestion to my dilemma was to essentially do the following:

typealias ImportStatement = (keyword: Token, packageName: Token)
typealias AssignmentStatement = (binding: Token, value: Expression)

enum Statement {
    case Import(ImportStatement)
    case Assignment(AssignmentStatement)
}

This allows the parseImport to return an ImportStatement instead. Of course, I then need to turn that into a Statement in my calling code. That kinda works; though it's much more tedious to actually do that, and it puts the construction of the actual type in the wrong location. Another problem with that approach is the assumption that I own the code. What if I was using a library that provided the enum, such as the one in Apple's Swift documentation:

enum Barcode {
    case UPCA(Int, Int, Int, Int)
    case QRCode(String)
}

Let's say I'm writing a parser for that and I want to write the parseUPCA function. What is the return type? Do I provide the loosely-typed version of parseUPCA() -> (Int, Int, Int, Int) and then have to pust the creation of the Barcode.UPCA instance at the caller (and if the UPCA data changes, it's more than a usage code update, I need to update all places that used the parts too, which is a potentially much more difficult search-and-replace fix)? Or return a Barcode and hope that I only ever get the UPCA version out of it?

I guess I just don't find any of the options available to us particularly good. In my code, Barcode.UPCA is all together a different type than Barcode.QRCode; I'd simply like to be able to actually model that.

So I need to ask myself: allow for the proper modelling of the type signatures or allow for the proper modelling of the closed nature of the types; I don't know how to do both in Swift today.

Enums with Associated Data vs. Structs

Language Design: Declarations

In case you haven’t caught on, the “Language Design” posts are basically me rambling about ideas during the development of the Proteus language.

I’m implementing the first part of the parser that actually handles declarations… The question my mind is using a simplified syntax or introducing keywords. I’m going to put this out here first so everyone can all gasp at once: I don’t like the idea of const; I’m not even sure the concept will be in the language.

So let’s look at some declarations:

// With a keyword (option 1)
var foo: i32 = 12    // declare and assign; type is explicit
var foo = 12         // declare and assign; type is inferred
var foo: i32         // declare; type is explicit

// No keyword used (option 2)
foo : i32 = 12    // declare and assign; type is explicit
foo := 12         // declare and assign; type is inferred
foo : i32         // declare; type is explicit
foo :: i32        // declare; type is explicit (two :: for clarity)

Option 1: Using a Keyword

At first look, I’m not really sure what that keyword is worth it. For one, I’m not sure if I’m going to be having a corresponding thing like let. For another, it’s just more typing. Is there a true benefit for it?

Option 2: No Keyword

There is something nice about this approach; it’s clean, fewer characters, and still plays nice with type inference. Ok, so this is a clear winner, yeah? Well, I’m not so sure.

Defining Other Types

Ok, so what about other types, like functions, structs, and enums? Now it starts to get a bit more interesting.

Functions
// With a keyword (option 1)
func foo(i32, i32) -> i32 = (x, y) { return x + y }      // declare and assign; type is explicit
var foo = func (x: i32, y: i32) -> i32 { return x + y }  // declare and assign; type is inferred
var foo: func (i32, i32) -> i32                          // declare; type is explicit

var foo = (x: i32, y: i32) -> i32 { return x + y }       // declare and assign; type is inferred
var foo: (i32, i32) -> i32                               // declare; type is explicit

// No keyword used (option 2)
foo : (i32, i32) -> i32 = (x, y) { return x + y }      // declare and assign; type is explicit
foo := (x: i32, y: i32) -> i32 { return x + y }        // declare and assign; type is inferred
foo : (i32, i32) -> i32                                // declare; type is explicit
foo :: (i32, i32) -> i32                               // declare; type is explicit (two :: for clarity)

With functions, it looks like option 2 is coming out ahead. I really like that there is explicit consistency throughout. With the keyword version, it starts to get really cumbersome. In order to fix that, special cases need to be introduced to start dropping some of the verbosity.

Also, the var keyword in option 1 is pretty problematic. Can foo really be assigned to something else? Now it seems that a let keyword has to be introduced just to support functions. Though, option 2 has this problem as well to some extent. However, the foo that are just declarations could be treated as a typealias instead, a and a variable that points to a function would be declared as: foo :: *(i32, i32) -> i32.

Again, option 2 is looking like it’s pulling ahead to the cleanest choice.

Structs
// With a keyword (option 1)
struct Foo {
    f: i32
}

// No keyword used (option 2)
Foo :: {
    f: i32
    defaultValue := 32
}

Ok, option 2 seems a bit strange in this scenario, but that could just be my eyes are so used to the first style.

Enums
// With a keyword (option 1)
enum Foo {
    ChooseMe
    SpecificValue
}

// No keyword used (option 2)
Foo :: {
    SomeValue
    SpecificValue   := 32
}

Ok… that option 2 doesn’t work; it’s ambiguous now. Hmm… something has to change here, and while some syntactic sugar could be added, like using | to separate the cases, this is starting to look like a general problem of making declarations less ambiguous at a quick glance.

Foo :: enum {
    SomeValue
    SpecificValue   := 32
}

If an approach like that is taken, then why is a function treated differently? Right, it should be foo :: func (x: i32) -> i32. If a keyword is going to need to be used, well, then it seems like the keyword-first approach is both the easier construct to use and easier to extend.

Conclusion

So where does this leave me? Well… I think I’m actually going to take the hybrid approach for now and treat variable declarations fundamentally different than type declarations. We’ll see how that goes for now.

Here is a short summary of the basic decision:

foo := 12

func add(x: i32, y: i32) -> i32 { return x + y }

// This code, for now at least, will be invalid.
foo := (x: i32, y: i32) -> i32 { return x + y }

// Instead, this would need to be used.
foo := &add

struct BigFoo {
    littleFoo: i32
}

enum ChooseFoo {
    DefaultChoice
    MyChoice := 5
}

As always, feel free to leave any comments in the Proteus Discussion Group.

Sidebar: Yeah, sure, technically the let says that the value it holds is constant, and yes, it’s technically true that only requires that the pointer address cannot be changed. However, this is what I mean by “semantically incorrect”. How do you say that not only is the pointer immutable, but the value that I point to is also immutable? This starts to add a bunch of complexity that I don’t want to deal with at the moment.

Language Design: Declarations

Proteus – The First Couple of Weeks

Here it is, the first update for September. I’ve been thinking a bunch about what I want from the language and what I think I want the shape of it to be. I’ve come the following basic conclusions:

  1. The syntax will gently evolve from C to Proteus. We’ll take changes a few at a time while the language starts to develop.
  2. To ensure we’re building something real, I’ll be porting the code from Handmade Hero from C to Proteus1.

So, the last language design article had this snippet:

import stdlib

let arguments = environment.arguments
guard if arguments.count != 2:
    println "Invalid usage!"; exit -1

let filename = arguments[1]
println "compiling file: {0}\n" filename
guard let file = sys.io.fopen filename sys.io.flags.read:
    println "Unable to open file."; exit -1

guard let content = sys.io.read file:
    println "Unable to read the entire file."; exit -1

println "file size: {0}" file.size
println "file contents: \n{0}" content

The more I played around with this style, I’m a bit concerned that there simply isn’t enough lexical information to make it easily scannable and parseable by humans. Also, I do wonder about the block distinction; I’ve gone back and forth on this. In one sense, I really like the significant whitespace, but on the other, I like the visual clarity of { and }; they are easier to grep while scanning quickly2.

With that in mind, the following will be the focus on the syntactical changes from C in the first iteration:

  1. Switch the order of type declaration
  2. Stub out the import mechanism
  3. Remove the ; for statement termination; it’s only used for multiple statements per line now.

Note: One other benefit of do the migrations from C to Proteus like this is the ability to convert a C file to a Proteus file (for the most part).

This means that the above sample will look like this:

import stdlib

let arguments = environment.arguments
guard if arguments.count != 2 {
    println("Invalid usage!"); exit(-1)
}

let filename = arguments[1]
println("compiling file: {0}\n", filename)
guard let file = sys.io.fopen(filename,sys.io.flags.read) {
    println("Unable to open file."); exit(-1)
}

guard let content = sys.io.read(file) {
    println("Unable to read the entire file."); exit(-1)
}

println("file size: {0}", file.size)
println("file contents: \n{0}", content)

The work items over the next couple of weeks boil down to:

  1. Finish up the initial lexer (scanner and tokenizer are basically there already)
  2. Start generating executable code (the first round of output will generate C code)
  3. Start getting the C interop in place

Also, I’ve got a spot setup for the location of the Proteus documentation: https://owensd.io/proteus.


  1. Handmade Hero is currently closed source. As Proteus matures, I may look into adding an official fork in the GitHub repo. However, until that time, I’ll only talk about my experiences and show a few snippets; I will not be releasing the Proteus version of Handmade Hero until it’s officially released. 
  2. Of course, if the starting { is offscreen, then the } might not be as helpful‚Ķ I don’t know yet. 
Proteus – The First Couple of Weeks

Modern Web Development… /sigh

Yesterday I thought it would be a good idea to see just how big an article on my blog was…

Profile Picture: 4.5MB for the size

Um… 4.5MB for an article about coding? That seems, a bit excessive?

So I went digging in to see what the heck was going on. Over half of that was for simply adding Disqus support. Seriously? Over 2MB for a comment system? And a comment system that can't even properly format posts? WHY?

Alright, so that's gone.

Where's the rest of it? Well… another 400KB or so was for the webfonts that I was using. Do I think the previous font looked better? Yeah. But come one! 400KB for slightly better looking prose. Nope, sorry, gotta go.

Another culprit of wasted space was the "me" picture on the left. It was a PNG file that was roughly 600KB. Yikes! The PNG version is down to 29KB.

After a bit more tweaking, I got the site down to about 360KB (for articles, content with images is obviously larger). This is still stupidly large for what is actually going on. Also, the page load time has gone from 1.42s down to 143ms.

Profile Picture showing the latest results

But I still have a bunch of cruft in there. Unfortunately, this last layer, to actually stream-line out, requires a complete rewrite of everything in there.

This is where I have the biggest issue with development today, and it's not just web developers: developers have become complacent in being wasteful for the benefit of their own development ease instead of considerate of the time and resources they are taking from customers. Every article I had on my page was costing you 4MB of bandwidth (minus any local caching). It took me about an hour of time to reduce the size and render time to about 10% of what it used to be.

We should all strive to do much better than we currently are.

Modern Web Development… /sigh