Language Design: Building a Modern C, Round 1

When I think about building a modern C language, there really are just a set of a few things I’m thinking:

  1. Cleaner, more consistent syntax
  2. Proper meta-programming (e.g. a real preprocessor and introspection)
  3. Consistency across types
  4. Better built-in types (e.g. a proper string type and bounded arrays)
    So let’s just start with that.

Here is a very basic C program that was the start of the Proteus compiler:

#include <stdio.h>
#include <stdlib.h>

int main(int ArgumentCount, char** ArgumentValues) {
      if (ArgumentCount != 2) {
          printf("Invalid usage!\n");
          return -1;
      }

    printf("compiling file: %s\n", ArgumentValues[1]);
    FILE *InputFile = fopen(ArgumentValues[1], "r");
    if (InputFile == 0) {
        printf("Unable to open file.\n");
        return -1;
    }

    fseek(InputFile, 0L, SEEK_END);
    long FileLength = ftell(InputFile);
    rewind(InputFile);

    char *FileBuffer = (char *)malloc(sizeof(char) * (FileLength + 1));
    if (FileBuffer == 0) {
        printf("Unable to allocate memory for the file buffer.\n");
        return -1;
    }

    if (fread(FileBuffer, 1, FileLength, InputFile) != FileLength) {
        printf("Unable to read the entire file.\n");
        return -1;
    }
    FileBuffer[FileLength] = '\0';

    printf("file size: %ld\n", FileLength);
    printf("file contents: \n%s\n", FileBuffer);

    fclose(InputFile);

    return 0;
}

This is a translation of some of the items I mentioned above:

import stdlib

let arguments = environment.arguments
guard if arguments.count != 2:
    println "Invalid usage!"; exit -1

let filename = arguments[1]
println "compiling file: {0}\n" filename
guard let file = sys.io.fopen filename sys.io.flags.read:
    println "Unable to open file."; exit -1

guard let content = sys.io.read file:
    println "Unable to read the entire file."; exit -1

println "file size: {0}" file.size
println "file contents: \n{0}" content

This includes:

  • module system
  • improvements to the standard library
  • type inference
  • syntactical noise reduction (the : is the “open scope” syntactic sugar)
  • error handling mechanism (guard, which forces the code to exit scope if the condition fails)

I’m not entirely convinced on the lack of () to invoke functions, but they seem fairly superfluous. Removing them also makes this type of code clearer, in my opinion: foo 2 (3 + 2) vs. foo(2, (3 + 2)). Reading the second option requires a context shift to understand that the first ( is a function call group, where as the second ( is a grouping of work that should be done first. Plus, who doesn’t want to be more like Haskell?

Language Design: Building a Modern C, Round 1