JSON Parsing

Update June 21st: I've fixed the errors of my ways: JSON Parsing Reborn. The content of this article still holds if you use straight-up Swift, but Swift provides us with a much better way to improve this.

: .info

In my previous article, I got some flack about many things, including JSON parsing and how the Swift code is no worse because of the generics. Here's an example of why I think it is worse. If any of the code is incorrect or if it could be written better, let me know and I'll update accordingly. Maybe I'll be proven wrong and that I just need to do it a different way and the problems go away. If that's the case, that would great.

Here is the JSON that we will be parsing:

{
  "stat": "ok",
  "blogs": {
    "blog": [
      {
        "id" : 73,
        "name" : "Bloxus test",
        "needspassword" : false,
        "url" : "http://remote.bloxus.com/"
      },
      {
        "id" : 74,
        "name" : "Manila Test",
        "needspassword" : true,
        "url" : "http://flickrtest1.userland.com/"
      }
    ]
  }
}

Both the ObjC code and the Swift code works under the following conditions (that is, it prints nothing when invalid or prints the blog entries when valid; also, it does not crash):

  1. The JSON response is nil
  2. The JSON response is an unexpected type, such as an array or string
  3. The JSON response does not have the blogs key
  4. The JSON response has the blogs key, but not the blog key
  5. The JSON response has the keys, but blog is the wrong type
  6. The JSON response has the keys and correct type, but blog is empty
  7. The JSON response has the keys, the correct types, and data in blog, but missing some keys (such as id or name)
  8. The JSON response is fully filled out

Here is the ObjC code that is needed to safely parse the code:

if ([json isKindOfClass:[NSDictionary class]]) {
    NSDictionary *dict = json[@"blogs"];
    if ([dict isKindOfClass:[NSDictionary class]]) {
        NSArray *blogs = dict[@"blogs"][@"blog"];
        if ([blogs isKindOfClass:[NSArray class]]) {
            for (NSDictionary *blog in blogs) {
                if ([blog isKindOfClass:[NSDictionary class]]) {
                    NSLog(@"Blog ID: %@", blog[@"id"]);
                    NSLog(@"Blog Name: %@", blog[@"name"]);
                    NSLog(@"Blog Needs Password: %@", blog[@"needspassword"]);
                    NSLog(@"Blog URL: %@", blog[@"url"]);
                }
            }
        }
    }
}

Here is the Swift code to do the same.

if let dict = json as? NSDictionary {
    if let blogs = dict["blogs"] as? Dictionary<String, AnyObject> {
        if let blogItems : AnyObject = blogs["blog"] {
            if let collection = blogItems as? Array<AnyObject> {
                for blog : AnyObject in collection {
                    if let blogInfo = blog as? Dictionary<String, AnyObject> {
                        let id : AnyObject? = blogInfo["id"]
                        let name : AnyObject? = blogInfo["name"]
                        let needspassword : AnyObject? = blogInfo["needspassword"]
                        let url : AnyObject? = blogInfo["url"]

                        println("Blog ID: \(id)")
                        println("Blog Name: \(name)")
                        println("Blog Needs Password: \(needspassword)")
                        println("Blog URL: \(url)")
                    }
                }
            }
        }
    }
}

I wrote unit tests for both Swift and ObjC to validate the claims I made above; both sets of code run without crashing under the error conditions and print the output successfully.

Both sets of code have to do similar checks to validate the types, but the Swift code is cluttered with meaningless type annotations. Also, the retrieval of items out of the dictionaries seems more complicated than it needs to be. Maybe I'm simply doing something wrong…

Of course, this isn't the canonical use of JSON. Really, the ObjC version of the code should actually boil down to this:

for (NSDictionary *blog in json[@"blogs"][@"blog"]) {
  NSLog(@"Blog ID: %@", blog[@"id"]);
  NSLog(@"Blog Name: %@", blog[@"name"]);
  NSLog(@"Blog Needs Password: %@", blog[@"needspassword"]);
  NSLog(@"Blog URL: %@", blog[@"url"]);
}

Why? Simple: when dealing with JSON, the structure is well-defined as we are simply parsing out the results.

The simplest I could get the Swift code was this:

let dict = json as Dictionary<String, AnyObject>
let blogs : AnyObject? = dict["blogs"]?["blog"]
let collection = blogs! as Array<Dictionary<String, AnyObject>>
for blog in collection {
  let id : AnyObject? = blog["id"]
  let name : AnyObject? = blog["name"]
  let needspassword : AnyObject? = blog["needspassword"]
  let url : AnyObject? = blog["url"]

  println("Blog ID: \(id)")
  println("Blog Name: \(name)")
  println("Blog Needs Password: \(needspassword)")
  println("Blog URL: \(url)")
}

All of the type gymnastics is off-putting, especially since it conveys no meaning and reduces the clarity of the code significantly. The bigger the JSON blob to parse, the more this issue is exasperated.

Note that you need AnyObject? to remove the compiler warnings.

Update: June 18th

I was asked on twitter by @jtjoelson why I didn't use typed dictionaries if I knew the schema. Well, I didn't think it helped the code. Here are two versions using fully typed information based on the JSON schema.

let dict = json as Dictionary<String, Dictionary<String, Array<Dictionary<String, AnyObject>>>>
let blogs = dict["blogs"]!["blog"]!
for blog in blogs {
  let id : AnyObject? = blog["id"]
  let name : AnyObject? = blog["name"]
  let needspassword : AnyObject? = blog["needspassword"]
  let url : AnyObject? = blog["url"]

  println("Blog ID: \(id)")
  println("Blog Name: \(name)")
  println("Blog Needs Password: \(needspassword)")
  println("Blog URL: \(url)")
}

Of course, we can make that monster of a type look better with typealiases.

typealias Blog = Dictionary<String, AnyObject>
typealias BlogCollection = Array<Blog>
typealias BlogsDictionary = Dictionary<String, BlogCollection>
typealias FlickrResponse = Dictionary<String, BlogsDictionary>

let dict = json as FlickrResponse
let blogs = dict["blogs"]!["blog"]!
for blog in blogs {
  let id : AnyObject? = blog["id"]
  let name : AnyObject? = blog["name"]
  let needspassword : AnyObject? = blog["needspassword"]
  let url : AnyObject? = blog["url"]

  println("Blog ID: \(id)")
  println("Blog Name: \(name)")
  println("Blog Needs Password: \(needspassword)")
  println("Blog URL: \(url)")
}

The type information is certainly more legiable. But really, the only choice from here to make this more readable and filled with less type info is to create classes to represent all of this and parse the values yourself. That's a lot of code to write for very little value.

The ObjC code will throw an exception if the schema is invalid. I can catch that exception and log it, but there is little error recovery that I can really do at the point. The rigid class will have to ultimately do the same thing, or report an error status of some kind. In ObjC, I had to write far less code and more readable code than I did in Swift to arrive at the same outcome: failed parsing or correct output.

JSON Parsing