Guess the Protocol Behavior

In Today's post, well, this evenings' really, it's time to play "guess what happens".

Let's start with the given code:

public protocol Animal {
    func speak()
}

extension Animal {
    public func speak() {
        print("rawr?")
    }
}

Simple enough, all animals say "rawr?", by default.

Let's add a bit more to the puzzle:

public struct Dog : Animal {
    public func speak() {
        print("ruff ruff!")
    }

    public init() {} // yes, we must define this because of "good reasons"
}

This is all pretty straight forward, nothing really interesting here. So let's create a little function to get the animals talking.

public func talk(animals: [Animal]) {
    animals.forEach { $0.speak() }
}

This is where it is going to start to get interesting:

public struct Sheep : Animal {
    public init() {}
}

And then:

let animals : [Animal] = [Dog(), Sheep()]
talk(animals)

The output is:

ruff ruff!
rawr?

But what if this is added:

extension Sheep {
    func speak() {
        print("bah!")
    }
}

What if I told you the output what still:

ruff ruff!
rawr?

Can you tell me why that is?

If I did this:

Sheep().speak()

The output would be correctly this:

bah!

The issue here is where the talk() function is defined. If talk() is defined within the same module as the extension for Sheep, then the output is the following:

ruff ruff!
bah!

However, if the talk() function is defined outside of that module, well then the output is:

ruff ruff!
rawr?

This behavior is unsettling to me. For one, it makes some sense that you cannot change the functionality of another module with extensions to types belonging to that module. On the other hand, if I provide an extension for Sheep in my module, I'll be able to use the new functionality just fine there, but anytime the type gets used in another module, the functionality will fall-back to the original behavior.

This just sounds like a scary source of bugs waiting to happen. I think the solution might be to simply dissallow extensions to protocols that are not defined within the same module. I rather lose out on potential functionality to maintain certain guarantees in my program.

Thoughts?

Update August 20th, 2015 @ 7:30am HST

The above explanation of talk() is a bit incorrect; here's the version I meant to copy in:

public func talkOf(animals: [Sheep]) {
    animals.forEach { $0.speak() }
}

The issue with the original talk() function is that the extension will never be used as the type is defined within another module if the base protocol of Animal is used.

Here are my three talk()-like functions I used:

public func talk(animals: [Animal]) {
    print("in-module: talk")
    animals.forEach { $0.speak() }
}

public func talkOf<T : Animal>(animals: [T]) {
    print("in-module: talkOf")
    animals.forEach { $0.speak() }
}

public func sheepTalk(animals: [Sheep]) {
    print("in-module: sheepTalk")
    animals.forEach { $0.speak() }
}


public func talk(animals: [Animal]) {
    print("out-of-module: talk")
    animals.forEach { $0.speak() }
}

public func talkOf<T : Animal>(animals: [T]) {
    print("out-of-module: talkOf")
    animals.forEach { $0.speak() }
}

public func sheepTalk(animals: [Sheep]) {
    print("out-of-module: sheepTalk")
    animals.forEach { $0.speak() }
}

The in-module versions are "my code"; it's the code where the Sheep extension is defined (the extension being public or internal had no effect). The out-of-module code is where the Animal protocol and Sheep type are defined. The thing to note that is that even with-in my own module, where I've defined the extension for Sheep, if I use the base type Animal, I'll not see my extension's behavior:

print("\nin-module: Sheep()")
let s = Sheep()
s.speak()

print("\nin-module: Animal - Sheep()")
let a: Animal = Sheep()
a.speak()

The output is:

in-module: Sheep()
bah!

in-module: Animal - Sheep()
rawr?

In anycase, a simple error or warning when defining extensions on types defined in a different module would alleviate this problem.

Project File: ProtocolDispatch.zip

Guess the Protocol Behavior

Protocols – My Current Recommendations

The big talk about Swift lately is around protocols. Everything should be a protocol they say! Well, that's great in theory, however, in practice, that attitude can lead to some really unfortunate side-effects.

Here are the top two things that I always try to keep in mind when working with protocols in my code:

1. Don't treat protocols as a type

A lot of solutions I see (and what I did initially) are basically treating protocols as a base class in your type hierarchy. I don't think this is really where protocols shine. This design pattern is still the "object-oriented" way of thinking about the problem.

To put it another way, if your protocol really only has meaning within your type hierarchy, ask yourself if it really makes sense to make it a protocol. I don't think an answer of, "well, I want my type to be a struct so I need to use a protocol here instead" is a good reason. Decompose it and make it more applicable if that's really the case.

Futher validation of this: http://swiftdoc.org/swift-2/. Notice anything about all of those protocols (well, all of the ones not prefixed with _)? All of them can be applied to multiple different types regardless of type hiearchy.

2. Don't make your protocols generic unless you really have too!

Hopefully this is a just a point-in-time problem, but as soon as you make your protocols generic, you lose the ability have typed collection of hetergenous instances of protocols. I consider this a serious design limitation. For instance, all of the non-Self constrained functionality of a protocol should be safely callable from any place that protocol wants to be used.

This also applies to having your protocol adhere to generic protocols, such as Equatable. Generics are infectious.

Doing this:

protocol Foo : Equatable {}

Is almost certainly going to cause you some significant grief down the line.

Here's a practical example:

Let's say we want to model an HTTP response and we want to support two different types of response types: strings and JSON data.

It might be tempting to do something like this:

class HTTPResponse<ResponseType> {
    var response: ResponseType
    init(response: ResponseType) { self.response = response }
}

I think this is bad approach. As soon as this happens, we have artificially limited our ability to use this type; it cannot be used in collections in a heterogenous fashion, for example. Now, why might I want a collection of these that have different ResponseType representation? Well, let's say I want to build a response/request playback engine for testing. The collection of responses I get back will be of my supported types: string and JSON data.

One option to address this is to simply use AnyObject. This works, but that pretty much sucks.

Another approach to address this problem is with protocols. However, instead of just creating a ResponseType protocol, let's think about what we really want from this. What I really care about is that any ResponseType that is provided to an HTTPResponse can be represented as a String.

With that in mind, we end up with something like this:

protocol StringRepresentable {
    var stringRepresentation: String { get }
}

class HTTPResponse {
    var response: StringRepresentable
    init(response: StringRepresentable) { self.response = response }
}

To me, this is vastly superior as it provides the consumers of the API to be much more flexible while still maintaining some type clarity.

Of course, this doesn't come without its own drawabks, and I'd be remiss to not point it out. If you actually want to deal with the specific type for the response, you need to cast it.

class JSONResponse : StringRepresentable {
    var stringRepresentation: String = "{}"
}

let http = HTTPResponse(response: JSONResponse())
let json = http.response as? JSONResponse

This is still significally better though. I, the caller, know what the response type is supposed to be, or what the possible values could be. This is starkly different then when I'm looping over the collection pull at the out the responses and want to get the value of the response because the consumer of the code could have actually created other response types, such as XMLResponse, and now my code would have no way of knowing about it.

In a perfect world, we could do this:

class HTTPResponse<ResponseType : StringRepresentable> {
    var response: ResponseType
}

let responses = [json, string]  // responses is an array of HTTPResponse where ResponseType is unrealized

You would still need to cast the response in the collection use case, however, using the json instance directly would still give you full type validation.

Until we can get there though, I'll take the collection type of [HTTPResponse] over [AnyObject] every time.

Protocols – My Current Recommendations

Goto Fail and Swift

So this is a blog post about a pet-peeve of mine. The claim: "Swift cannot have bugs like Apple's goto-fail bug."

This is rubbish!

The biggest problem I have with much of the analysis of this bug is the focus on the missing braces around the if-statements. No, the problem is that the code is terrible to begin with, and it obviously had no tests which were trivial to implement.

So, we have to start with the following assumptions to see just how we get this code in Swift:

  1. The code structure was just poor to begin with.
  2. Compiler settings were disabled so the "unreachable code paths" warning didn't show (or was ignored).
  3. Evidently tests were thought to be optional.

So, here's basically the same code in Swift:

enum SSLHashSHA1 {
    static func update(inout hashCxt: Int, _ call: Int) -> OSStatus {
        hashCxt = call
        if call == 4 { return -1 }
        return 0
    }
}

func isAnyOneSafe() -> OSStatus {
    var err: OSStatus = 0
    var hashCtx: Int = 0

    err = SSLHashSHA1.update(&hashCtx, 0)
    if (err != 0) {
        return err
    }
    err = SSLHashSHA1.update(&hashCtx, 1)
    if (err != 0) {
        return err
    }
    err = SSLHashSHA1.update(&hashCtx, 2)
    if (err != 0) {
        return err
    }
    err = SSLHashSHA1.update(&hashCtx, 3)
    if (err != 0) {
        return err
    }
    return err
    err = SSLHashSHA1.update(&hashCtx, 4)
    if (err != 0) {
        return err
    }

    return err
}

isAnyOneSafe()  // returns 0, should return -1 though

The point of this is post is to debunk the myth that you are immune to these same type of stupid errors just because you are writing code in Swift. That's just simply not true. A "merge error" or a "copy and paste error" like the above is pretty easy to do and miss if people aren't paying attention and not code reviewing changes.

Moral of the story: the compiler tells you jack squat about the correctness of your code; it only tells you that you have passed the rules to generate machine code based on the language rules. You still need to write tests to make sure that your code is actually functioning correctly.

P.S. If your counter-argument is that this is terrible code and you shouldn't write it that way to begin with! Of course it's terrible code! However, the fact is that the C-version is also terrible code and it shouldn't have been written that way either. Unfortunately, bad code happens regardless of language.

P.P.S. Swift does provide us a nice way to write this code that is, in my opinion, even better than some of the cleaned up C-versions from the links above:

enum ResultCode : ErrorType {
    case Error
    case Success
}

enum SSLHashSHA1 {
    static func update(inout hashCxt: Int, _ call: Int) throws {
        hashCxt = call
        if call == 4 {
            throw ResultCode.Error
        }
        throw ResultCode.Success
    }
}

func isAnyOneSafe() throws {
    var hashCtx: Int = 0

    try SSLHashSHA1.update(&hashCtx, 0)
    try SSLHashSHA1.update(&hashCtx, 1)
    try SSLHashSHA1.update(&hashCtx, 2)
    try SSLHashSHA1.update(&hashCtx, 3)
    try SSLHashSHA1.update(&hashCtx, 4)
}

do {
    try isAnyOneSafe()
}
catch {
    print("yay!")
}
Goto Fail and Swift

Be Mindful of Your Filters

let items = 1...100
for i in items {
    if i % 2 != 0 { continue }
    print("\(i)")
}


let items = 1...100
items
    .filter() { $0 % 2 == 0 }
    .forEach() { print("\($0)") }

Two loops, two ways of looking at the problem. The second is better, yeah? It's cleaner, easier to read, easier to understand. All of those lead to more maintainable and less buggy code. So what's the problem?

Performance.

Sure, maybe you won't actually run into any particular issue with this usage, but what if you want to add some more filters?

let items = 1...100
items
    .filter() { $0 % 2 == 0 }
    .filter() { $0 % 3 == 0 }
    .filter() { $0 % 5 == 0 }
    .forEach() { print("\($0)") }

Now, again, in this specific case, it might not be too bad. Afterall, the first filter() loops through all 100 items, the second filter() only needs to go through 50 (all of the even numbers). The last filter() then only needs to run through 16 values. Finally, the forEach() is really only working on collection of 3 items.

This version of the construct doesn't have the performance problem though:

let items = 1...100
for i in items {
    if i % 2 != 0 { continue }
    if i % 3 != 0 { continue }
    if i % 5 != 0 { continue }
    print("\(i)")
}

If you have missed it, the performance problem is that every call of filter() is a potential O(n) operation. If you want to apply three filter() calls and a forEach(), that is going to be four times through the collection. In addition to that, each filter() is creating a new array of your filtered items.

Bad mojo.

Now, you might be muttering to yourself: premature optimization! You haven't even profiled it! To that, I say: why write code that you know has a good likely-hood of being a performance problem? Especially if you don't even need to sacrifice the coding approach to just make it better from the start?

Of course, we don't want to just throw away the chained filters because that style is a lot cleaner. Thankfully, there is already a Swift type that helps us out here: LazySequence (and its LazyCollection friend).

let items = 1...100
lazy(items)
    .filter() { $0 % 2 == 0 }
    .filter() { $0 % 3 == 0 }
    .filter() { $0 % 5 == 0 }
    .forEach() { print("\($0)") }

Simply wrapping items in a lazy() call will convert our Sequence into a LazySequence. This gives us the performance benefits of the more iterative-style approach with the benefits of the semantically broken out operations.

This is pretty interesting to watch in a playground as well with a large collection as you'll be able see the filters being applied as an iteration over each new collection (the non-lazy version) or in-sequence as the collection is iterated (lazy version).

Update: August 11th, 2015 @ 2:15pm

Just to clarify, the above performance gains that we are getting with the use of lazy() are from the following:

  1. Reducing the number of times each element in the sequence is visited.
  2. Removing the intermediate copies of the filtered collection for each filter() or map() call.

This is not reducing the number of filter() calls because that still needs to be done per element, thus we are not really changing the time complexity, per se.

Here is some quick and dirty perf tests (2012 rMBP, Release build):

let items = 1...100000000

func measure(fn: () -> ()) -> NSTimeInterval {
    let start = NSDate().timeIntervalSince1970
    fn()
    return NSDate().timeIntervalSince1970 - start
}

var counter = 0

let time = measure() {
    items
        .filter() { $0 % 2 == 0 }
        .filter() { $0 % 3 == 0 }
        .filter() { $0 % 5 == 0 }
        .forEach() { counter = counter &+ $0 }
}

let lazyTime = measure() {
    lazy(items)
        .filter() { $0 % 2 == 0 }
        .filter() { $0 % 3 == 0 }
        .filter() { $0 % 5 == 0 }
        .forEach() { counter = counter &+ $0 }
}

print("counter: \(counter)")
print("time: \(time)")
print("lazy time: \(lazyTime)")

Output:

counter: 333333366666660
time: 0.795416116714478
lazy time: 0.286408185958862

Another run this time incorporating some map() calls:

let time = measure() {
    items
        .filter() { $0 % 2 == 0 }
        .map() { $0 * 2 }
        .filter() { $0 % 3 == 0 }
        .map() { $0 + 1 }
        .filter() { $0 % 5 == 0 }
        .forEach() { counter = counter &+ $0 }
}

let lazyTime = measure() {
    lazy(items)
        .filter() { $0 % 2 == 0 }
        .map() { $0 * 2 }
        .filter() { $0 % 3 == 0 }
        .map() { $0 + 1 }
        .filter() { $0 % 5 == 0 }
        .forEach() { counter = counter &+ $0 }
}

Output:

counter: 666666500000010
time: 1.12964105606079
lazy time: 0.129108905792236
Be Mindful of Your Filters

The Matrix

I know, sadly, that I'm getting older as there are new hires that haven't even seen The Matrix.

Neo saying 'Woah'

If you haven't seen it, well, I think you should; it is a fantastic movie. It's nice that hollywood didn't try and turn it into a trilogy…

So, what's this have to do with Swift? Excellent question!

I'll submit this for my trolling enjoyment your pondering: the Matrix is built with a static type system.

Spoon boy: Do not try and bend the spoon. That's impossible. Instead… only try to realize the truth.

Neo: What truth?

Spoon boy: There is no spoon.

Neo: There is no spoon?

Spoon boy: Then you'll see, that it is not the spoon that bends, it is only yourself.

Happy Thursday! And welcome performSelector. 🙂

The Matrix

Dynamic Swift

Xcode 7 Beta 4 is out and it is a doozy! One of the changes is that performSelector is now available from Swift. Now, this isn't going to make your Swift types dynamic all of a sudden. However, what it does open the door for, is writing both your ObjC-style code and your Swift code all in the same language: Swift.

That's huge.

Here's some really ugly code to demonstrate:

@objc (Foo)
class Foo : NSObject {
    func bar() -> String {
        return "bar"
    }
}

let foo = Foo()
let value = foo.performSelector(Selector("bar")).takeUnretainedValue()
let result = value as? String
print("result: \(result)")

if let c = NSClassFromString("Foo") {
    let newFoo = c.alloc().performSelector(Selector("init")).takeUnretainedValue() as? Foo
    print("newFoo: \(newFoo?.bar())")
}

I'm actually pretty excited by this. This is another step closer, in my opinion, for winning in the pragmatic realm. There are certain types of problems (like a plug-in architecture) that are suited to this type of dynamic invocation.

This seems like baby steps to a great merging of the two worlds.

Update: July 23rd

It turns out, this is possible without using @objc as well:

class Bar : CustomStringConvertible {
    required init() {}
    var description: String { return "hahaha" }
}
if let c = NSClassFromString("PerformSelector.Bar") as? Bar.Type {
    let i = c.init()
    print("interesting: \(i)")
}

Just update "PerformSelector" to your module name, and voilà! For generic classes, you need to use it fist before it gets registered.

Good stuff!

Dynamic Swift

Brent’s Feed Protocol Problem

Brent has a problem:

//platform.twitter.com/widgets.js
His problem – protocols are seemingly messing up with mojo. It happens, and Equatable with its Self requirement can be quite annoying at times. However, his problem, at least as I understand it, is easily solvable, albeit with a bunch of code.

The key things to note about Brent's problem is that the Folder and the Feed are tightly coupled together; they have a strong type relationship together. So, we can use generics to help with this:

protocol Feed {
    var url: String {get}
}

protocol Folder {
    typealias FeedType

    var feeds: [FeedType] {get}
    func addFeeds(feedsToAdd: [FeedType])
}

class LocalFeed: Feed, Equatable {
    var url: String
    init(url: String) {
        self.url = url
    }
}

func ==(lhs: LocalFeed, rhs: LocalFeed) -> Bool {
    return lhs.url == rhs.url
}

class LocalFolder: Folder {
    var feeds = [LocalFeed]()

    func addFeeds(feedsToAdd: [LocalFeed]) {
        for oneFeed in feedsToAdd {
            if !feeds.contains(oneFeed) {
                feeds += [oneFeed]
            }
        }
    }
}

Basically, move Equatable to the type LocalFeed and every other feed type, and create a typealias on the Folder protocol so a specific type of Feed can be used.

We can go one step further, but we will see a slight problem when we do:

protocol Feed {
    var url: String {get}
}

protocol Folder {
    typealias FeedType : Equatable

    var feeds: [FeedType] { get set }
    mutating func addFeeds(feedsToAdd: [FeedType])
}

extension Folder {
    mutating func addFeeds(feedsToAdd: [FeedType]) {
        for oneFeed in feedsToAdd {
            if !feeds.contains(oneFeed) {
                feeds.append(oneFeed)
            }
        }
    }
}

class LocalFeed: Feed, Equatable {
    var url: String
    init(url: String) {
        self.url = url
    }
}

func ==(lhs: LocalFeed, rhs: LocalFeed) -> Bool {
    return lhs.url == rhs.url
}

class LocalFolder: Folder {
    var feeds = [LocalFeed]()
}

Now all Folder implementations get the addFeeds functionality for free. BUT, we have to clearly mark what could be potentially mutating. The big side-effect of this is that feeds becomes mutable outside of addFeeds, which is not good.

The way to fix this, or the only way I know how to fix this, is to create a protocol, that by convention, people agree not to use. Something like this:

protocol Feed {
    var url: String {get}
}

protocol Folder : _Folder {
    typealias FeedType : Equatable

    var feeds: [FeedType] { get }
    mutating func addFeeds(feedsToAdd: [FeedType])
}

protocol _Folder {
    typealias FeedType : Equatable
    var _feedStore: [FeedType] { get set }
}

extension Folder {
    mutating func addFeeds(feedsToAdd: [FeedType]) {
        for oneFeed in feedsToAdd {
            if !feeds.contains(oneFeed) {
                _feedStore.append(oneFeed)
            }
        }
    }
}

class LocalFeed : Feed, Equatable {
    var url: String
    init(url: String) {
        self.url = url
    }
}

func ==<T : Feed>(lhs: T, rhs: T) -> Bool {
    return lhs.url == rhs.url
}

class LocalFolder : Folder {
    var _feedStore : [LocalFeed] = []
    var feeds: [LocalFeed] { return _feedStore }
}

The _Folder protocol hides the mutation exposure of _feedStore. Oh, and while we're at it, lets change == to be generic and work for any type of Feed by default by comparing their respective url members.

Anyhow, hope that helps somewhat.

Brent’s Feed Protocol Problem

Interview Questions

Interviewing is hard work. Somehow, in a single hour (or some other arbitrary span of time), you are supposed to give an opinion of yay! or nay! on whether to hire someone or not. Sure, sometimes they get N number of chances, but if one of those goes wrong, then what?

Unfortunately, there really are not a lot of good ways to measure a candidate's ability. Sure, a GitHub history might be ok. But honestly, I'm not going to spend hours looking through that for each candidate. Also, sometimes the code we push up to GitHub is terrible, especially as we are iterating towards a solution. So how to find the good from the bad?

Work experience… well, that's nice, but that really only spins it how the candidate wants it to be spun. I've had candidates that have clearly embellished their involvement and contributions to a project. So it's not really enough to say, "Wow, cool, you worked at Google for three years doing X! You are clearly a hire!". I also have candidates that undersell the work they've done (I typically fall into this category too).

So basically we are back to a set of interview questions where we bombard a candidate with, hopefully interesting and reflective questions. Some questions though are just not that great.

Take this classic problem: "You have 8 identical balls in every way, expect that one of them is heavier than the others. You are given an old-school balance scale; your job is to find the heavier ball."

One thing I really dislike about this question is that the interviewer is typically biased towards finding the answer in the fewest number of weighs, which to me, is a bit of a trick question (2 weighs vs. 3). Also, answers that take an iterative solution are usually looked down upon as not being correct, even though they are actually the better approach when generalizing the problem and moving it to code (and arguably in the real world scenario as well depending on the speed of your balance scale).

Here are the actual considerations of the problem:

  1. The cost of counting the initial set of balls
  2. The cost of partitioning the balls into various groups
  3. The cost of determining the weight of each partition of balls
  4. The act of "weighing" the two partitions of balls that can go on the scale

These basic steps need to be repeated until the answer is found.

Now, when dealing with physical objects in the real world, the cost breakdown is: O(n) (or O(1) if we are "told"), O(n), O(1), and O(1). We'll assume that you put the balls into a tray of some sort so moving to and from the scale is some small constant time. There's also the time it takes the balance to move, but that should also be a small constant time (afterall, the weight difference needs to be large enough that a random distribution of balls in a tray wouldn't cause a false negative).

If we are asked to implement this solution in code, the cost breakdown is this: O(n) (need to count them at least once to know), O(1) (in the best case, just index offset updates), O(n), O(1).

Notice anything in particular? You always have to touch every single ball to find out the answer. The only question is, do any of these operations have a constant time value that dwarfs the apparent O(1) cost. I'd argue no. Maybe you could make the case in the real world, but most definitely not when asked to generalize this solution to N weighted items and write the algorithm to do it.

If I were to ask this question, I would ask it exactly as I framed it above. Regardless of the path the candidate chose, I would expect them to be able to give reasons why they chose that path. However, if you gave me the binary search answer and simply told me that it is the best because it has the fewest number of weighs in it, I would simply look at you and say, "I believe I can solve it faster with more weighs".

Interview Questions

Protocols and Hidden Details

Protocols (and their extensions)… those glorious things that they are. Well, mostly. I'm still finding a few places where they are coming up a bit short, or at least are not providing a good long-term solution.

Let's say you want to create a protocol, like I did with the Key Value Coding post.

protocol KeyValueCodable {
    mutating func setValue<T>(value: T, forKey: String) throws
    func getValue<T>(forKey: String) throws -> T
}

What this says is we want a protocol that defines a setValue() and a getValue() function. No problems here.

Now, it's time to implement this protocol.

struct Person : KeyValueCodable {
    private var kvcstore: [String:Any] = []

    mutating func setValue<T>(value: T, forKey: String) throws {
        // a bunch of logic to validate the correct type for the key and value; throw on error.
    }

    func getValue<T>(forKey: String) throws -> T {
        // a bunch of logic to validate the correct type for the key and value; throw on error.
    }

    var name: String {
        get { return getValue("name") as String }
        set { setValue(newValue, forKey: "fname") }
    }
}

Well, what happens when we want to add a new type Address that also conforms to the KeyValueCodable protocol? Well, we are going to need to implement:

  1. The kvcstore for the backing store.
  2. Duplicate the setValue() function.
  3. Duplicate the getValue() function.

Ok, that definitely sucks. There are two ways to get around this, namely:

  1. Throw away the protocol approach and use a base class with default implementations.
  2. Use protocol extensions with default implementations.

Unfortunately, both have significant drawbacks. The class version forces you into reference semantics for all KeyValueCodable types, which is less than desirable. With protocol extensions, we are forced to make our internal implementation publicly exposed (well, exposed at the same access level of the protocol).

The basic problem is this: there is no way to create a protocol that is for consumers of the API and a protocol for implementors of an API. rdar://21850541

This is one of the fundamental arguments about Swift's public, internal, and private access modifiers. They do not allow for this type of design pattern, and it's necessary.

So, in this strongly typed langauge of ours, we solve this problem by convention!

protocol KeyValueCodable : _KeyValueCodable {
    mutating func setValue<T>(value: T, forKey: String) throws
    func getValue<T>(forKey: String) throws -> T
}

protocol _KeyValueCodable {
    static var _codables: [KVCTypeInfo] { get }
    var _kvcstore: Dictionary<String, Any> { get set }
}

extension KeyValueCodable {
    mutating func setValue<T>(value: T, forKey: String) {
        // default implementation goes here...
    }

    func getValue<T>(forKey: String) -> T {
        // default implementation goes here...
    }
}

That's right, create another protocol prefixed with _ and everyone knows you are up to some dirty little secret tricks that you unfortunately need to expose to everyone so they can violate all of your assumptions. Good times.

This leads me to my next problem: these default implementations are going to need to work on some data; it would be really nice if we could also create a default store in our extension so that all of the implementors do not have to do this each time to get the default behavior (rdar://21844730):

struct Person : KeyValueCodable {
    var _kvcstore: Dictionary<String, Any> = [:]
}

Just copy/paste those member fields (sure, it's just one in this example) for each type that extends KeyValueCodable. I sure hope the protocol doesn't change how it needs to store that backing information, otherwise you're out of luck.

Protocols are pretty great and made even better with extensions. However, there is still more required from to reach their potential.

Protocols and Hidden Details

Key Value Coding in Swift

I was having a discussion on Twitter with someone about KVC and creating a typed version. Well, I do not think we can create a fully type-safe version, at least not at compile time. However, we should be able to come close at runtime, I think.

Anyway, here is a stab at an implementation.

The basic protocol is simple, just setValue() and getValue() functions:

protocol KeyValueCodable : _KeyValueCodable {
    mutating func setValue<T>(value: T, forKey: String) throws
    func getValue<T>(forKey: String) throws -> T
}

However, some implementation details are necessary to support a default implementation via protocol extensions:

protocol _KeyValueCodable {
    static var _codables: [KVCTypeInfo] { get }
    var _kvcstore: Dictionary<String, Any> { get set }
}

struct KVCTypeInfo : Hashable {
    let key: String
    let type: Any.Type

    // Terrible hash value, just FYI.
    var hashValue: Int { return key.hashValue &* 3 }
}

func ==(lhs: KVCTypeInfo, rhs: KVCTypeInfo) -> Bool {
    return lhs.key == rhs.key && lhs.type == rhs.type
}

As you might be gathering, the basic premise is to use a backing store to maintain our values and type information. The implementation can then verify that the data coming in is correct.

extension KeyValueCodable {
    mutating func setValue<T>(value: T, forKey: String) {
        for codable in Self._codables {
            if codable.key == forKey {
                if value.dynamicType != codable.type {
                    fatalError("The stored type information does not match the given type.")
                }

                _kvcstore[forKey] = value
                return
            }
        }

        fatalError("Unable to set the value for key: \(forKey).")
    }

    func getValue<T>(forKey: String) -> T {
        guard let stored = _kvcstore[forKey] else {
            fatalError("The property is not set; default values are not supported.")
        }

        guard let value = stored as? T else {
            fatalError("The stored value does not match the expected type.")
        }

        return value
    }
}

Of course, the errors could be more meaningful, but I'll leave that as an excercise for the reader.

Initially I looked at using throws to capture the error. However, the usage of the code becomes quite annoying. Also, there is a fairly big limitation as computed properties (which is what we'll need for the next section), do not support throwing (http://www.openradar.me/21820924).

Ok, finally, let's implement this in a type:

struct Person : KeyValueCodable {
    static var _codables: [KVCTypeInfo] { return [ _idKey, _fnameKey ]}
    var _kvcstore: Dictionary<String, Any> = [:]
}

extension Person {
    private static let _idKey = KVCTypeInfo(key: "id", type: Int.self)
    private static let _fnameKey = KVCTypeInfo(key: "fname", type: String.self)

    init(id: Int, fname: String) {
        self.id = id
        self.fname = fname
    }

    var id: Int {
        get { return getValue("id") as Int }
        set { setValue(newValue, forKey: "id") }
    }

    var fname: String {
        get { return getValue("fname") as String }
        set { setValue(newValue, forKey: "fname") }
    }
}

All of the stored properties are put into the non-extended type. If we're using classes, this could live happily in a base class. The extension contains all of the meat, and unfortunately, all the boiler-plate code required to make this work.

And the usage code:

var p = Person(id: 123, fname: "David")
p.id
p.fname

let id: Int = p.getValue("id")
let fname: String = p.getValue("fname")

p.setValue(21, forKey: "id")
p.setValue("Sally", forKey: "fname")

let id1: Int = p.getValue("id")
let fname1: String = p.getValue("fname")

The full playground source can be found here: https://gist.github.com/owensd/82af8e362273e46d70f9.

I'll leave it up to you on how useful this is.

Key Value Coding in Swift