Xcode, Frameworks, and Embedded Frameworks

So last week I spent the better part of the day trying to figure out what exactly was going when I was trying to build a component using SourceKittenFramework.

It turns out that not all frameworks are created equal in Xcode. Honestly, this wouldn’t be such a big deal if Swift properly supported static libraries, as the rabbit hole for this problem is rooted in a bunch of hacks to make command-line tools work properly with Swift that have dependencies.

I’ll provide a little story about my experience this week.

Embedded Frameworks

Xcode supports the concept of embedding frameworks into your bundle. This is essentially the same thing as the old “Copy Files” build phase where you can copy a dependency into your app bundle under a particular directly, such as “Frameworks”.

However, there is an extremely important distinction between the “Copy Files” build phase and the “Embed Frameworks” option.

The normal output of a framework looks like this:

├── Headers -> Versions/Current/Headers
├── Modules -> Versions/Current/Modules
├── MyFramework -> Versions/Current/MyFramework
├── Resources -> Versions/Current/Resources
└── Versions
├── A
│   ├── Headers
│   │   ├── MyFramework-Swift.h
│   │   └── MyFramework.h
│   ├── Modules
│   │   ├── MyFramework.swiftmodule
│   │   │   ├── x86_64.swiftdoc
│   │   │   └── x86_64.swiftmodule
│   │   └── module.modulemap
│   ├── MyFramework
│   └── Resources
│       └── Info.plist
└── Current -> A

This provides all of the necessary content to be able to use this framework both at runtime and as a developer-friendly framework; it has the headers and the module definitions necessary when building and linking against the library.

However, the frameworks that get embedded strip out all of that information and you end up with something like this:

├── MyFramework -> Versions/Current/MyFramework
├── Resources -> Versions/Current/Resources
└── Versions
├── A
│   ├── MyFramework
│   ├── Resources
│   │   └── Info.plist
│   └── _CodeSignature
│       └── CodeResources
└── Current -> A

This contains only the content required to be used at runtime. Now, it makes sense why Xcode would do this, after all, it’s being packaged up for use within a target so it’s already built. Also, this can help reduce the size of bundles by removing all of the information that simply isn’t necessary to work at runtime.

Had I only known this before…

Unexpected Outcomes

Now… the problem with all this of course is that when we start doing hacks to make thing work in the ever changing landscape of Swift and Xcode.

I ran into this when trying to use SourceKitten. As Swift doesn’t really have a good way to build testable command-line tools or static libraries, SourceKitten follows the pattern that a lot of other tools do: it builds an app target and then copies out the CLI tool and packages up its dependencies.

The start of the problems…

I don’t use Carthage or CocoaPods… but the reasons for that is outside of the scope for this. Needless to say, I simply clone the repo and ran make install as the ReadMe told me to do.

Everything is happily pulled down and everything is built properly. Then, SourceKitten is installed into my /usr/local path. Great!

Specifically, it creates a structure like this:

/usr/local/Frameworks/SourceKittenFramework.framework
/usr/local/bin/sourcekitten

The @rpath for sourcekitten is then set to @executable_path/../Frameworks.

There is nothing wrong with this setup. It works great.

However… remember, my intention is to now take SourceKittenFramework.framework, link it into my own app, and start converting my prototype that was shelling out to sourcekitten to directly use the API.

So I do what seems reasonable:

  1. Create my app
  2. Create a lib folder
  3. Copy the SourceKittenFramework.framework into lib
  4. Link the framework
  5. Add import SourceKittenFramework
  6. Build

And then I’m greeted with this:

<path/to/file:line:column> error: no such module 'SourceKittenFramework'
import SourceKittenFramework
       ^

Um… what!?

So I spend some time making sure my framework search paths are correct, inspecting the SourceKitten project, searching the web… no idea what is going on.

Ok… so I tried building from within Xcode and looking at the output of the SourceKittenFramework target itself. I still don’t understand the problem yet.

I copy over the version of SourceKittenFramework that is built from the target and I get this:

<path/to/file:line:col>: error: missing required modules: 'SWXMLHash', 'Yaml'
import SourceKittenFramework
       ^

Ok… what is happening!? I still haven’t figured out all of the specialness of “embedded frameworks” at this point.

I tried mucking with the framework search paths to point to the Yaml and SWXMLHash frameworks that I clearly see are within the frameworks, but nothing seems to be working… Including copying the Yaml and SWXMLHash frameworks to be siblings of the SourceKittenFramework.framework.

Ok… maybe source doesn’t work!? I download the framework from the releases page… SAME FLIPPING ERROR!

At this point, I’m quite frustrated, have no idea what is going on, and decide to call it for the day.

/ragequit

Taking Time

I come back the next day. OK, I’m going to get this to work!

I do, what I thought were mostly identical steps above.

Build success… ok, what the flying flip?

The key difference that I did this time was that I copied the Yaml.framework, SWXMLHash.framework, and SourceKittenFramework.framework from the target output of the SourceKittenFramework target.

At this point I was curious as to what had happened. This is where I started doing an analysis of what is different between each of the frameworks. See the “Embedded Frameworks” section above.

Conclusion

If you are providing frameworks to people that you expect to be able to develop with and not just use at runtime, please be sure to distribute the non-embedded framework version! Otherwise, well, all of your consumers will face the above issues.

SourceKitten tracking issue: https://github.com/jpsim/SourceKitten/issues/232

SourceKitten potential fix: https://github.com/jpsim/SourceKitten/pull/233

Xcode, Frameworks, and Embedded Frameworks

SE-0117 – The Proposal of Doom

Proposal SE-0117 is causing quite the ruckus. It’s actually a bit amusing to watch as people clamber against preventing inheritance by default.

In case you haven’t noticed… Swift really wants to move as much as possible to be statically defined as possible. This isn’t about “static is better” and “dynamic sucks”, it’s about the Core Team’s belief that being able to author code that the compiler can help you reason about is vastly safer than having code the compiler cannot reason about.

There are two very real and tangible side effects here:

  1. Resiliency
  2. Performance

Resiliency

By default, the code that Swift wants you to write wants to put you in a sport where you are not pigeonholed into a design that you do not want in the future. Today, when you create a class, if you forget to omit the final keyword, then that’s it! It’s a breaking change to later close down that class.

That is exactly the type of design error that Swift is trying to prevent when providing sensible defaults.

Performance

If the compiler can remove all of the virtual dispatching required in a dynamic type hierarchy, performance can get measurably better. Even with all of the optimizations of objc_msgSend and friends, it’s still magnitudes slower than static dispatch.

I get the argument: pragmatic programmers claim that they need the ability to work around bugs in frameworks that ship because developers make mistakes. Yes, they do… and code bases that use inheritance for this are writing hacks in their code. Should this be prevented? No… however, this proposal doesn’t actually say that it should be prevented either.

The problem is this: Swift is still in its infancy. Sure, version 3 is coming out soon, but really, let’s call it what it is closer to: version 0.3. It’s suitable for certain classes of applications, but it’s certainly not suitable for large-scale teams (100s of developers) to author code large swaths of code in it (I’m talking in the hundres of thousands or millions of lines of code here). Sure… you can, but it’s painful and expensive (e.g. see the upcoming Swift 2 to Swift 3 migration!).

What’s most important for Swift to get right up front is the underlying model. What’s the right set of defaults for everything to produce code that can benefit from tools to help analyze for correctness? Also, what are the right set of defaults to guard against code authors doing things wrong?

Taking a Step Back

Swift has two fundamental ways of expressing types: structs and classes. The primary difference is not that classes support inheritance and structs do not. The primary difference is that structs are value types and classes are reference types.

Swift could completely remove the idea of inheritance out of classes and they would still be important to have in the Swift type system.

Why is this important? Simple. The way in which we start to model future Swift types is not around do I need inheritance or not, it’s around do I need a raw value type, a reference type, or value semantics (usually a mixture of a struct API surface and an internal backing reference type store).

As an API author, if I need to have a value type, I cannot use inheritance. And more important, especially with regards to many of the arguments against this proposal, you cannot fix any issues with these struct types with inheritance either. It’s at this point that I find your arguments extremely weak: if it is so crucial that inheritance be enabled by default for class types, why are you not more concerned about Swift’s focus on value types and using value semantics for APIs? After all, there are pretty much no APIs in Swift’s libraries that you will be able to patch this way.

[Un]Safely Breaking Out

Let’s be honest here… the crux of the argument is that there is some error that has happened in the library you’re consuming and it’s either closed source or you are unable to modify the source for any number of reasons… so how do you fix it?

With unsafe code.

There’s absolutely nothing in this proposal that prevents Swift from providing tools to get you access to what you need. For example, imagine a world where you could download a developer Swift module that contains all of the unoptimized code.

This would allow you to write something like this:

class SuperHack : @unsafe YoDontSubclassMe {
	    @unsafe override dontTouch() {
		    // Implement your super hack
	    }
}

Why is this better? Well, I think it’s better for a variety of reasons:

  1. The code author was able to have vastly more performant code with their other design all consumers
  2. The specific use case that you needed to “fix” is clearly documented in your code as being unsafe; future maintainers will know this in a compiler verifiable way (e.g. not a comment)
  3. Future updates to a framework can easily be audited (e.g. all hacks can be used and logged against specific versions of frameworks)

So let’s maybe not get so melodramatic that providing no inheritance by default on classes is the end of the world?

SE-0117 – The Proposal of Doom

Looping with iterate() and takeWhile()

There’s a funny thing that happens when you remove a language construct that actually provides value: you need to re-invent ways to support that construct.

The proposal Add scan, takeWhile, dropWhile, and iterate to the stdlib provides a basic way to get back the lost functionality of the C-style for-loop, specifically with iterate() and takeWhile().

The key thing to remember for the implementation is that we must have a lazy version of iterate() in order for this to be semantically comparable to the C-style for-loop that is being replaced. Further, we need to be extremely careful when using the proposed takeWhile() (and other) extensions to be sure we’re getting the lazy versions when we need them.

So let’s look at what an implementation might look like (this is using Swift 2.2). We are going to want to replicate the following loop:

for var n = 1; n < 10; n = n \* 2 {
    print("\(n)", terminator: ", ")
}

This loop simply outputs: 1, 2, 4, 8, 

Ok, first we need to define the iterate function:

// Creates a lazy sequence that begins at start. The next item in the
// sequence is calculated using the stride function.
func iterate(initial from: T, stride: T throws -> T) -> StridingSequence

This is going to require that we return a SequenceType (this is renamed to Sequence in Swift 3). But remember, we want this to be lazy, so we really need to conform to the LazySequenceType protocol. That type is going to need to know the starting point and the mechanism to stride through the desired sequence.

struct StridingSequence : LazySequenceType {
    let initial: Element
    let stride: Element throws -> Element

    init(initial: Element, stride: Element throws -> Element) {
        self.initial = initial
        self.stride = stride
    }

    func generate() -> StridingSequenceGenerator {
        return StridingSequenceGenerator(initial: initial, stride: stride)
    }
}

Of course, now the StridingSequence is going to need the underlying GeneratorType implementation: StridingSequenceGenerator (the GeneratorType protocol is renamed to IteratorProtocol in Swift 3).

struct StridingSequenceGenerator : GeneratorType, SequenceType {
    let initial: Element
    let stride: Element throws -> Element
    var current: Element?

    init(initial: Element, stride: Element throws -> Element) {
        self.initial = initial
        self.stride = stride
        self.current = initial
    }

    mutating func next() -> Element? {
        defer {
            if let c = current {
                current = try? stride(c)
            }
            else {
                current = nil
            }
        }
        return current
    }
}

OK… this is getting to be a lot of code. But there’s going to be a big payoff, right?

What we have now is an infinite sequence. We can test it out like so:

for n in iterate(initial: Int(1), stride: \{$0 \* 2}) {
    if n >= 10 { break }
    print("\(n)", terminator: ", ")
}

At this point, we are pretty close to getting what we want. The last question is how to move the condition out of the body of the loop and into the for-loop construct?

We have two basic options:

  1. Add a while: parameter to the iterate() function, or
  2. Add a takeWhile() function that can be chained.

The proposal that I linked to earlier proposes to add a takeWhile() function. This is probably the “better” way to go given that we are creating a sequence and it’s feasible that we may want to do other operations, like filtering.

Unfortunately, this means a bit more code.

Let’s start with the extension to LazySequenceType:

extension LazySequenceType \{
    typealias ElementType = Self.Elements.Generator.Element
    func takeWhile(predicate: ElementType -> Bool)
        -> LazyTakeWhileSequence
    {
        return LazyTakeWhileSequence(base: self.elements, takeWhile: predicate)
    }
}

This requires us to create another sequence type that knows how to walk our original sequence type but stop when the given condition is met.

struct LazyTakeWhileSequence : LazySequenceType {
    let base: Base
    let predicate: Base.Generator.Element -> Bool

    init(base: Base, takeWhile predicate: Base.Generator.Element -> Bool) {
        self.base = base
        self.predicate = predicate
    }

    func generate() -> LazyTakeWhileGenerator {
        return LazyTakeWhileGenerator(base: base.generate(), takeWhile: predicate)
    }
}

And then this is going to require another generator type that can do gives us the next item in the sequence and nil after the condition is met.

struct LazyTakeWhileGenerator : GeneratorType, SequenceType {
    var base: Base
    var predicate: Base.Element -> Bool

    init(base: Base, takeWhile predicate: Base.Element -> Bool) {
        self.base = base
        self.predicate = predicate
    }

    mutating func next() -> Base.Element? {
        if let n = base.next() where predicate(n) {
            return n
        }
        return nil
    }
}

Whew! Now we can write this:

for n in iterate(initial: Int(1), stride: \{$0 \* 2}).takeWhile({ $0 < 10 }) {
    print("\(n)", terminator: ", ")
}

Of course, we could have just written this and been done with it:

for var n = 1; n < 10; n = n \* 2 {
    print("\(n)", terminator: ", ")
}

Summary

It’s honestly really difficult for me to take this approach to be objectively better, especially when I have to write the supporting library code ;). Yes, there are clearly benefits to an iterate() function that you can then perform different operations on, and maybe if I needed to perform some type of filtering with the above loop like so:

let items = iterate(initial: Int(1), stride: \{$0 \* 2})
    .filter({ $0 != 4})
    .takeWhile({ $0 < 10 })

for n in items \{
    print("\(n)", terminator: ", ")
}

I could see the benefit for this approach for some use cases. However, there are also objectively bad things about the approach above. For one, there is a crap ton of code that needs to be written just to get this to work, and I’m not done. I need to similar stuff for collection types and the non-lazy versions as well.

The other thing, I don’t find it any less cryptic. Sure, things are labeled a bit better, but there’s a lot more syntax in the way now (using an @autoclosure would be nice, but you cannot use anonymous variables like $0). In fact, it’s only after moving the iterate() code into its own line, do things start to become a bit more clear.

Anyhow, if you’re interested in how to implement this, it’s all here. And if there is actually an easier way, PLEASE let me know.

Full gist is here: iterate.swift.

Looping with iterate() and takeWhile()

APIs Matter

I asked a poll on Twitter today about API preference between two options (three if you count the updated version):

// the very verbose range-based loop
for n in 0.stride(through: 10, by: 2).reverse() {
    print(n)
}

// the more concise range-based loop
for n in 10.stride(through: 0, by: -2) {
    print(n)
}

// c-style loop
for var n = 10; n >= 0; n -= 2 {
    print(n)
}

And even earlier I wrote this blog article: For Loops and Forced Abstractions.

The primary point of the entry was about being forced into abstractions when they are not necessary.

One of the things that really bothered me were the examples in the Swift blog:

for i in (1...10).reverse() {
    print(i)
}

for i in 0.stride(to: 10, by: 2) {
    print(i)
}

In my opinion, those are really terrible APIs. In addition being arguably just as bad to visually parse as the c-style for-loop, they still do not convey the intent behind what is being done: they are supposed to be creating a range and only the first usage even comes close to looking like that. Not only that, there is no symmetry involved in incrementing and decrementing ranges.

For example, this is invalid in Swift: 10...0. So we have, what I would call, a broken and partial abstraction over the concept of “ranges” or “intervals”. Ironically, that’s exactly the API we need, especially when we are removing the c-style for-loop.

Let’s take a look at the Strideable protocol:

/// Conforming types are notionally continuous, one-dimensional
/// values that can be offset and measured.
public protocol Strideable : Comparable {
    /// A type that can represent the distance between two values of `Self`.
    associatedtype Stride : SignedNumberType
    /// Returns a stride `x` such that `self.advancedBy(x)` approximates
    /// `other`.
    ///
    /// - Complexity: O(1).
    ///
    /// - SeeAlso: `RandomAccessIndexType`'s `distanceTo`, which provides a
    ///   stronger semantic guarantee.
    @warn_unused_result
    public func distanceTo(other: Self) -> Self.Stride
    /// Returns a `Self` `x` such that `self.distanceTo(x)` approximates
    /// `n`.
    ///
    /// - Complexity: O(1).
    ///
    /// - SeeAlso: `RandomAccessIndexType`'s `advancedBy`, which
    ///   provides a stronger semantic guarantee.
    @warn_unused_result
    public func advancedBy(n: Self.Stride) -> Self
}

This seems fairly clear: it’s an abstraction over an item that can be incremented or decremented by some Self.Stride value. In addition, we can also determine the distance between two Stridable instances, so long as they share the same Stride associated type.

This is one layer of the abstraction onion, but OK. When applied to numeric types, this gives us the nice ability to add and subtract in a generic and type-safe manner.

The problem, in my opinion, is the extension:

extension Strideable {
    /// Returns the sequence of values (`self`, `self + stride`, `self +
    /// stride + stride`, ... *last*) where *last* is the last value in
    /// the progression that is less than `end`.
    @warn_unused_result
    public func stride(to end: Self, by stride: Self.Stride) -> StrideTo
}

WHAT!?

This makes absolutely no sense to me. I actually find this API really bad on multiple counts:

  1. Why does a type that is responsible for incrementing itself now have the ability to create a sequence of values?
  2. What definition of “stride” ever means “create a sequence”?
  3. The API has a variable named stride that has a different conceptual meaning altogether than the function withthe same name.

In my opinion, this is just a bad API. Further, this goes on to confuse matters at the call sites.

If we must get rid of the c-style for loops, then we need to look at what the alternative is: for-in.

So what is a for-in loop construct?

You use the for-in loop to iterate over a sequence, such as ranges of numbers, items in an array, or characters in a string.

Source: Swift Programming Language: Control Flow.

Great! So what we really want is the ability to create such a range with as little abstraction as possible. The stride API is attempting to do that, but it fails to do so in an appropriate matter.

Instead, we want an API that can be called like this:

for n in range(from: 10, to: 0, by: 2) {
}

And here’s what the signature looks like:

func range(
    from start: T, 
    to end: T,
    by step: T.Stride = 1) -> Interval

NOTE: Sure, there needs to be other variants to support open, closed, left-open, and right-open intervals, but that’s irrelevant

for this purpose.

Wait a minute… isn’t that the same as what stride() is today. Sure, except:

  1. range() is vastly more explicit in what is actually going on.
  2. Instead of tacking on to the Strideable protocol like a poor man’s side-car, it composes with it instead creating anAPI that is much more natural and expressive.
  3. Creates a much more natural call site.

I still don’t like the removal for the c-style for-loop, but thankfully, Swift v3 will be moving stride to be a free function again. It’s nice having a more “proper” API to work with out of the box.

Now to get it renamed to range

APIs Matter

For Loops and Forced Abstractions

In case you haven’t heard, the traditional c-style for loop has been deprecated and is slated for removal in Swift 3.0. More info about that can be found here: New Features in Swift 2.2.

I’m not a fan, at all.

The fundamental reason I’m not a fan is quite simple: the only way to write a for loop now is by leveraging abstractions. Personally, I really dislike being required to use abstractions when they are not necessary.

The defense I hear all the time is this:

Well, the compiler will close that gap or remove the abstraction cost all together.

That’s nice in theory, but it’s patently false in practice. The optimizer can remove some of the abstractions, but it cannot guarantee to remove all of the cost of the abstraction every time.

Here’s the real-world cost of abstractions (not necessarily specific to just this for-loop construct):

Language: C, Optimization: -Os                                          Avg (ms) 
---------------------------------------------------------------------------------
RenderGradient (Pointer Math)                                              9.582 
RenderGradient (SIMD)                                                      4.608 

Language: Swift, Optimization: -O                                       Avg (ms) 
---------------------------------------------------------------------------------
RenderGradient ([Pixel])                                                22.51406 
RenderGradient ([UInt32])                                               18.39304 
RenderGradient (UnsafeMutablePointer)                                   20.67769 
RenderGradient (UnsafeMutablePointer<UInt32>)                           15.29333 
RenderGradient ([Pixel].withUnsafeMutablePointer)                       22.51703 
RenderGradient ([UInt32].withUnsafeMutablePointer)                      19.27868 
RenderGradient ([UInt32].withUnsafeMutablePointer (SIMD))               15.63351 
RenderGradient ([Pixel].withUnsafeMutablePointer (SIMD))                24.48129 

Source: https://github.com/owensd/swift-perf/blob/swift-v3/reports/swift_3_0-march.txt

At best, under an optimized build, we’re looking at a 4x cost in performance. With unchecked builds, it’s possible to get the performance down to equivalent timings. With non-optimized builds, we are talking anywhere from 3 to 88 (!!) times slower than the equivalent C code.

It’s not that I don’t think that the for-in style loop isn’t useful. I do. I also completely agree that it should be the one used the majority of the time. However, please don’t force me to use abstractions when I don’t want to or when they are not appropriate.

Here’s the before and after with the upcoming changes of some real code:

for var y = 0, height = buffer.height; y < height; ++y {
    let green = min(int4(Int32(y)) &+ yoffset, 255)

    for var x: Int32 = 0, width = buffer.width; x < width; x += 4 {
        let blue = min(int4(x, x + 1, x + 2, x + 3) &+ xoffset, 255)

        p[offset++] = Pixel(red: 0, green: green.x, blue: blue.x, alpha: 255)
        p[offset++] = Pixel(red: 0, green: green.y, blue: blue.y, alpha: 255)
        p[offset++] = Pixel(red: 0, green: green.z, blue: blue.z, alpha: 255)
        p[offset++] = Pixel(red: 0, green: green.w, blue: blue.w, alpha: 255)
    }
}
for y in 0..<buffer.height {
    let green = min(int4(Int32(y)) &+ yoffset, 255)

    for x in 0.stride(to: buffer.width, by: 4) {
        let x32 = Int32(x)
        let blue = min(int4(x32, x32 + 1, x32 + 2, x32 + 3) &+ xoffset, 255)

        p[offset] = Pixel(red: 0, green: green.x, blue: blue.x, alpha: 255)
        offset += 1

        p[offset] = Pixel(red: 0, green: green.y, blue: blue.y, alpha: 255)
        offset += 1

        p[offset] = Pixel(red: 0, green: green.z, blue: blue.z, alpha: 255)
        offset += 1

        p[offset] = Pixel(red: 0, green: green.w, blue: blue.w, alpha: 255)
        offset += 1
    }
}

I personally don’t consider that a readability win.

  1. It’s more code.
  2. It requires type coercion for x as the Int32 type isn’t stridable.
  3. stride(to:by:) is ambiguous compared the < operator.

And finally, this is not an acceptable alternative in my opinion:

var y = 0
var height = buffer.height
while y < height {

    var x: Int32 = 0
    var width = buffer.width
    while x < Int32(width) {
        let x32 = Int32(x)
        let blue = min(int4(x32, x32 + 1, x32 + 2, x32 + 3) &+ xoffset, 255)

        p[offset] = Pixel(red: 0, green: green.x, blue: blue.x, alpha: 255)
        offset += 1

        p[offset] = Pixel(red: 0, green: green.y, blue: blue.y, alpha: 255)
        offset += 1

        p[offset] = Pixel(red: 0, green: green.z, blue: blue.z, alpha: 255)
        offset += 1

        p[offset] = Pixel(red: 0, green: green.w, blue: blue.w, alpha: 255)
        offset += 1

        x += 4
    }

    y += 1
}

Why you might ask?

  1. It’s extremely easy to forget the increment (I actually did a few momements ago).
  2. The iterator variables are being leaked out of scope.
  3. All of the loop parts (initialization, condition, and increment) are scattered throughout the construct.
  4. The suggested pattern of using defer for incrementing is fundamentally flawed:
var i = 0
	while i < 10 \{
    defer { i += 1 }
    if i == 5 { break }
}
	print(i)

What do you think i is here? What should it be? I’ll give you a hint, they aren’t the same answer.

Yes, the above examples are narrow and specific. But that’s exactly the point. When we need to write for narrow and specific cases, that’s exactly when we need to get outside of the abstraction box that makes for simpler code.

It’s strange watching Swift evolve. Maybe I’m just dense or stuck in my old ways, but I can’t see how this change is aligned with one of Swift’s aspirations of being a systems-level language.

For Loops and Forced Abstractions

Access Control Modifiers Proposal Thoughts

The big thread these days on swift-evolution is regarding access control modifiers. Swift supports a fairly limited set today, namely:

  • public: visible outside of the module
  • internal: visible within the module
  • private: visible within the file

There is a proposal, SE-0025: Scoped Access Level, that wants to add another layer to the mix: lexical scoping (I’ll use local in the example for it).

struct Outer {
    local let scopeVisible: Int
    private let fileVisible: Int
    func f() {
        /* scopeVisible is accessible here */
        /* fileVisible is accessible here */
    }
}

let o = Outer()
/* o.scopeVisible is not accessible here */
/* o.fileVisible is accessible here */

Today, in Swift, there is no way to provide this type of scoping mechanism within the same file.

My argument is that this proposal should be rejected, the Core Team thinks otherwise:

To summarize the place we’d like to end up:

  • “public” -> symbol visible outside the current module.
  • “internal” -> symbol visible within the current module.
  • unknown -> symbol visible within the current file.
  • “private” -> symbol visible within the current declaration (class, extension, etc).

The problem, as I see it, is that this is simply a one-off fix for one of the limitations for access modifiers today. There are other, arguably reasonable, asks for access control modifiers as well:

  • visibility to only extensions declared in the same file
  • visibility to only extensions
  • visibility to subclasses
  • visibility to specific functions or types (e.g. C++’s friend)

We could get even more fine grained as well:

  • visibility to only a specific set of modules
  • visibility within a specific submodule

I don’t think that these are all necessarily bad, in fact, some can be quite helpful. However, instead of just accepting this proposal and adding this specific change, I’d rather see the entire access modifier system to be revisited because this doesn’t really fix much, it just moves the problem.

The example used is this (where local means lexical scope):

class A {
   local var counter = 0

   // API that hides the internal state
   func incrementCount() { ++counter }

   // hidden API, not visible outside of this lexical scope
   local func advanceCount(dx: Int) { counter += dx }

   // incrementTwice() is not visible here
}

extension A {
   // counter is not visible here
   // advanceCount() is not visible here

   // may be useful only to implement some other methods of the extension
   // hidden from anywhere else, so incrementTwice() doesn’t show up in 
   // code completion outside of this extension
   local func incrementTwice() {
      incrementCount()
      incrementCount()
   }
}

Ok, so this addresses the problem that counter is not meant to be visible outside of type A. Maybe it was unintentionally being leaked. However, if counter is required to be used within one of the extensions, say a reset() function, then counter needs to be promoted to the file-based one. However, by doing so, we are again leaking counter to be more visible than it is supposed to be. So what was really the point?

At the end, if the current access modifiers are not sufficient because they are “too leaky”, then this proposal doesn’t fix the root problem. If the root problem is really sufficient enough, then I would think all of the modifiers should be revisited to provide the fine-grained access control system that is really being asked for.

Of course, I’m also just fine with the three we have and not trying to add all of the complexity required.

If you have thoughts on it, be sure to contribute here: http://thread.gmane.org/gmane.comp.lang.swift.evolution/12183/focus=12219

Access Control Modifiers Proposal Thoughts

Tooling Around – Testing in Swift

For those that don’t know, a few of us are working on a set of tools for building for Swift. As part of that work, I’ve been thinking about how some unit tests could be done in a much simpler way. D does something interesting for unit tests; it allows you to define them inline and have them runnable at build time. Pretty cool, though D’s implementation is a bit limited.

class Sum
{
    int add(int x, int y) { return x + y; }

    unittest
    {
        Sum sum = new Sum;
        assert(sum.add(3,4) == 7);
        assert(sum.add(-2,0) == -2);
    }
}

If we had the ability to create custom attributes in Swift (ok… this feature really requires custom attributes and compiler plug-ins), I was thinking that I could build something like this:

class Sum {
    func add(x: Int, _ y: Int) -> Int { return x + y }
}

@test("Sum", "add(_:_)", "checkin") {
    let sum = Sum()
    assert(sum.add(4, 5) == 9, "Math is hard!")
    assert(sum.add(-3, 3) == 0)
}

The intent is that this provides us with more functionality than what D offers, namely the ability to filter test cases by a number of factors including type, function names, test type (e.g. checkin), or any other text-based qualifier you want. Also, since it was an attribute, we could easily strip out these code paths if a flag, say -enable-testing, wasn’t used.

So, to run all of the checkin tests, you’d do something like this (assume we had some tool run-tests that is magical for now):

$ run-tests -match "checkin"

This would let us find all of the @test items with checkin as part of the metadata and run them.

Ok… that’s great, but Swift doesn’t allow us to create these attributes… so all hope is lost, right?

Nope, we can hack around to get what we want. =)

Instead, let’s do this:

class Sum {
    func add(x: Int, _ y: Int) -> Int { return x + y }
}

func __test_sum_add_checkin() throws {
    let sum = Sum()
    assert(sum.add(4, 5) == 10, "Math is hard!")
    assert(sum.add(-3, 3) == 0)
}

The idea is fairly simple:

  1. Build a static library of your module that you wish to test; make sure the -enable-testing flag is set. 2. For each Swift file with methods following our convention (top-level functions that start with __test_), create an executable that calls that function. 3. Run the executable.

Boom! Integrated unit tests.

Digging In

I’m using our build tool, but you can probably do something similar with Swift’s Package Manager.

Here’s the contents of my build file:

(package
  :name "IntegratedUnitTests"

  :tasks {
    :build {
      :tool "atllbuild"
      :sources ["Sum.swift"]
      :name "math"
      :output-type "static-library"
      :publish-product true
      :compile-options ["-enable-testing"]
    }

    :test {
      :dependencies ["generate-test-file"]
      :tool "atllbuild"
      :sources ["sum_test.swift"]
      :name "sum_test"
      :output-type "executable"
      :publish-product true
      :link-with ["math.a"]
    }

    :generate-test-file {
      :dependencies ["build"]
      :tool "shell"
      :script "echo '@testable import math' > sum_test.swift && xcrun -sdk macosx swiftc -print-ast Sum.swift | grep __test | sed 's/internal func/try/g' | sed 's/throws//g' >> sum_test.swift"
    }

    :run {
      :dependencies ["test"]
      :tool "shell"
      :script "./bin/sum_test"
    }
  }
)

The build task is responsible for creating the math.a static library. The test task is responsible for creating the test executable. The generate-test-file task actually does creates the source code for the test executable. It does the following:

  1. Creates a new file named sum_test.swift 2. Appends @testable import math to it 3. Examines the AST for Sum.swift and adds the calls for our test methods.

The final file looks like this:

@testable import math
try __test_sum_add_checkin() 

And when you run it:

assertion failed: Math is hard!: file Sum.swift, line 7

Yay! Inlined test code.

This is just a preview. I plan on flushing this out some more, but I thought it was interesting enough to post about. =)

Tooling Around – Testing in Swift

Sad State of Enums

Enums… those lovely little beasts of many uses. I really do like associated enums. Well, at least, I really like the idea of associated enums.

The problem: they really suck to work with.

Take for example you simply want to validate that an enum you got back is a specific enum case.

enum Simple {
    case SoEasy
    case Peasy
}

func simple() -> Simple { return .SoEasy }

func testSimple() {
    assertme(simple() == .SoEasy)
}

This is a cake walk with normal enums. But…

enum DoYou {
    case ReallyWantToHurtMe(Bool)
    case ReallyWantToMakeMeCry(Bool)
}

func doyou() -> DoYou { return .ReallyWantToHurtMe(true) }

func testChump() {
    assertme(doyou() == .ReallyWantToHurtMe)
}

GAH! Ok…

func testChump() {
    assertme(case .ReallyWantToHurtMe = doyou())
}

Oh… the case syntax isn’t a real expression…

func testChump() {
    if case .ReallyWantToHurtMe = doyou() { assertme(false) }
}

Yeah… that’s really less than ideal.

This is where I just get really freaking tired of working with associated enums and I do one of two things:

  1. Convert the associated enum into a struct that holds the values and a enum that is just the simple types. 2. Add getters for every case that returns an optional.

The first option has the severe limitation of only really working when the cases hold the same data types or nearly the same. It’s also a bit more annoying.

The second option is just super annoying to have to do. It signals a significant design issue with them. It’s also just a waste of time as well.

So this is what I do:

enum DoYou {
    case ReallyWantToHurtMe(Bool)
    case ReallyWantToMakeMeCry(Bool)

    var reallyWantToHurtMe: Bool? {
        if case let .ReallyWantToHurtMe(value) = doyou() { return value }
        return nil
    }

    var reallyWantToMakeMeCry: Bool? {
        if case let .ReallyWantToMakeMeCry(value) = doyou() { return value }
        return nil
    }
}

func testChump() {
    assertme(doyou().reallyWantToHurtMe != nil)
}

/cry

Sad State of Enums

Named Parameters

There’s a pretty interesting proposal discussion on swift-evolution right now: Naming Functions with Argument Labels.

I bring it up because it hits a bit close to my heart with regards to naming of functions named arguments. It’s my opinion that Swift should have diverged from ObjC here and treated named arguments properly. What I mean by that is to move the argument name within the function parameters completely.

So a function name like this:

func insertSubview(view, aboveSubview) {}

Would have become:

func insert(subview, aboveSubview)

The difference is the lack of the implicit _ on the first argument name. I bring this up (again) because of the given proposal shows well why I think the current convention stinks.

let fn = someView.insertSubview(_:aboveSubview:)

Ick! Why is that _ necessary. Oh‚Ķ right, because we’ve shoved the actual parameter name for that first item into the name of the function. ObjC did this for a compelling reason. Swift seems to simply follow that convention for what I can only presume to be convenience in the Swift to ObjC interop.

Too bad, this seems so much nicer to me:

let fn = someView.insert(subview:aboveView:)

Maybe someday…

Named Parameters

RE: Why Swift guard Should be Avoided

I saw this blog article, Why Swift guard Should Be Avoided, and it got me thinking about things I believe are fallacies but are continued to be talked about as the “right way to program”. I will preface this by saying that I don’t think there is a “right way” to program, but rather, there are trade-offs for particular paths that we chose to go down. Some of these paths provide better fruit than others. However, just as fruits have seasonality to them, some paths might not always produce the best fruit in all circumstances. Context matters. All that said, I still believe that are paths that never produce good fruit, and I think two of those paths are demonstrated in the linked blog article.

So, what are these bad paths (or as I call programming fallacies)?

  1. Functions should be between 6 and 10 lines 2. Single Responsibility means doing only one thing

The first is an arbitrary way to determine quality and complexity of code that holds no bearing in the actual domain of the problem. It also asserts that shorter code is better than longer code with no real presumption on the complexity of the shorter code. I believe that code clarity is far more important than code length.

The quote presented to defend this was from Robert C. Martin:

The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that.

Here’s what I believe is a better philosophical position:

Functions should be as small as possible to do there job, but no smaller than that.

The other fallacy is that “single responsibility” means a single action. This leads you down the path of premature refactoring, which I’ll talk about a bit with examples from the post. A function should do a single task, but tasks are generally multi-step transactions. This should be completely obvious because we still need to have the vend() function; without it, we have to duplicate the logic everywhere in which a vend() must take place.

Pre-mature Refactoring

Ok, let’s get into the post a bit. Here’s the Swift code from Apple’s example:

struct Item {
    var price: Int
    var count: Int
}

enum VendingMachineError: ErrorType {
    case InvalidSelection
    case InsufficientFunds(coinsNeeded: Int)
    case OutOfStock
}

class VendingMachine {
    var inventory = [
        "Candy Bar": Item(price: 12, count: 7),
        "Chips": Item(price: 10, count: 4),
        "Pretzels": Item(price: 7, count: 11)
    ]

    var coinsDeposited = 0

    func dispense(snack: String) {
        print("Dispensing \(snack)")
    }

    func vend(itemNamed name: String) throws {
        guard var item = inventory[name] else {
            throw VendingMachineError.InvalidSelection
        }

        guard item.count > 0 else {
            throw VendingMachineError.OutOfStock
        }

        guard item.price <= coinsDeposited else {
            throw VendingMachineError.InsufficientFunds(coinsNeeded: item.price - coinsDeposited)
        }

        coinsDeposited -= item.price
        --item.count
        inventory[name] = item
        dispense(name)
    }
}

And here’s the refactored version from the post:

func vend(itemNamed name: String) throws {
    let item = try validatedItemNamed(name)
    reduceDepositedCoinsBy(item.price)
    removeFromInventory(item, name: name)
    dispense(name)
}

private func validatedItemNamed(name: String) throws -> Item {
    let item = try itemNamed(name)
    try validate(item)
    return item
}

private func reduceDepositedCoinsBy(price: Int) {
    coinsDeposited -= price
}

private func removeFromInventory(var item: Item, name: String) {
    --item.count
    inventory[name] = item
}

private func itemNamed(name: String) throws -> Item {
    if let item = inventory[name] {
        return item
    } else {
        throw VendingMachineError.InvalidSelection
    }
}

private func validate(item: Item) throws {
    try validateCount(item.count)
    try validatePrice(item.price)
}

private func validateCount(count: Int) throws {
    if count == 0 {
        throw VendingMachineError.OutOfStock
    }
}

private func validatePrice(price: Int) throws {
    if coinsDeposited < price {
        throw VendingMachineError.InsufficientFunds(coinsNeeded: price - coinsDeposited)
    }
}

Let’s break it down:

vend(itemNamed name: String) throws

The author suggests that the refactored version is better. But here’s the first thing to note: the responsibility of the function has not changed; it is still responsible for doing the same thing it did before. So in the first regard, nothing should have changed from an API usage standpoint. This is vital because when we refactor functionality, this is the actual goal: to break about couple functionality that didn’t belong together.

private func validatedItemNamed(name: String) throws -> Item

I actually don’t know what this is supposed to do. In order to follow what is going on, I need to actually read through the code all of the code that it calls. Doing that, I can see that it does the following:

  1. Ensures the item is in the dictionary 2. That the count of items is not zero 3. That the number of coins deposited is greater than or equal to the price of the item

However, this took four functions and three levels of function calls to achieve. The oversight in this approach is that it is inherently fragile because it makes use of four functions to achieve it’s goal. A change to any single function can have a ripple effect on unintended side-effects.

Example: The vending machine needs a new function, addItem(). It’s purpose is to allow additional items to be added to the vending machine. However, there are some constraints we want to add for new items:

  1. The name must not be empty 2. The name must be word-capitalized (e.g. Big Candy Bar) 3. The price must be less than 100

I can pretty much guarantee you that the validateItem() function is going to be updated here to add these requirements. So not only are we going to be validating things that we simply don’t care about on ever call to vend(), if there is a data already in the vending machine that doesn’t already meet these requirements, the vend() is going to fail.

This may seem like a contrived example, but it absolutely is not. I’ve had to fix issues like this in real code because of this exact type of refactoring.

reduceDepositedCoinsBy(price: Int)

This function will lead to data corruption. It makes the assumption that validate() has been called. This function, if we’re going to actually have it, should actually do the verification that this is a legal operation. Otherwise, there is absolutely no purpose for it.

removeFromInventory(var item: Item, name: String)

Same comments apply here as well: data corruption!

itemNamed(name: String) throws -> Item

This one is an interesting one. If Swift had throwable subscripts, then this wouldn’t be necessary. However, I think in principle, this one is good, but the implementation is prone to bugs. This is the poster child for the guard statement.

private func itemNamed(name: String) throws -> Item {
    guard let item = inventory[name] {
        throw VendingMachineError.InvalidSelection
    }
    return item
}

This is objectively better as it ensures that the only code-path that can exist after the guard is one where the dictionary actually has the item. If also ensures that if the dictionary doesn’t have the item, we error out early.

Summary

Be wary of refactoring to meet arbitrary goals. When that is the purpose of the refactoring, it is very easy to get into a spot that you’ve actually created more complexity and introduced more error paths in your code.

My guiding principle: a function should handle it’s responsibility and only its responsibility. The number of steps for that responsibility is mostly immaterial.

RE: Why Swift guard Should be Avoided