Articles, podcasts and news about Swift development, by John Sundell.

Task-based concurrency in Swift

Published on 03 Feb 2019
Basics article available: Grand Central Dispatch

Just like sequential code, concurrent code can come in many different shapes and forms. Depending on what we’re trying to achieve — whether that’s asynchronously fetching a piece of data, loading heavy files from disk, or performing a group of related operations — the abstraction that’ll prove to be the best fit might vary quite a lot from use case to use case.

One such concurrent programming abstraction is Tasks. While tasks may at first seem very similar to both Futures & Promises and Foundation’s Operation API, there are some distinct differences in how they behave and what level of control they give to the API user.

This week, let’s take a look at some of those differences, and some scenarios in which tasks can become really useful.

Please note that this article was written long before Swift’s concurrency system was introduced in Swift 5.5, which ships with a built-in implementation of task-based concurrency. Check out this Discover page for more information about that new concurrency system.

The task at hand

Let’s say that we’re building a social networking app, and that we offer our users the option to attach both photos and videos when publishing a post. Currently, whenever the user hits “Publish” — we call the following function, which uploads all attached media and also uploads the data for the post itself, like this:

func publish(_ post: Post) {
    for photo in post.photos {
        upload(photo)
    }

    for video in post.videos {
        upload(video)
    }

    upload(post)
}

While the above code is very simple, there’s a few problems with it. First, there’s no way for us to get notified whenever the publishing of a post was finished. We also don’t do any kind of error handling, which means that if a photo or video failed to upload — we’ll just continue uploading the post data anyway, which isn’t ideal.

There’s quite a number of ways that we could go about solving the above problem. One idea might be to use the built-in Operation and OperationQueue types to sequentially execute all operations — but that’d require us to either make our actual networking code synchronous, or to subclass Operation to create a very custom solution. Both of those alternatives are valid, but can feel a bit ”heavy-handed”, since they’ll require quite major changes to our original code.

Luckily, it turns out that all we have to do to gain the control we need is to step one level deeper into the stack, and use the framework that Operation and OperationQueue are based on — Grand Central Dispatch.

Like we took a look at in “A deep dive into Grand Central Dispatch in Swift”, GCD allows us to quite easily group together a bunch of operations, and to get notified when all of them have completed. To make that happen, we’ll make a slight tweak to our upload functions from before — so that they now offer us a way to observe their completion using a closure, and make balanced calls to enter and leave on a DispatchGroup in order to get notified when all media uploads were finished:

func publish(_ post: Post) {
    let group = DispatchGroup()

    for photo in post.photos {
        group.enter()

        upload(photo) {
            group.leave()
        }
    }

    for video in post.videos {
        group.enter()

        upload(video) {
            group.leave()
        }
    }

    // Calling ‘notify’ allows us to observe whenever the whole
    // group was finished using a closure.
    group.notify(queue: .main) {
        upload(post)
    }
}

The above works really well, and leaves us with pretty clean-looking code. But we still haven’t addressed the lack of error handling — as we’ll still blindly upload the post, whether or not the media uploads were successfully completed.

To fix that, let’s add an optional Error variable that we’ll use to keep track of any error that occurred. We’ll make another tweak to our upload functions, to have them pass an optional Error argument to their completion handlers, and use that to capture the first error that was encountered — like this:

func publish(_ post: Post) {
    let group = DispatchGroup()
    var anyError: Error?

    for photo in post.photos {
        group.enter()

        upload(photo) { error in
            anyError = anyError ?? error
            group.leave()
        }
    }

    for video in post.videos {
        group.enter()

        upload(video) { error in
            anyError = anyError ?? error
            group.leave()
        }
    }

    group.notify(queue: .main) {
        // If an error was encountered while uploading the
        // post’s media, we’ll call an error handling function
        // instead of proceeding with the post data upload.
        if let error = anyError {
            return handle(error)
        }

        upload(post)
    }
}

We’ve now fixed all correctness issues with our original piece of code, but in the process of doing that we’ve also made it much more complicated and harder to read. Our new solution is also quite boilerplate-heavy (with a local error variable and dispatch group calls that need to be kept track of), which might not be a problem here, but as soon as we start moving more of our asynchronous code to use this pattern — things can get much harder to maintain quite quickly.

It’s abstraction time!

Let’s see if we can make the above kind of operations easier to work with by introducing a thin, task-based layer of abstraction on top of Grand Central Dispatch.

We’ll start by creating a type called Task, which will essentially be just a wrapper around a closure that’ll get access to Controller for controlling the flow of the task:

struct Task {
    typealias Closure = (Controller) -> Void

    private let closure: Closure

    init(closure: @escaping Closure) {
        self.closure = closure
    }
}

The Controller type, in turn, provides methods for either finishing or failing the task that it’s associated with, and does so by calling a handler to report the Outcome of the task:

extension Task {
    struct Controller {
        fileprivate let queue: DispatchQueue
        fileprivate let handler: (Outcome) -> Void

        func finish() {
            handler(.success)
        }

        func fail(with error: Error) {
            handler(.failure(error))
        }
    }
}

Just like the code above shows, Outcome has two cases — .success and .failure — making it very similar to a Result type with a generic type Void. In fact, we could choose either to implement Outcome as its own enum type, or simply use one of the techniques from “The power of type aliases in Swift” and make it a generic shorthand for Result<Void>:

extension Task {
    enum Outcome {
        case success
        case failure(Error)
    }
}

extension Task {
    typealias Outcome = Result<Void>
}

Finally, we need a way to actually perform the tasks that we’ll define. For that, let’s add a perform method on Task, that’ll either take an explicit DispatchQueue to perform the task on — or simply retrieve any global one — and a handler to be called once the task finished executing. Internally, we’ll then use the given DispatchQueue to execute the task asynchronously, by creating a controller and passing it into the task’s closure — like this:

extension Task {
    func perform(on queue: DispatchQueue = .global(),
                 then handler: @escaping (Outcome) -> Void) {
        queue.async {
            let controller = Controller(
                queue: queue,
                handler: handler
            )

            self.closure(controller)
        }
    }
}

With the above in place, the initial version of our Task API is finished, and we’re ready to take it for a spin. Let’s start by defining a task that’ll replace our photo uploading function from before — which calls into an underlying Uploader class, and then uses the task’s Controller to notify us of the outcome:

extension Task {
    static func uploading(_ photo: Photo,
                          using uploader: Uploader) -> Task {
        return Task { controller in
            uploader.upload(photo.data, to: photo.url) { error in
                if let error = error {
                    controller.fail(with: error)
                } else {
                    controller.finish()
                }
            }
        }
    }
}

With our first task now available — let’s see what using it looks like at the call site:

for photo in photos {
    let task = Task.uploading(photo, using: uploader)

    task.perform { outcome in
        // Handle outcome
    }
}

Pretty cool! 😎 However, where the power of tasks really starts to shine is when we start grouping and sequencing them, which’ll let us solve our original problem in a very elegant way. So let’s keep going!

Grouping

Earlier, we used a DispatchGroup that was entered and left in order to keep track of when a group of operations were finished, so let’s port that logic into our new task system. To do that, we’ll add a static method on Task that’ll take an array of tasks and group them together. Under the hood, we’ll still use the exact same dispatch group logic as before — but now wrapped in a much nicer API:

extension Task {
    static func group(_ tasks: [Task]) -> Task {
        return Task { controller in
            let group = DispatchGroup()

            // To avoid race conditions with errors, we set up a private
            // queue to sync all assignments to our error variable
            let errorSyncQueue = DispatchQueue(label: "Task.ErrorSync")
            var anyError: Error?

            for task in tasks {
                group.enter()

                // It’s important to make the sub-tasks execute
                // on the same DispatchQueue as the group, since
                // we might cause unexpected threading issues otherwise.
                task.perform(on: controller.queue) { outcome in
                    switch outcome {
                    case .success:
                        break
                    case .failure(let error):
                        errorSyncQueue.sync {
                            anyError = anyError ?? error
                        }
                    }

                    group.leave()
                }
            }

            group.notify(queue: controller.queue) {
                if let error = anyError {
                    controller.fail(with: error)
                } else {
                    controller.finish()
                }
            }
        }
    }
}

With the above in place, we can heavily simplify our media uploading code from before. All we now have to do is to map each piece of media into a Task, and then pass a combined array of those tasks into our new group API, to group them all into a single task — like this:

let photoTasks = post.photos.map { photo in
    return Task.uploading(photo, using: uploader)
}

let videoTasks = post.videos.map { video in
    return Task.uploading(video, using: uploader)
}

let mediaGroup = Task.group(photoTasks + videoTasks)

Groups are great for when we don’t rely on the order of completion of our grouped tasks — but that’s not always the case. Going back to our original problem of not uploading the data for a post until we’re sure that all media was successfully uploaded — that’s one such case. What we’d ideally like is to be able to chain, or sequence, our media uploading operations with the one that finishes the post upload.

Let’s take a look at how we can extend Task to support that.

Sequencing

Rather than using DispatchGroup (which doesn’t have any opinion about the order of our operations), let’s implement sequencing by keeping track of the current task’s index and then continuously executing the next task once the previous one was finished. Once we’ve reached the end of our list of tasks, we’ll consider the sequence completed:

extension Task {
    static func sequence(_ tasks: [Task]) -> Task {
        var index = 0

        func performNext(using controller: Controller) {
            guard index < tasks.count else {
                // We’ve reached the end of our array of tasks,
                // time to finish the sequence.
                controller.finish()
                return
            }

            let task = tasks[index]
            index += 1

            task.perform(on: controller.queue) { outcome in
                switch outcome {
                case .success:
                    performNext(using: controller)
                case .failure(let error):
                    // As soon as an error was occurred, we’ll
                    // fail the entire sequence.
                    controller.fail(with: error)
                }
            }
        }

        return Task(closure: performNext)
    }
}

The reason we don’t simply use a serial DispatchQueue to implement sequencing, is because we can’t assume that our sequence will always be dispatched on a serial queue — the API user can choose to perform it on any kind of queue.

Above we make use of the fact that Swift supports both first class functions, and inline function definitions — since we’re passing our performNext function as a closure to create a Task for our sequence.

And that’s it — believe it or not, but we’ve actually just built a complete Task-based concurrency system — from scratch! 😀

Putting all the pieces together

With all the pieces in place, let’s finally update our original post publishing code, to now make full use of everything our new system has to offer. Instead of having to keep track of errors, or encounter bugs due to the unpredictable nature of network calls, we can now simply form a sequence by combining a group of our media upload operations with a post uploading task — like this:

func publish(_ post: Post,
             then handler: @escaping (Outcome) -> Void) {
    let photoTasks = post.photos.map { photo in
        return Task.uploading(photo, using: uploader)
    }

    let videoTasks = post.videos.map { video in
        return Task.uploading(video, using: uploader)
    }

    let sequence = Task.sequence([
        .group(photoTasks + videoTasks),
        .uploading(post, using: uploader)
    ])

    sequence.perform(then: handler)
}

The beauty of the above solution, is that everything is executing fully concurrently — using dispatch queues — under the hood, but as an API user, all we have to do is to create a few tasks and tell the system how we want to combine them. And since our abstraction is so thin, if we ever encounter a problem or unexpected behavior, we just need to step one level down in order to debug things.

Conclusion

Tasks can be a great way to abstract heavily concurrent code, that either should execute with as much parallelism as possible, or await the completion of previous tasks before moving ahead. They essentially provide a way to create a simple, thin, layer on top of Grand Central Dispatch — which lets us leverage all of its power in a really nice way.

Like mentioned earlier in this article, there are of course many other ways that concurrency can be implemented or used in Swift. Futures & promises make it easy to hide much of what kind of concurrency is going on, at the cost of a slightly lower amount of control — while frameworks like RxSwift make it possible to build much more complex chains of execution, but using a much heavier abstraction.

My advice is to try many different kinds of concurrent programming in Swift — to see which one (or multiple ones) that fits you, your team, and your project the best. Hopefully this article has given you a bit of insight into one way of implementing tasks in Swift, and how tasks might compare to other solutions commonly used in the community. And if that’s the case — then I consider my mission accomplished 😀.

Feel free to find me on Twitter, where you can ask me questions, or let me know what you thought about this article, and tasks in general.

Thanks for reading! 🚀