Weekly Swift articles, podcasts and tips by John Sundell.

Time Complexity

Published on 09 Aug 2019

Measuring the time complexity of a piece of code is a common technique used to optimize algorithms and other kinds of functions, by estimating their cost in terms of execution time.

However, rather than being an exact measurement of how many seconds a block of code takes to run, time complexity aims to answer the question of how the number of inputs affects the amount of work that a function has to perform — and is most often described using “Big O notation”, which lets us express the relationship between an algorithm and its number of inputs as a mathematical function.

As an example, let’s say that we’re working on a calendar app, and that our core Calendar model contains an array of all of the events that appear within a given user’s calendar. We’ve also extended that model with a convenience API for checking whether its first event is currently scheduled after a given date:

struct Calendar {
    var events: [Event]
}

extension Calendar {
    func isFirstEvent(scheduledAfter date: Date) -> Bool {
        guard let firstEvent = events.first else {
            return false
        }

        return firstEvent.date > date
    }
}

The above function has constant time complexity — O(1) — since its execution time isn’t affected by the number of events within the calendar its run on. Since it just looks at the first event, it doesn’t matter if the calendar contains a total of 10, 1000 or 10,000 events — the amount of work that our above function has to perform remains constant.

Now let’s say that rather than just looking at the first event, we instead wish to return all events that are scheduled after a given date, which can be done by filtering our events array — like this:

extension Calendar {
    func events(scheduledAfter date: Date) -> [Event] {
        return events.filter { event in
            event.date > date
        }
    }
}

Since the above filter operation requires us to iterate through each element within the events array, it has a time complexity of O(N) — where N refers to the number of events that our function will be operating on. Since the function has a direct one-to-one relationship with its number of inputs (2 events require 2 iterations, 10 events 10 iterations, and so on), its time complexity is linear.

A function is considered to have linear, or O(N), time complexity as long as the worst case scenario involves iterating through all of its inputs — even if that’s not always required. For example, the following function stops iterating as soon as it finds a match — but is still considered to have O(N) complexity, since the matched element might be the last one:

extension Calendar {
    func firstEvent(scheduledAfter date: Date) -> Event? {
        return events.first(where: { event in
            event.date > date
        })
    }
}

The time complexity of a function can also depend on multiple variables. For example, here we’re again iterating through all events (O(N)), but as part of each iteration we’re also iterating over an array of users (also O(N)):

extension Calendar {
    func events(createdByAnyOf users: [User]) -> [Event] {
        return events.filter { event in
            users.contains(event.creator)
        }
    }
}

The total time complexity of the above function will therefore be number of events * number of users, which can be described using big o notation as O(N * M) — where N is the number of events, and M is the number of users.

While having to iterate through all elements within a collection is sometimes unavoidable (like in this case), whenever we’re dealing with nested iterations, there’s often some form of optimization to be found.

In this case, we can actually give the above function pure O(N) complexity, by passing our users as a Set, rather than as an Array. Since sets can look up whether they contain a given value in constant (O(1)) time — we’d end up with O(N * 1), or simply O(N), in total complexity with this implementation:

extension Calendar {
    // The only change compared to the previous code sample is that
    // users is now passed as Set<User>, rather than [User].
    func events(createdByAnyOf users: Set<User>) -> [Event] {
        return events.filter { event in
            users.contains(event.creator)
        }
    }
}

When it comes to nested iterations, one thing to look out for in particular is when the same set of inputs is iterated over twice — since that’ll give us quadratic time complexity, or O(N²). Here’s an example of such a function, which finds all events that are scheduled on the exact same date as another event — by filtering through the events array (O(N)), and then calling contains on that same array within each iteration (also O(N)):

extension Calendar {
    func conflictingEvents() -> [Event] {
        return events.filter { event in
            events.contains(where: { otherEvent in
                guard event.id != otherEvent.id else {
                    return false
                }
            
                return event.date == otherEvent.date
            })
        }
    }
}

Once a given function reaches quadratic time complexity, chances are quite high that it’ll end up becoming some form of performance bottleneck, especially if there’s any chance that it’ll have to operate on a significant number of inputs.

Thankfully, in this case, there’s once again an optimization that can be made. Rather than iterating through our entire events array again as part of each iteration, let’s start by making a single pass through it to identify the count of events for each date — and then use that collection of counts when filtering — like this:

extension Calendar {
    func conflictingEvents() -> [Event] {
        var eventCountsForDates = [Date : Int]()

        for event in events {
            eventCountsForDates[event.date, default: 0] += 1
        }

        return events.filter { event in
            eventCountsForDates[event.date, default: 0] > 1
        }
    }
}

While we’re still performing two iterations within our new implementation, those iterations are completely decoupled — giving us a complexity of O(2N), rather than O(N²). For a user that has 1000 events in their calendar, that means that we’re now only making a maximum of 2,000 iterations, instead of 1,000,000 — quite an improvement!

Time complexity doesn’t tell the complete story of how long a given function will take to run, or how costly it’ll be in terms of CPU or GPU resources — but it does tell us how well it scales according to the number of inputs given to it. By measuring the time complexity of our algorithms and iterations, we can both identify and fix many common performance problems more easily, and also more clearly communicate the cost of calling a given function to our API users or fellow team members.

Thanks for reading! 🚀