Managing self and cancellable references when using Combine
Discover page available: CombineMemory management is often especially tricky when dealing with asynchronous operations, as those tend to require us to keep certain objects in memory beyond the scope in which they were defined, while still making sure that those objects eventually get deallocated in a predictable way.
Although Apple’s Combine framework can make it somewhat simpler to manage such long-living references — as it encourages us to model our asynchronous code as pipelines, rather than a series of nested closures — there are still a number of potential memory management pitfalls that we have to constantly look out for.
In this article, let’s take a look at how some of those pitfalls can be avoided, specifically when it comes to self
and Cancellable
references.
A cancellable manages the lifetime of a subscription
Combine’s Cancellable
protocol (which we typically interact with through its type-erased AnyCancellable
wrapper) lets us control how long a given subscription should stay alive and active. Like its name implies, as soon as a cancellable is deallocated (or manually cancelled), the subscription that it’s tied to is automatically invalidated — which is why almost all of Combine’s subscription APIs (like sink
) return an AnyCancellable
when called.
For example, the following Clock
type holds a strong reference to the AnyCancellable
instance that it gets back from calling sink
on a Timer
publisher, which keeps that subscription active for as long as its Clock
instance remains in memory — unless the cancellable is manually removed by setting its property to nil
:
class Clock: ObservableObject {
@Published private(set) var time = Date().timeIntervalSince1970
private var cancellable: AnyCancellable?
func start() {
cancellable = Timer.publish(
every: 1,
on: .main,
in: .default
)
.autoconnect()
.sink { date in
self.time = date.timeIntervalSince1970
}
}
func stop() {
cancellable = nil
}
}
However, while the above implementation perfectly manages its AnyCancellable
instance and the Timer
subscription that it represents, it does have quite a major flaw in terms of memory management. Since we’re capturing self
strongly within our sink
closure, and since our cancellable (which is, in turn, owned by self
) will keep that subscription alive for as long as it remains in memory, we’ll end up with a retain cycle — or in other words, a memory leak.
Avoiding self-related memory leaks
An initial idea on how to fix that problem might be to instead use Combine’s assign
operator (along with a quick Data
-to-TimeInterval
transformation using map
) to be able to assign the result of our pipeline directly to our clock’s time
property — like this:
class Clock: ObservableObject {
...
func start() {
cancellable = Timer.publish(
every: 1,
on: .main,
in: .default
)
.autoconnect()
.map(\.timeIntervalSince1970)
.assign(to: \.time, on: self)
}
...
}
However, the above approach will still cause self
to be retained, as the assign
operator keeps a strong reference to each object that’s passed to it. Instead, with our current setup, we’ll have to resort to a good old fashioned “weak self
dance” in order to capture a weak reference to our enclosing Clock
instance, which will break our retain cycle:
class Clock: ObservableObject {
...
func start() {
cancellable = Timer.publish(
every: 1,
on: .main,
in: .default
)
.autoconnect()
.map(\.timeIntervalSince1970)
.sink { [weak self] time in
self?.time = time
}
}
...
}
With the above in place, each Clock
instance can now be deallocated once it’s no longer referenced by any other object, which in turn will cause our AnyCancellable
to be deallocated as well, and our Combine pipeline will be properly dissolved. Great!
Assigning output values directly to a Published property
Another option that can be great to keep in mind is that (as of iOS 14 and macOS Big Sur) we can also connect a Combine pipeline directly to a published property. However, while doing so can be incredibly convenient in a number of different situations, that approach doesn’t give us an AnyCancellable
back — meaning that we won’t have any means to cancel such a subscription.
In the case of our Clock
type, we might still be able to use that approach — if we’re fine with removing our start
and stop
methods, and instead automatically start each clock upon initialization, since otherwise we might end up with duplicate subscriptions. If those are tradeoffs that we’re willing to accept, then we could change our implementation into this:
class Clock: ObservableObject {
@Published private(set) var time = Date().timeIntervalSince1970
init() {
Timer.publish(
every: 1,
on: .main,
in: .default
)
.autoconnect()
.map(\.timeIntervalSince1970)
.assign(to: &$time)
}
}
When calling the above flavor of assign
, we’re passing a direct reference to our Published
property’s projected value, prefixed with an ampersand to make that value mutable (since assign
uses the inout
keyword). To learn more about that pattern, check out the Basics article about value and reference types.
The beauty of the above approach is that Combine will now automatically manage our subscription based on the lifetime of our time
property — meaning that we’re still avoiding any reference cycles while also significantly reducing the amount of bookkeeping code that we have to write ourselves. So for pipelines that are only configured once, and are directly tied to a Published
property, using the above overload of the assign
operator can often be a great choice.
Weak property assignments
Next, let’s take a look at a slightly more complex example, in which we’ve implemented a ModelLoader
that lets us load and decode a Decodable
model from a given URL
. By using a single cancellable
property, our loader can automatically cancel any previous data loading pipeline when a new one is triggered — as any previously assigned AnyCancellable
instance will be deallocated when that property’s value is replaced.
Here’s what that ModelLoader
type currently looks like:
class ModelLoader<Model: Decodable>: ObservableObject {
enum State {
case idle
case loading
case loaded(Model)
case failed(Error)
}
@Published private(set) var state = State.idle
private let url: URL
private let session: URLSession
private let decoder: JSONDecoder
private var cancellable: AnyCancellable?
...
func load() {
state = .loading
cancellable = session
.dataTaskPublisher(for: url)
.map(\.data)
.decode(type: Model.self, decoder: decoder)
.map(State.loaded)
.catch { error in
Just(.failed(error))
}
.receive(on: DispatchQueue.main)
.sink { [weak self] state in
self?.state = state
}
}
}
While that automatic cancellation of old requests prevents us from simply connecting the output of our data loading pipeline to our Published
property, if we wanted to avoid having to manually capture a weak reference to self
every time that we use the above pattern (that is, loading a value and assigning it to a property), we could introduce the following Publisher
extension — which adds a weak-capturing version of the standard assign
operator that we took a look at earlier:
extension Publisher where Failure == Never {
func weakAssign<T: AnyObject>(
to keyPath: ReferenceWritableKeyPath<T, Output>,
on object: T
) -> AnyCancellable {
sink { [weak object] value in
object?[keyPath: keyPath] = value
}
}
}
With the above in place, we can now simply call weakAssign
whenever we want to assign the output of a given publisher to a property of an object that’s captured using a weak reference — like this:
class ModelLoader<Model: Decodable>: ObservableObject {
...
func load() {
state = .loading
cancellable = session
.dataTaskPublisher(for: url)
.map(\.data)
.decode(type: Model.self, decoder: decoder)
.map(State.loaded)
.catch { error in
Just(.failed(error))
}
.receive(on: DispatchQueue.main)
.weakAssign(to: \.state, on: self)
}
}
Is that new weakAssign
method purely syntactic sugar? Yes. But is it nicer than what we were using before? Also yes 🙂
Capturing stored objects, rather than self
Another type of situation that’s quite commonly encountered when working with Combine is when we need to access a specific property within one of our operators, for example in order to perform a nested asynchronous call.
To illustrate, let’s say that we wanted to extend our ModelLoader
by using a Database
to automatically store each model that was loaded — an operation that also wraps those model instances using a generic Stored
type (for example in order to add local metadata, such as an ID or model version). To be able to access that database instance within an operator like flatMap
, we could once again capture a weak reference to self
— like this:
class ModelLoader<Model: Decodable>: ObservableObject {
enum State {
case idle
case loading
case loaded(Stored<Model>)
case failed(Error)
}
...
private let database: Database
...
func load() {
state = .loading
cancellable = session
.dataTaskPublisher(for: url)
.map(\.data)
.decode(type: Model.self, decoder: decoder)
.flatMap {
[weak self] model -> AnyPublisher<Stored<Model>, Error> in
guard let database = self?.database else {
return Empty(completeImmediately: true)
.eraseToAnyPublisher()
}
return database.store(model)
}
.map(State.loaded)
.catch { error in
Just(.failed(error))
}
.receive(on: DispatchQueue.main)
.weakAssign(to: \.state, on: self)
}
}
The reason we use the flatMap
operator above is because our database is also asynchronous, and returns another publisher that represents the current saving operation.
However, like the above example shows, it can sometimes be tricky to come up with a reasonable default value to return from an unwrapping guard
statement placed within an operator like map
or flatMap
. Above we use Empty
, which works, but it does add a substantial amount of extra verbosity to our otherwise quite elegant pipeline.
Thankfully, that problem is quite easy to fix (at least in this case). All that we have to do is to capture our database property directly, rather than capturing self
. That way, we don’t have to deal with any optionals, and can now simply call our database’s store
method within our flatMap
closure — like this:
class ModelLoader<Model: Decodable>: ObservableObject {
...
func load() {
state = .loading
cancellable = session
.dataTaskPublisher(for: url)
.map(\.data)
.decode(type: Model.self, decoder: decoder)
.flatMap { [database] model in
database.store(model)
}
.map(State.loaded)
.catch { error in
Just(.failed(error))
}
.receive(on: DispatchQueue.main)
.weakAssign(to: \.state, on: self)
}
}
As an added bonus, we could even pass the Database
method that we’re looking to call directly into flatMap
in this case — since its signature perfectly matches the closure that flatMap
expects within this context (and thanks to the fact that Swift supports first class functions):
class ModelLoader<Model: Decodable>: ObservableObject {
...
func load() {
state = .loading
cancellable = session
.dataTaskPublisher(for: url)
.map(\.data)
.decode(type: Model.self, decoder: decoder)
.flatMap(database.store)
.map(State.loaded)
.catch { error in
Just(.failed(error))
}
.receive(on: DispatchQueue.main)
.weakAssign(to: \.state, on: self)
}
}
So, when possible, it’s typically a good idea to avoid capturing self
within our Combine operators, and to instead call other objects that can be stored and directly passed into our various operators as properties.
Conclusion
While Combine offers many APIs and features that can help us make our asynchronous code easier to write and maintain, it still requires us to be careful when it comes to how we manage our references and their underlying memory. Capturing a strong reference to self
in the wrong place can still often lead to a retain cycle, and if a Cancellable
is not properly deallocated, then a subscription might stay active for longer than expected.
Hopefully this article has given you a few new tips and techniques that you can use to prevent memory-related issues when working with self
and Cancellable
references within your Combine-based projects, and if you have any questions, comments, or feedback, then feel free to reach out via either Twitter or email.
Thanks for reading!