Building a Granular Synth in Swift, Part 3: Making it Grain

In the previous two posts (1, 2) we’ve essentially created a very barebones AVAudioSourceNode audio playback example. Apple has a good example too, but it’s a command-line app that doesn’t load audio. In any case, it’s time for grains. As most reader probably know, the seminal reference for grain concepts can be found in Curtis Roads’ Microsound.

We’ll want a data structure for each audio grain, and we’ll want some encapsulation that manages all of them. Leaving out most details for now, our most basic grain player might look like GrainSwift/GrainEngine.swift at 784979720b1a1c1551191ce60701538b53ae9a9a · thestrangeagency/GrainSwift · GitHub.

Interestingly we already run into Swift speed limitations here, being able to play about 2 grains on the simulator. Increasing the grain count results in stuttering and then the audio dropping out completely. Building for Release instead of Debug lets us crank the grain count up to 2000 and beyond, so there’s a staggering difference in the optimized code.

I think the code above isn’t violating any realtime principles, like doing memory allocation or locking, but the function calls and array iteration maybe be doing some nastiness under the hood. Many have argued that Swift has no place on the audio thread, but I’m curious if things are any better now. There is some room for Swift optimization, and it would be interesting to benchmark some variations on this implementation along with perhaps ye olde C++ approach.

Instead of trying to time the audio calls, maybe trying to max out grain count would give a good real-world measure. With adds benchmarking control · thestrangeagency/GrainSwift@0afb18f · GitHub we make a new Release scheme and some testing UI, and we can happily get up to 12,000 grains on an iPhone Pro Max. Not too shabby.

Note that to add the control, grainEngine became an instance variable of the Audio class. I tried a guard around grainEngine inside the AVAudioSourceNode callback but this completely broke audio rendering.

// not so great
guard var grainEngine = self.grainEngine else {
                return noErr

Using a forced unwrap as in the commit above or even a default value for the optional works fine however.

// chill
let sample = self.grainEngine!.sample()
// also chill
let sample = self.grainEngine?.sample() ?? SIMD2<Float>(0.0, 0.0)

As far as the Swift optimization reference, making things more private, fileprivate, or final had no effect. Using ContiguousArray instead of Array for the grains resulted in crapping out after 6,000 grains. This is lauded as a speed boost but for some reason cuts performance in half. There’s some discussion about it here: Execution time ContiguousArray vs Array - Using Swift - Swift Forums, but I think I’ll stick with Array for now.

Inlining all of the grain code and using an UnsafeMutablePointer to store grain indices, gets us to just over 20,000 grains. These are impressive gains but they come at the cost of some ugliness, essentially doing away with any structure: inline all the things · GitHub.

One further option is keeping the structs but using withUnsafeMutableBufferPointer for their iteration. This gets us to 14,000 grains. We will want to do a lot more math per grain, adding windowing, offsets, LFOs, and so forth, but we also need maybe a 10th as many grains at the very most, so this implementation is likely to be good enough, even on a slower device.

Curiously, a fairly raw Objective C implementation only gets us to about 10,000 grains. It does however give much better performance in Debug mode, which in a complex app is certainly worth something.

Finally, it’s time for a little refactoring. Let’s move the source node initialization code to the grain engine, which we’ll rename to GrainSource. And instead of piping grain controls through methods in the Audio object, we’ll keep these grain-specific options in the GrainSource: Release 0.0.1 · thestrangeagency/GrainSwift · GitHub.