A day of coding…
Following on from my experiment a few days ago to see if I could use a min-max filter for envelope extraction I set about fiddling further with this.
I oh-so-nearly had some nice results to post, but not quite, so bear with me. For the time being, some words will have to do.
First off, I did some quick and dirty experiments just removing the ‘envelope’ from one sound and applying one from another. I did this, at first, by half-wave rectifying the original signal into upper and lower portions, and dividing by the max/min filter outputs respectively. The results weren’t all that exciting, partly because to deal with the inevitable overflow of the demodulated signal beyond -1:1, I just clipped (on the basis of this being the simplest approach to start with).
Whilst pondering how better to approach the demodulation, it struck me that the kind of output I was getting from the min-max filter (basically a step function between local extrema, but not quite) was a similar to what is used in something called Local Mean Decomposition, developed a few years ago by a chap called Jonathan Smith.
Digression…
LMD is one of a range of ways that engineers have been looking at recently to do better time-frequency analysis of non-stationary signals (which is most of them). It is similar in thrust, though not in detail, to another scheme called Empirical Mode Decomposition, first devised by Norden Huang. Both processes are data-driven, meaning that they make no assumptions about what signals are composed of (contra short-time fourier techniques, which assume that everything is a collection of slowly varying sinusoids). However, neither is designed for real-time use. Althought Doug van Nort has investigated the potential of EMD for live electronics and sound analysis – and released an emd~ max object with Kyle McDonald – the object works on successive signal vectors in isolation. This is fine for generating interesting control data and doing feature extraction, but not so fine to listen to as there are pretty harsh discontinuities at vector boundaries, a low frequency limit imposed by the vector size, not to mention that the audible artefacts of EMD are pretty gnarly frequency modulation in sub-bands.
If these schemes could be got to work satisfactorily for audio processing, it would be, like awesome, however. They offer extraordinary time resolution and, in principle, extraordinary frequency resolution as well. They both work on the principle of decomposing a signal into components for which one has both instantaneous amplitude and instantaneous frequency. This would lend itself to, among other things, adaptive, multi-band granular decompositions; filtering with highly desirable transient response; spectral monkeying about (retuning, time stretching, etc.); using infra-audio components for control signals, and so on, and so on.
Back to business…
So, I was tantalised by the vague possibility that using this min-max scheme, I might be able to make an online thing that was at least a bit like LMD, though I wasn’t holding out much hope (having wasted far too much time tinkering with these things already, to little avail). Initially, I just set about following LMD’s lead in approaching demodulation. Principally, this involves rolling both upper and lower envelopes into one, but also generating a ‘mean’ between them. The mean is then subtracted from the original before demodulating, and is this helps avoid blow up. In LMD both the envelope and mean are smoothed as well, then the whole process is iterated until you end up with a pure FM signal between -1:1.
Now, quite a lot of time was spent fixing dumbass bugs, but I eventually got a decent smoothing scheme going (like this) which didn’t add too much extra latency, and started tinkering. Much to my surprise, once I’d corrected for the latency between the min-max filter and the input, I was able to get an FM that was almost -1:1. Almost, for these purposes, is way good enough, so just for kicks, I coded in the other steps of a layer of LMD to see how it sounded.
To my dumbfoundment, not only does it seem to be filtering (I can extract a higher frequency signal, and leave a lower frequency residue), but it doesn’t sound shit! This never normally happens. I need to do more fiddling with the relative filter sizes ( at the moment I’ve got the min-max and smoothing filters set to the same size, which needs decoupling), and various other things, but at the very least there looks to be scope here for a nice little adaptive filter bank.
The really important thing for tomorrow is not to vanish up my algorithmic arse in pursuit of perfection, but to stay focused on the musical task at hand, ahead of Wednesday’s session. To follow John Bowers, ‘crude but usable’ is the order du jour. Once I’ve done some verifying / fiddling (which I shall time-limit), I shall experiment a) with using the various outputs to feed a signal-driven granulator, b) look at multi-band amplitude re-modulations / frequency re-mods (harder, but potentially v. cool) c) see what else I can do with the frequency data.
I’ve got a structure in mind for the final piece, beyond its being a solo-for-two, but Wed. should be more about workshopping and seeing what emerges, I think.
algorithms, creative pact, documentation, maxmsp
adaptive dsp, Decomposition, doug van nort, empirical mode decomposition, Envelope, John Bowers, local mean decomposition, maxmsp, sound