instanZ : fast analytic reassigned spectrograms in Pytorch

instanZ consists of a torch-based library for computing analytic reassignment spectrograms, together with a handy lightweight sonogram viewer, running on Pytorch. It can be found in the Lab’s github repository.

The app features both STFT sonograms as well as analytic reassignment.

The library of analytic reassigment has both frequency stride and time stride algorithms. It is based on the analytic reassignment framework (instantaneous time-instantaneous frequency) derived in Timothy Gardner and Marcelo Magnasco, Sparse time-frequency representations, PNAS 2006.

The analytic reassignment function is implemented entirely in torch parallel tensor calls. There are no loops in the code.

Method

Analytic reassignement works by noting that, defining a complex variable z to have time as the real part and frequency as the imaginary part, correctly scaled by a bandwidth \sigma as

    \[z \equiv \frac{t}{\sigma} - i \sigma \omega\]

, then the integral

    \[G(z) = \int e^{-(z - \frac{t'}{\sigma})^2/2} x(t') dt'\]

is an analytic function of z. Furthermore, defining the STFT over a Gaussian window and over a derivative of a Gaussian,

    \[\chi(t,\omega) = \int e^{-(t-t')^2/(2\sigma^2)} e^{i \omega (t-t') x(t') dt'\]


    \[\eta(t,\omega) = \frac{1}{\sigma} \int (t'-t) e^{-(t-t')^2/(2\sigma^2)} e^{i \omega (t-t') x(t') dt'\]


then

    \[G(z)= \chi e^{(\sigma \omega)^2/2\]


so up to a factor the STFT \chi is also analytic, and defining

    \[z_{ins} = \frac{1}{\sigma} t_{ins} - i\sigma \omega_{ins}\]


where t_{ins} is the instantaneous time and \omega_{ins} the instantaneous frequency, we get

    \[z_{ins} = z + \left( \frac{\eta}{\chi} \right)^\dagger\]

where \dagger is complex conjugation.

The instanZ code is based on computing z_{ins} (hence the name! haha) using this expression directly. Both stfts are computed in torch by unfolding the input signal into overlapping snippets with torch.unfold, windowing and ffting in parallel, and then creating the 2D histogram.

Library

The instanZ library has the following interface:

The app

The app is a self-contained Python program with user interface in Tkinter. It supports opening a variety of soundfiles supported in TorchAudio, plus movies (from which only the audio tracks is examined), through (a) command line argument (b) file menu “open” (c) drag-and-drop anywhere.