I’m messing around with your RackAFX FFT object and am looking at: AN-2: Threading an FFT for Parallel Processing in a Plug-In. First of all, thank you for putting this on your website. I have been trying to make sense of the whole FFT scheme for a while – and I have a pretty good understanding of it now – but you seem to be the only one who has put a real-world example online that is easy to follow:)
I rebuild the AN-2 app note. It’s working fine but I have a question about the results you store in the m_dFFT_Magnitude[FFT_LEN/2] array. In the doFFT() function you assign values to the array in the following manner:
m_dFFT_Magnitude[i] = sqrt((re*re)+(im*im));
Why do you include (add) both the real ánd the imaginary number in this calculation? It seems to me that you only need the real part for the frequency response. The imaginary part contains the phase information, doesn’t it?
All the best,
Ps. Cool tip about the RackAFX status window btw
the FFT is a mathematical transformation in a special space. The complex values you see in the FFT are coefficients for the different axes in this space (if you do a 1024 point FFT, you are working in a space of 1024 dimensions where each axis represents a complex exponential function). If you only use the real part of the FFT results, from a pure mathematical view you project the result down to one dimension (where "projecting down" always means a loss of information). But the "magnitude" actually is the absolute value of the complex number which is the "l2-norm" or "geometric norm" and not just the real part.
Here's a very nice resource for DFT:
I'm not sure how to interpret the real part alone - so I modified the project I'm currently working on to show the difference - I was a little surprised that the visible difference actually is very small. It looks to me like the lower the frequency, the higher the error gets.
This is a 8192 point fft with hann window and overlap of 80%. White curve is the magnitude on a dB scale, yellow is the "real part only" curve on the same scale. Red is the absolute error, scaled linear "empirical" so that it fits into the graph window:
Looks like the error occurs where the slope changes sign...or maybe I have a bug in my error-calculation, but I already double checked it.
Will, can you explain what we see here or if I did something wrong?
I checked out your links and delved a little bit deeper in the Parseval’s theorem. I think I understand! I want to test the FFT by making some relatively simple effects like ‘Robotization’ (set the phase in every bin to zero) or ‘Whisperization’ (set the phase to a random value every bin). I got a little confused by the fact that the term ‘Magnitude’ is sometimes used for the real part of the bins, as well as the result of sqrt((re*re)+(im*im)) which is used for building a spectrum analyzer…
I'm also very curious about graph issue. Thanks for your help
I read through the Jack Schaedler tutorial and learned some new stuff A very good source indeed. I realise now that the results of a FFT are not represented in Polar notation but in Cartesian notation… If I transform to Polar notation I get the ‘Magnitude’ and ‘Phase’ (in degrees). This takes away my confusion completely. Thanks for your help, Tom!
Hi Tom and JD
First off, the notion that the real part of the FFT (or ordinary Fourier Transform, FT) is the magnitude and the imaginary part is the phase is just wrong, though I have also seen a few websites using terminology like that. But it looks like Tom pointed you to a good source for understanding the two-dimensionality of the Fourier Transform. You do need both parts to get all the information.
Secondly, one of the things I do in my classes to get a more intuitive understanding of the FT is to do some thought experiments as to why multiplying a function x(t) by a sinusoid (sin(wt), cos(wt) or e^jwt), then integrating the result produces a value that tells you how much of that sinusoid is inside of x(t). Using only cos(wt) on signals x(t), you arrive at the Cosine Transform (and Discrete Cosine Transform). When x(t) is cos(wt), cos(2wt) etc... with no phase shift, you get useable answers about the relative amount of cos(wt) that is inside x(t). But as soon as you shift the phase of x(t) it falls apart - you need another degree of freedom to handle that. Like I said, this is more of a thought experiment, but it gives some more insight as to the workings of the FT.
Tom, I'm not sure how you are doing your FFT plots but I am guessing that for the real part, you are squaring the Re component then taking the root - i.e. setting the Im part to 0; the squaring forces the values to all be positive before you take the log for the plot. Otherwise, you would have a giant mess there.
But the FT output really depends on the input signal. When the input function is even, there is no Imaginary component, so the sqrt(re^2) part and magnitude are going to be the same. Odd functions have no real part at all, so they will be very different mag vs real. We assume audio signals are blends of odd and even functions, so we expect to get a combination of both.
My favorite source on the topic is called "The FFT Fundamentals and Concepts" by Robert Ramirez. The figure here is from that book and shows a pulse with real, imaginary, mag and phase all plotted. You can see that the real and magnitude plots are very similar looking, though the mag plot is unipolar from the squaring. Also, the imaginary part is not trivial, but the mag does indeed resemble the real part. Using the sqrt(re^2) and then taking the log would make them look even closer - small changes in dB are larger changes in real values. So, although they may look similar, they are not identical.
A purely odd function is here, with no real component:
In this case, the magnitude is just a rectified version of the imaginary component. Likewise in a purely even function you would have a magnitude that is a rectified version of the real part.
So, the similarities you are seeing are not something to get worried about, and I think taking the log of the signals is making them even more similar to they eye. But, you should forget the idea that the real component is always going to be similar to the magnitude, as the FFT-1.jpg illustration shows.
Hope that helps - great discussion you guys have going on.
Thanks for your message! I learned a lot about ‘correlation’ from Tom’s source, which makes a lot of sense. I also read the book by Robert Ramirez you recommended. I got about 75% of it. I’m gonna do some test with windowing and overlap-add and then I want to explore a few of the available libraries. I heard about Kiss-FFT, FFTW and Octave and am very curious about the differences. (The thing that confuses me most at the moment is multithreading in C++)
I do have one more question after reading the book though. When you take a real world signal of arbitrary length, like 512 samples, the chances are that the average value of the signal will be non-zero. The author says that a DC component like that will introduce inaccuracies in the frequency domain. This can be prevented by compensating for the DC component – making the average signal zero – before doing the FFT.
I imagine that this might be a good strategy if you’re making a spectrum analyzer, but is the same true for like an auto-tune/pitch-correction like effect? An example:
- Get input array
- Compensate for the DC component
- Store the difference
- Do the FFT
- Do some kind of processing for pitch-correction
- Do the IFFT
- Add the stored difference to the output of the IFFT
- Create output array
Could this make the processing in between the FFT and IFFT more accurate, or would this distort the waveform in a bad way… Just thinking out loud here
All the best,
In our work, we've not had issues with DC offsets when using windowing and especially overlap. You can do an example in RackAFX - just open the analyzer and oscillator and turn windowing off in the FFT. You will see a rocking back and forth motion with a huge DC offset appearing when you use a 1kHz signal and 1024 FFT size. The fact that 1kHz is not an exact bin frequency causes the captured buffers to contain a DC offset. Then, change the oscillator frequency to 215.332 Hz, a multiple of the bin frequency - the DC offset disappears as the captured buffers contain complete cycles of data.
Now go back to 1kHz and then add the windowing back in. You will see a massive reduction in the DC offset as well as a tightening of the lobe. With Blackman-Harris windowing, even a 1024 point FFT will give decent results without DC offset issues. BTW, the RAFX analyzer does not currently use overlap at all for the FFT.
Recently, several of my students have been using the HiFiLoFi FFT convolver for convolving long IRs for reverb. It contains both fast FFT and fast convolution objects. I have not used it personally, but last semester one of my students used it for convolving stupidly long IRs (the fast convolution requires a fast FFT as well). I am told it is fairly easy to integrate into RAFX plugins whereas FFTW is not as simple.
I also have a few students who are using the RAFX FFT object for simple frequency domain processing using FFT and IFFT in realtime without issues, but yes you still need to deal with multithreading.
PS: finishing up the newest RackAFX now - it includes the log/volt-octave GUI controls as well as a bunch of other fixes and enhancements including full preset-support for VST and a new simple one-click-compile for VST2/3 in MacOS. Hope to have it ready by end of this week.
That is good to know! I tried your settings in RackAFX, and it’s cool to see that the windowing has such an impact, without overlap… Nice to see the different windows in action like this. I downloaded the FFTConvolver from github: great tip! But I’ll explore the RackAFX object more first then.
Very curious about the new RackAFX
Hope I don't bother you too much today, but I’ve got one last question about your FFT object. I have successfully made a few tests with the fft_double() function from your class – which is pretty cool! However, I can’t get the inverse FFT to work. It just doesn’t do anything. I thought the way to use it was just to set the boolean flag argument ‘p_bInverseTransform’ to ‘true’. It might be something silly I overlooked, but I tried this to make sure:
1. Input -> Do FFT
2. Get magnitude and make spectrum with RAFX meters
3. Set real and imaginary parts of the complex number to zero
4. Do IFFT -> Output
The spectrum analysis is working fine, but since I still get an unchanged output signal, I concluded the IFFT is not doing anything. I thought it might be necessary to change the arguments of the function too, but that doesn’t seem to make a difference. I have a feeling I’m missing something obvious here… Any idea what I’m doing wrong?
Here’s a code snippet that might clear some things up: https://www.dropbox.com/s/iq3jecjo8wdco7u/IFFT.png?dl=0
Your arguments are wrong for the IFFT.
The input to the IFFT are the real and imaginary buffers that where the output of the first FFT, not m_dInputArray and NULL.
The output of the IFFT needs to go to another set of buffers.
I will try to get one of my student's projects for you to look at.
Hi Will and Tom,
I did some interesting experiments with the FFT object. First I successfully made a robotization and whisperization plug-in, and after that I tried some different windowing and overlap-add. After some more tests I found out that the atan() function doesn’t give you the right values for all four quadrants on the complex plane. I solved this with two simple if-statements, adjusting the phase accordingly. Now everything is setup for doing more complex frequency domain processing But I still run into something I can’t explain.
I want to make a mono->stereo plug-in by shifting the phase of the left channel, and leaving the right channel alone (except for compensating for the FFT’s delay). This works without problems when I shift the phase of the left channel by pi radians, but if I shift by pi/2.0 radians the left channel goes to zero. I tried everything I could think of – doing some math on paper to check - but haven’t been able to solve this. Is a shift by pi/2.0 radians possible? Any tips and tricks are welcome!
All the best,
Wow, that's interesting - I tried this, too and I see the same effect.
Did you take a look at the analyzer? The oscillator and output meter show only some low noise left the closer you come to +/-90° BUT the analyzer still shows the same as without the phase shift!?
I'm lost, too...
Hi Tom! Glad to see you have the same results, I thought I lost it for a minute haha... Can it have something to do with the ‘mirrored’ results of an FFT. That the two sides will cancel eachother out with 90° or something, due to the 90° difference between the sin and cos base-functions in the IFFT?
Hi Tom and Will,
I created the same setup with the HiFi-LoFi audioFFT class. It gives similar results. At 180° it works fine. On 90° it does not dissapear but you hear the phase shifted signal with a strange buzzing sound in it… interesting stuff
EDIT: the buzzing sound was the rectangular window on a short FFT lenght. With 50% overlap-add, a HANN window and a minimum FFT length of 512 samples: everything works fine with the HiFi-LoFi audioFFT class... The RAFX FFT object also sounds smooth, but makes the channel dissapear at 90°.
I now did some tests with the AudioFFT wrapper and the Ooura "real" and "complex" DFT code. Indeed there are some lines burried in the "documentation" of the originql ooura code, where he states very subtly (hint: "less-or-equal" for real, but "less" for imaginary):
a[2*k] = R[k], 0<=k<t n/2
a[2*k+1] = I[k], 0<k<n/2
a = R[n/2]
a[2*j] = R[j], 0<=j<n/2
a[2*j+1] = I[j], 0<j<n/2
a = R[n/2]
This means that the real part of the nyquist bin is stored in the imaginary part of the DC bin, and there it is expected to be in the inverse tranform, too. This seems to be a quite common "encoding" for FFT libraries - but/thus it is not mentioned very prominently in the API docs...thanks for pointing me to AudioFFT where I stumbled across pre/post processing code compensating for that!
Now I think I understand what happens with the "naive" phase-rotation code (assuming you implemented it like me)
I simply stepped thru all bins and applied a plain 2D rotation matrix, not treating DC and nyquist separately. But when rotating the DC and Nyquist bins (e.g. setting a non-zero imaginary part), one essentially makes it to generate a complex signal in the inverse DFT.
When using the ifft from the AudioFFT wrapper, it will simply ignore the imaginary parts of DC and nyquist bins while composing the interleaved complex array from the separate real and imaginary arrays before passing it to the ooura code.
JD - thank you so much for your "problem". I only now understand the whole idea behind the real valued DFT!
That makes total sense! (although I had to read your message several times haha). I will go run some test on the RAFX fft object to see if I can get it to work correctly now… But I guess the HiFi-LoFi is a bit more efficient to use anyway, because you only have to go through half the bins…
Thanks for your detailed answer Tom, and I’m glad you got something out of it too!
Most Users Ever Online: 152
Currently Browsing this Page:
Guest Posters: 1
Newest Members:TheJonDoe, DoubleLiines.com, jmf11, dan, Luke Bilodeau, Carlos_1, ant, marclingk, TheCammen, Hubbert
Moderators: W Pirkle: 444
Administrators: Tom: 74, JD Young: 80, Will Pirkle: 0, W Pirkle: 444