Discrete Fourier Transform

The last part of the previous post mentions the method to find frequencies in a signal: the Discrete Fourier Transform (DFT). This post dives deeper into the DFT.

The main idea of the DFT is to find out which frequency component correlates with the given input signal.

This mathematical formula looks scary. \[ X[k] = \sum_{n=0}^{N-1} x[n] \, e^{-j2\pi k n / N} \]

k = current frequency to check correlation
n = current sample
N = number of samples
x[n] = the value of the current sample of the given signal

My goal in this post is to demonstrate what the DFT performs is simple.

Input signal and correlation signal

Let's say the input signal is a sine wave of 2 Hz. There are 30 samples to represent this signal.

The input signal of 2 Hz should show the highest correlation at 2 Hz. Let's also have 5 different correlation signals varying from 0 Hz to 5 Hz.

The first correlation signal has its frequency = 0. The figure below illustrates the input signal of 2 Hz with blue open circles, the imaginary part of the correlation signal with red circles and the real part with green circles:

The green line (Real part) has all its samples at value 1 and the red line (Im part) has all its samples at 0.

The correlation signal consists of a real part and an imaginary part, derived from the formula. \[ X[k] = \sum_{n=0}^{N-1} x[n] \, e^{-j2\pi k n / N} = \sum_{n=0}^{N-1} x[n] \, \{cos(2\pi k n / N) -j sin (2\pi k n / N)\} \] This derivation uses the Euler's formula.

What happens in this formula is that "multiply the current sample of the input signal x[n] with real and imaginary values at the same time step, and sum up values through time steps" in order to analyse the correlation level of the input signal at frequency k (currently k=0 in the figure).

As we can see from the figure, the correlation signal of 0 Hz has a very little overlap with the input signal of 2 Hz. A more interesting thing happens when the frequency k of the correlation signal is set to 2.

In this figure, the input signal of 2 Hz (open blue circles) looks very similar to the correlation signal of real and imaginary parts.

Point-by-point correlation product

Point-by-point products are the best example to show how correlation of the input signal (2 Hz) looks at each frequency. Point-by-point products are the multiplication of the input signal per time step with the corresponding values of the correlation signal. The first figure shows point-by-point products when the correlation signal is at k=0.

The purple line is multiplication of the input signal with the real part and the orange line is multiplication with the imaginary part. It is clear that the orange samples are all value 0. The purple line has equally positive and negative values. Summing up all of the values of the purple line cancels all the values, and this ends up with 0.

The next figure illustrates point-by-point products when the correlation signal is at k=2.

In this figure, the orange line has all of its samples below 0. Summing up all of these negative values becomes a single large negative value.

DFT magnitude

The two figures (k=0, 2) of point-by-point correlation products show that k = 2 has a stronger correlation with the input signal of 2 Hz. However, the summed up value of point-by-point products at k=2 ends up with a negative value.

This is why the DFT expresses the strength of each frequency component with the magnitude.

The magnitude (Euclidian distance) is computed by: \[ \sqrt \{ Real(X[k])^2 + Imaginary(X[k])^2 \} \]

The summed point-by-point product X[k] consists of real and imaginary parts. The most important of this formula is output being a positive value.

The current example uses the sine wave of 2 Hz as input and correlation sweep runs from 0 Hz to 5 Hz. The figure below illustrates output of the DFT over this example signal:

The highest magnitude is clearly at 2 Hz, and this is what the DFT does.

The whole picture

I wrote a Python notebook to play with this whole process of generating the input and correlation signals, point-by-point products and the DFT magnitude: https://github.com/yasumori/blog/blob/main/2025/2025_12_25_dft.ipynb.

Output of this notebook has a slide bar to change the frequency of the correlation signal, "k". This gives an insight into how the DFT analyses correlation of the input signal with each frequency within a range e.g., 0-5 Hz in the current example.

Sampling rate and Nyquist frequency

The previous post discusses the Nyquist frequency that is N samples are required to represent the digitised signal of N / 2 Hz. This also applies to the DFT.

Let's keep the input signal still at 2 Hz, but change the number of samples to represent the signal to 10 (previously 30) and the range of frequencies to check its correlation between 0 and 9 Hz (previously between 0 and 5 Hz).

The Nyquist frequency says that 10 samples can represent a signal only up to 5 Hz. This is why the DFT in the above figure shows its magnitude peak both at 2 Hz and 8 Hz, despite the input signal being 2 Hz.

At least 18 samples must be present in a signal, in order for the DFT to correctly analyse if the given signal has a frequency component at 9 Hz,

Input signal is 1 Hz + 2 Hz

What if the input signal is addition of 1 Hz signal and 2 Hz signal?

The figure below illustrates such a signal composed of the two frequency components and the correlation signal at k=1. These three lines appear to be similar but not fully.

Fastforward, the DFT deals with such a signal. As shown in the figure below, the magnitude of DFT is high at both frequency k=1 and k=2.

Additional topics (not covered in this post)

The Fourier transform in this post is the Discrete Fourier Transform. The Fourier transform changes its name depending on (1) continuous values or discrete values and (2) finite time or infinite time.
The audio and speech signal can contain multiple frequency components through time. For practical applications, the technique called "windowing" is often used to capture frequency components of a short time frame of an input signal.
The practical application also uses the highly optimised version of the DFT called the Fast Fourier Transform.

Summary

The Discrete Fourier Transform is a fundamental tool for analysing the frequency components of a signal. The math looks intimidating, but what happens is simply to check which frequency k of the correlation signal leaves high values (magnitudes) against the input signal.

References

1. This YouTube video is an excellent demonstration of how the DFT works. This video inspired me to create the demo figure using Python instead of Matlab (which not everyone has access to).

2. A good book chapter of the DFT available online.

3. If you have a question: why does the DFT commonly use complex numbers?

Speech & Language Blog