6
$\begingroup$

I know the DFT, I agree with the formula and everything, but I don't get the intuition on the link between frequency resolution and number of samples.

Like, why would I get a higher frequency resolution by taking more samples in my DFT?

The formula is: enter image description here

Here, we are not summing on a finite number of k discrete frequencies, we are just summing over N samples.

Here is a great video from 3Blue1Brown, which helped me get a visual intuition on Fourier Transform: https://www.youtube.com/watch?v=spUNpyF58BY&t=915s

In this video, we can see how we evaluate our signal on the exp function for different rotational speeds. So I don't get why we could not evaluate our signal on how many frequency points we want with the DFT.

It's weird, cause I understand why we go from discrete time to discrete frequencies. But I don't have the intuition and visual representation in my head.

Can somebody please help me?

Many thanks in advance

Antoine

$\endgroup$
2
  • $\begingroup$ Imagine you have 10 samples at 1Hz. Can you reliably tell the difference between a 0.1Hz wave (one cycle in your samples) and a 0.101Hz wave? Now imagine you have 1000 samples of that wave: you can just count the cycles and see whether you have 100 or 101 cycles $\endgroup$
    – user253751
    Jan 29 at 9:44
  • $\begingroup$ Hi user253751, Thanks for your reply! It's the kind of concise and logical explanation I was looking forward. Thanks a lot. $\endgroup$ Jan 30 at 21:36

7 Answers 7

5
$\begingroup$

It has to do with the fact that two sinusoids with a different, but exact integer, number of period (within some window or vector length) are orthogonal, or, as sampled data, sum to zero under vector dot product.

So take two sinusoids that are very slightly different in frequency (so slight that you couldn't tell at a quick glance that they were any different). Extend both in length, and at some length eventually that slight difference in frequency will cause one sinusoid to go though one more period than the other. Then the difference will be quite noticeable because, due to the difference in the number of full periods between them, at some point they will be 180 degrees out of phase, or exact opposites in amplitude. Obviously not the same.

Thus showing that a longer amount of data (more samples at the same sample rate) provides greater ability to tell two frequencies apart. The slighter the difference, the more data required to get a full period of difference, and thus dot product cancellation.

Go in the opposite direction, and pretty soon two moderately different frequency sinusoids will overlap over some short enough segment with less difference than your pencil mark or chalk line widths, thus looking the same. (or less different than your sampler's finite quantization noise). Causing you to mistake the two as just one.

If you window, you will need around two integer periods of difference over your number of samples, due to artifacts the windowing introduces to basis vector orthogonality.

$\endgroup$
1
  • $\begingroup$ Hi hotpaw2, thanks for the brilliant explanation. I think it's the one that best answers my need for a intuition. $\endgroup$ Jan 30 at 22:03
6
$\begingroup$

This is a common cause of confusion. Presumably you're thinking of the problem like

Given a time series $(s_n)_{n\in\{0..N-1\}}$ which we know is of the form $$ s_n = A\cdot e^{i\omega \frac{n}N},$$ can we find out what values $A\in\mathbb{C}$ and $\omega\in\mathbb{R}$ have?

Now, that is a problem one can tackle with Fourier analysis, but it's not really the most suitable tool. Better would be to perform a nonlinear least-square fit with e.g. Levenberg-Marquardt, which would indeed be able to obtain good approximations of $A$ and $\omega$ even from a short time series.

If Fourier analysis is used, one could be forgiven to expect that it works by just applying the DFT to the time series and the result would be a decomposition $(S_k)_{k\in\{0..N-1\}}$ where all the $S_k$ are zero except for the one at $k = \frac{\omega}{2\cdot\pi}$. But that only works out if $\omega$ actually happened to be an exact multiple of $2\cdot\pi$ in the first place.

In general, you get a lot of spectral leakage instead. This can be mitigated by applying suitable window functions before the DFT, but ultimately you still never get a result of only a single frequency but instead multiple bins showing some amplitude. And the shorter your time series, the wider these bins are (because there are fewer of them).

Why then do we put up with Fourier analysis, if it has these problems? Well, because it's solving a different, much more involved problem: where you start out with a signal with many different components, like $$ s_n = A_0\cdot e^{i\omega_0 \frac{n}N} + A_1\cdot e^{i\omega_1 \frac{n}N} + A_2\cdot e^{i\omega_2 \frac{n}N} + \text{weird aperiodic signal} + \mathrm{noise} $$ In that case, fitting approaches like Levenberg-Marquard get more and more problematic, but the Fourier transform just marches on and still extracts the amplitudes of any sinusoidal components quite reliably.

$\endgroup$
1
  • $\begingroup$ Hi leftaroundabout, Thanks for taking the time to reply. I was more looking for a high-level abstraction/intuition. Your explanation is a bit difficult to understand for me unfortunately... My signal processing courses from uni start dating. $\endgroup$ Jan 28 at 16:13
6
$\begingroup$

When you set your $k=1$, you get the lowest frequency (that's not 0); all other frequency bins are integer multiples of that frequency, so that's actually your frequency "raster" (could call it resolution, if you want).

Now, with a low $N$, $e^{-2i\pi \cdot n/N}$ rotates quickly, because you divide by a small number, so the lowest frequency is already pretty high. With a high $N$, you get a much slower rotation.

Other visual thing: The DFT only represents oscillations that fit a whole number of times into its observation of length $N$ (in addition to the constant component). So, the lowest frequency that's representable has a period that fits exactly 1 time into $N$. Obviously, for small $N$, that means a short period, and thus a high frequency, and therefore a "worse" resolution.

$\endgroup$
8
  • 2
    $\begingroup$ I think these points are spot on, but would add that one can artificially increase $N$ to get more bins, or an increase in resolution depending on how it's defined. This is equivalent to zero-padding the FFT. Though the addition of zeros does not necessarily add any new information, the extra bins can be useful for better resolving the frequency and amplitude of sinusoids occurring between frequency bins using interpolation. This paper combines zero-padding and quadratic interpolation to do just that. $\endgroup$
    – Ash
    Jan 27 at 16:52
  • 1
    $\begingroup$ @Ash totally correct, though I'd like to add that if you know you're looking for only a single sinusoid in otherwise white noise, there's lower-effort, arbitrary-precision (as in: resolution is limited by SNR, not by the method) parametric spectral estimators. The DFT/FFT isn't the right tool, necessarily! $\endgroup$ Jan 27 at 16:56
  • $\begingroup$ Hi Marcus Müller, Thank you for taking the time to reply. What you say here makes sense to me. But, why are we saying that k is in N (1, 2, 3, ...) for the DFT? Couldn't evaluate our digital signal on any value of k we want? $\endgroup$ Jan 28 at 16:27
  • $\begingroup$ we can't; look at your formula! $\endgroup$ Jan 28 at 16:29
  • 1
    $\begingroup$ @AgileProgrammer you could evaluate it with any non-integer $k$, but you wouldn't get any more useful information. For example, if you have a peak at $k=17$ then evaluating with $k=16.5$ would always just give you bleed from that same peak. The only reason you can distinguish different frequencies by DFT is that the sinusoidals are orthogonal, but on a limited interval only e.g. $k=16$ and $k=17$ are orthogonal, not the frequencies in between. $\endgroup$ Jan 28 at 16:37
2
$\begingroup$

If you think of the DFT as a (complex) FIR filter that is convolved with the input signal with a kernel of length N and you discard N-1 outputs before inspecting the final output that is generated from the full length N overlap between input signal and filter coefficients.

Does it make intuitive sense that a two-tap FIR filter cannot distinguish between many frequency bands, while a 2048-tap filter can distinguish between more frequencies?

edit:

Using the following MATLAB script for generating a frequency sweep, doing overlapped-window FFTs of length N subportions and plotting the magnitude as a function of time and frequency

fs = 10*100;
x = chirp(0:(1/fs):(1-1/fs), 0, 1, fs/2);
figidx = 1;
for N = [8 64]
    x_b = buffer(x, N, N-1, "nodelay");
    x_b = x_b.*hann(N);
    W = fft(eye(N));
    X = W*x_b;
    figure(figidx), 
    subplot(2,2,2)
    imagesc(real(W))
    set(gca, 'ydir', 'normal')
    title('Real(W)')
    colormap gray
    xlabel('time [frame]')
    ylabel('frequency [DFT bin]')

    subplot(2,2,3)
    plot(x)
    title('Input chirp')
    subplot(2,2,4)
    imagesc((abs(X)))
    set(gca, 'ydir', 'normal')
    title('Response')
    axis tight
    xlabel('time [frame]')
    ylabel('frequency [DFT bin]')
    figidx = figidx + 1;
end

For N = 8 (top) and N = 64 (bottom) I get these 2x2 subplots where the input signal is in the lower left, the real value of the DFT matrix is in the upper right, and the lower right of each shows the 8/64 "lanes" where the top ~half is a mirror image of the bottom half. Clearly, there is more frequency resolution to be had from the 64-point DFT than the 8-point DFT because the 64-point is less smeared and there are more of them. Possibly, for this particular case, one could interpolate the 8-pt DFT on the assumption that the input is single-frequency, but in general one cannot assume that to be true.

enter image description here

The point here is that a 64-sample DFT row or column contains many cycles of a given center frequency at fs/2. One would expect to couple well (uniquely) with input of approximately that frequency. Thinking of matched filters or correlators, having a unique pattern be long is usually a stronger "key" than having a short pattern.

We get a uniform partion of "frequency" from 0 (DC) to fs/2. If N is larger, each partition is narrower, and as we have seen that it is also "sharper" this means that we can resolve more frequency detail.

Choice of window shape also matters a bit. Try commenting the line with the hann window and observe what happens.

$\endgroup$
2
  • $\begingroup$ can you explain this without reference to topics which might not be known to everyone or give at least a short intuitive understanding for the connection? $\endgroup$ Jan 27 at 11:28
  • $\begingroup$ Hi Knut Inge, Thanks for taking the time to answer. As I replied to OuttaSpaceTime below, I was looking for a more high level intuition/explanation/abstraction (if that exists). Even though I had signal processing courses at university, it starts dating, and I have difficulties understanding your reply. $\endgroup$ Jan 28 at 16:11
1
$\begingroup$

I think it can be expressed visually quite good:

look at this code and the plots

import numpy as np
import matplotlib.pyplot as plt

k = 1
N = 50
time  = np.arange(0.,N,1)/N # time in seconds
signal = np.exp(-1j*2*np.pi*k*time)

fig, axs = plt.subplots(2)

centroid = np.sum(signal)/N
axs[0].plot(signal.real, signal.imag, 'o')
axs[0].plot(centroid.real, centroid.imag, 'o')
axs[0].set_ylabel('Imaginary')
axs[0].set_xlabel('Real')

axs[1].plot(time, signal.real)
axs[1].plot(time, signal.imag, 'orange')
axs[0].set_ylabel('Imaginary')
axs[0].set_xlabel('Real')

plt.show()

which looks like

![enter image description here

When you set higher values of $k$ you can see the following effect Feel free to play around with these parameters.

for example $k = 20$

enter image description here

We can see for higher $k$ it starts rotating more often around $2 \pi$ so the wave is getting faster. We can also see that we have less points fitting our plot which comes from the sampling rate.

Feel free to play around with the parameters a little bit.

So with all thins in mind let's have a look at the dft. The DFT can be interpreted as the inner product of two functions (it literally can be compared to the dot product of two vectors, since we are comparing all pairs $(x_i, y_i)$ and checking for their correlation.

Now what the DFT basically does is lying different complex waves into the graph of the original wave and computes it's correlation regarding to different frequencies. A nice property of the complex inner product is that it is invariant to the phase of the complex wave.

So to visualize this I came up with this code:

# create the signal
srate  = 4000 # hz
# points from 0 to 2 in 1/srate steps
time   = np.arange(0.,5,1/srate) # time vector in seconds
pnts   = len(time) # number of time points 2000 here
# signal = np.cos(2*np.pi*1000*time - np.pi/2) + 2*np.cos(2*np.pi*2000*time + np.pi/2)
signal = np.cos(2*np.pi*1000*1/4000*time - np.pi/2) + 2*np.cos(2*np.pi*2000*1/4000*time + np.pi/2)

# prepare the Fourier transform
# fourTime is from 0 to N-1
fourTime = np.array(range(pnts))/pnts
fCoefs   = np.zeros((len(signal)),dtype=complex)

# freq = random.randint(0, pnts)
k = 3

# create complex sine wave
csw = np.exp( -1j*2*np.pi*freq*fourTime )

# compute dot product between sine wave and signal
fCoefs[freq] = np.sum(np.multiply(signal,csw) ) / pnts

plt.plot(time, signal)
plt.plot(time, csw.real)
print(np.abs(fCoefs[freq]))

enter image description here

You can pluck in different values for $k$ and check its correlation with the original wave.

So the interesting point here is that I used time = np.arange(0.,5,1/srate) for the original signal but foturTime = np.array(range(pnts))/pnts for computing the values of the complex sine wave and used time for plotting both.

The important point here is that in the original formula we just put in value the $n$ which is the same for both waves and since we are working with discrete signals there is no time involved anymore. It is just sampling rate and number of points now. You can check out this answer to see how it can be related back to the original sampling rate.

Hope that helps a little bit building some intuition!

$\endgroup$
3
  • 1
    $\begingroup$ Hi OuttaSpaceTime, Thank you for taking the time to build such a detailed answer. I don't understand everything unfortunately. It will take me a bit of time to digest... I was maybe looking for a more higher level intuition /abstraction I think. It's just that I can visually understand other theorems like Shannon's one. But this, just can't grasp it. $\endgroup$ Jan 28 at 16:09
  • $\begingroup$ What exactly is confusing to you? I was struggling with this at the same time and the plotting helped me a lot, but maybe some underlying concepts are not completely clear. If you have any question let me know. $\endgroup$ Jan 28 at 16:19
  • 1
    $\begingroup$ Thanks OuttaSpaceTime I need to find the time to go through your code. hotpaw2's answer below is pretty intuitive as well. It's the type of explanation I was looking for. But I need to manipulate to fully grasp the thing, so your code will surely help me. $\endgroup$ Jan 30 at 21:45
0
$\begingroup$

This might help...remember that units of frequency are the reciprocal of units of time. As such, effects in the frequency domain track time domain effects inversely. Making your samples take up more time, and it pushes the frequencies together.

You can add more frequencies to your result by adding more samples. You have two options for this...one is to increase your sampling rate while keeping the same time window. This adds additional frequencies above those represented at the lower rate, but doesn't change how close the frequency domain samples are to each other.

The other option is to retain the same rate and sample for more time. Now you have the same Nyquist rate, but more frequency domain samples...meaning the frequency domain samples are forced more closely together.

Remember too that the content represented by a single frequency-domain sample is in reality a sinc() function with its peak at the frequency represented by that sample. The sinc() response is the transform of the rectangular function in time represented by the sampling window. Longer sampling windows mean the sinc() drops off faster and frequencies that don't fit a neat integer number of cycles in your window are less spread out in the frequency domain.

$\endgroup$
1
  • $\begingroup$ Thanks for taking the time to reply Cristobol! It helped me a lot and is what I was looking for $\endgroup$ Jan 30 at 22:00
0
$\begingroup$

An intuitive way for me too look at this topic is to imagine wanting to measure the length of a single link in a chain. Let us say that you have a ruler that can only measure with 1 mm accuracy. This corresponds to your sampling time.

If you just took a single link and tried to measure its length, you could get a very inaccurate measurement. It might read as 11mm but you don't really know if it was 11.3mm or 10.6 mm. since the sampling does not allow for greater accuracy. Now in order to increase the accuracy without changing the sampling time/frequency, you can measure across multiple links. This increases N. Measuring across 10 links for example means that you can now determine the length of a single link with 0.1mm accuracy. You have narrowed the bin width by 1/10.

As the chain length increases towards infinity, the DFT becomes a DTFT which produces a continuous frequency signal. Now you have infinite resolution but since infinitely long chains are quite unwieldy in the real world, we need to accept the limits produced by DFTs.

$\endgroup$
1
  • $\begingroup$ Hi Martin, thanks for taking the time to reply. It's a great explanation, like hotpaw2's one below. It's the type of intuition I was looking for. $\endgroup$ Jan 30 at 22:02

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.