Hacker News new | past | comments | ask | show | jobs | submit login
The Fourier Transform, explained in one sentence (revolutionanalytics.com)
364 points by ggonweb on Oct 24, 2014 | hide | past | favorite | 82 comments



Fourier Transform became trivial to me when I noticed that it's just a basis transform, as you would do with ANY other basis. Except this basis happens to be sine and cos waves of different frequencies.

i.e. you view the entire signal in the original domain as a single point in space of dimensionality equal to number of measurements, and then you project it onto the new (fourier) basis. For example, one of the new basis vectors could be sine signal of some frequency k. You find the component of the original signal along this new basis just as you would do for any other basis vector, with a dot product. The only slight complication is that to get the phase information you have to actually dot with a cosine at that frequency as well. The confusing e^ix complex number multiply "hides" the fact that we are actually doing two simultaneous dot products (since e^ix = sinx + icosx): one with the sine and the other with the cos, to get both the frequency and phase.

this is the intuition for DFT but for FT you just have infinite-dimensional vector space, done. This is much more intuitive to me than "spinning a signal in circle", which I personally find very to be a very confusing statement.

EDIT: granted, basic linear algebra concepts (vectors, basis transform) needed.


The isolated phrase "spinning a signal in a circle" is confusing, but the accompanying visualization of tracking a point on a spinning speaker cone is something that can be visualized.

More generally, your explanation may make sense to you, but it makes no sense to a layperson (i.e. me). But the linked short description, along with the speaker cone visualization, does make a reasonable amount of sense.

I've also seen a similar description in the past with accompanying animated GIFs illustrating the point, instead of the prose about a spinning speaker cone. I thought the animated GIFs made it even easier to understand. Sadly, I don't remember where I saw that.



I've seen that before, and it's neat, but I don't think that's what I'm remembering.


Agreed a layperson could benefit, but karpathy insight is also very nice, it allows to leapfrog from elementary linear methods to a good intuition of the real thing.


Oh sure, I bet karpathy's explanation is great for the people it makes sense to. It just assumes a lot more specialized knowledge on the part of the reader, knowledge that I personally don't have.


Let me try a (hopefully unspecialized) basis view explanation. I think this is more of an explanation of Fourier Series, but hopefully it gets the idea across:

Consider the space around you. Pick some frame of reference with a fixed origin point in space. To specify an object, you must specify the distance along each independent direction (x, y, and z in 3d space) to arrive at the object from the fixed origin. These independent directions are referred to as a basis for the space. Often these independent bases are perpendicular (or orthogonal).

For higher dimensional spaces its the same idea. It turns out we can even have infinite dimensional spaces whose bases are functions that can be perpendicular in a certain sense. It also turns out that complex exponentials (waves) are an orthogonal basis for a certain space of functions. This is really cool! It means we can refer to any function in this space using the complex exponential basis functions just like we referred to objects in 3d space using three independent directions.

Why is this useful? The magnitude in a particular independent direction in this space tells us about frequency content of that function. So instead of the first coordinate telling you how much x distance there is, it says how much of a certain frequency there is.


That helps, thanks. But while that's an explanation that may make a certain amount of intellectual sense, it still doesn't make any intuitive sense. Perhaps if I'd worked with basis views a lot more, then it would become intuitive, but I haven't. The nice thing about the spinning signal idea is I can visualize the speaker, and it makes some intuitive sense that if I spin it at a given frequency, and the speaker cone's average movement is zero, then the signal has no content at that frequency.


OK, I'm curious about how well this explanation works for you. I came up with it, and I want to know if someone like you thinks it's any good:

The Fourier transform is usually used to flip from the time domain representation to the frequency domain representation.

"Time domain representation" means when you plot that information, the x-axis is in units of time; seconds or fractions of seconds, usually, for something humans can perceive as sound. The line tells us how loud the signal is at a given instant of time.

"Frequency domain representation" means when you plot that information, the x-axis is in units of frequency; hertz or kilohertz, usually, for something humans can perceive as sound. The line tells us how loud the signal is at a given frequency.

Here's the bit about basis vectors: The frequency domain representation is possible because we can make any graph by adding enough sine functions, assuming we're allowed to do things like increase or decrease their magnitude, increase or decrease their frequency, and shift them left or right relative to each other (adjust their phase).

Therefore, we can describe the graph were interested in using only sine waves, which are pure frequencies, and when we've done that, we'll know which specific frequencies contributed most to the information we represented in the original graph.

There is, of course, an inverse Fourier transform, which takes a frequency domain representation to the time domain, so we can do things like record a sound (time domain), take it into the frequency domain, remove certain frequencies (typically all frequencies above and/or below some cutoff) by zeroing parts of the graph, and take the result back into the time domain, so it can be played back as a recording.


Well this helped me! I'm trying to wrap my head around FFT (or DFT) for my course in speech recognition, and while I still wouldn't be able to derive FFT or understand maths, I now have an idea what it does ;) Thanks!


Sure, I agree. Its a mathematical sort of intuition - one on abstract objects (in particular Hilbert spaces). This abstraction provides something sort of fundamental and poetic. Of course, this is better motivated by something concrete or physical I think before stepping in this direction.


One small note is that independent vectors are ALWAYS orthogonal in their space. That's by definition of linear independence. If they were not mutually orthogonal, then at minimum you could remove 1 vector from the set and still have the same space representation.


[1, 0] and [1, 1] are linearly independent but not mutually orthogonal. Unless I am understanding what you mean by "in their space".


I don't see how it is so specialized. It is an inner product, which can be seen as analogous to a dot product that pretty much everybody sees at some point in high school.


I think it's this one, which was also posted on HN:

http://nautil.us/blog/the-math-trick-behind-mp3s-jpegs-and-h...


> Fourier Transform became trivial to me when I noticed that it's just a basis transform

I always thought this was the definition of Fourier transform, and all the stuff with integrals was just to express the change of basis in an explicit form. What else would the word "transform" mean?

How is it even possible for someone to learn about the Fourier transform without it being very explicitly said that it is a change of basis to a different set of orthogonal functions? Did they just introduce the integral expression for it and hope people would memorize it? I don't understand.


You could learn about it from looking at and understanding a harmonic analyser: http://www.sciencemuseum.org.uk/objects/mathematics/1946-343... Or more likely the reverse mechanism, Kelvin's tide predictor being a good example: http://www.sciencemuseum.org.uk/objects/oceanography/1876-11...

Honestly that is more important to my mental model of the 1D Fourier transform than the algebraic description. It may be slightly inaccurate (are we talking projection or decomposition when calculating the settings on that tide predictor? for example).

2D Fourier transform is more interesting to me as it has applications in graphics. For example, you can synthesise ocean wave-like patterns in a height field by generating a noise image, shaping the noise into a slightly modified disc around the origin, and doing an IFFT. Intuitively you get random wavelength components within a given range, with random orientations. I couldn't write you a formula for that. You can imagine that a 12 year old could understand it, thought. Old paper is here: http://www.computer.org/csdl/mags/cg/1987/03/mcg1987030016.p... Another intuitively comprehensible 2D trick is to use an FFT to implement a simulation of fluid dynamics. That's a more complex one but again, it can be understood without algebra. http://www.dgp.toronto.edu/people/stam/reality/Research/pdf/...


I think he meant the 'decomposition' definition. You're decomposing your signal into other signals, that's the fundamental aspect. This decomposition is a linear, so you can also make it so your new components have certain nice properties -- they can be orthogonal, unitary, etc. Then you naturally get waves (complex exponentials) as the components.

And then you realize -- hey, those are exponentials -- so the rule that integrating/differentiating each component is trivial allows you to solve linear ODE really quickly.

I don't think algebra gets in the way at all -- it just makes this decomposition crystal clear. It makes generalizing easy also -- there are other nice basis like cosine basis, and so on. You can choose any N functions to decompose your N-point function, as long as they are linearly independent, you're guaranteed perfect reconstruction.


I don't know who you mean by "he" as I was talking about the machine built by Kelvin. As far as the difference goes, my uncertainty comes from knowing a tiny bit about wavelets. A decomposition means that you can reconstruct the signal by adding all the (independent) components, whereas a projection gives you more of a spectrum picture involving non-orthogonal bases? I guess this doesn't apply to the Fourier basis.

Algebra is definitely necessary and useful here, but the concepts can be learnt without knowing algebra, which means they are open to a lot more people.


> Did they just introduce the integral expression for it and hope people would memorize it?

Unfortunately, yes. The linked article even avoids this language at all costs. And that's just how people work in physics and engineering. They want to talk about spinning functions and all this nonsense instead of just doing some basic linear algebra.


That's because it's important that people understand the meaning of the Fourier Transform and what it actually calculates. It's not just some random basis you throw in an equation without understanding what you're doing.


You can't "throw a basis in an equation", that's not a known mathematical procedure.

Once you learn enough about mathematics, and here linear algebra, you will start to reduce more complicated concepts to more simple concepts to build the complex ideas up from simple ones. In this case, the notion of a change of basis is definitely much simpler than the notion of an integral transform, so it makes sense to express it that way.

"What it actually calculates" can be a tricky topic in mathematics. It calculates what we want it to calculate, that's how people came up with the definition. So the way to understand "what it calculates" is to understand "what do I want it to calculate". Somewhat like programming, really. If we want to calculate a function's representation in a convenient basis, then that's what we'll make Fourier transform do.


Thanks for the response! I've always wanted to have math explained to me by a 21-24 year old that spent too much time with Rudin and Axler in undergrad.


Were you planning on bringing anything of substance to this discussion?


Can't we just appreciate that both points of views are helpful? Instead of thinking that one is better, and saying that therefore, the other sucks.

Both of them explain different things. The "it's just a basis transformation" makes invertibility, additivity, etc. obvious. The spinning function things makes it clear why we want to do it and what it means to say, approximate by only the small frequencies. (The basis thing tells you nothing about why looking at just the low coefficients is helpful, sometimes. A basis doesn't care about ordering)

Except for orientability, I guess.


I agree- the Discrete Fourier Transform can be easily explained in terms of basic linear algebra. And that explanation is usually included in an introduction-level DSP book. Both Oppenheim and Proakis have a section devoted to this view- understanding the DFT as matrix multiplication starts you on the road to the FFT algorithm.

But most ECE undergrads will first see the Fourier transform in a continuous setting. Then it isn't so much an issue of "basic linear algebra"- students have almost certainly never talked about the space of square integrable functions, the inner product on this space, a basis for continuous functions, what it means for an infinite collection of objects to be orthogonal, etc... That is, if they have even taken any linear algebra yet!

Students learn to associate "Fourier Transform" with "complicated math" and then are discouraged when they learn about the DFT. At least, that has been my experience when TAing signal processing classes.


I have no idea about engineering, but I don't see how a physicist would not know this. Basis transformations in infinitely dimensional Hilbert spaces are an integral part of quantum mechanics. The Fourier-transform in particular is used for transformation between space |r> and momentum |p> representations, but it's also common to use energy representations |E_n> where the orthonormal basis is typically somewhat more exotic (e.g. given by a Gaussian multiplied with Hermite polynomials, as is the case for the QHO).


That's sad. It reminds me of "maths phobia", which is apparently a thing people can have. Also "maths anxiety".


I think part of it is also that physicists and engineers are the ones primarily teaching the subject. So it might not be that they can't handle the mathematics, but that they simply prefer to think entirely in intuition land where they can spin functions around without anyone protesting.


If anyone wants to visualize how a variety of basis transforms (including Fourier/DFT, Eigenvectors/PCA, Haar Wavelets, Random basis, and Identity basis) can be applied to images, I created this a few years ago which seems to have become the #2 google result for "basis transform":

http://www.cc.gatech.edu/~phlosoft/transforms/

It's a bit dated by being a Java applet, but source code is all there, written in Processing. (I should probably update it now that processing.js is a thing. Years pass quickly.)


I think you mean e^(ix) = cos(x) + sin(x) i

What I find useful about Laplace and Fourier transform is there is that applied to differential equations there produce algebraic equations, so is a very useful tool when you are working in problems that are modelled via differential equations.

My introduction for someone totally new to this field is to consider y = cos(x) and y= sin(x) as functions whose second derivative satisfie y'' = -y, and the exponential y = e^(ax) as the solution of y'=ay then you can think of solving this kind of equations trying a solution of the form y=e^(kx) then you realize that the constant k is the solution of a first or second order algebraic equation that you know how to solve. Then if your equation has imaginary solutions you need to know what is the exponential of an imaginary number, exp(ik) = cos(k) + i sin(k), then you say ok but that is not a real function, answer the real part and the imaginary part of that are also solutions. Going this way you find that a complex number z=a+bi is the solution of a differential equation that often constitute a model a physical system. z=a+bi is the solution of the second order equation (z-a)^2+b^2 = 0 and this algebraic equation goes to a differential equation just by replacing z^2 with y'' (second derivative) z with y' (first derivative) and finally you have the solutions y=e^(ax)cos(bx) and y=e^(ax)sin(bx) in which the amplitude of the signal is controlled by the exponential so that if a is negative you have a signal that goes to zero and otherwise a signal that explode.


This is the kind of explanation I was looking to post on here, but couldn't quite formulate it right in my head. Yours is very clear.

Interestingly, it was actually learning quantum mechanics that made me understand the Fourier Transforms much more easily. You're always projecting the wavefunction into some basis, so the cosines and sines are just one particular basis.


Sure, but the hard thing to accept intuitively, for me, is the existence of an alternative representation of a function, particularly for functions with require intricate, data-heavy descriptions to express. It's kinda crazy that you can combine sinusoids such that they look like an arbitrary function; it's only slightly more crazy that there is a way to calculate what those sinusoids are. My intuition is also violated by the fact that you can go back and forth between these representations losslessly. It would seem to me that some functions, even simple ones like a line, would require a great explosion of information to represent as sinusoids. A function requiring two numbers to describe now requires infinite numbers to describe, in the linear case!


My preferred way of looking at the Fourier Transform is that it is evaluating a polynomial at the roots of unity. This is why it is useful for large-integer arithmetic.


This approach is much more clear when you use inner product notation.

Another benefit is you can see exactly how it is similar to other basis like (orthonormal) wavelets, and dis-similar to so-called "over-complete" basis (complex wavelets, various frames). It also becomes easier to see what the projections onto basis vectors are doing, how/why energy is "concentrated" under a wavelet transform of natural signals, etc.


While the "spin it around" explanation is useful at a certain level, this "basis understanding" is useful in a different domain - quantum physics. Position and momentum are two such dual bases, as are energy and time. In other words, time and energy (for example) are perspectives. If you try to narrow a signal from one perspective, it will look spread out from the other perspective.


Agreed, I've taken classes in linear algebra, analysis, and harmonic analysis...and I can't grok what "spinning a signal around a circle" is or why we would do that, but the change of basis realization was also my lightbulb moment.


I completely agree with you. Especially after my work with numerical techniques to partial differential equations, I honestly cannot see why Fourier transforms keep getting people confused. Hell, even Laplace transforms trip people up...


I agree, the "magic" part is the proof that the sine and cos signal vectors in the Fourier basis are orthogonal. But it's actually a very simple proof.



I like to think of it like a vertical line of individual LEDs that sequentially shine to show current amplitude. (Like a Cylon visor or KITT the car.)

Sweep it sideways across a dark area, and you will see the waveform as an afterimage.

But start to spin it around the center, and you get a blob of light, one that becomes tighter and more symmetric whenever the spin-rate approaches the rate of a repeating subpattern.


THIS makes the most sense to me of any of the explanations on this page.


i.e. color coding is necessary because math notation is typically expressed with ambiguous assumptions in service to absolute terseness.

As far as I can tell the only time it's attempted to combat this is in proofs.

Note that in a real paper or textbook, such a detailed description explaining each term would most definitely be absent, even without the color coding.


Note that in a real paper or textbook, such a detailed description explaining each term would most definitely be absent, even without the color coding.

I disagree. In a textbook, it is very unlikely (and would be a sign of a badly written book) that there are terms in a displayed equation that are genuinely undefined. The worst case scenario is that they are not defined immediately next to the equation and presume that you have read the preceding material.


> The worst case scenario is that they are not defined immediately next to the equation and presume that you have read the preceding material.

The textbooks I have used written by Russian mathematicians are almost entirely equation after equation of "explanation", derived from previous equations with the exact transformations in between left as an exercise for the reader.

There's a very distinct cultural difference for sure, but that doesn't make these textbooks less real.

Edit, to clarify: A lot of terms are not explained in natural language either as their very definition is an equation of other terms (feel free to apply recursively).


Yes that's fair - I guess I was thinking more of homework materials distributed as a part of educational content, but all good teaching should try to address this, yes.


One thing I don't fathom is why FT is always explained in terms of circles? For me it was always confusing this way; the concept was much more graspable when visualized in the terms superposition of sinusoidal waves.


Because those sinusoidal waves are merely projections (in the real and complex plane) of the true elemental component of the FT, the complex exponential, which is really a helix: http://m.eet.com/media/1068017/lyons_pt2_3.gif

A while ago I made some drafts for a series of diagrams to help visualize this (http://imgur.com/vEcnVdn) but unfortunately never got around to finish it (https://en.wikipedia.org/wiki/Wikipedia:Graphics_Lab/Illustr...).

Anyway, for those who find it easier to think of sinusoidal curves, the animation in the Wikipedia article (https://commons.wikimedia.org/wiki/File:Fourier_transform_ti...) is a very good visualization (also, BetterExplained's rant on sines being explained as circles may resonate: http://betterexplained.com/articles/intuitive-understanding-...)


To be fair, the expression does contain (edit: ) Euler's Formula, which is basically drawing circles in the complex plane, so people are going to draw circles.

I started thinking about sinusoids this way after seeing stuff like this: https://3.bp.blogspot.com/_6t_ZmJSkbL4/TJLIDoBQMzI/AAAAAAAAC... — if you look down at the complex plane, you will see a circle.


Sine an cosine are functions which tells you what your coordinates are when you travel around a circle (of diameter 1) so anything containing those (here hidden in Euler's formula) will provoke thinking in terms of circles. I find it very useful btw. Once you realize sin(x) represents how high you are on a circle and cos(x) how far to the right (if you start walking on it from [0,1] counter clockwise) then a lot of stuff start making intuitive sense.


Think of FFT like music notation - a separation into notes, dynamics, and tempo.

Alexander Graham Bell came up with the idea of increasing the capacity of telegraph lines by combining several 'dit-da-dit' streams at different pitches. He called it a 'harmonic telegraph'. The idea was that a mechanical arrangement of reeds tuned to different pitches would pluck out each stream of data. This insight eventually led to the invention of the telephone.

Finally, if 'orthogonal basis set' is difficult to understand, think of giving someone 'left/right' driving directions. An FFT would then (roughly) be the mathematical equivalent of 'driving directions' for a clip of music or speech.



This is an extremely confusing sentence. My sentence would be:

> Every "nice" function can be uniquely decomposed into complex exponentials, which in some contexts represent physical frequencies.


Your sentence describes what the Fourier transform is, whereas the sentence in the article tries to describe how the Fourier transform does it.

What the article does is translate the mathematical notation to a sentence in English, attempting to retain all the information in the mathematical notation. It can be helpful if you are trying to visualize what each component of the mathematical notation is doing, but if you separate the sentence from the equation, it is not very helpful. For example, how would you interpret "spin your signal around a circle"?


It's confusing because it explains "how." Your sentence explains "what," which misses the original point.


If you understand the "what" deeply, then the "how" is trivial. The decomposition is in the sense of an orthogonal basis, and so to compute the transform you just do a dot product in the relevant vector space.


The key of course it that it is matching the colors of the words with the notation. It is a helpful tool for people who cannot read the notation.


Neat idea but this is a discrete Fourier transform (the original is continuous, with an integral), and it's only "explained" in the context of signal processing. And even then, the explanation is imperative.

I agree with another commenter that the more useful explanation is in terms of a change of basis.


And it also relies on the rather fuzzily defined concept of "a bunch of points." Which bunch? Some particular bunch, or will any old bunch do?


yeah, I don't like the idea of 'spin your signal around a circle', and prefer 'compare your signal to a perfect wave' (multiplication).


The Fourier transform is the map between sound waves and music as written on a staff.

That's an informal statement with some hand-waving since, for example, music notation is discrete and sound waves aren't, but that's the main idea.


Saw this after posting my comment. I agree this is the simplest way to think of an FFT.


I remember fondly a college lab experiment where we had to design a circuit that modulated a signal on a carrier (basic FM modulator).

The prep for the lab included a paper running all the math predicting the outcome, that included fourier transforms.

The moment we hooked the circuit's output to an spectral analyzer was the first time I saw years of theoretical math resulting on something "tangible": the carrier signal's peak and +/- modulated peaks around it.


I think all the efforts to make FT more intuitive are necessary because representation of signals in terms of complex numbers is a tricky abstraction. It's definitely worth taking the time to get comfortable with the idea of representing a signal in the complex plane without trying to understand the FT at the same time. Then all the spinning-around-a-pole verbiage makes a lot more sense.


I'm glad to hear that, as I didn't understand the pole analogy one bit.


Understanding the fourier transform (from archive) https://archive.today/ulPFk

(original link(dead) http://www.altdev.co/2011/05/17/understanding-the-fourier-tr...)


>then there is frequency corresponding the pole's rotational frequency is represented in the sound.

Parse error.


I can't actually work out what this was meant to say. Anyone?




Felt inspired after reading this this morning and spent my Saturday hacking up a little demo in d3:

http://bbrennan.info/fourier-work.html


Would thinking of the fourier transform as breaking a signal down into basic elements (like atoms) and figuring out how much of each atom is in the signal? Each different value of "K" then is like using a Ph strip to measure approximate energy b/c that particular strip reacts most strongly to a certain k.


What about phase?


The complex number contains both cos and sin components. Sine is the y axis, cos is the x. From that you know the angle, i.e. phase, right?


isn't this theorem used to transmit signals across DSL connexions ? isn't it also used for wireless internet ?


As I switch off the moment I see "signal" or "wave" let alone spinning it around the circle here is my take without any engineering intuitions:

-you can represent any (nice enough) function as sum of sines and cosines of given frequencies and amplitudes (frequency is how often the function hits the whole period, amplitude is by how much you multiple its value). The functions look this way: Amplitude* sine(x* freq)

-So you have your sines and cosines at various frequencies and you wonder what the amplitudes should be for every one o them so that if we add them all up we end up with the original function

-The idea to answer this question is to see how values of your sine or cosine correlate with values of original function. If it turns out that the original function is high in value when your sine/cosine is high in value then sine/cosine at this frequency needs to have higher weight than sine/cosine at other frequency where it doesn't correlate well with original function. Intuitively if you take functions which correlate well with original one with higher weight and those which doesn't with lower weight then you will have good approximation of the original function.

-Euler formula tells you what sine and cosine at given point are representing it as one complex number; to get value of sine/cosine at frequency different than 1 you need to multiple the argument by k*2PI (as frequency is given in beats/period and period is 2PI it's quite obvious why); So e^(xi) is value of cosx and sinx and e^(xik2PI) is value of those functions at frequencies different than 1.

So now let's see what this formula is:

-it's an average of things (1/N and a sum of N elements)

-those things are multplications of values of sine/cosine at given frequency at given point with value of original function at this point

-this average is going to be high if places where sine/cosine at given frequency is high and original function is high overlap (you multiple big number by big number) and low if described correlation is low; that's the concept of correlation

-the points are evenly spaced from 0 to n-1 (that's why n/N in the formula) and the sum goes over them.

That's it. No circles and spinning. The only trick is to realize that: e^(ix) represent value of sine and cosine at x and e^(ixk2PI) represents value of sine/cosine at frequency k (if k is 1, then it's standard sine/cosine as everything simplifies).

One sentence explanation could be: the more sine/cosine at given frequency correlates with original function the more that sine/cosine contribute to the original function.


you know, it should be possible to generate some of these sentences from the mathematical notation directly to make concepts more understandable to the lay person (and people in general).


It just correlates a signal against harmonically related complex sinusoids.


Yeah, and don't forget to sum it all up and divide it by the number of samples.


Maybe an animation would serve to be a better vehicle for understanding.



All of these explanations remind me of this: http://www.joe-ks.com/archives_may2011/HowToDrawAnOwl.jpg




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: