This was really great. It's how I was taught entropy at college (biophysics and molecular biology) though without the sheep. "Statistical Mechanics" was the name our professor used.
The only tiny change I'd like to make is to add a line or two near the end, something along the following lines:
There's a lot fewer ways to arrange water molecules so that they form an ice cube than there are to arrange them as a liquid. Most arrangements of water molecules look like a liquid, and so that's the likely endpoint even if they start arranged as an ice cube.
The same is true of more or less any macroscopic object: the thing that we recognize and name ("chair", "table", "pen", "apple") requires the atoms to remain in one of a fairly small set of particular arrangements. Compared to the vast number of other arrangements of the same atoms ("dust"), the ones where the atoms form the "thing" are quite unlikely. Hence over time it's more likely that we'll find the atoms in one of the other ("random", or "dust-like") arrangements than the one we have a name for. The reason things "fall apart" isn't that there's some sort of preference for it - it's that there are vastly more ways for atoms to be in a "fallen apart" state than arranged as a "thing".
"Students who believe that spontaneous processes always yield greater disorder could be somewhat surprised when shown a demonstration of supercooled liquid water at many degrees below 00 C. The students have been taught that liquid water is disorderly compared to solid ice. When a seed of ice or a speck of dust is added, crystallization of some of the liquid is immediate. Orderly solid ice has spontaneously formed from the disorderly liquid.
"Of course, thermal energy is evolved in the process of this thermodynamically metastable state changing to one that is stable. Energy is dispersed from the crystals, as they form, to the solution and thus the final temperature of the crystals of ice and liquid water are higher than originally. This, the instructor ordinarily would point out as a system-surroundings energy transfer. However, the dramatic visible result of this spontaneous process is in conflict with what the student has learned about the trend toward disorder as a test of spontaneity.
"Such a picture might not take a thousand words of interpretation from an instructor to be correctly understood by a student, but they would not be needed at all if the misleading relation of disorder with entropy had not been mentioned."
> There's a lot fewer ways to arrange water molecules so that they form an ice cube than there are to arrange them as a liquid. Most arrangements of water molecules look like a liquid, and so that's the likely endpoint even if they start arranged as an ice cube.
Except this is, as an insight, obviously wrong. The arrangement you get is determined by temperature: cold water will spontaneously freeze, and hot ice will spontaneously melt. The model you state predicts that
In keeping with the level of explanation in the article, I was omitting the role of energy in describing the exploration of possible microstates by the system.
A system with zero energy is highly constrained in its exploration of possible microstates, and thus is unlikely to undergo a change in its macrostate.
A system with a lot of energy is much less constrained, and is more likely to undergo macrostate changes.
This wasn't really covered in the article, so I didn't want to put it into my (perhaps foolish) add on sentences.
This followup still predicts that cooling water cannot cause it to freeze. (And relatedly, it predicts that cold ice will take a long time to melt, but not that it won't melt. In fact, it won't melt.)
I don't see why either of those things follow, particulary the second.
Cold ice, in an environment that doesn't supply energy to the ice, will not explore microstates at any notable pace, and thus will not melt.
I don't think it makes any prediction about what will happen when cooling water, because the freezing reaction is related to the specific chemistry of water molecules. The only prediction is that any macrostate that gets newly entered into is less likely to change, because of the relatively low energy state of the system (compared to the energy required to break the newly formed bonds of the crystalline form.
1. Things tend to turn to dust rather than stay together, because the number of states where things are "together" are small, and the number of states where things are dust are high.
2. Ice has a few arrangements, and water has many. So water is "dust ice".
3. H2O will naturally tend towards "dust" form over time, so ice will eventually become water.
So one thing I've never understood is how you can "count" microstates, or bits required to describe them, when the relevant physical parameters all seem to be real numbers. For instance, a gas of N atoms is described by 6N real numbers (3d position and velocity) regardless of how hot it is. The article talks about quanta of energy, but that seems like a simplification at best: a given interaction might be quantized, but quanta in general (e.g. photons) exist on a real-number spectrum, so it's possible to have an uncountable infinity of energy packet sizes right? (That's the point of a black body, right?)
What am I missing? Is this a weird measure theory thing, where hot objects have even bigger uncountable infinities of states and we get rid them all with something like a change of variables? If you told me spacetime was secretly a cellular automaton I could deal with it, but real numbers ruin everything.
I asked the same question to my professor when I took thermodynamics as an undergrad. In that class we are told that a particle in a box of size 2 can be in twice as many places as a particle in a box of size 1. But in real analysis we learn there are just as many numbers between 0 and 1 as there are between 0 and 2. The answer I was given, in true physicists form, is hand-wave it. There's an intuitive notion that twice as big means twice as many places to be, therefore just accept it and let the mathematicians cry over our abuse of the reals.
The true answer is that "quanta of energy" is not a simplification. The idea that physical variables like energy and position come in discrete units is the quant in quantum physics. If you imagine the position of a particle in a box of size 1 to be discretized into n states, then a box of size 2 really would have 2n states. So all of your concerns are moot because quantum mechanics replaces all the uncountable sets with countable ones.
But this still leaves us with the issue that Boltzmann did all this work before quantum mechanics existed so there must be some useful notion of "bigger uncountable infinities". The answer, as far as I know, is that you can always approximate classical physics with finite precision variables so long as the precision is high enough (replace the reals with floats). The idea of counting states works for any arbitrarily precise (but still finite) discretized variables, and as far as physicists care an arbitrarily precise approximation is the same as the real thing.
This is not true; position/time are not quantized in the standard model. String theory is not canonical. I think a better way to think about it is not in terms of size, but in terms of time. A particle in a bigger box, on average, can go on a random walk for longer without hitting a wall. It will take longer for a particle to sufficiently (arbitrarily close) exhaust the phase space of a bigger box.
This is an excellent, and puzzling, question! Let me try to provide some insight into how physicists think about such paradoxes, by addressing the specific example you mention, of the presumed uncountable infinity of different possible photon energies in a finite range of frequencies. In the case of blackbody radiation, when physicists analyze the set of possible photon energies more carefully, they find that there is really only an infinite number of different possible photon energies (for a finite range of frequencies) if the volume of space containing the photons is infinite. In any finite volume of space, if we allow ourselves to place boundary conditions on the electromagnetic fields at the edges of that volume (for example, suppose we think of our volume as a cube with mirrored walls), we find that there are only a finite number of oscillating modes of the electromagnetic field in any finite range of frequency. In the case of a cube, there is a very lowest frequency of radiation whose wavelength will allow it to form a standing wave in the box, and all the other allowed modes are multiples of that lowest frequency. So the density of possible photon states per unit of frequency is actually proportional to the volume of space we allow to hold the photons. (By the way, this is also precisely related to the quantum complementarity of uncertainty between momentum and position. To confine a photon to a volume of space, the uncertainty in its momentum must be the same order as that carried by the lowest frequency standing waves which would be compatible with the container.)
I think part of my mistake was not thinking of the thermal energy packets (phonons?) as waves in boxes, with the box being the boundary of whatever thing has the energy. Which is still weird for a gas expanding into a vacuum I guess, but works for a hot solid object or a confined gas.
This is a very good question, which goes to the heart of statistical Physics. We use phase spaces for this (typically a 6N-dimensional vector space in which each microstate is represented by a point). The system has a probability of being in (or rather very close to) each microstate, which depends on several factors, like the conditions (isolated system, constant pressure, temperature, number of particles, etc). Counting microstates is “just” calculating integrals of that probability weight in the phase space. Of course, most of the time it is impossible, so we have tools to approximate these integrals. There are a lot of subtleties, but that’s the general idea.
The phase space does not change depending on temperature, so there’s nothing weird like the space getting bigger. But the probability of each microstate might, as high-energy states become more accessible.
So would it make sense to think of a microstate as a region of phase space, a point and those points "very close to" it? And "increasing number of microstates" just means a larger number of these regions have non-negligible probabilities? In continuous terms you would see this as the distribution flattening out. I might be having trouble visualising what we're integrating, since if it's a probability the integral over the whole phase space can only be 1, right?
Yes, that is the principle. The probability of a single point is zero because an integral over a point is zero, hence “very close to it” (in an infinitesimal volume around the point).
The integral of the probability over the phase space is indeed 1. This is the purpose of the partition function, which is the normalisation factor. The weight function is not normalised a priori.
Good point. I think the what's missing is how electron energy states work in quantum mechanics. The wikipedia paged (linked below) has a pretty good explanation:
"A quantum mechanical system or particle that is bound—that is, confined spatially—can only take on certain discrete values of energy, called energy levels. This contrasts with classical particles, which can have any amount of energy. The term is commonly used for the energy levels of electrons in atoms, ions, or molecules, which are bound by the electric field of the nucleus, but can also refer to energy levels of nuclei or vibrational or rotational energy levels in molecules. The energy spectrum of a system with such discrete energy levels is said to be quantized". (https://en.wikipedia.org/wiki/Energy_level)
To put things another way: while there could theoretically be infinite sizes of energy quanta, the permutations of energy states for matter are in fact discrete.
What you say is true, but also incomplete. We are perfectly able to quantify the accessible states in purely classical systems, such as ideal gases, without requiring discrete energy levels. The trick is to think of a continuous probability density instead of discrete probabilities. This framework is very general and does not depend on the quantum-ness of what you look at.
Even in some systems that actually follow quantum mechanics (such as phonons or electrons in a material, or photons in a black body), we often use continuous probabilities (densities of states) because it’s much more convenient when you have lots of particles.
IANAP but there's two different things that come to mind.
Even if the number of states were infinite, as long as there's a reasonable probability distribution you can know you can integrate over the probabilities that have some property vs another (say: solid vs not).
Secondly, energy is quantized (on a very very very small level) in reality. Ymmv though, I tried googling this and read some stuff about waves being quantized and particles not and about how it depends on what system you're looking at and I had to drop out of quantum physics when I took it :-)
For quantum objects, the distinction between a wave and a particle is not very meaningful. The energy levels of an electron around a nucleus are discrete, regardless of whether the electron behaves more like a wave or more like a particle in the specific experiment you’re doing.
Otherwise, you’re right: we count states by integrating (at times discrete, continuous, and often very complex) probability distributions.
As a physicist, this is such a great explanation that's actually correct for a change. Entropy must be one of the if not the most misunderstood physical concept out there (along with Planck metrics like Planck energy or Planck length). Entropy is commonly used and written about by so many people that clearly lack the understanding of it that this blog post is a refreshing change.
I recommend you the story "The Last Question", by Isaac Asimov[0], one of my favourite stories of all time. It's about how the question about reverting the direction of the entropy keeps rising on people, throughout the life of the universe.
That still doesn’t answer the question how, if the laws of physics are time-symmetric, the universe as a whole can have a time-asymmetric evolution of entropy. I.e., if something forces entropy to increase in the long run, then that should hold in both directions of time. So what is it that causes entropy to only increase in the direction of the future, but not in the direction of the past, given that the laws pf physics do not distinguish between both directions?
The answer I've provided elsewhere in this thread is along the lines of: Other than melting icecubes and scrambling eggs, the only other difference you notice between the past and the future is that you can remember the past, but you cannot remember the future. If you could remember the future just as well as you remember the past, you probably wouldn't have strong opinions about which way time goes (or which direction is "clockwise"). If you, Merlin-like, could only remember the future then you'd probably be here asking why you always observe entropy decreasing in closed systems.
But memory operates on systems of increasing entropy, so you'll always only remember the past having less entropy. [1]
We don’t have an answer to this question. I don’t want to discuss metaphysics here, but there is a very interesting discussion on that subject here: https://youtube.com/watch?v=-6rWqJhDv7M
Let me rephrase maybe: Given the state of affairs I described above, I don’t understand what is the convincing argument that entropy does indeed increase in the long run. Any argument given should also work in the reverse direction, given the symmetry of time, shouldn’t it? (And thereby create a kind of reductio ad absurdum.) If not, why not?
I think it collapses down to an even simpler, perhaps even tautological point (though I think that it really isn't).
In a system with energy present, the system is continually and randomly shifting between microstates.
We identify macrostates, and analytically can identify certain macrostates as having more or less possible microstates.
Given the constant random movement between microstates (thanks, energy!), the system's macrostate is most likely to be one represented by large numbers of microstates.
The system isn't really "increasing its entropy" - it's simply randomly exploring all accessible microstates. If we could observe the microstates directly, we'd probably never think of entropy at all. But since we observe the macrostates, we also end up noticing that over time, systems with energy tend toward macrostates representing large numbers of microstates.
If the opposite were true, you'd basically be saying "more probable things are actually less probable", which is contradictory. Increasing entropy is really just a different way of saying that "more probable things tend to happen".
Because of the relationship between microstates and macrostates, and our cognitive biases towards certain macrostates, we tend to notice things moving towards what we consider disorder. All that is really happening that things just tend toward "more probable" macrostates, because they represent larger numbers of possible microstates.
Again, if you were unaware of the macrostates, you'd see no asymmetry.
It’s fundamentally an empirical law, even if one of the best-verified ones. I don’t think there really is a convincing argument, other than “we haven’t seen a single instance of the opposite being true”.
It’s a missing piece in quantum mechanics (linked to decoherence) and quantum loop gravity (in which time is...interesting).
The issue you mention (how can the 2nd law and the fact that the laws of Physics don’t care about the direction of time be true at the same time) is a big problem.
I really recommend watching the video if you haven’t, it is fascinating and well worth the time.
One possible model of the universe is a low-entropy "bounce" in the middle (the big bang), with entropy increasing in both time directions away from it (and therefore observers on either side of it experiencing the big bang as being in the "past"). Then the only part that remains to be explained is why there's this single point of low entropy, and that's kind of a "why is there something rather than nothing?" question.
"Since the second law of thermodynamics states that entropy increases as time flows toward the future, in general, the macroscopic universe does not show symmetry under time reversal." - https://en.wikipedia.org/wiki/T-symmetry
There are two significant repositories of high entropy in the known universe: empty space and black holes.
We can recast the Boltzmann measure of entropy by swapping small sub-volumes of a volume with one another and see if the containing volume changes significantly.
Swapping 1 cm^3 taken from the top of your brain with 1 cm^3 taken from the bottom of your brain will probably cause serious injury or death. Likewise, swapping 1 cm^3 of the valves on the left of your heart with 1 cm^3 of the muscles on the right of your heart will probably kill you. But swapping 1 cm^3 of blood taken from your left leg with 1 cm^3 of blood from your right arm will probably lead to no medical difference at all. So, qualitatively, the heart and brain have higher entropy than blood.
If you take 1 cm^3 of the stuffing of a cushion and swap it with 1 cm^3 of the stuffing elsewhere in the same cushion, you'd struggle to measure a difference by sitting on the cushion.
If you take 1 cm^3 of the air in a room and swap it with 1 cm^3 of the air elsewhere in the room, you'd struggle to notice a difference from outside the room.
And so forth.
We can play with larger volumes: if you are sitting on a cushion in a room, then swapping 1 cm^3 of your blood with 1 cm^3 of air will probably kill you. Swapping 1 cm^3 of a muscle in your leg with 1 cm^3 of the stuffing in the cushion will be unpleasant but not fatal.
In the room, most cm^3 is air rather than brain tissue. If we make the room bigger without adding more people or cushions, we get more air, so a bigger room+cushion+human has lower entropy than a smaller room+cushion+human. Expanding a room in this manner increases its total entropy. Our expanding universe works comparably.
A cm^3 of empty space is to all practical purposes completely substitutable with any other cm^3 of empty space. There's nothing in it. Real space outside galaxy clusters are practically empty, just some photons and neutrinos passing through.
When we consider the metric expansion of space we get two effects: there's more cm^3 of space, but not more photons and neutrinos. Those photons and neutrinos are also cooling, because the expansion is adiabatic.
The expansion of space is generating enormous entropy between clusters of galaxies, and that alone accounts for a large fraction of the increase in entropy in the universe.
Black holes hide what's in them. In a theoretical black hole (and the theory closely matches observation of things that are virtually certainly astrophysical black holes), the no-hair conjecture says that apart from position in and motion through spacetime, which we can remove by using a set of coordinates in which the (theoretical) black hole's mass M is always at the spatial origin, the only measurable values in electrovacuum are electric charge, angular momentum, and mass. What makes up M or contributes to the angular momentum or charge is unknown just by looking at a black hole at any given moment.
We can probe this a bit. Let's say our "snapshot" black hole's charge is completely neutral. Is that because only neutral charges have ever fallen into it? Or because a mix of opposite charges have fallen in over time, neutralizing small deviations in charge away from 0? We don't know. Same with angular momentum. We also don't know what things combined to give us the mass M.
In fact, consider a chargeless nonspinning spherically symmetrical black hole of mass M. We can raise M a small amount i by throwing in a thin collapsing shell of gas of mass ~ i. Or we can throw in two thin collapsing spherically-symmetrical shells of gas where each has mass ~ i/2. Or we can throw in three thin collapsing spherically-symmetrical shells of gas each of mass ~ i/3. And so on. A future observer would not know -- thanks to no hair -- the count of shells we threw in: it would only see a new black hole with one observable: M' where M' > M.
Because we can substitute the "insides" of a black hole arbitrarily as long as we preserve its observable quantities, a no-hair black hole can have absolutely enormous entropy. Additionally, a black hole with a larger mass has more entropy than a black hole with a smaller mass.
Astrophysical black holes probably only grow (and if they ever evaporate by the Hawking process then they mostly emit greybody radiation, which is extremely high entropy). Moreover, they grow by intercepting lower-entropy material, whether that's stars, molecular clouds, distant starshine, or the cosmic microwave background, all of which is of lower entropy.
Assuming the black holes that one finds in galaxy clusters tend to merge (which raises their entropy), and the metric expansion of space continues indefinitely, the far future of the universe is very large black holes surrounded by enormous amounts of space containing an increasingly sparse, increasingly cold gas of material that managed to avoid being stuck in galaxy clusters in the earlier universe which still had lower-entropy structures like stars and planets.
We can look backwards too: when we reverse the metric expansion of space, galaxy clusters are closer together, and the cosmic microwave background and the like gets denser and hotter. Stellar black holes become less numerous; central black holes in clusters and galaxies are smaller. Thanks to light being pretty slow (and also gravitational lensing) we can test this by probing "first light", "the dark ages", and other epochs that are accessible to conceivable observatories. It would be a super-interesting discovery to see more black holes in more distant (and thus older) galaxies than in closer ones, for instance, because it would directly implicate questions about the total entropy of the universe. Likewise, when we study the metric expansion of space there perhaps we could (shockingly) discover that remote "empty space" has much lower entropy than expected. These aren't very likely, though: it's safer to bet on the total entropy of the universe growing for a very very very long time thanks to the metric expansion of space outside galaxy clusters, and the gravitational collapse of matter inside galaxy clusters.
Thermodynamic entropy is not really a physical property of a system, it is a property of our description of the system subject to some macroscopic constraints.
Loved this article. There are so many applications of entropy and statistical physics in computer science, and I find it fascinating that the same general properties are useful in such different contexts.
For example, there's a well-known phenomenon in probability called concentration of measure. One of the most important examples in computer science is if you flip n coins independently, then the number of heads concentrates very tightly. In particular, the probability that you are more than an epsilon-fraction away from 0.5n heads is at most around e^{-epsilon^2 n}. This is exactly the setting described in the article with the sheep in two pens, and this inequality is used in the design of many important randomized algorithms! A classic example is in load balancing, where random assignment sometimes produces 'nice' configurations that look very much like the high-entropy states described in this article (but unfortunately, many times random assignments don't behave very well, see e.g. the birthday paradox).
The sum of independent random variables is well known to have concentration properties. An interesting question to me is what sorts of other statistics will exhibit these concentration phenomena. An important finding in this area is the bounded differences inequality (https://web.eecs.umich.edu/~cscott/past_courses/eecs598w14/n...), which generally states that any function that doesn't "depend too much" on each individual random argument (and the sum of bounded variables satisfies this assumption) exhibits the same concentration phenomenon. There are some applications in statistical learning, where we can bound the estimation error using certain learning complexity measures that rely on the bounded differences inequality. In the context of this article, that means there's a whole class of statistics that will concentrate similarly, and perhaps exhibit irreversibility at a macroscopic level.
There's a related anecdote about John von Neumann: he used to joke that he has superpowers and can easily tell truly random and pseudo random sequences apart. He asked people to sit down in another room and generate a 0/1 sequence via coin flips, and record it. Then, generate another sequence by heart, trying to mimick randomness as much as possible. When people finally showed the two sequences to him, Neumann could instantly declare which one was which.
People were amazed.
The trick he used was based on the "burstiness" rule you describe: a long enough random sequence will likely contain a long homogeneous block. While humans tend to avoid long streaks of the same digit, as it does not feel random enough.
So, all he did was he quickly checked with a glimpse, which of the two sequences contained the longest homogeneous block, and recognized that as the one generated via the coin flips.
That's a cool anecdote :-) I wouldn't say it uses concentration of measure exactly, but I see how it is related. The anecdote is about asymptotic properties of random sequences, and concentration of measure is about the same too. In this case, I think you can show that homogenous blocks of length log(n) - log log (n) occur at least with constant probability as n gets large. In other words, the length of homogenous blocks is basically guaranteed to grow with n. I suppose a human trying to generate a random sequence will prevent homogenous blocks above a certain constant length from appearing regardless of the length of the sequence, which would make distinguishing the sequences for large n quite easy!
I think there is also a quite strong connection in this anecdote to the information-theoretic notion of entropy, which takes us all the way back to the idea of entropy as in the article :-) Information-theoretically, the entropy of a long random sequence concentrates as well (it concentrates around the entropy of the underlying random variable). The implication is that with high probability, a sampled long random sequence will have an entropy close to a specific value.
Human intuition actually is somewhat correct in the anecdote, though! The longer the homogenous substring, the less entropy the sequence has, and the less likely it is to appear (as a limiting example, the sequence of all 0s or all 1s is extremely ordered, but extremely unlikely to appear). I think where it breaks down is that there are sequences with relatively long homogenous substrings with entropy close to the specific values (in the sense that the length is e.g. log (n) - log log (n) as in the calculation before), where the human intuition of the entropy of the sequence is based on local factors (have I generated 'too many' 0s in a row?) and leads us astray.
In the video above you will witness a metal wire in a disorderly shape spontaneously form into an organized spring shape when entropy is increased (the wire is heated). There are no special effects or video rewinding trickery here, the phenomenon is very real.
It is as if you were looking at water spontaneously form into ice cubes but entropy is NOT reversing! It is moving forward.
The video is a really good example of entropy. It's good in the sense that if you understand why entropy is increasing when the metal is heated than you truly understand what entropy is... as the video gets rid of the notion of order and disorder all together.
That's right. There are many cases of increasing entropy where the system spontaneously progresses from a greater disorder to more organization.
Many people say that the when some system becomes more organized that means entropy is leaving the local system and increasing overall in the global system. This is not what's happening here. The wire is being heated. Atoms are becoming more organized and less disorderly by virtue of MORE entropy entering the system. If you understand this concept then you truly understand entropy. If not you still don't get it.
If the entropy of the universe were to suddenly go in reverse would we be able to detect it or would our memory formation and perceptions being reversed make it indistinguishable from what we experience now?
Yes, time reversal would also reverse the process of our perception and memory, so we would experience time "moving forward" even if time were "moving" backward. We would experience entropy increasing even if entropy were "decreasing with time". (And it's perfectly valid to call the past "+t" and the future "-t"; the laws of physics don't care; if you do that, you'll see that entropy decreases as "t" increases, but despite changing definitions you'll still only have memories of a universe with lower entropy than the present).
This is one reason why it might be more correct to say that time doesn't move at all. We just perceive a direction of time wherever the universe has an entropy gradient along "the time axis".
Entropy decreasing is not equivalent to time reversal.
[Edit: to expand a bit: time reversal requires that some previous macrostate is achieved again. Entropy decreasing merely requires that the system enters any macrostate represented by less microstates than the current one.
Put in more simple terms, the ice cube could reform but in a different shape. Entropy would have decreased, but it would not "look like" time reversal - it would just look like something very strange had happened. ]
This is true! I oversimplified a bit, because it's much easier to concretely discuss that scenario and its consequences, and it's more important from a philosophical perspective. Just like it's much easier to catch ice cubes forming on film by rewinding a tape than to run the camera waiting for spontaneous entropy decrease.
Nonetheless, if the entropy of the whole universe were to consistently decrease for a sufficient time, even along a different route than time reversal, you should gradually start to notice that it's easier to form predictions or "memories" of the future than to remember the past, due to Landauer's principle. Before long, you should go back to considering the future to be the state of higher entropy.
I think you're priviledging microstates. There are plenty of entropy decreases that are possible without the ice cube reforming. They're (extremely) unlikely of course, but somewhat less unlikely than the new, singular microstate you're asserting is somehow "less of a coincidence". It's vastly more likely (for example) that 1% of the particle velocities reverse than all of them doing so, and that could (depending ...) still lead to a decrease in the entropy of the system (but not a reformed ice cube).
I don't see how a long term entropy decrease that does not follow the time reversal route could be easier to form predictions in. There are many possible higher entropy states for the water than the water sitting in the glass, and if the universe enjoyed random entropy decreases, I see no reason why it would tend to pick certain macrostates over others.
Ah, I think we're just miscommunicating. I'm not saying that local time-reversal is more likely to happen than other kinds of entropy decrease; I'm not saying that a system is more likely to retrace its past than to enter a different state of low-entropy. Spontaneous local entropy decreases happen all the time, of course (but are overpowered by entropy increases), and the majority of those won't be exact reversals.
And I'm not saying that locking yourself in a refrigerator will allow you to predict the future!
I'm mainly picking on time reversal because it's easier to concretely communicate and reason about as an example of how a system behaves under entropy decrease, and is philosophically important because depending on one's definition of the sign of 't', you could actually view our universe as undergoing entropy decrease right now. But redefining the arrow of time won't let you form memories of the future -- you'll have memories of the lower entropy universe, regardless of which way you call "the future".
The deeper point is, that's not a coincidence and doesn't depend on entropy decreases specifically being tied to time reversal. If we lived in a universe where the second law were somehow different and entropy "statistically always" decreased with time the way it "statistically always" increases now, then... well, it's hard to reason about such a universe because either it has very different laws than our own, or it's just our own universe with exactly that arbitrary t <--> -t transformation. But in most logically-consistent interpretations of that scenario, that universe's version of Landauer's principle would also flip the arrow of time perceived by its local inhabitants, and they'd end up with only memories of their "future".
If I'm not quite making my point clear, try asking yourself why, if the laws of the universe are time reversible, why can't you remember the future the way you can remember the past? This gets into the mechanics of how memory works (as a general concept, not human memory specifically; it's easier to think about computer memory).
(Edit): Why isn't this a trivial point? Well, if you imagine a universe composed of a chain of of "linked states" that goes from (low entropy) - (high entropy) - (low entropy), you'd find that any inhabitants of that universe would perceive a universe with directional time that progresses in the entropy gradient, even though that universe has no consistent directional time.
> why can't you remember the future the way you remember the past
Memory of a finite brain isn't -- and cannot be -- a perfect record, so to some extent human memory retrodicts personally experienced past events extrapolatively. That human memory is better than human prediction of future events might not be closely coupled to cosmic thermodynamics.
For example, our ancestors with good memories producing more viable offspring than their contemporaries with poor memories but better predictive skills. You might not want to play certain competitive sports against the better predictors but worse rememberers, since they are likely to know where to be to catch the ball or whatnot; but you also might not want to eat the food they prepare because they don't reliably remember crucial food-safety practice.
The relative entropy inside the braincase of a living Australopithecus or H. erectus versus inside a living modern human's braincase has little to do with the change in total entropy of the universe over the past couple million years. It is perfectly plausible under modern physics that humans a couple million years in the future may end up with simpler and smaller brains, rather than larger or more complex ones. And the improving scientific knowledge of the evolutionary changes in human skull dimensions was not helpful in resolving the 20th century question of whether the universe was collapsing, expanding, or static.
Finally we aren't very good at communicating with the other intelligent species on our planet. Maybe orcas or octopuses have terrific memory and don't feel a significant difference between remembering a recent previous hunt and predicting the one they are just about to embark upon. It'd be fun to find out that our distinction between memory and prediction is just another part of human vestigiality, like our inability to produce L-ascorbic acid internally. Colloquially, maybe "future memory" was one of the things that were too metabolically expensive for our starving distant ancestors to live with, and so it fell off like tails.
>why can't you remember the future the way you can remember the past
this is getting a bit meta, maybe even off-topic. But I think this is fairly simple to explain: you can't remember states you haven't been in. Generalized memory implies some record of a state that has occured - states that have not occured cannot be remembered. The problem is that memory of types that we are familiar with (human, written, computer) all involve macrostates, not microstates. And increasing entropy (aka "more probable things are more likely to happen") mean that there is asymmetry at the macrostate level, and thus in memory.
A memory system that only recorded microstates would, I suggest, have no concept of time, and a much reduced notion of causality. There may be some statistical patterns that could be observed and could perhaps be "strong" enough to infer that "after state N we frequently end up in state P", but the sheer number of states would likely interfere with this.
There's also the timescale problem. A memory system that operates on the timescales of typical human experience will notice relatively constant change in much of the world, as macrostates come and go. But a memory system that operates at, say, geological timescales won't record many macrostate changes at all, and will tend to indicate that almost nothing happens in the world. All those macrostates ("tables", "chairs", "houses", "books") that came and went without being noticed form no part of this system's memory of the world. Of course, there are processes still taking place (new macrostates made up of even more vast microstates), but these going to be even more directional/asymmetric.
The one part of this view that leaves me a little confused is that at very short timescales, the unchanging nature of many macrostates is echoed in relatively unchanging microstates for most solids. The piece of metal that makes your <whatever> isn't changing macrostates at any appreciable pace (which is why its eventual wearing out forms an asymmetric experience of time for us), but it also isn't changing microstates in any notable way either. I find this confusing.
False memories are commonplace. Where exactly did you put your keys? Even if you give the right location, is that a true memory of the state you and the keys were in when you separated, or is it a retrodiction ("I probably put them in their usual storage place")?
> A memory system that operates on the timescales of ...
You mean one that measures parts of the world periodically?
ISTR we discussed this some exp(10^120) years ago[1] but I forget whether we reached any conclusions.
Sorry for the late reply. Don't know if you'll still be reading this, but...
> This is getting a bit meta, maybe even off-topic. But I think this is fairly simple to explain: you can't remember states you haven't been in. Generalized memory implies some record of a state that has occured - states that have not occured cannot be remembered.
In a discussion about the arrow of time, this is somewhat begging the question! The difference between a state that you've "been in" and a state that you "will be in" is exactly the subject that we're discussing. How do you precisely define "been in" or "occurred" in a way that doesn't reverse under a transformation from t <---> -t?
The answer I've provided elsewhere in this thread is along the lines of: Other than melting icecubes and scrambling eggs, the only other difference you can notice between the past and the future is that you can remember the past, but you cannot remember the future. If you could remember the future just as well as you remember the past, you probably wouldn't have strong opinions about which way time goes (or which direction is "clockwise"). If you, Merlin-like, could only remember the future then you'd probably be here asking why you always observe entropy decreasing in closed systems, and complaining that they got the second law of thermodynamics backwards.
On a fine-grained scale causality works just as well in a rewound video, although it is full of spontaneous-seeming coincidences with surprising macroscopic effects. There are only two things that establish an arrow of time:
* The universe has an entropy gradient along the "time axis", with one direction (which we can call 'P') having lower entropy and the other ('F') having higher entropy.
* We (and other physical systems) are able to form memories in one direction, but not the other. Because of this, we perceive a sense that time progresses "from" the direction that we can remember. This happens to be the direction of increasing entropy. Because of this ability to remember only along one direction of the entropy gradient, we call 'P' the past and 'F' the future.
This is not a coincidence. Memory operates on systems of increasing entropy, so you'll always only remember the past having less entropy than the present. [1]
>In a discussion about the arrow of time, this is somewhat begging the question!
Not really. "Been in" isn't meant to imply a temporal relationship. If a system has been in microstates A, B and C, then it can potentially remember states A, B and C. It cannot remember state D. That doesn't stay anything about whether or not A preceeded B, or C preceeded A.
I agree with your last two paragraphs, but not the formulation in the middle para. "you cannot remember the future" ... this just seems like an non-useful observation to me. I think the problem comes from this line:
>We (and other physical systems) are able to form memories in one direction, but not the other.
I think this is wrong. The issue is that our memories are of macrostates, and macrostates are subject to the entropy gradient. If we could form memories of microstates, we'd effectively be able to remember events that had no arrow of time associated with them. But then you say this yourself in your final line. And Maccone's concept seems fine to me (as if that matters :)
I love how the notion of entropy permeates into so many other things. It's fundamental, universal, and at the heart of nearly every aspect of our existence.
Take philosophy. If the ultimate state of everything culminates in chaos (according to the theory of entropy), the human existence constitues the exact opposite: controlling the chaos that surrounds us, and shaping it into something useful and, in entropy-speak, progressively unprobable. Making ice cubes out of water.
It follows that our existence can at least be described as a function of entropy.
This describes life. Survival is the battle against entropy. Procreation is the chosen weapon against chaos, a force of order in a universe that can't help itself but fall into chaos -- and, undoubtedly, will ultimately prevail in that fight. In the long run, life will lose.
Alternatively, all that attempts to control chaos and decrease entropy actually results in faster entropy increase overall on a systemic level. I remember reading about a (Russian?) physicist that believed that life simply happens as a result of the universe's attempt to increase entropy faster on sufficiently complicated systems. If someone remembers his name I'd be obliged.
Dorion Sagan (Yes, Carl Sagan's son) also covers this in his book Into The Cool. That life is only a force multiplier in increasing it universally, basically.
See my username ;) But my inspiration was from trying to make a difference within corporate culture, not raw survival.
> controlling the chaos that surrounds us, and shaping it into something useful
i.e., work
> In the long run, life will lose.
Unless there is a transcendent being beyond our universe, an explanation believed by many people for why, a long time ago, entropy was microscopic. The article admits it can't come up with a materialistic explanation and ends by calling it a "mystery".
It's true. Have you read the book "What is Life?" By Schrodinger? He discusses the very concept you're talking about as he speculates on the physical substrate of genetic information (which was discovered to be DNA not long after the book was published).
Great explanation. What are the top theories for why the universe began in a low entropy state? The mere fact of that seems to contradict our current understanding of entropy. That implies that there is something very fundamental about the universe which we don't understand.
> The mere fact of that seems to contradict our current understanding of entropy
What part of our understanding does it contradict? The second law of thermodynamics says that entropy increases with time; this seems entirely consistent with a low-entropy past.
One explanation for why the universe began in a low entropy state is that that state has a very low description length. (This is a bit of a truism, since description length is a measure of entropy). But basically, let's just imagine that the universe is a simulation, with an initial state described by an initialization routine that sets up the simulation to run. If the initial state has high entropy, that initialization routine would need to be very long and detailed to describe exactly the location of every electron, neutrino, etc. If the initial state has very low entropy, that initialization routine is very short. If there's a reason to think that a short program is "more probable" than any particular very long program, then that would explain a low-entropy initial condition.
Another explanation is that, if the universe random-walks through all possible configurations, the "past" will still always look lower-entropy than the "future", for any little life-form that occupies that universe, because that life-form's memories will be much more likely to be correlated with the lower-entropy state. (It would have been nice for the article to go into this detail, but it's rarely discussed).
Still another explanation is provided by Many-Worlds interpretation of QM. Again the "big bang" is akin to initializing the wavefunction of the universe to something very simple and compact like a constant function, which as a whole evolves unitarily; the complexity and increasing entropy arises within particular branches of that wavefunction, where an observer requires an ever-longer description length to identify their particular branch.
A higher entropy state has a longer description length.
For example, let's say I have a magic electron microscope that can scan and record the exact position and velocity of each particle in some 1-cubic-micron volume, to within Heisenberg uncertainty limits and some finite digitization precision.
If my sample is a 1-cubic-micron volume of flawless monocrystalline silicon at 0 Kelvin, I can 'zip' my recording and transmit that description in a much shorter sentence (in fact, I just sent it to you!) than if my sample is a cubic micron of room-temperature saltwater (whose macrostate I just described, but whose microstate I did not).
If you care about describing the details, you can compress your description better if it's a low-entropy state.
But of course, cosmology is full of more mundane explanations about how the limit of the possible entropy of the universe can grow with time, so a high-entropy state suddenly has a lot of room to increase even further.
There is a fantastic book by Leonard Susskind [0] that tackles this questions. His answer - rooted in string theory - is based on a "landscape of possibilities" and we happen to be in the place where all elementary particles (more specifically the Higgs) had the right energy configuration.
Where did those particles come from? Is there any scientific theory for the beginning of the beginning? A friend who studied physics told me that there's no reason that matter/energy couldn't have just existed forever. I don't find that explanation satisfying. If "entropy always increases" is a universal rule, then there must have been a time where it went from zero to non-zero. If "matter cannot be created or destroyed" is supposed to be a universal rule, then the mere existence of matter seems to contradict the rule, rendering it a non-universal tautology with vast but nonetheless bounded application.
No. In a forever-existing universe, as you go back in time, entropy could asymtotically approach zero (or some other limit) without ever actually reaching it. It doesn't have to be zero at some point in the past.
In fact, if it is zero at some point in the past, then what happened before that? Did it not increase before that point, or was it less than zero before that point?
I get your point about the asymptote, but I should have clarified: I was synthesizing two rules, "entropy always increases" and "matter cannot be created or destroyed"
Beyond that, metaphysically, how does something increase without having an origin? "Everything just always was" seems to conveniently handwave away a very important line of inquiry. This line of inquiry might be uncomfortably close to religious thought, but that shouldn't be a reason to terminate it.
As time goes on, we are able to explain more and more things scientifically, and yet, arbitrarily, we are supposed to be satisfied with "energy and matter were just always there, any other explanation is arbitrarily religious" without a sense of irony. That belief is held onto with religious conviction, sometimes used to dismiss alternate ideas with religious fervor, and backed with a religious rather than scientific standard of evidence.
Here's a premise for a (maybe horrible) book idea: God manipulates the minds of top scientists so that they do not definitively prove his existence, which would ruin the point of his simulation. Each time they get close, he finds an idiosyncratic way to make them forget about it or dismiss the idea. The people who discover God's tricks get brownie points in the afterlife. The people who discover God's tricks but try to publicize them, hence ruining the illusion, get...taken care of. Perhaps you can say the entropy of their body would increase.
Or it wasn't and came into existence with a 'big bang'. So how did it come into existence? There's no scientific or religious answer for it. You'll still be skirting the question with the old 'who made the gods' problem. I know what you're getting at and trying to conflate science with religion. Spoilers here: Scientists don't know these answers and just because we don't know doesn't mean the thousands of denominations of thousands of gods of thousands of religions of thousands of years has a true answer either.
>we are supposed to be satisfied
No one said you should be satisfied by it. That's what's good about science, you're expected to not be satisfied and to survey and question, it's not infallible nor claim omniscience.
>God manipulates the minds of top scientists so that they do not definitively prove his existence, which would ruin the point of his simulation. Each time they get close, he finds an idiosyncratic way to make them forget about it or dismiss the idea.
And of course it's the gods, or all created thereof, who are behind helping or hurting belief in gods. Ask a god believer and it's always god's plan.
>then there must have been a time where it went from zero to non-zero.
Take a step towards the door. On the next step, take half that step. Keep taking half the step as before and you'll never get to the door. Perhaps the multi/universe is so infinite that there's no end or beginning as we know it. You can fill in the blank with deities, anthropomorphic or what have you, but that's where science ends and religion begins.
Cosmology works a lot around that question. There are a few theories that claim that the maximum entropy of the universe is growing with time, so what was high-entropy on the past gets room to keep growing.
That's remarkably well done, but the underlying principle can be summarised as "regression towards the mean" which is not so difficult to understand.
Now explain this: when the sheep are cooped up in one box you can extract energy from the system. This becomes apparent if you imagine the boundary between the boxes is a fan. As the sheep move from what is a high pressure on one side to 0 pressure on the other they will move the fan. Attach a generator to the fan, and you get energy out.
This configuration, with high pressure on the side low on the other, will happen every so often through random chance. Admittedly not very often, in fact so rarely it's useless to us. But in principle we have a perpetual motion machine.
We don't, of course. But why not?
And more to the point is the 2nd law nothing more than a statement of averages? By which I mean "order always tends towards disorder" is in fact false. We will every so often see order arise from disorder. So the 2nd law is not really a hard and fast law, any more then "you always loose when gambling at the casino" is a hard and fast law.
Great article. Can energy be explained in a similarly simple fashion? I'm pretty comfortable with probability theory so this entroppy explaination makes sense, but I still don't understand what energy is.
Also, who determines what a 'macroscopic variable' is? Why do there only seem to be 3 for gassess (V, T, P?)
>locally anti-entropic processes. We do see liquids turn into regular solids after all:
Crystallization is exothermic process. It releases heat into environment. It converts potential chemical energy into heat as result of bond making. Thus the total entropy of the system "crystal forming liquid + environment" is increased. Life is another famous process of local anti-entropy which is driven by the increase of the total entropy.
great explanation and visuals.
but I do not quite get the way the arrangements of sheep are treated. They seem to be counted in the standard "Unordered Sampling with Replacement" fashion.
>Just as the sheep wander about the plots of land in the farm, these packets of energy randomly shuffle among the atoms in the solid.
this would mean that any "packet" of energy is equally likely to be in any of the buckets and the "packets" are independent of each other. so we have an equal distribution on the product space with 6^6 elements.
but then later
> Now, let’s assume the sheep are equally likely to be in any of these 462 arrangements. (Since they move randomly, there's no reason to prefer one arrangement over another.)
under the prior assumptions these arrangements would not be equally likely. e.g. "all sheeps in plot 1" would be far less likely than "each sheep in a different plot"
am I missing something here?
in any case the same conclusions can be drawn in both cases, only that the concentration around 3 is already more pronounced in the "6 sheep, 6 plots" case using the product space model.
I think the confusion is in the way that sheep as a word can be both plural and singular. Specifically, one sheep is as likely to be in any single spot compared to any other single spot. It’s when you get to more than one sheep that you see the distributions
Great article! I'm almost done with a PhD in physics and statistical physocs is still the hardest thing for me to wrap my mind around. Even more so than quantum and relativity. There's something about how unintuitive math becomes in ultra high dimension and the old timey feeling way that it was taught to me (steam engines and stuff) that makes it difficult to learn.
Thermodynamics as a subject is taught in two entirely different traditions, the engineering tradition and the physics tradition. I've taken courses in both traditions and I personally find that physics tradition explains a lot more.
When I took engineering thermodynamics, the focus was on solving "practical" problems, and I never got a good understanding of what entropy was. The point of the class was not to teach physics, though. The point was to get engineers to solve thermodynamics problems (like heat engines and cycles). The primary thing I remember from that course was looking up things in the tables in the back of the book and converting between Btus and other units.
I only understood entropy after taking a statistic physics course. The course's explanation of entropy is basically the same explanation as the article. The point of that class was entirely different than the engineering course, so the professor spent several lectures going into detail about what entropy is and how it relates to everything else. I never had to do any table lookups in that class too.
Side note: you can easily tell what tradition you were taught in based on the sign convention in the first law of thermodynamics you learned:
I don't really understand this "Entropy Is All About Arrangements" takeaway. A fair coin has higher entropy than a biased one. What are the "arrangements" in this case?
The coins don't have entropy, the sequences they produce do (in information theory sense, this is not about physics).
For a long sequence (say 1000), the former will produce close to 500 heads and 500 tails. The latter, assuming one heads are three times as probable as tails, will produce around 750 heads and 250 tails. There are many more different sequences of the first kind.
I'm referring to the entropy of the Bernoulli distribution. If the coin is fair, the entropy is 1 bit... if the coin isn't fair, then the entropy of the distribution is less than 1 bit. I'm having trouble reconciling the information theory way of thinking about entropy as a function of a distribution, with how physicists tend to think of entropy of arrangements.
There is a relationship between the distribution of x_i and the sequences generated from that distribution x_1, x_2, ..., x_n. If the coin isn't fair there are less arrangements possible. If the coin has two heads there is only one possible sequence.
excuse my naivete, but do black holes help reduce entropy by capturing/engulfing things around them? Is that the cycle how universe keeps creating and recreating itself?
Nope, black holes have entropy proportional to the surface area of their event horizon. So the more stuff they engulf, the more their entropy increases, and thus they satisfy the 2nd law of thermodynamics just like everything else.
Has this been observed, or is it more: here's some math that makes entropy even possible because the alternatives sound unlikely? I'm betting on the weird, things like the universe being generative on a macro scale and entropic in local timespace, wherein dark matter is merely newly created matter not in another universe, but in this universe, maybe some unknown interactions between the unstoppable force of expansion and the immovable object of a black hole's gravity. I'm probably blathering, I'm not a physicist.
Order as humans understand it is different from physical order. We see order as an array of ascending numbers or a house of cards. The universe sees order more like a state of 'useful' energy, where it's possible to extract it, and seemingly became ordered by putting energy in. This is why I think the information theory definition of entropy is a bit misleading.
I like to see it as a radioactive atom, which starts as a useful structure with potential energy that has an inherent timer until this energy is lost due to the universe wanting to return to equilibrium. So it's statistically extremely unlikely to get the atom back to its original energetic state by nature, but not because because it's literally impossible, it's just impossible to do this without putting energy back in and that's not something that happens naturally.
A reversal of entropy is entirely possible, it's just called doing work. Of course, we don't have the knowledge necessary to reverse each individual atom in the melting process of an ice cube, but one day we might. It's theoretically possible with enough work. Of course, there will always be a loss of energy, but i'm pretty sure that's an entirely different thermodynamic law.
The only tiny change I'd like to make is to add a line or two near the end, something along the following lines: