The key clarification is in one of the comments: if you want to treat partial derivatives like fractions, you need to carry the "constant with respect to foo" modifier along with both nominator and denominator.
Once you do that, it's clear that you can't cancel "dx at constant z" with "dx at constant y" etc. And then the remaining logic works out nicely (see thermodynamics for a perfect application of this).
I’ve never liked to conflation with fractions. Abuse of notation. And it causes so much confusion.
Also integrals with “integrate f(x) dx” where people treat “dx” as some number than can be manipulated, when it’s more just part of the notation “integrate_over_x f(x)”
Sigh. These are sadly some kind of right-of-passage, or mathematical hazing. Sad.
Ordinary derivatives work fine as fractions. They are rigorously the limit of a fraction. Same deal with dx inside the integral, it is rigorously the limit of a small \Delta x in a summation.
Baez is mixing partial derivatives with different variables treated as constants. Whole different ball game.
I consider it not an abuse of notation but a helpful notation. It insinuates correctly >90% of calculus rules, which tend to be hard to remember otherwise.
I still don’t understand what "at constant something" means. I mean formally, mathematically, in a way where I don’t have to kinda guess what the result may be and rely on my poor intuitions and shoot myself continually in the foot in the process.
I assume you're talking about thermodynamics - this comes down to a slight abuse of notation. For an ideal gas, say, you can express various state functions like the internal energy in various different ways. You can do it in terms of pressure P and volume V to get U ~ PV, for instance.
Or you could do it in terms of temperature T and pressure, for instance, to obtain U ~ T (in this case there's no dependence on pressure).
The ideal gas laws let you transform between these choices. But the point is that the same physical quantity, U, has multiple mathematical functions underlying it - depending on which pair you choose to describe it with!
To disambiguate this physicists write stuff like (dU/dP)_T, which means "partial derivative of U wrt P, where we use the expression for U in terms of P and T". Note that this is not the same as (dU/dP)_V, despite the fact that it superficially looks like the same derivative! The former is 0 and the latter is ~V, which you can compute from the expressions I gave above.
The mistake is thinking that U is a single function of many independent variables P, T, S, V, etc. Actually these variables all depend on each other! So there are many possible functions corresponding to U in a formal sense, which is something people gloss over because U is a single physical quantity and it's convenient to use a single letter to denote it.
Maybe it would make more sense to use notation like U(T, P) and U(P, V) to make it clear that these are different functions, if you wanted to be super explicit.
> The mistake is thinking that U is a single function of many independent variables P, T, S, V, etc. Actually these variables all depend on each other!
So, in vector space terms, we have different bases for describing U in, but not that many independent variables.
If U is a function of x and y, but x and y are not orthogonal, then I can't treat dU/dx and dU/dy as independent, even for partial derivatives, because x and y aren't really independent.
You're not, in general, just working in a vector space but on a manifold whose coordinates are your extensive variables. It's only linear locally, in the (co-)tangent space where you're doing calculus.
Yeah, I think this is along the right lines - in the vector space analogy it's like we have a bunch of vectors we can measure (P, T, S, V, etc) but due to the constraints we're actually working in a 2 dimensional space. So we could form a basis from many different choices of vectors, and our coefficients would change accordingly.
As the other commenter said, you can make this analogy rigorous by looking at manifolds (differential geometry). They're a little bit like the non-linear version of a vector space. In this case the set of physically valid values for P, T, S, V forms a two-dimensional surface due to the ideal gas laws, and you can derive local coordinate charts for the surface using any (non-degenerate) pair of these variables.
Imagine a function z=f(x,y) in 3D space. Now picture a plane at say, x=3, that is parallel to the plane passing through the Y and Z axes. This x=3 plane cuts through our function, and its intersection with the z=f(x,y) function forms a sort of 2D function z=g(x)=f(3,y).
(The Wikipedia page[1] has nice images of this [2])
The slope of this new 2D function on the x=3 plane at some point y is then the partial derivative ∂z/∂y for constant x at the point (3,y). As we are "fixing" the value of x to a constant, by only considering the intersection of our original function with a plane at x=x_0.
That’s just the standard partial derivative in multivariable calculus. This one I have no trouble to understand. My question is about "at constant something" as used in thermodynamics, where "at constant something" is clearly doing more work than just "partial derivative". What work ? How ? Damned if I know.
Consider f(x,y,z), let’s say f(x, y, z) = x^2 + 3y^3 - e^(-z). What’s the difference between "the partial derivative of f with respect to x" and "the partial derivative of f with respect to x at constant y" ? The first one is already at constant y !
In standard multivariate calculus, the partial derivative of f with respect to x , as you explained, is always "at constant y and z".
In thermodynamics, you can say things like "partial derivative of pressure with respect to volume" and add "at constant temperature" or "at constant entropy" and get different results. What ? Why ? How ?
> things like "partial derivative of pressure with respect to volume" and add "at constant temperature"
They're the same thing, isn't it? Except that with add the "at constant temperature" addendum, you're just making explicit the other variable(s) that can potentially be varied. Without it, it just means all other variables, whatever they may be, are constant.
But if something depended on both temperature and some other quantity X, and you said "partial derivative of pressure with respect to volume at constant temperature," that would be sort-of misleading because you're only exlicitly mentioning one of the other two variables - rather, you should say "at constant temperature and X" or not mention either of them.
They aren't the same thing since the first is strictly speaking not well defined - see my answer to the OP. I think the problem is that physicists use the same letter, say U, to denote multiple different mathematical functions depending on the context. The "holding XXX constant" thing serves to tell you which function you're dealing with formally.
For problems in the plane, it's natural to pick two coordinate functions and treat other quantities as functions of these. For example, you might pick x and y, or r and θ, or the distances from two different points, or...
In thermodynamics, there often isn't really one "best" choice of two coordinate functions among the many possibilities (pressure, temperature, volume, energy, entropy... these are the must common but you could use arbitrarily many others in principle), and it's natural to switch between these coordinates even within a single problem.
Coming back to the more familiar x, y, r, and θ, you can visualize these 4 coordinate functions by plotting iso-contours for each of them in the plane. Holding one of these coordinate functions constant picks out a curve (its iso-contour) through a given point. Derivatives involving the other coordinates holding that coordinate constant are ratios of changes in the other coordinates along this iso-contour.
For example, you can think of evaluating dr/dx along a curve of constant y or along a curve of constant θ, and these are different.
I first really understood this way of thinking from an unpublished book chapter of Jaynes [1]. Gibbs "Graphical Methods In The Thermodynamics of Fluids" [2] is also a very interesting discussion of different ways of representing thermodynamic processes by diagrams in the plane. His companion paper, "A method of geometrical representation of the thermodynamic properties of substances by means of surfaces" describes an alternative representation as a surface embedded in a larger space, and these two different pictures are complimentary and both very useful.
Here's a geometric way of looking at it. I'll start with a summary, and then give a formal-ish description if that's more your jam.
---
The fundamental issue is physicists use the same symbol for the physical, measurable quantity, and the function relating it to other quantities. To be clear, that isn't a criticism: it's a notational necessity (there are too many quantities to assign distinct symbols for each function). But that makes the semantics muddled.
However, there is also a lack of clarity about the semantics of "quantities". I think it is best to think of quantities as functions over an underlying state space. Functional relationships _between_ the quantities can then be reconstructed from those quantities, subject to uniqueness conditions.
This gives a more natural interpretation for the derivatives. It highlights that an expression like S(U, N, V) doesn't imply S _is_ the function, just that it's associated to it, and that S as a quantity could be associated with other functions.
---
The state space S has the structure of a differential manifold, diffeomorphic to R^n [0].
A quantity -- what in thermodynamics we might call a "state variable" -- is a smooth real-valued function on S.
An diffeomorphism between S and R^n is a co-ordinate system. Its components form the co-ordinates. Intuitively, any collection of quantities X = (X_1, ..., X_n) which uniquely labels all points in S is a co-ordinate system, which is the same thing as saying that it's invertible. [1]
Given such a co-ordinate system, any quantity Y can naturally be associated with a function f_Y : R^n -> R, defined by f_Y(x_1, ..., x_n) := Y(X^-1(x_1, ..., x_n)). In other words, this is the co-ordinate representation of Y. In physics, we would usually write that, as an abuse of notation: Y = Y(X_1, ..., X_n).
This leads to the definition of the partial derivative holding some quantities constant: you map the "held constant" quantities and the quantity in the denominator to the appropriate co-ordinate system, then take the derivative of f_Y, giving you a function which can then be mapped back to a quantity.
In that process, you have to make sure that the held constant quantities and the denominator quantity form a co-ordinate system. A lot of thermodynamic functions are posited to obey monotonicity/convexity properties, and this is why. It might be also possible to find a more permissive definition that uses multi-valued functions, similar to how Riemann surfaces are used in complex analysis.
To do that we'd probably want to be a bit more general and allow for "partial co-ordinate systems", which might also be useful for cases involving composite systems. Any collection of quantities (Y, X_1, ..., X_n) can be naturally associated with a relation [2], where (y, x_1, ..., x_n) is in the relation if there exists a point s in S such that (Y(s), X_1(s), ..., X_n(s)) = (y, x_1, ..., x_n). You can promote that to a function if it satisfies a uniqueness condition.
I think it is also possible to give a metric (Riemannian) structure on the manifold in a way compatible with the Second Law. I remember skimming through some papers on the topic, but didn't look in enough detail.
---
[0] Or half of R^n, or a quadrant maybe.
[1] The "diffeomorphism" definition also adds the condition that the inverse be smooth.
[2] Incidentally, same sense of "relation" that leads to the "relational data model"!
I honestly don't know why infitesimals aren't widespread. It can basically have the same basis/justification can't it? But with the bonus of being more intuitive.
You don't even need to use "infinity", it starts out as just a variable representing some unknown quantity, then you "round to zero" on output.
I actually collected a bunch of old Infinitesimal calculus math books.
> I honestly don't know why infitesimals aren't widespread. It can basically have the same basis/justification can't it? But with the bonus of being more intuitive.
Indeed they are more intuitive, people like Newton and Leibniz invented/discovered calculus by thinking in terms of infinitesimals, but it took time to be made rigorous, in the XX century. By then network effects got we stuck with epsilons and deltas, given that was the approach made rigorous earlier, and broadly adopted, despite being more cumbersome.
They are in the attic at the moment, but they are all fairly old books (and terse, dry, basic formatting/illustration), seemingly from a period in time when infitesimals were apparently more popular.
Newtonian notation certainly feels more elegant to me. But kind of painful to work with in LaTeX. Langrangian notation is almost the same, and much eaiser to type too.
Newtonian notation is just doing time derivatives with a dot above them, so in Latex that is just \dot{x} = v . Which means dx/dt = v, or \ddot{x} = a.
Did you mean "Leibniz's" notation[1]? If so, if you use the esdiff package[2] it's just \diffp{y}{x} for partials or \diff{x}{y} for regular derivatives.
Lagrange's notation is when people do x' = v or x'' = a and Like the Newton's notation you kinda have to know from context that you are differentiating with respect to time unless they write it properly as a function with arguments which people often tend not to (at least I often tend not to I guess).
Sometimes people call the partial derivative notation where you use subscripts "Lagrange's notation" also[3]. So like f_x(x,y) = blah is the partial derivative of f with respect to x.
[1] Actually invented by Euler, or maybe some other guy called Arbogast or something[?sp]
It has been argued before [0] that Leibniz notation being embraced in mainland Europe and not adopted in England/UK was the reason England fell about a century behind. First heard of this in MIT Calc undergrad course on YouTube, but would be too tedious to find which video, hence ran a search on the Internet.
I’d be thrilled if mathematicians would just use multicharacter variable names instead of getting overly fancy with diacritics and italic/bold/capital/Greek variations.
Am I missing something, I don't see how the examples are more "intuitive" as they just provide an allied example of using this?
My pain was always Hamiltonians and Legendre equations for systems because the lecturer believed in learn by rote rather than explaining something that I'm sure for him was simply intuitive.
Why would you even tell in the first places derivatives are simply fractions ? They’re not, unless in some very specific physical approximations and in that case don’t try to do anything funky, sticks with the basics stuff
My understanding is they actually are fractions of things called differential one-forms[1], but even most people who can do calculus don't get to differential geometry, so the sense in which they are fractions is not commonly understood. Michael Penn explains it here https://youtu.be/oaAnkzOaNwM?si=nwNNg4pl7WW4KvIO
A 1-form is a section[1] of the cotangent bundle[2] of a manifold. In other words, a rank 1 covariant tensor field.
At any given point p on an n-dimensional manifold, a 1-form defines an n-dimensional cotangent vector (in the language of bundles[3], a point in the fiber over p).
So how do we define fractions of sections or vectors?
In the article, Baez defines fractions of 2-forms on the plane as the pointwise ratio of coefficients of a basis vector, which he can do because, as he points out, the space of 2-forms at a point on a 2-dimensional manifold is a 1-dimensional vector space (more generally, for k-forms on an n-dimensional manifold, this dimension is n choose k, so only 1 for 0-forms [functions] and n-forms).
If you are careful to represent them on the right set of variables, and apply them on the right points (what the example on the article obviously doesn't do), they pretty much behave exactly like fractions.
There are many areas of mathematics that spun from this.
Being careful is very costly, if you’re not careful you’re just approximating or being wrong. At this point we have to weigh if the tradeoff of writing it as fractions was really worth it
they refer in the beginning to physics classes, and I had the same exact experience in university. diffeq was not a prereq and yet instead of explaining the derivation of these equations, our physics professor just handwaved and said "they're basically just fractions, don't think about it too much"
Once you do that, it's clear that you can't cancel "dx at constant z" with "dx at constant y" etc. And then the remaining logic works out nicely (see thermodynamics for a perfect application of this).
reply