## Fibonacci and gold

Most people have heard either about the golden ratio

$\varphi := \frac{1}{2}(1 + \sqrt{5}) \approx 1.618\ldots$

or about the Fibonacci numbers

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, ….

They are intimately related, and I could write several enormous posts enumerating all of their amazing properties.

The golden ratio $\varphi$ is an irrational number which satisfies the polynomial equation $x^2 - x - 1 = 0$, that is, we have $\varphi^2 - \varphi - 1 = 0$. In fact, since this is the lowest-degree (hence “simplest”) polynomial annihilating $\varphi$, we refer to it as the minimal polynomial of $\varphi$. Many surprising facts can be derived from this innocuous-looking relation. For example, $\varphi^2 = 1 +\varphi$ immediately yields that $\varphi = \sqrt{1 + \varphi}$, from which we get the “continued surd” expression

$\varphi = \sqrt{1 + \sqrt{1 + \sqrt{\ldots}}}$

Stated differently, $\varphi$ is a fixed point of the mapping $x \mapsto \sqrt{1 + x}$. When you were a kid, if you were bored with a calculator, maybe you had the idea of starting with some number, adding 1 to it and taking its square root, and then repeating this process ad nauseum. If you did that, you would have found that eventually the numbers on your calculator stop changing at exactly the value $\varphi \approx 1.618 \ldots$.

For something else, take our original equation $\varphi^2 = \varphi + 1$, and divide through by $\varphi$ to obtain $\varphi = 1 + \frac{1}{\varphi}$. This tells us $\varphi$ is also fixed by the map $x \mapsto 1 + \frac{1}{x}$. It follows immediately that the so-called “continued fraction expansion” of $\varphi$, which (in a precise sense) provides the data of the “best rational approximants” to $\varphi$, must look like:

With a lot of irrational numbers, we get much less pretty continued fraction expansions:

One thing to note is that the appearance of a large number in the continued fraction expansion, like 292 above, is telling us something about Diophantine approximation: that is, how well we’re able to approximate our number by rationals of a given denominator. The rationals you obtain by truncating a number’s continued fraction expansion are provably always the “best” in this sense. Thus, if we look at $\varphi$, whose continued fraction is just all 1’s, we can say that in this precise sense, $\varphi$ is the number for which this “approximability” is the worst. It is as hostile as can be towards rational numbers.

Anyway, this is just a small taste of $\varphi$. Now let’s do something seemingly random: take the polynomial $x^2 - x - 1$ and “flip” the sequence of coefficients around, to obtain $1 - x - x^2$ (alternatively, replace each exponent $k$ in the expression with the new exponent $2-k$). Now we’ll take a reciprocal:

$\displaystyle \frac{1}{1-x-x^2} = \frac{1}{1-(x+x^2)}.$

We’re going to look at the series expansion of this thing, which we get from the geometric series formula. For what it’s worth, I don’t care at all about convergence (it’s the summer break; my operator theory course doesn’t start until 2 weeks from now!), so just work completely formally.

$\displaystyle \frac{1}{1-x-x^2} = \sum_{n=0}^\infty (x+x^2)^n = \sum_{n=0}^\infty \sum_{k=0}^n \binom{n}{k} x^k (x^2)^{n-k} = \sum_{n=0}^\infty \sum_{k=0}^n \binom{n}{k} x^{2n-k}.$

Stare at this thing for a while and you notice that the coefficient $F_k$ of $x^k$ in the above is given by

$\displaystyle F_k = \sum_{n=0}^\infty \binom{n}{2n-k}$

which is (of course) actually a finite sum since we nonchalantly ditch any terms with $2n-k < 0$ or $2n - k > n$. The sequence $F_k$ is the Fibonacci sequence (okay, except for being offset by one or something). It is not hard to show that as $k \to \infty$, we have $F_{k+1}/F_k \to \varphi$. In fact these common ratios are nothing more than the convergents of the continued fraction expansion of $\varphi$.

One cool thing Wikipedia mentions is that you can tile the plane with a “spiral” of squares whose side lengths are given by the Fibonacci sequence:

In the next post I will discuss why all this number-theoretic information related to $\varphi$ (root of the polynomial $x^2-x-1$) shows up in the power series expansion of the reciprocal of the “reversed” polynomial $1-x-x^2$. I’ll also apply the same general procedure to $x^3 - x - 1$, which has the so-called “plastic constant” $\rho$ as its root. The sequence we’ll obtain is called the Padovan sequence, and you can perform a similar tiling of the plane using triangles of those side lengths. Once you look at $x^4 - x - 1$, the sequence you get from the series expansion is no longer nice and monotonic. This is odd, but in hindsight unsurprising since by analogy we would expect it to correspond to some kind of “tiling of the plane by a spiral of 2-sided polygons”, which is absurd.

Posted in articles | 5 Comments

## Serre’s affineness criterion

Since we’ve been discussing sheaf cohomology for the last few weeks of the algebraic geometry seminar, and I’m leaving Waterloo soon, I was thinking about possible topics for what will probably be the last seminar talk. I figured that having drudged through all this machinery, it would be nice to look at a cohomological characterisation of affine schemes: namely, the fact that a scheme $(X, \mathcal{O}_X)$ is affine if and only if all quasi-coherent sheaves $\mathscr{F}$ of $\mathcal{O}_X$-modules are acyclic, i.e. $H^i(X, \mathscr{F}) = 0$ for $i>0$. In this post I’ll go over the treatment in [Hartshorne III.3, “Cohomology of a Noetherian Affine Scheme”]. I will probably explain all this stuff more coherently in a video sometime down the road.

This is called Serre’s affineness criterion, and the key to the proof (or at least one direction of it) lies in the fact that if you start with an injective $A$-module $I$, and consider its associated sheaf of $\mathcal{O}_{\text{Spec } A}$-modules $\widetilde{I}$ (just defined by $X_f \mapsto I_f$), then in fact this is flasque. We saw before that flasque sheaves are acyclic for the global sections functor $\Gamma(X,-)$, so in particular we can use flasque resolutions to compute cohomology (this will be important later).

We also saw that injective sheaves are flasque, so one might be tempted to claim that the “key” we mentioned above is a mere triviality: indeed, why not just observe that (in view of the equivalence of categories) any injective $A$-module will give rise to an injective sheaf, and then finish? The problem with this argument is that the category of $A$-modules is equivalent to the category of quasicoherent sheaves, and not the full category of $\mathcal{O}_X$-modules. So yes, we will always have an injective of the former category, but we would need an injective of the latter category to conclude flasqueness — and in general this does not happen.

The starting point is a theorem of Krull from commutative algebra. The full statement concerns the $\mathfrak{a}$-adic topology on an $A$-module, and I don’t really know much (nor do I currently have time to read Atiyah-Macdonald) about completions. However, we only really need one containment:

Krull’s Theorem. Let $A$ be a Noetherian ring and $\mathfrak{a} \subseteq A$ be an ideal. If $M \subseteq N$ are finitely generated $A$-modules, then for any $n > 0$ there is $n' \geq n$ such that $\mathfrak{a}^n M \supseteq M \cap \mathfrak{a}^{n'} N$.

Now, define the following submodule of $M$:

$\Gamma_{\mathfrak{a}}(M) = \{ m \in M \mid \mathfrak{a}^n m = 0 \text{ for some } n > 0 \}.$

Before proceeding, let us mention a remark about injectives. We said an object $I$ of an abelian category $\mathfrak{A}$ was injective if the functor $\mathrm{Hom}(-,I)$ is exact. This (contravariant) functor is always left exact, so the important thing to take away is the following: “$I$ injective” means that if $M' \subseteq M$ is a submodule and $\varphi : M' \to I$ is a morphism, then $\varphi$ extends to a morphism $\widetilde{\varphi}: M \to I$.

Surprisingly, the above turns out to be equivalent to the following seemingly weaker condition (Baer’s criterion), namely: if $\mathfrak{b}$ is an ideal of $A$ and $\varphi : \mathfrak{b} \to I$ is a morphism, then $\varphi$ extends to a morphism $\widetilde{\varphi} : A \to I$. This equivalence is a basic result from commutative algebra.

This reminds me of a similar thing that came up when trying to formulate the universal property of the Stone-Cech compactification: in some sense the closed unit interval $[0,1]$ is a “good enough” representative of the class of all compact Hausdorff spaces (this is formalised in the fascinating notion of an injective cogenerator).

Lemma 1. Let $A$ be a Noetherian ring, let $\mathfrak{a} \subseteq A$ be an ideal. Then if $I$ is an injective $A$-module, then $J = \Gamma_{\mathfrak{a}}(I)$ is also an injective $A$-module.

To prove this, we only need to establish Baer’s criterion for $J$, and this is done by observing one can apply Krull’s theorem to the inclusion $\mathfrak{b} \subseteq A$, pulling back from $\mathfrak{b}/(\mathfrak{b} \cap \mathfrak{a}^{n'})$ to $A/\mathfrak{a}^{n'}$, and finally using the natural map $A \to A/\mathfrak{a}^{n'}$ to pull back to $A$ as required.

Lemma 2. Let $A$ be a Noetherian ring, and $I$ an injective $A$-module. Then for any $f \in A$, the natural map $I \to I_f$ to the localisation is surjective.

This lemma isn’t very difficult either. If $\mathfrak{b}_i$ is defined as the annihilator of $f^i$, then you get some ascending chain of ideals in $A$, but $A$ is Noetherian, so $\mathfrak{b}_r = \mathfrak{b}_{r+1} = \ldots$, yada yada. Then, letting $\theta : I \to I_f$ be the natural map, you take some $x \in I_f$, write $= \theta(y)/f^n$ for some $y \in I$ and $n \geq 0$ (you can do this by definition of localisation), and define a map $\varphi : (f^{n+r}) \to I$ by sending $f^{n+r} \mapsto f^r y$ (this turns out to be fine since $(f^{n+r}) \cong A/{\text{Ann } (f^{n+r})}$ as $A$-modules, and $\text{Ann } f^{n+r} = \mathfrak{b}_{n+r} = \mathfrak{b}_r$). Lift $\varphi$ to a map $\psi : A \to I$ by injectivity of $I$, and then let $z = \psi(1)$. Then $\theta(z) = x$. Magic.

Proposition. If $I$ is an injective $A$-module, then $\widetilde{I}$ is a flasque sheaf of $\mathcal{O}_X$-modules, where $X = \text{Spec } A$.

To establish this, we use Noetherian induction on the support of the sheaf $\widetilde{I}$ (call it $Y$). The basic idea is, for some open $U \subseteq X$, to choose some $f$ and consider some open of the form $X_f \subseteq U$. Noting that $\Gamma(X_f,\widetilde{I}) = I_f$, we can invoke the lemma above, and then the problem reduces to showing $\Gamma_Z(X,\widetilde{I}) \to \Gamma_Z(U,\widetilde{I})$ is surjective, where $Z = X \setminus X_f$. But this follows by induction (put $J = \Gamma_{(f)}(I) = \Gamma_Z(X,\widetilde{I})$ and note this is an injective $A$-module by the lemma, hence $\widetilde{J}$, whose support is strictly contained in $Y$, is flasque; at this point we win since $\Gamma_Z(U,\widetilde{J}) = \Gamma(U,\widetilde{I})$ for all opens $U$).

Theorem. Let $X = \text{Spec } A$ for some Noetherian ring $A$. Then if $i>0$, $H^i(X,\mathscr{F}) = 0$ for all quasicoherent sheaves $\mathscr{F}$ on $X$.

To see this, let $M = \Gamma(X,\mathscr{F})$. Take an injective resolution in the category of $A$-modules, apply the Serre functor $M \mapsto \widetilde{M}$ to get a flasque resolution of $\mathscr{F}$. Applying the global sections functor, we just get back the original resolution, so we’re done.

Theorem (Serre). Let $X$ be a Noetherian scheme. Then TFAE:

1. $X$ is affine.
2. $H^i(X,\mathscr{F}) = 0$ for any quasicoherent sheaf $\mathscr{F}$ on $X$ and $i>0$.
3. $H^1(X,\mathscr{I}) = 0$ for any coherent sheaf of ideals $\mathscr{I}$ on $X$.

We’ve already shown that (1) => (2), and (2) => (3) is easy. (3) => (1) can be proved using the following characterisation of affineness: $X$ is affine if and only if there are $f_1, \ldots, f_r \in A := \Gamma(X,\mathcal{O}_X)$ such that $\langle f_1, \ldots, f_r \rangle = A$, each set $X_f$ is affine, and $X$ is covered by $X_{f_1}, \ldots, X_{f_r}$.

## Chapter 2: Quantum Monomials

by Wei Xi Fan

R: What is $\frac{d}{dx} x^n?$
L: $nx^{n-1}.$
R: What is the quantum analogue of $x^n?$
L: I have no clue.
R: Let’s try $i^n.$ This is the sequence $1^n, 2^n, 3^n, \ldots$
L: OK. $(\Delta i^n)_j = (j+1)^n - j^n$. I can’t get anywhere from here that would make it look remotely close to $nj^{n-1}$.
R: What do we do?
L: We call upon the fire of inspiration itself, Ignis.
IGNIS: The truth ye seek is but a cascading staircase, but one that ends.
L: What do you think that means?
R: Instead of trying $i^n = i* i * i * \ldots * i$, let’s try

$i^{[n]} := i * (i-1) * \ldots * (i-n+1)$

there are $n$ terms in total just like $i^n$, but here, each term is one less than the previous.
L: Like a factorial that suddenly ends. All right, so

\begin{aligned} (\Delta i^{[n]})_j &= (i+1)^{[n]} - i^{[n]} \\ &= (i+1) * i * (i-1) * \ldots * (i-n+2) - i * (i-1) * \ldots * (i-n+2) * (i-n+1) \end{aligned}

Oh! The factors line up. The subtraction yields

$i * (i-1) * (i-n+2) * [i+1-(i-n+1)] = n * i^{[n-1]}.$

This is exactly what we are looking for.
R: Great! So we have the identity

$(\Delta i^{[n]})_j = n * j^{[n-1]} \qquad (*)$

corresponding to

$\displaystyle \frac{d}{dt} x^n = nx^{n-1}$

in the real calculus.
L: OK. What about when $n=0$ or when $n < 0$?
R: Well… what do we do to get from $x^{[n+1]}$ to $x^{[n]}?$
L: Well, we divide $x^{[n+1]}$ by its last term, which is $(x-n)$, to get to $x^{[n]}$, because the only difference between them is that $x^{[n]}$ is missing the $(x-n)$ term.
R: What do we get when we go from $x^{[1]}$ to $x^{[0]}?$
L: Well, $x^{[1]} = x$, and so we divide this by $(x-0)$ to get to $1$, so I guess we should define $x^{[0]} := 1.$
R: What about $x^{[-1]}?$
L: Well, we would divide this by $(x+1),$ so I guess it is $1/(x+1).$
R: So what’s the general formula?
L:

$x^{[n]} = \begin{cases} x(x-1)(x-2) \ldots (x-n+1) & \text{if } n > 0, \\ 1 & \text{if } n = 0, \\ 1/(x+1)(x+2)\ldots(x+(-n)) & \text{if } n < 0. \end{cases}$

Are you sure the identity (*) still holds for these new definitions?
R: For these situations, we should call upon the minion of non-illuminating bashing, Grunt.
GRUNT: ‘Tis done.
R: There, this fact has been verified.
L: I shall call these Pochhammer symbols.
R: WTF?
L: Or we can call them falling factorials. Or falling powers.

Posted in articles | 1 Comment

## Mathematics as a social science

Another “soft post”. It has been a pretty long time

I’m not really sure how to get started here, so let me just start with an observation. As I’ve mentioned many times in my posts, I tend to be very nocturnal; although I sometimes manage to “normalize” my sleep schedule, it never lasts for more than a week. For example, it is currently 6:00 in the morning. When I have classes to attend, or other daytime responsibilities, I usually compensate by sleeping twice a day for half as long.

Fortunately, my plight is shared by a fair number of mathematics students (although admittedly it seems to be shared by even more computer science students). The result is that I spend a lot of time talking to those people (mostly undergrads in lower years) about mathematics. You all know who you are.

What I find ironic is that lower-year students seem to assume they’re wasting my time, or that I will find their thoughts and questions boring. This couldn’t be further from the truth; in fact I’ve found all such conversations quite intriguing and useful. I feel like I probably learn as much from these people as they do from me. For example, I kind of understand the implicit function theorem now! And it’s been almost 3 years since I took Calculus 3. Anyone who knows me will agree that unless I’m honestly rushing to finish some work, I’m always happy to think about a problem, or explain some concept.

The other thing I want to mention is the burden of correcting others when they make a false statement. I choose the word “burden” carefully, because it is usually a feat (especially for people like me, whose body language tends to be pretty arbitrary) to execute this favour without rendering the atmosphere uneasy, or provoking a defensive reaction. The smallest difference in the tone of your voice could make the difference between the person laughing and thanking you, or becoming irritated as if you had stood up and straight out called them an imbecile.

Alas, such is the untrained human ego. Over the years, I’ve grown completely accustomed to being flat out wrong. It’s unavoidable. Mathematics is done with such acute precision, and such calculated deliberation, that recalling everything correctly all the time is practically impossible, and trust me, there are diminishing returns on trying to achieve that kind of perfection. Often I’ll state a flawed version of something I remember reading in a book, or miss an edge case in a definition. The whole point is that there are usually enough people in the room to tell me I’m wrong.

If you were the only person left in the world, doing mathematics would be utterly pointless. As they say: “a proof in a vacuum is no proof at all.” Others are rendering you an enormous service by correcting your mistakes. Whenever the primal ego-goose inside you starts ruffling its feathers, just give it a swift slap of rationality, and remind yourself of this fact. The alternative is that people remain silent and thereby leave you to spoil in the cesspool of your own ignorance for that much longer. Is that really desirable?

One thing you need to understand about doing mathematics is that you are working in a veritable exosphere of abstraction. You are standing on a magnificent, skillfully crafted mountain, and should you ever choose to trek down to its base, you will discover that this mountain was merely rooted on yet another mountain, double its size, and so on and so forth. It’s way too difficult to remember how to prove every single fact on the “dependency chain” to where you are now. That’s like remembering each and every stone you saw on your way up the mountain. Try to remember as much as you can, but don’t forget there are plenty of pretty yellow photo albums of all those stones anyway, should you ever forget one of them.

Okay, I think that’s all I wanted to say. I’ll try to type Chapter 2 of Wei Xi’s quantum calculus saga soon. Until next time, cheers.

Also, it looks like representation theory got slightly nerfed due to the fact that we now only have one course for both group and ring theory: the material on the Sylow theorems and semidirect products has been moved to rep theory instead of in a group theory course where it actually belongs (and where I saw it). Sigh… less time for actual rep theory. The old system was better; upper-year courses are being watered down more and more.

Posted in articles | 1 Comment

## Chapter 1: Fundamental Theorem of Quantum Calculus

by Wei Xi Fan

Let us take a detour into the world of quantum calculus. <insert weird chromatic 8-bit beeps>

L: It is rather embarrassing that sums are not additive, isn’t it?
R: Hmm…you mean for $a < b < c,$

$\displaystyle \sum_{i=a}^c x_i \neq \sum_{i=a}^b x_i + \sum_{i=b}^c x_i? \qquad \text{(*)}$

L: Yes, precisely. Let’s fix it.
R: OK. How about for $a let’s use

$\displaystyle \sum_{i=a}^b x_i := x_a + x_{a+1} + \ldots + x_{b-1}?$

L: Yes, that works! After this change, we will now have a sum that is additive. To wit, in (*) we now have in the left-hand side terms $x_a + \ldots + x_{c-1},$ and on the right-hand side the last term of the first sum is $x_{b-1},$ while the first term of the second sum is $x_b,$ thus matching up perfectly.
R: What about when $a=b$ or $a
L: Well, to make additivity work, we will have to define

$\displaystyle \sum_{i=a}^a x_i = 0$

when $a=b,$ and

$\displaystyle \sum_{i=a}^b x_i = -\sum_{i=b}^a x_i$

when $a
R: Does additivity still hold with these definitions?
L: Why, yes, this is why we made such definitions in the first place.
R: The fundamental theorem of calculus is nice.
L: Why did you bring that up?
R: Well, it’s nice. It says integration and differentiation are inverse operations.
L: Hmm. Wouldn’t it be nice if there is somehow an inverse operation to summation?
R: Suppose we had a sequence of numbers $x_1, x_2, \ldots$ and we transformed it into a new sequence $y_1, y_2, \ldots,$ where $y_k = \sum_{i=1}^k x_i.$ (Using our new notation, of course.) What can we do to recover the original sequence $x_1, x_2, \ldots?$
L: Well, we can take the successive differences:

$y_2 - y_1, y_3 - y_2, y_4 - y_3, \ldots$

R: Which is?
L: Let’s see…

\begin{aligned} y_2 - y_1 & = x_1 - 0 = x_1; \\ y_3 - y_2 & = (x_1 + x_2) - (x_1) = x_2; \\ y_4 - y_3 & = (x_1 + x_2 + x_3) - (x_1 + x_2) = x_3. \end{aligned}

Oh! This reminds me of the fundamental theorem of calculus. The analogy is that our sequence $x_1, x_2, \ldots$ is like a function $f(x)$ in calculus. When we transform it to a new, accumulative sequence $x_1, x_1 + x_2, x_1 + x_2 + x_3, \ldots,$ this corresponds to us transforming $f(x)$ into another, accumulative function $F(x) = \int_1^x f(t) \; dt.$
R: Exactly. The accumulation occurs discretely for our sequence, while the accumulation is infinitesimal for our function (a.k.a. integration).
L: But what does taking successive difference correspond to?
R: Hint: discrete versus infinitesimal.
L: I understand now. Taking successive differences corresponds to differentiation: given a sequence $(y_i),$ we can make a new sequence $y_2 - y_1, y_3 - y_2, \ldots,$ corresponding to making a new function $g'(x)$ by differentiating some function $g(x)$ at each point.
R: So does summing and then differencing cancel each other out?
L: Well let’s see. Let’s start with

$x_1, x_2, \ldots$

Let’s sum it.

$0, x_1, x_1 + x_2, x_1 + x_2 + x_3, \ldots$

Let’s difference it.

$x_1, x_2, \ldots$

We got back our original sequence. This tells us that differencing is the opposite of summation. Wait a minute… that’s what we did a minute ago.
R: If we really waited a minute, this is a tautology. What happens if we difference first, and then sum?
L: Well, let’s start with $x_1, x_2, \ldots$ again and first difference it:

$x_2 - x_1, x_3 - x_2, x_4 - x_3, x_5 - x_4, \ldots$

Let’s sum it. Watch this telescoping:

$0, x_2 - x_1, x_3 - x_1, x_4 - x_1, x_5 - x_1, \ldots$

Hmm…we got our original sequence, except each term had $x_1$ subtracted away from it.
R: What does this remind you of?
L: $\int_a^b f'(t) \; dt = f(b) - f(a).$
R: That’s right: the second fundamental theorem of calculus.
L: Great! We should give the operation of finite difference a symbol so we can write down our fundamental theorems in a succinct form.
R: Given a sequence $x = (x_i)_{i \in \mathbb{Z}},$ let us define a new sequence $\Delta x$ by $(\Delta x)_i = x_{i+1} - x_i.$
L: $\mathbb{Z}?$ Our sequences are double-ended now?
R: Why not? Such sequences are just functions from $\mathbb{Z}$ into $\mathbb{R}$ anyway.
L: Oh. I guess all of the above remains unchanged anyway, so this is fine. Using $i=1$ as a starting point is arbitrary anyway.
R: What are our fundamental theorems?
L:

1. $\displaystyle \left(\Delta \sum_{i=a}^n x_i \right)_j = x_j,$
2. $\displaystyle \sum_{i=a}^b (\Delta x)_i = x_b - x_a,$

corresponding to

1. $\displaystyle \frac{d}{dx} \int_a^x f(t) \; dt = f(x),$
2. $\displaystyle \int_a^b f'(x) \; dx = f(b) - f(a).$

R: Let us call this shadow of the real calculus some kind of umbral calculus.
L: Why not quantum calculus? It is discrete, after all.
R: Why not finite calculus then?

Posted in articles | 1 Comment

On multiple occasions, people I know have given me their own lucid expositions of some topic within mathematics, but lack the motivation to write their ideas up properly anywhere. As an expression of my gratitude, and simply because I want to share their explanations with others, I’ve decided to start writing up these “lost stories”. Stay tuned for the first of these, which was motivated by the following problem, posed to me yesterday by Kamal Rai: prove that

$\displaystyle \sum_{k=1}^\infty \frac{1}{k(k+1) \cdots (k+n)} = \frac{1}{n \cdot (n!)}.$

Aside | Posted on by | 1 Comment

## Random thoughts on pushforwards/pullbacks

I’ve been trying to deepen my knowledge about adjunctions for the past few hours, and somehow got thrown off on a tangent thinking about notation. This post is the result; hopefully it’s somewhat coherent. I was very tempted to post it on triple involution, but for some reason I ended up posting it here anyways.

One thing we tend to do a lot in mathematics is use mappings between objects to “transport” structures from one to the other. I’m sure there are plenty of really elementary examples, but the first interesting one that occurs to me is in differential geometry. Another example is given by direct image (and inverse image) sheaves.

Given a smooth map $f : M \to N$ between manifolds, and a point $p \in M$, one can “push forward” vectors tangent to $M$ at $p$, and obtain vectors tangent to $N$ at $f(p)$. To be more precise, if $T_p M$ denotes the tangent space of the manifold $M$ at $p$, then we are saying our smooth map $f : M \to N$ induces a (linear) map $T_p M \to T_{f(p)} N$, known as the pushforward of $f$ at $p$. I usually saw this map denoted $(f_*)_p$, read “$f$-lower star-$p$“.

A decent definition of the tangent space $T_p M$ is that it’s the vector space of all linear derivations at $p$, that is, linear maps $X_p : \mathcal{C}^\infty(M) \to \mathbb{R}$ such that

$X_p(gh) = g(p) X_p(h) + X_p(g) h(p), \qquad \forall g, h \in \mathcal{C}^\infty(M).$

When we define the tangent space this way, the pushforward map is given by

$((f_*)_p(X_p))(g) = X_p(g \circ f), \qquad \forall g \in \mathcal{C}^\infty(N).$

That is, we have produced a derivation of $\mathcal{C}^\infty(N)$ at $f(p)$, call it $Y_{f(p)}$, that acts on functions $g \in \mathcal{C}^\infty(N)$ as $Y_{f(p)}(g) = X_p(g \circ f)$. The way it behaves is pretty simple; it takes $g$ (which, being in $\mathcal{C}^\infty(N)$, is nothing but a smooth function $N \to \mathbb{R}$), precomposes it with $f : M \to N$ thereby giving a smooth function $g \circ f : M \to \mathbb{R}$, and then feeds this thing into the derivation $X_p$ we started with. The proof that $(f_*)_p(X_p) =: Y_{f(p)}$ actually is an element of $T_{f(p)} N$ (that is, actually is a linear derivation at $f(p)$) is trivial.

With this example in mind, suppose now that we’re in some category $\mathcal{C}$. Let’s think about the thingie that takes a pair of objects $c, d$ in $\mathcal{C}$, and gives us the set of all morphisms $c \to d$. We usually denote this set, say, by $\mathrm{Hom}(c,d)$ or maybe $\mathcal{C}(c,d)$ to emphasize which category we’re working in.

Now suppose we fix $d$ and we consider $\mathrm{Hom}$ just in its remaining one variable: that is, look at $\mathrm{Hom}(-, d)$. If we have two objects $c, c'$ of $\mathcal{C}$, and a morphism, say $f : c' \to c$, then we actually get a function $\mathrm{Hom}(c,d) \to \mathrm{Hom}(c',d)$ given by $g \mapsto g \circ f$, that is, by precomposing with $f$ to “pull back to $c'$“. The obvious notation for this map, since $\mathrm{Hom}(-,d)$ is a (contravariant) functor, is $\mathrm{Hom}(f,d)$. Mac Lane (in Categories for the Working Mathematician) however, often opts to call this map $f^*$ instead (which is consistent with the notation from differential geometry, where $f^*$ is the pullback of differential forms).

If we fix the first component instead, that is, fix $c$ and consider $\mathrm{Hom}(c,-)$, it seems natural to use a “lower star”. That is, given a map $f : d \to d'$, we get a map $f_* : \mathrm{Hom}(c,d) \to \mathrm{Hom}(c,d')$ by “post-composition”, which (I guess you could say) pushes us forward to $d'$. That is, $g \mapsto f \circ g$.

Let’s go back to our example. What is $(f_*)_p$ actually doing? It’s sending $X_p$ to $X_p \circ f^*$, where $f^*$ is the thing that sends $g \in \mathcal{C}^\infty(N)$ to $g \circ f \in \mathcal{C}^\infty(M)$. So, $(f_*)_p$ is just “precomposition by $f^*$“. That being said it almost seems like we should denote the pushforward by $(f^*)^*$ or something… -_- maybe I just screwed something up.

In this sense, I really can’t seem to justify writing $(f_*)_p$. I mean, yes, you’re pushing tangent vectors forward, but set-theoretically, you’re pulling back along a pullback map. I don’t know. Notation sucks.