Given a function $f: \mathbb{R}^+ \rightarrow \mathbb{C}$ satisfying suitable conditions (exponential decay at infinity, continuous, and bounded variation) is good enough, its Mellin transform is defined by the function

This is a change of variable from the Fourier inversion formula, or the Laplace inversion formula, and can be proved in the same way. This is used all the time in analytic number theory (as well as many other subjects, I understand) -- for example, if $f(y)$ is the characteristic function of $[0, 1]$ then its Mellin transform is $1/s$, and one recovers the fact (Perron's formula) that

is equal to 1 if $0 < n < 1$, and is 0 if $n > 1$. (Note that there are technical issues which I am glossing over; one integrates over any vertical line with $\sigma > 0$, and the integral is equal to $1/2$ if $n = 1$.)

I use these formulas frequently, but... I find myself having to look them up repeatedly, and I'd like to understand them more intuitively. Perron's formula can be proved using Cauchy's residue formula (shift the contour to $- \infty$ or $+ \infty$ depending on whether $n > 1$), but this proof doesn't prove the general Mellin inversion formula.

My question is:

What do the Mellin transform and the inversion formula mean? Morally, why are they true?

For example, why is the Mellin transform an integral over the positive reals, while the inverse transform is an integral over the complex plane?

I found some resources -- Wikipedia; this MO question is closely related, and the first video in particular is nice; and a proof is outlined in Iwaniec and Kowalski -- but I feel that there should be a more intuitive explanation than any I have come up with so far.

You could also take a look at section 5.1 of Montgomery and Vaughan's Multiplicative Number Theory. It has several specific examples of Mellin transform pairs, together with brief remarks on how the formulas are proved. I believe that the basic method really is the same as Perron's formula: if $y$ is small then you drag the contour integral to the right and get a contribution of $0$, while if $y$ is large then you drag the contour to the left and pick up the residue of the integrand at $s=0$.
–
Greg MartinNov 2 '11 at 23:15

Shouldn't this be something like Pontryagin duality for the multiplicative group of positive reals (maybe with some analytic continuation thrown in?).
–
Aaron BergmanNov 3 '11 at 0:06

I agree with Greg here that Montgomery and Vaughan is a good pointer on why Mellin inversion works. But personally, the only justification I need is changing variables so that it becomes the same as Laplace inversion. This change of variables shows why one integral is over the positive reals while the other is a line integral in the complex plane. Of course, this all depends on how willing you are to believe why Laplace inversion is morally true, or Fourier inversion, for that matter.
–
Peter HumphriesNov 3 '11 at 0:50

21

I think of the Mellin transform as the equivalent gadget to the Fourier transform, but for the group of positive reals instead of the real line. The change of variables going from one transform to the other is either an exponential or log (depending on which way it's going) which corresponds to the usual isomorphism between the two groups. The function $t \rightarrow t^s$ is a character on the positive reals which is analogous to the function $x \rightarrow e^{2 \pi i x y}$ which is a character on the real line. The Mellin transform is natural for studying multiplicative functions.
–
Matt YoungNov 3 '11 at 1:33

6 Answers
6

[Some next-day edits in response to comments] As counterpoint to other viewpoints, one can say that Mellin inversion is "simply" Fourier inversion in other coordinates. Depending on one's temperament, this "other coordinates" thing ranges from irrelevancy to substance... The question about moral imperatives for Fourier inversion is addressed a bit below.

[Added: the exponential map $x\rightarrow e^x$ gives an isomorphism of the topological group of additive reals to multiplicative. Thus, the harmonic analysis on the two is necessarily "the same", even if the formulaic aspects look different. The occasional treatment of values (and derivatives) at $0$ for functions on the positive reals, as in "Laplace transforms", is a relative detail, which certainly has a corresponding discussion for Fourier transforms.]

The specific riff in Perron's identity in analytic number theory amounts to (if one tolerates a change-of-coordinates) guessing/discerning an L^1 function on the line whose Fourier transform is (in what function space?!) the characteristic function of a half-line.

Since the char fcn of a half-line is not in L^2, and does not go to 0 at infinity, there are bound to be analytical issues... but these are technical, not conceptual.

[Added: the Fourier transform families $x^{\alpha-1}e^{-x}\cdot \chi_{x>0}$ and $(1+ix)^{-\alpha}$, (up to constants) where $\chi$ is the characteristic function, when translated to multiplicative coordinates, give one family approaching the desired "cut-off" effect of the Perron integral. There are other useful families, as well.]

To my taste, the delicacies/failures/technicalities of not-quite-easily-legal aspects of Fourier transforms are mostly crushed by simple ideas about Sobolev spaces and Schwartz' distributions... tho' these do not change the underlying realities. They only relieve us of some of the burden of misguided fussiness of some self-appointed guardians of a misunderstanding of the Cauchy-Weierstrass tradition.

[Added: surely such remarks will strike some readers as inappropriate poesy... but it is easy to be more blunt, if desired. Namely, in various common contexts there is a pointless, disproportionate emphasis on "rigor". Often, elementary analysis is the whipping-boy for this impulse, but also one can see elementary number theory made senselessly difficult in a similar fashion. Supposedly, the audience is being made aware of a "need/imperative" for care about delicate details. However, in practice, one can find oneself in the role of the Dilbertian "Mordac the Preventer (of information services)" [see wiki] proving things like the intermediate value theorem to calculus students: it is obviously true, first, or else one's meaning of "continuous" or "real numbers" needs adjustment; nevertheless, the traditional story is that this intuition must be delegitimized, and then a highly stylized substitute put in its place. What was the gain? Yes, something foundational, but time has passed, and we have only barely recovered, at some expense, what was obviously true at the outset.

On another hand, Bochner's irritation with "distributions theory" was that it was already clear to him that things worked like this, and he could already answer all the questions about generalized functions... so why be impressed with Schwartz' "mechanizing" it? For me, the answer is that Schwartz arranged a situation so that "any idiot" could use generalized functions, whereas previously it was an "art". Yes, sorta took the fun out of it... but maybe practical needs over-rule preservation of secret-society clubbiness?]

Why should there be Fourier inversion? (for example...) Well, we can say we want such a thing, because it diagonalizes the operator $d/dx$ on the line (and more complicated things can be said in more complicated situations).

Among other things, this renders "engineering math" possible... That is, one can understand and justify the almost-too-good-to-be-true ideas that seem "necessary" in applied situations... where I can't help but add "like modern number theory". :)

[Added: being somewhat an auto-didact, I was not aware until relatively late that "proof" was absolutely sacrosanct. To the point of fetishism? In fact, we do seem to collectively value insightful conjecture and not-quite-justifiable heuristics, and interesting unresolved ideas offer more chances for engagement than do settled, ironclad, finished discussions. For that matter, the moments that one intuits "the truth", and then begins looking for reasons, are arguably more memorable, more fun, than the moments at which one has dotted i's and crossed t's in the proof of a not-particularly-interesting lemma whose truth was fairly obvious all along. More ominous is the point that sometimes we can see that something is true and works despite being unable to "justify" it. Heaviside's work is an instance. Transatlantic telegraph worked fine despite...]

In other words: spectral decomposition and synthesis. Who couldn't love it?!

[Added: and what recourse do we have than to hope that reasonable operators are diagonalizable, etc? Serre and Grothendieck (and Weil) knew for years that the Lefschetz fixed-point theorem should have an incarnation that would express zeta functions of varieties in terms of cohomology, before being able to make sense of this. Ngo (Loeser, Clucker, et alia)'s proof of the fundamental lemma in the number field case via model theoretic transfer from the function field case is not something I'd want to have to "justify" to negativists!]

@Mariano S-A... Hahaha! Your "criticism" does gratify me, of course, ... tho' with a small worry in the back of my mind. :) As you/one may imagine, I was sincere. At the same time, yes, the school-math mind-set does prohibit having opinions/tastes. My real worry is that beginners accidentally alienate potential employers by being "too honest". That is, it is often the case that conformity is the implicitly-valued trait, despite other things being the advertised desiderata. I am glad (!?) I did not understand the state of things when I was younger... it would have been ... impossible. (Thx)
–
paul garrettNov 4 '11 at 1:23

6

I can't even tell what most of your answer and essentially all of your comment have to do with the question, really. «They only relieve us of some of the burden of misguided fussiness of some self-appointed guardians of a misunderstanding of the Cauchy-Weierstrass tradition.» is a circumlocution for an demeaning comment at who/what, exactly? I have absolutely no problem with opinions or tastes —despite my school-math mind-set and my characteristic conformity, which you very perspicaciously managed to put to the fore, even I have been known to have a couple of both— but...
–
Mariano Suárez-Alvarez♦Nov 4 '11 at 1:45

1

@paul garrett: I would be interested to read a write-up of your views on putting the "Fourier" back in "Fourier analysis"; but I don't think the cramped comment boxes of MathOverflow are an optimal place :) Certainly these boxes aren't a good place for discussion.
–
Yemon ChoiNov 4 '11 at 3:32

7

@daniel litt: I am not opposed to rigor itself, but to (what seems to me) a disproportionate interest in prohibition, rather than facilitation. A caricature of this is a scenario in which one proves that there is no Dirac delta "function", rather than the more useful building-up of a situation in which it is perfectly legitimate. Related to this (example and notion) is/are provocative examples (e.g., the first volume of Gelfand-et-alia on Generalized Functions, which I accidentally encountered before having any general understanding of distributions) compelling work to legitimize it.
–
paul garrettNov 14 '11 at 14:25

As others have pointed out, the Mellin inversion theorem is just the Fourier inversion theorem in disguise for the particular group ${\mathbb R_+}$ with invariant measure $\frac{dx}{x}$. The goal of the Fourier transform is to express a general function as a linear combination (i.e. integral) of the characters of the group, so that in this basis the operations of translations and all commuting operations will be diagonalized. For the ${\mathbb R_+}$, these characters look like $x \mapsto x^{-s}$ (the minus sign because of the normalization you chose in the question), and they are unitary (take values in the circle) for imaginary $s$ -- the operation of multiplying characters is just addition in the $s$ variable, so in the inversion formula you have the measure $ds$. There's also this funny thing about how there are $s$ with positive real part -- this is because in the "physical space" ${\mathbb R_+}$ you're always talking about distributions which are compactly supported away from $0$ when you use this transform. Let's ignore that.

Since Mellin inversion is a disguised Fourier inversion, the real question is: why is the Fourier inversion formula on ${\mathbb R}$ true? To me the most convincing answer is the following: we can decompose a general function $f(x) = \int f(y) \delta(x-y) dy$ (this is the definition of $\delta$ but you have to take approximate delta-functions to make this rigorously work like a decomposition), so if we want to express a general function as a combination of the characters $x \mapsto e^{2 \pi i \xi x}$, it suffices to consider the $\delta$ function

$\delta(x) = \int u(\xi) e^{2 \pi i \xi x} d\xi $

One interpretation of this formal idea is that the distributions $\delta(x-y)$ are just like your usual standard basis functions.

Now, observe that because $\delta(x)$ is invariant under multiplication by $e^{2 \pi i \eta x}$ for any $\eta$, the distribution $u(\xi)$ is translation invariant, and therefore must be constant. After you find the constant, plugging in $\delta(x) = C \int e^{2 \pi i \xi x} d\xi$ into $f(x) = \int f(y) \delta(x-y) dy$ gives the Fourier inversion formula. Complete, rigorous proofs all follow more or less these lines, but there are many flavors of how you like to phrase it. Of course, we can write the whole argument with multiplicative characters as well.

Edit: The above argument assumes uniqueness of the representation, but one can also remark that if there is even a single function $f(x)$ for which $\int f(x) dx \neq 0$ and which can be realized as a linear combination $\int \hat{f}(\xi) e^{2 \pi i \xi \cdot x} d\xi$, then by rescaling, renormalizing and taking a limit, we obtain $\delta(x) = C \lim_{\epsilon \to 0} \epsilon^{-1} f(x/\epsilon)$, leading formally to the formula $\delta(x) = C \int e^{2 \pi i \xi \cdot x} d\xi$. One common rigorous execution of this philosophy is performed by taking $f$ to be a Gaussian.

Thanks to everyone who answered! A (CW'ed) summary of some of what I learned:

In the first place, I now cheerfully second Greg Martin's recommendation of Chapter 5.1 of Montgomery and Vaughan. It is a rather "lowbrow", very readable treatment. (doesn't prove Mellin inversion in complete generality)

Also, as Matt Young pointed out, for any complex $s$, the function $t \rightarrow t^s$ is a character on $\mathbb{R}^{\times}$. This is a triviality, but the importance of this fact escaped me the first time. The invariant measure on $\mathbb{R}^{\times}$ is $\frac{dx}{x}$, and so the Fourier transform of a function $f$ defined on this group is exactly

$$\int_{x \in \mathbb{R}^{\times}} f(x) x^s \frac{dx}{x},$$

the Mellin transform. Once this is written down, the rest follows mechanically (from change of variables and Fourier inversion).

Usually, it is considered helpful and nice to update the original question with what you learned from the various answers. Doing so helps others get a digested summary of the highlights of the various answers.
–
SuvritNov 15 '11 at 17:52

The transformations from one equation to the other are obvious. The delta function results are intuitive and an extrapolation of the discrete case for the orthogonality relationships of the characters of character groups. The transform pairs, Plancherel and convolution theorems, and other relations are easy to derive from these two.

(Note that whereas $e^{sz}$ is an eigenfunction for $d/dz$ and so the Laplace/Fourier transforms are appropriate for devising an operator calculus for $f(d/dz)$, $z^s$ is an eigenfunction of $zd/dz$ and so the Mellin transform is more appropriate for $f(zd/dz)$.)

Ramanujan's Master Formula/Theorem (see Wikipedia, particularly the Hardy refer.) gives a somewhat intuitive perspective on the Mellin transform as providing an "interpolation" of the coefficients of the Taylor series of certain classes of functions, as discussed in the intro of "Ramanujan's Master Theorem ..." by Olafsson and Pasquale. E.g.,

Edwards in Riemann's Zeta Function in Ch. 10 Fourier Analysis Sec. 10.1 Invariant Operators on R+ and Their Transforms gives a nice, more group-theoretic intro to the Mellin transform in line with other comments in this stream.

It provides a nice description of the Mellin transform when $f(x)$ is sufficiently smooth at $x=0$, and of rapid decay at infinity.

For example, assume $f(x) = \sum_0^\infty a_n x^n$, in some neighbourhood of the origin, and decays rapidly as $x \to \infty$, then its Mellin transform has meromorphic continuation to all of $\mathbb{C}$ with simple poles of residue $a_n$ at $s=-n$, $n=0,1,2,3,\ldots$. This is nicely explained in Zagier's appendix.

So, rate of decay issues aside, shifting the inverse Mellin transform to the left, i.e. letting $\sigma \to -\infty$, picks up the residues of the integrand at s=-n, i.e. $a_n x^n$, i.e. recovers the Taylor expansion about $x=0$ of $f(x)$.

Of course, it only applies to a limited class of functions $f$, but, in many practical examples, this reasoning gives one explanation of why the Mellin inversion formula is true, without resorting to Fourier inversion.