I have some questions about renormalization. To my understanding, in order to deal with infinities that appear in loop integrals, one introduces some kind of regulator (eg, high momentum cutoff, taking $d\to d+\epsilon$, etc.) so that we get a finite answer, which blows up as we remove the regulator. Then we renormalize various coefficients in the Lagrangian in a regulator-dependent way so that, in scattering amplitudes, a finite piece remains as we remove the cutoff. The regulator seems to always require introducing some arbitrary scale (eg, the momentum cutoff or $\mu^\epsilon$ for dim-reg), although I'm not sure why this must be the case.

Now, this finite piece is completely arbitrary, and depends on what scheme we want to use (eg, on-shell, minimal subtraction, etc). The beta function is then, roughly speaking, the rate of change of this finite piece under changes in the scale in the regulator. My question is, what exactly is the invariant information in the beta function? Under changing renormalization scheme, it obviously changes, but it seems that this roughly corresponds to a diffeomorphism on the space of couplings (is this always true? for example, in on-shell renormalization, if we set the mass to the physical mass and the coupling to the corresponding exact vertex function at zero momentum, these seem to be independent of scale - do the beta functions vanish here?). Also, it seems to depend on exactly what regulator we are using, and how the dimensionful scale enters into it. There are certain quantities, such as the anomalous dimension of a field at a fixed point, which must be independent of all these choices, but I don't understand why this is the case.

Finally, and this is more of a philosophical question, what exactly do the infinities in the loop integrals mean in the first place? Why is it that we can remove them by any of these various regulators and expect to get the same answer? I've read something about Haag's theorem, that interacting theories live in a different Hilbert space than the free-field Fock space, and that this is somehow related, but I'm not sure why. Thank you.

1 Answer
1

Renormalization approximates (in a first step) an original, bare (= ill-defined) theory by a family of theories depending on the regularization prescription (indicated by a variable R) and an energy scale (indicated by a variable $\Lambda$). These approximate theories are matched by a regularization prescription, which fixes their free constants in a way (depending on $R$ and $\Lambda$) that makes sure that a small number of key parameters (predictions of the theories) agree.
This is the renormalization step; these parameters are renormalized (= physical, measurable).

After this matching, we have a well-defined familiy of approximate theories $T(R,\Lambda)$ producing observables depending on the regularization and the energy scale but agreeing in the renormalized parameters.

In a second step, the regularization is undone by taking the limit of no regularization at all. This results in a family of renormalized theories $T(\Lambda)$ which - as no regularization is present anymore - describes the same physical theory. Thus (if the renormalization procedure is ''correct''), all measurable predictions are identical and independent of $\Lambda$.

However, the renormalized theories $T(\Lambda)$ themselves may contain parameters still depending on $\Lambda$. Of course these parameters cannot affect the measurable, physical results at all. Thus the dependence of these parameters on $\Lambda$ must be systematically related in a way that no measurable prediction changes when $\Lambda$ is changed. This results in the differential equations known as the renormalization group.

Of course, the above holds under the assumption that the predictions are those of the exact theory. In practice, one must make additional approximations to get computable predictions, and these still have a residual dependence on $\Lambda$ that should vanish in the limit of better and better approximations.
This is the reason why - although in theory each value of $\Lambda$ is as good as any other - one must choose $\Lambda$ for a particular application to be close to the enery relevant for the prediction in question.