June 9, 2012

Polymath7 discussion thread

The “Hot spots conjecture” proposal has taken off, with 42 comments as of this time of writing. As such, it is time to take the proposal to the next level, by starting a discussion thread (this one) to hold all the meta-mathematical discussion about the proposal (e.g. organisational issues, feedback, etc.), and also starting a wiki page to hold the various facts, strategies, and bibliography around the polymath project (which now is “officially” the Polymath7 project).

I’ve seeded the wiki with the links and references culled from the original discussion, but it was a bit of a rush job and any editing would be greatly appreciated. From past polymath experience, these projects can get difficult to follow from the research threads alone once the discussion takes off, so the wiki becomes a crucial component of the project as it can be used to collate all the progress made so far and make it easier for people to catch up. (If the wiki page gets more complicated, we can start shunting off some stuff into sub-pages, but I think it is at a reasonable size for now.)

One thing I see is that not everybody who has participated knows how to make latex formatting such as appear in their comments. The instructions for that (as well as a “sandbox” to try out the code) are at this link.

Once the research thread gets long enough, we usually start off a new thread (with some summaries of the preceding discussion) to make it easier to keep the discussion at a manageable level of complexity; traditionally we do this at about the 100-comment mark, but of course we can alter this depending on how people are able to keep up with the thread.

My ignorance is embarrassing, but I’ll ask anyway. Could someone point me to directions on how to edit a comment I make on the research thread after I’ve posted it? I’d like to save moderators and readers the irritation of fixing my typographical/LaTeX errors.

Unfortunately, the hosting company for this blog doesn’t allow editing of comments by users :-(. So I’ll be editing comments manually, which has worked well enough in the past.

For really lengthy computations, though, it may be a good idea to put the details on the wiki (maybe creating a subpage if necessary) and just put a link and a summary on the blog, since the wiki is easier to edit and format.

As there has been a lot of talk about approaching the conjecture using Bessel functions I decided to finally learn about them. I wrote up a summary (basically for my own benefit to understand the material better) which can be seen at

The research thread is getting rather lengthy, so I will probably roll it over tomorrow by starting a new research thread that tries to summarise the progress so far, and then direct all discussion to the new thread. This is just a heads-up, though; in the meantime, keep using the current research thread. :-)

Hmm, it’s only been three days and I think I may have to roll over the thread again, maybe sometime tomorrow. (From past experience with polymath projects, the first week or two are quite hectic and chaotic, with lots of people pursuing lots of possible angles of attack but after that things settle down, focusing on a core group of people pursuing a core set of strategies; the progress becomes less exciting, but more steady.)

Continuing the comments of Hung (and moving the discussion to the discussion thread):

I met with Hung today to discuss analytic proof of Isosceles Triangle – Special Case – Corollary 4, so maybe I can clarify some of our confusion.

Hung suggested that an argument could be made via the scalar maximum principle by instead of considering the vector-valued gradient of , considering the directional derivative in a direction which “points within the cone” (i.e. , where is the angle ABD).

This seems reasonable: itself solves the heat equation. By considering, say , we have that at the bottom of the parabolic boundary. Along the Dirichlet boundary for all time (because is non-negative in the interior). But we couldn’t figure out why for all time on the Neumann part of the parabolic boundary. If we only had that we would be done by the scalar maximum principle… Maybe there is a simple reason we missed

My understanding of the vector-valued (weak) maximum principle is that if you have a vector valued function which solves the heat equation, and its parabolic boundary data lies in some convex set , then so too will the function on the entire parabolic domain. Switching between this and the scalar valued maximum principle is only a matter of projecting onto an axis or equivalently considering your convex set to be the space between two hyperplanes (Things get messier for, say, reaction-diffusion equations but for the heat equation I believe considering the gradient and directional derivatives are equivalent?)

So in arguing via the vector-valued maximum principle we are taking as our convex set the infinite sector it seems. But then I wasn’t sure what the importance of is. Is an argument with necessary if we take for granted the weak maximum principle in the previous paragraph? Or is it that the argument is going beyond the basic weak maximum principle (in either the scalar of vector-valued case)? Part of the confusion for me I think is in parsing things as both the domain and range of were in the sector .

(Thanks, by the way, for moving these sorts of discussions to the discussion thread rather than the research thread – I should have mentioned earlier that this thread is intended in part to help explain and clarify all the hectic stuff that goes on on the research thread.)

It is a bit odd that one can’t use an “off the shelf” maximum principle, either in the scalar or vector setting, to establish this result, but has instead to “roll one’s own” principle by adapting the proof (and this is why the epsilons have to come in, usually they are hidden in the proof of the textbook maximum principle). I’m not sure exactly why this is the case, but I think it is because the Neumann boundary condition does not directly place the gradient of u inside S on the boundary, but instead offers the option of either lying in the interior of S or in the exterior, and one then has to rule out the second possibility by an additional argument (taking advantage of either the Neumann boundary conditions or on a reflection argument) to conclude.

I think considering just a single derivative doesn’t work because when one bounces off of a boundary, this derivative somehow gets mixed up with other derivatives, so one has to control the whole gradient at once or else one can’t predict what happens on a boundary.

Ok, I see your point about losing information when only considering one directional derivative; it seems that part of the argument is to take advantage of the fact that points parallel to the boundary for points on the Neumann boundary (something which cannot be reflected in a property of a single derivative ).

It seems you take advantage of this by expanding the sector to the set , which then allows you to identify a unique point that is equal to for BD. But then I don’t see where you take advantage of knowing the particular location of (besides knowing that it is on the boundary).

Is it correct that the idea of reflecting across BD is just to justify that the equality holds on the boundary BD as well?

Another point I don’t get is why you consider the set and not just . Is the idea that “As the set is expanding in time, the only way for to catch it is if it is actively moving towards the (receeding) boundary$… which prevents it from touching the boundary with “zero derivative”?

P.S. Looking at the argument I don’t see any place that the acuteness of the triangle is used. Is it correct to say the same argument would work for an obtuse mixed Dirichlet-Neumann triangle?

The location of v is important (as pointed out by Hung) because it is both on the boundary of and on its reflection, which allows one to keep the direction of pointing inwards or tangentially. As you say, the receding nature of the boundary is to make sure that the time derivative points strictly outwards, rather than tangentially, since otherwise one doesn’t quite get a contradiction. (One could of course use some other increasing function of t than , e.g. , if desired.)

I think the argument works fine for obtuse triangles, though one has to be a little careful with regularity. Once one has an obtuse Neumann angle, one no longer has C^2 regularity at the vertices, but only for some (basically, one has degrees of regularity at a vertex of angle , except when divides evenly into when one has smoothness instead). But this may still be enough regularity to run the argument.

Ok, I think I follow the argument now: The idea is that if is the first point with on and BD, then (considering reflections) points of the form in the reflected domain have lying in the union of and its reflection, which is a *convex* set. Thus the “pull” from these nearby , i.e. , is into/tangent to this convex set and in particular doesn’t point in the direction of .

I don’t think this argument would work for the obtuse case then as the union of and its reflection wouldn’t be convex (and so while might be horizontal, it might still point in the same direction as ).

Also, whether my understanding above is correct or not, because the level curves of are roughly concentric circles at the corner, in the obtuse case even if we could show that stayed within certain angle bounds, it would not be the case that the *directional derivative* in the expected directions would have constant sign. So, it definitely seems considering has its advantages!

This may be somewhat tangential to the original aim of this project, but it may be of interest to try to explore the vector maximum principle/coupled Brownian motion connection further. Two obvious directions to pursue are (a) finding a maximum principle version of mirror coupling arguments and (b) finding analogues of both coupled Brownian motion and maximum principle arguments in the discrete graph setting. (There must presumably be some literature on this sort of thing already – I would find it hard to believe that the two most fundamental tools in parabolic PDE have not been previously linked together!)

Well actually my introduction to the hot spots problem came from David Jerison who suggested that his graduate student, Nikola Kamburov, and I try to come up with a purely analytic replacement for the coupling arguments used in hotspots problems (specifically those in the paper on Lip domains by Atar and Burdzy). We approached it by trying to consider maximum principles etc on the product domain, but we never formalized anything concrete… it seems that we should have been looking at the gradient vector (although as your argument shows some finesse is required even there)

Consider the unit disk . Then if is (free) Brownian motion started at it turns out that , where for , is (after a time-scaling) a reflected Brownian motion started at . In fact, this still holds if itself was a reflected Brownian motion to begin with!

Suppose now we consider the upper half of the unit disk, and assign Dirichlet boundary conditions to its flat bottom and Neumann boundary conditions to its arc-boundary. Considering the paths of and , we see that the *paths* will meet up when first hits the arc-boundary of , or otherwise both *paths* will terminate at the same time when they hit the Dirichlet boundary. But now recall that for to be a true Brownian motion, we need to scale its speed; to be a Brownian motion, must travel slower than until they meet. But this implies that will always hit the Dirichlet boundary before (since is further behind on its path). By the usual adjoint/duality between Brownian motion and the heat equation it therefore follows that the first eigenfunction for is monotone radially out from the origin!

Then, by conformal mapping, Pascu extends this scaling coupling to more general domains. The basic idea is that, as conformal mappings preserve angles, reflected Brownian motion gets mapped to reflected Brownian motion up to a time-scaling (the idea is that the “reflection angles” at the boundary are preserved, and in the interior “angles of direction of motion” are not “squished” so the Brownian motion still is equally likely to head in any direction). Therefore, if we consider a (I think this matters for conformal mappings?) domain with an axis of symmetry (whence we can consider the domain with mixed boundary), we can identify it with via conformal mapping, and define a scaling coupling on via the scaling coupling on .

The crucial issue however, is that while the *paths* of and in are nicely coupled, we need also control on the time-scaling of and to ensure that moves slower along its path than does along its path until the paths meet. This can be ensured provided that is a *convex* domain (This is the content of Proposition 2.13). Under this additional assumption, we can argue as before to show that the extremum of the first eigenfunction of lies on its Neumann boundary.

Points of Interest:

1) It may be useful to keep in mind the scaling coupling as a tool (at least at the heuristic level… i.e. we can define a scaling coupling and move it to other domains via conformal maps provided we are wary of time-scaling effects) especially as this is a somewhat exotic coupling which I don’t believe has a direct analytic counterpart (though I would of course be happy to see one).

2) I don’t really see where in the argument we need to consider all of and . It seems that we only need the convexity of (for time-scaling purposes) so we could consider just it provided it is conformally equivalent to . This then would give information on the location of the extremum of the first eigenfunction of such a convex mixed-boundary domain.

Hmm, your comments got caught by the spam filter (presumably because of the link). Sometimes it has false positives. Nilam had a similar issue; I will try to upgrade both of your user statuses to try to get past the filter.

I am a bit behind in all that’s been done with regards to the rigorous numerics argument, but I am confused on one point:

As I understand it, the goal is to get C^0 continuity of the eigenfunction with respect to the domain in order to get control over where the extrema of the eigenfunction go as we perturb the domain. And for the moment, all we have is continuity with respect to the L^2 and H^1 norms.

But the paper of Banuelos and Pang gives a C^0 continuity result. So why can’t we apply their result as it stands?

The Banuelos-Pang paper does in principle give explicit C^0 bounds, but they depend on bounds on the heat kernel on the triangles, which are presumably in the literature but it would take a fair amount of effort to make all the constants explicit (and I would imagine that the final constants would be terrible, making it much harder to use them for numerics as one would have to use an enormous number of reference triangles). My belief is that by working with the explicit nature of the triangular domains one can get better constants.

Also, there is a possibility that we may also get explicit C^1 or even C^2 bounds as well, which would also be helpful in locating extrema, though it is probably going to be simplest to try for C^0 bounds first.

Terry, I have looked over your recent notes “Stability Theory for Neumann Eigenfunctions” and had a few comments/questions:

1) (Typo) In Lemma 1.2 you talk about P and Q but then call them X and Y.
2) (Typo?) On page 6 in the equation after the line “From the orthonormality (2.5) and the Bessel inequality, we conclude that” shouldn’t it be in the term on the left?
3) (Question) At the bottom of page 6, why is it that when we differentiate we don’t have an extra term of the form ?
4) (General Question) It seems that getting a bound on will control how much the values of change… but I don’t see how that would immediately give control of the *location* of the extrema. Is the argument to appeal to the fact that “if the extrema is near the corner it must be at the corner”?

(1) thanks for the correction, it will appear in the next revision of the notes.

(2) I think the factor is , the extra factor coming from the weight in (2.5).

(3) depends linearly on time (assuming vary linearly in time) and so the second derivative is zero.

(4) Yes, one also needs to separately exclude extrema occuring near the corner in addition to L^infty variation bounds to completely control all extrema, this is the rationale behind my previous comment at Comment 11. Unfortunately I am beginning to be a bit worried that the bounds there are a bit weak and will lead to requiring the mesh density to be huge…

Is your concern about the mesh spacing required in parameter space, or that required for the computation on any given triangle? In other words, are you concerned about the bounds on the variation, or on the location of the extrema near the corners?

I should be ready to post some numerical results by Wednesday on computing the bounds you calculated on the variation, as well as the (numerically computed) variation. Based on the preliminary results, no surprises so far.

I guess both, because the net error in our L^infty control on an eigenfunction on a triangle will depend on both (a) the distance in parameter space to the nearest reference triangle, and (b) the accuracy of our eigenfunction approximation in the reference triangle, as well as (c) the spectral gap bounds. Then this has to be compared against (d) our numerical bounds on how far away from the extrema the numerical eigenfunctions are far away from the extremal vertices, and (e) the neighbourhoods around the extremal vertices for which we may rigorously exclude extrema. The hope is that (c), (d), (e) are strong enough that we can use numerically feasible mesh sizes for (a) and (b).

Agreed.
A while ago I was trying to get a handle on (d) numerically. My approach was to use an overlapping Schwarz iteration. The idea is to iterate on eigenvalue problems on subdomains of the triangle – I partitioned the triangle into regions which are wedges and a full circle. My hope was the tools of Siudeja and Banuelos-Pang would help in establishing the method converged. Unfortunately, I was unable to rigorously prove this.

Well, perhaps we don’t need a rigorous guarantee that the numerical algorithm converges, but instead go with a numerical recipe that in practice gives a numerical eigenfunction and numerical eigenvalue with very good residual , and then do some a posteriori analysis to rigorously conclude that the error is small. Indeed, if one has a demonstrable gap between the numerical eigenvalue and the true third eigenvalue , then some simpleplaying around with eigenvalue decomposition (computing the inner product of against other true eigenfunctions via integration by parts) shows that the residual controls the error in H^2 norm (and hence in L^infty norm, by the Sobolev inequality in my notes), at least if one can ensure that obeys the Neumann condition exactly.

The challenge with this is in how I compute the residual. Numerically, my strategy was to approximate $u_i$ (in the notes) by finite linear combinations of Fourier-Bessel functions. The trace of the approximations on the arcs can be written down readily; the application of the Laplacian on the sub-domains is also OK. However, to compute the L2 inner products, I used a quadrature. This is how I assemble the matrices to get the approximate eigenfunctions. Also, the conditioning of the eigenvalue problems wasn’t great. Since one is looking at minimizing the residual in $L^2$, the all-critical traces of $u_i^n \frac{\partial u_0^n}{\partial \nu}$ on the common interfaces play a role, but not as important as one may want. While I want to believe this method gives a good approximation by looking at the numerical residual, I am not 100% convinced.

One I got the approximate eigenfunction by this method, I still have to locate the extrema. I do this by interpolating the function by piecewise linears onto a mesh of the triangle, and then doing a search. This can be improved.

Let me add in some of the details of the implementation in the notes. Perhaps some collective trouble-shooting will help.

Using the finite element method, the quadratures are exact (since I use piecewise polynomials). The search proceeds as above. Since I’m using a quasi-regular discretization, both the Galerkin errors and the Lanczos errors are well-understood and the methods are provably convergent. This is a reliable, if not super-fast, work-horse.

Here’s one possibility. You’re dividing the triangle into three sectors and a disk, and on each of these regions one can create an exact eigenfunction with Neumann conditions on the original boundary (and some garbage on the new boundaries). Now with some explicit C^2 partition of unity, one can splice together these exact eigenfunctions on the subregions into an approximate eigenfunction on the whole triangle, and the residual will be controlled by the H^1 error between the exact eigenfunctions on the intersection between the subregions.

To illustrate what I mean by this, let us for simplicity assume that the triangle is covered into just two subregions instead of four. Let be exact eigenfunctions on respectively with the same eigenvalue , and obey the Neumann condition exactly on and respectively. We then glue these together to create a function on the entire triangle , where is a C^2 bump function that equals 1 outside of and equals 0 outside of . Then we may compute

.

Also u obeys the Neumann conditions exactly. Thus if u_1 and u_2 are close in H^1 norm on the common domain , the global residual will be small.

One advantage of this approach is that we don’t need to care too much about the boundary traces of u_1,u_2. But one does need a certain margin of overlap between the subregions so that the cutoffs lie in C^2 with reasonable bounds, it’s not enough for them to be adjacent.

Yes, this is certainly one way to analyze the overlapping strategy: the partitions of unity will assure convergence of the Schwartz iteration in one step.

In the set-up I tried using numerically, the domains have non-trivial overlap. Solving boundary value problems this way would ensure nice convergence of the iteration. My misgiving came from the conditioning of the eigenvalue problems on the sub-domains; since the computations were in floating-point arithmetic, poor conditioning is worrying.

My thinking was that since the actual eigenfunction is C^2 in the interior, the non-standard eigenvalue problem for the disk will have smooth coefficients. My rationale for not using the partition of unity was that the approximation functions I used in each region satisfy $-\Delta u = \Lambda u$ exactly (but potentially not the boundary data). However, for the purpose of an analytical treatment, the partition of unity strategy may be easier to work with.

Just a short note to say that I’m still interested in this problem, but am preparing for a two-week vacation starting on Saturday and so unfortunately have had to prioritise my time. But I will definitely return to this project afterwards…

Apologies about the delay from my end- I’ve been writing up some notes to summarize the numerical strategy, include some validation experiments, and discuss the results so far.

The conjecture has been (numerically) examined and (numerically) verified on a fine, non-uniform, grid in parameter space away from the equilateral triangle. The grid spacing is chosen so that the variation of the eigenfunctions is controlled to 0.001. At each of these points, we have numerical upper and lower bounds on the second eigenvalue; these bounds provide an interval of width 1e-7 around the true eigenvalue. The eigenfunctions are computed so the Ritz residual is under 1e-11.

I have *something* coded up which uses the bounds near the equilateral triangle, but am not confident enough about these yet to present them.

I just wanted to say that Bartlomiej and I are at a stochastic analysis conference at the moment and we are discussing ideas for the problem (the discussion has been restricted to analytic approaches though) along with some other interested people (Mihai Pascu, Rodrigo Banuelos, Chris Burdzy, etc) at the conference.

My internet access is limited (I am on a public computer at the moment) but I/we will try to write a summary of our discussion after the conference!