1 Introduction

In recent years, entanglement entropy (EE) has become an important and intensively studied quantity of states of many-particle quantum systems. For an introduction to this topic, we refer to [3, 6, 15]. In this paper, we study the EE of ground states of the ideal Fermi gas in a magnetic field in three-dimensional Euclidean space, \({{\mathbb {R}}}^3\), see [18]. The two-dimensional Fermi gas in a constant magnetic field was recently analyzed in [5, 21], starting from the earlier work in [31]. Here, a strict area-law holds, while for the free Fermi gas in any dimension d a logarithmically enhanced area-law is valid, see [13, 20]. Stability of these area-laws has been proved in [25, 26] for \(d\ge 2\) and in [29] in the sense that adding a “small” electric or magnetic potential to the Hamiltonian does not change the leading asymptotics of the entropy. The one-dimensional case seems to be still open (for the \(\gamma \)-Rényi entropy with \(\gamma \le 1\)).

There is an extensive literature on EE by now with many fascinating connections and implications to related fields. Here, we only mention and refer to a small fraction of mathematical results. In [30], an enhanced area-law was proved for the one-dimensional free Fermi gas in a periodic potential; the higher-dimensional case remains an open problem. By the work in [8, 24, 28], we understand EE in Anderson-type models on the lattice. An extension to the EE of positive temperature equilibrium states (of the ideal Fermi gas) was presented in [22, 23, 35]. Finally, we mention results on the XY and XXZ quantum spin chain [1, 4, 7, 11, 12, 16].

By a (strict) area-law for a ground state of the infinitely extended Fermi gas, say in \({{\mathbb {R}}}^d\) with spatial dimension \(d\in {\mathbb {N}}\), we mean that the entanglement (or local) entropy of this state reduced to the scaled (bounded) region \(L\Lambda \) grows to leading order like \(L^{d-1}{\mathcal {H}}^2(\partial \Lambda )\) as the dimensionless real parameter L tends to infinity. Here, \({\mathcal {H}}^2(\partial \Lambda )\) is the (Hausdorff) surface area of the boundary \(\partial \Lambda \). If there is an extra \(\ln (L)\) factor in this leading asymptotics, then we call it a logarithmically enhanced area-law.

Whether one should expect a strict area-law or an enhanced area-law is related to the spectral properties of the one-particle Hamiltonian of the non-interacting many-particle Fermi gas. If the off-diagonal part of the integral kernel of the corresponding spectral (Fermi) projection has a fast decay (e.g., exponential), then we expect a strict area-law to hold. It is not difficult to argue for that (see [28]) but to compute and finally prove the precise leading coefficient has only been accomplished in special cases. On the other hand, if the decay of the off-diagonal part of the integral kernel is weak (e.g., inverse linear), then we can expect an enhanced area-law. In the present model, we have a mixture. Namely, we have an exponential decay in the planar coordinate (orthogonal to the magnetic field) and a \(1/|\cdot |\) decay in the longitudinal coordinate along the magnetic field. The latter prevails and leads to a logarithmically enhanced area-law. Our main result is formulated in Theorem 4.1.

As in previous proofs there are two parts to proving such a result. Firstly, we prove a two-term asymptotic expansion for polynomials (see Theorem 2.3). Due to the product structure of the ground state, see (2.7), we can dimensionally reduce the asymptotics of a three-dimensional problem to an asymptotic expansion of a one-dimensional problem with localizing sets \(L\Lambda _{x^\perp }\subset {{\mathbb {R}}}\) and with the spectral projection of the one-dimensional Laplacian, see Lemma 3.2. The corresponding asymptotic expansion was already proved by Landau and Widom [19] and then improved by Widom [36]. But here we need to take care of the error term which depends on the planar coordinate \(x^\perp \in {{\mathbb {R}}}^2\) and integrate over \(x^\perp \). To this end, we show that the error term is of order one and is integrable as a function of \(x^\perp \) under some assumptions on \(\Lambda \). We believe that the precise description of the error term for the one-dimensional free case in terms of the finite collection of intervals \(\Lambda _{x^\perp }\) is of independent interest and we provide a proof in Appendix C. This dimensional reduction is also the strategy of Widom [37] and of Sobolev in the proof of the Widom conjecture in [32]. In fact, due to the fast (exponential) decay in the planar direction error estimates are simpler to obtain than in the case with no magnetic field. This and the improved Landau–Widom (or Widom) asymptotics allows us to prove for \(\textsf{C}^{1,\alpha }\) (smooth) regions \(\Lambda \) an error term (for polynomials as in Theorem 2.3) of the order \(L^2\) rather than merely of lower order than \(L^2\ln (L)\) in [32, Theorem 2.9].

Secondly, in Sect. 4 we make the transition in the asymptotic expansion from polynomials to the entropy function. This requires certain Schatten–von Neumann quasi-norm bounds presented in Sect. 5, which in turn are based on bounds obtained in previous papers [20, 21] and notably by Sobolev [34].

The smoothness conditions on the region \(\Lambda \) to prove our two-term asymptotic result with error term \(o(L^2\ln (L))\) are rather weak; namely, we require \(\Lambda \) to be only piecewise Lipschitz smooth. For a smooth region \(\Lambda \), one would expect the next lower order term to be of the order \(L^2\). This is indeed true if the boundary \(\partial \Lambda \) is piecewise \({\textsf{C}}^{1,\alpha }\) smooth. We also present regions with weaker regularity on the boundary for which the error term (for a quadratic polynomial) can be arbitrarily close to the leading \(L^2\ln (L)\)-term. This may also be of independent interest and is the content of Sect. 6.

A note on our notation: As L, \(L\ge 1\), is our scaling parameter that tends to infinity, we use the “big-O” and “small-o” notation in the sense that for two functions f and g on \({{\mathbb {R}}}^+\), \(f = O(g)\) if \(\limsup _L f(L)/g(L) <\infty \) and \(f = o(g)\) if \(\limsup _L f(L)/g(L) =0\). By C with or without indices, we denote various positive, finite constants, whose precise values is of no importance, and may even change from line to line.

2 Setup

We consider a nonzero constant magnetic field in \({{\mathbb {R}}}^3\) of strength B which is perpendicular to a plane. We assume without loss of generality that this constant magnetic field points in the positive z-direction with \(B>0\).

We denote the Euclidean norm in \({{\mathbb {R}}}^d\), \(d\in {\mathbb {N}}\), or the norm in the Hilbert-space \(\text{ L}^2({{\mathbb {R}}}^d)\) of complex-valued, square-integrable functions on \({{\mathbb {R}}}^d\) by the same symbol \(\Vert \cdot \Vert \). For \(x \in {{\mathbb {R}}}\), let \(\langle x \rangle :=\sqrt{ 1+ x^2}\) denote the Japanese bracket. For a Borel set \(\Omega \subset {{\mathbb {R}}}^d\) and \(k <d\), let \({\mathcal {H}}^k(\Omega )\) be the k-dimensional Hausdorff measure of \(\Omega \), \(\#\Omega ={\mathcal {H}}^0(\Omega )\) its counting measure, and let \(|\Omega |\) be its d-dimensional Lebesgue measure/volume. By \(\mathbb {1}_{\Omega }\) we denote the multiplication operator on \({\textsf{L}}^2({{\mathbb {R}}}^d)\) by the indicator function \(1_\Omega \) of the set \(\Omega \). As usual, we write for the complement .

For \(r>0\), \(x\in {{\mathbb {R}}}^d\), and a set \(X\subset {{\mathbb {R}}}^d\) we denote by

$$\begin{aligned} B_r(x) :=\big \{y\in {{\mathbb {R}}}^d : \Vert y-x\Vert<r\big \} ,\quad B_r(X) :=X+B_r(0):=\big \{x+y: x\in X, \Vert y\Vert <r\big \} \end{aligned}$$
(2.1)

the open ball of radius r with center x and the (open) r-neighborhood of the set \(X\subset {{\mathbb {R}}}^{d}\) of width r, respectively. In most cases, the dimension, d, is clear from the context and we omit it in the definition; if not, we write \(B^{(d)}_r(x)\). We denote the closed ball of radius r with center x by \(\overline{B}_r^{(d)}(x)\).

For a point \(x\in {{\mathbb {R}}}^3\), we write \(x = (x^\perp ,x^\parallel )\) with (planar coordinate) \(x^\perp \in {{\mathbb {R}}}^2\) and (longitudinal coordinate) \(x^\parallel \in {{\mathbb {R}}}\), and \(\nabla = (\nabla ^\perp ,\nabla ^\parallel )\), where \(\nabla ^\perp \) and \(\nabla ^\parallel \) are the gradients in the respective Cartesian coordinates.

By our assumption, the magnetic field is equal to \(B\cdot e_3\) with \(e_3 :=(0,0,1)\). We use the symmetric gauge \(a:{{\mathbb {R}}}^2\rightarrow {{\mathbb {R}}}^2\) defined as \(a(x^\perp ) :=B/2\,(-x^\perp _2,x^\perp _1)\) so that the curl

$$\begin{aligned} \nabla \times (a,0) = B\cdot e_3 . \end{aligned}$$
(2.2)

The one-particle Hamiltonian of the ideal Fermi gas in three-dimensional Euclidean space \({{\mathbb {R}}}^3\) subject to the magnetic field \(B\cdot e_3\) is informally given by

$$\begin{aligned} \textrm{H}_B :=(-\textrm{i}\nabla ^\perp + a)^2 + (-\textrm{i}\nabla ^\parallel )^2. \end{aligned}$$
(2.3)

We use physical units such that Planck’s constant \(\hbar = 1\), the mass is equal to 1/2 and the charge of the particles is equal to one. \(\textrm{H}_B\) is well defined as a self-adjoint operator on a suitable domain in the one-particle Hilbert space \(\textsf{L}^2({{\mathbb {R}}}^3)\).

The ground state of free fermions with one-particle Hamiltonian \(\textrm{H}_B\) is described by the spectral projection (or Fermi projection) \(\textrm{D}_\mu :=\mathbb {1}(\textrm{H}_B\le \mu ) :=1_{(-\infty ,\mu ]}(\textrm{H}_B)\) of \(\textrm{H}_B\) below some so-called Fermi energy (or chemical potential) \(\mu \in {{\mathbb {R}}}\). As is well-known, we have [10, 18]

$$\begin{aligned} (-\textrm{i}\nabla ^\perp + a)^2 = B\sum _{\ell =0}^\infty (2\ell +1) \textrm{P}_\ell \end{aligned}$$
(2.4)

with explicitly known (infinite-dimensional) eigenprojections \(\textrm{P}_\ell \) on \(\textsf{L}^2({{\mathbb {R}}}^2)\). In order to write down these projections, let us introduce the Laguerre polynomials, \(\mathcal {L}_\ell (t) :=\sum _{j=0}^\ell \frac{(-1)^j}{j!}\, \left( {\begin{array}{c}\ell \\ \ell - j\end{array}}\right) \, t^j\), \(t\ge 0\), of degree \(\ell \in {\mathbb {N}}_0\). Then, the integral kernel of \(\textrm{P}_\ell \) is given by the function

$$\begin{aligned} p_\ell (x^\perp ,y^\perp )&:=\frac{B}{2\pi } \,\mathcal {L}_\ell \big (B\Vert x^\perp -y^\perp \Vert ^2/2\big )\,\exp \big (-B\Vert x^\perp -y^\perp \Vert ^2/4 + \textrm{i}{\textstyle \frac{B}{2}} x^\perp \wedge y^\perp \big ) ,\nonumber \\&\quad x^\perp ,y^\perp \in {\mathbb {R}}^2. \end{aligned}$$
(2.5)

Here, \(\wedge \) refers to the exterior or wedge product on \({{\mathbb {R}}}^2\). The explicit description of this kernel is not relevant for this paper. We only use the exponential decay in \(\Vert x^\perp -y^\perp \Vert ^2\) and \(p_\ell (x^\perp ,x^\perp )=B/(2 \pi )\). In the z-direction, we meet the spectral projection \(\mathbb {1}((-\nabla ^\parallel )^2\le \mu )\) with (sine) integral kernel, \(\mathbb {1}((-\nabla ^\parallel )^2\le \mu )(z,z')=k_\mu (z-z')\),

$$\begin{aligned} k_\mu (z):={\left\{ \begin{array}{ll} \frac{\sin (\sqrt{\mu }z)}{\pi z} &{}{} \text { for } z\in {{\mathbb {R}}}{\setminus } \{0\} \\ \lim _{z \rightarrow 0} k_{\mu }(z) = \frac{\sqrt{\mu }}{\pi }&{}{} \text { for } z=0 \end{array}\right. } \quad \mu >0. \end{aligned}$$
(2.6)

The following factorization of spectral projections is crucial, which stems from the fact that the magnetic field is pointing in the z-direction. We work with the identification \({\textsf{L}}^2({{\mathbb {R}}}^2) \otimes {\textsf{L}}^2({{\mathbb {R}}})={\textsf{L}}^2({{\mathbb {R}}}^3)\). Since the spectrum of \(\textrm{H}_{B}\) is the set \([B,\infty )\), we may always consider \(\mu > B\) since for smaller values of \(\mu \) the ground state is zero. If \(B<\mu \le 3B\) then \(\textrm{D}_\mu = \textrm{P}_0 \otimes \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu - B]\). For higher values of \(\mu \), let \(\nu :=\lceil \frac{1}{2}(\mu /B -1)\rceil \in {\mathbb {N}}\) be the smallest integer larger or equal to \(\frac{1}{2}(\mu /B -1)\), and let us set \(\mu (\ell ) :=\mu - B(2\ell +1)\). Then,

$$\begin{aligned} \textrm{D}_\mu = \mathbb {1}(\textrm{H}_{B}\le \mu ) = \sum _{\ell =0}^{\nu -1} \textrm{P}_\ell \otimes \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu (\ell )] \end{aligned}$$
(2.7)

with integral kernel (\(x=(x^\perp ,x^\parallel ), y=(y^\perp ,y^\parallel )\))

$$\begin{aligned} \textrm{D}_\mu (x,y) = \sum _{\ell =0}^{\nu -1} p_\ell (x^\perp ,y^\perp ) k_{\mu (\ell )}(x^\parallel -y^\parallel ). \end{aligned}$$
(2.8)

For any Borel subset \(\Lambda \subset {{\mathbb {R}}}^3\) we define the spatial reduction (or truncation) of \(\textrm{D}_\mu \) to \(\Lambda \) by

$$\begin{aligned} \textrm{D}_\mu (\Lambda ) :=\mathbb {1}_\Lambda \textrm{D}_\mu \mathbb {1}_\Lambda . \end{aligned}$$
(2.9)

Before we define the main object in this paper, we introduce for any \(\gamma >0\) the \(\gamma \)-Rényi entropy function, \(h_\gamma :[0,1]\rightarrow [0,\ln (2)]\),

$$\begin{aligned} h_\gamma (t) :=&\ \frac{1}{1-\gamma }\ln \big (t^\gamma + (1-t)^\gamma \big ) ,\ \gamma \not = 1 , \end{aligned}$$
(2.10)
$$\begin{aligned} h_1(t) :=&-t\ln (t) - (1-t)\ln (1-t) \text{ if } t\not \in \{0,1\} \text{ and } h_1(0):=h_1(1):=0 . \end{aligned}$$
(2.11)

Now, for a ground state described by the projection \(\textrm{D}_\mu =\mathbb {1}(\textrm{H}_B\le \mu )\) as above, a Borel subset \(\Lambda \subset {{\mathbb {R}}}^3\), and localized ground-state projection, \(\textrm{D}_\mu (\Lambda ) \), we define the \(\gamma \)-Rényi entanglement entropy of the ground state at Fermi energy \(\mu \) localized (in space) to \(\Lambda \) by

$$\begin{aligned} \mathrm S_\gamma (\Lambda ) :=\textrm{tr}\,h_\gamma (\textrm{D}_\mu (\Lambda )) . \end{aligned}$$
(2.12)

Here, \(\textrm{tr}\,\) refers to the (usual Hilbert space) trace on \(\textsf{L}^2({{\mathbb {R}}}^d)\). For bounded \(\Lambda \), \(h_\gamma (\textrm{D}_\mu (\Lambda ))\) is trace-class by the same arguments as in the proof of Lemma 7 in [21]; thus, the entanglement entropy \(\mathrm S_\gamma (\Lambda )\) is trivially a positive number. This entropy is a rather complicated function of \(\Lambda \), but there is a chance to describe it for large regions. To this end, we scale a fixed set \(\Lambda \) by \(L, L \ge 1\), and we determine the leading growth (scaling) of the entropy \(\mathrm S_\gamma (L\Lambda )\) as \(L\rightarrow \infty \).

As there does not seem to be a common definition for regions with piecewise differentiable boundary, we will now provide the one used in this paper.

Definition 2.1

Let \(0< \alpha <1, d\in {\mathbb {N}}\). A region \(\Lambda \subset {{\mathbb {R}}}^{d+1}\) is a finite union of bounded, open, connected sets in \({{\mathbb {R}}}^{d+1}\) such that their closures (denoted by \(\bar{\cdot }\)) are disjoint. The boundary \(\partial \Lambda \) is the set \(\bar{\Lambda }{\setminus }\Lambda \). We assume that the closures \(\overline{\Lambda }\) and are topological manifolds with boundary \(\partial \Lambda \).

We call a bi-LipschitzFootnote 1 map \(\Psi :[0,1]^d \rightarrow \partial \Lambda \) a Lipschitz chart of \(\partial \Lambda \) if \(\Psi ((0,1)^d) \subset \partial \Lambda \) is relatively open. If in addition \(\Psi \in \textsf{C}^1((0,1)^d)\) and its differential \(D\Psi \) satisfies the Hölder condition

$$\begin{aligned} \Vert D\Psi (x) - D\Psi (y) \Vert \le C \Vert x-y \Vert ^\alpha , \quad x,y \in (0,1)^d , \end{aligned}$$
(2.13)

for some constant C, we say that \(\Psi \) is a \(\textsf{C}^{1,\alpha }\) chart. A finite set of charts \((\Psi _i)_{i \in I}\) is called a piecewise atlas of \(\partial \Lambda \) if \( \partial \Lambda =\bigcup _{i\in I} \Psi _i([0,1]^d)\), and a global atlas of \(\partial \Lambda \) if \( \partial \Lambda =\bigcup _{i\in I} \Psi _i((0,1)^d)\). We say an atlas is a Lipschitz atlas (resp. \(\textsf{C}^{1,\alpha }\)) if it consists of Lipschitz (resp. \(\textsf{C}^{1,\alpha }\)) charts.

We say that \(\Lambda \) is a piecewise Lipschitz region (resp. global Lipschitz region) if \(\partial \Lambda \) admits a piecewise Lipschitz atlas \((\Psi _{\text {pL},i})_{i \in I}\) (resp. global Lipschitz atlas \((\Psi _{\text {gL},i})_{i \in I}\)). We call \(\Lambda \) a piecewise \(\textsf{C}^{1,\alpha }\) region if it admits both a global Lipschitz atlas \((\Psi _{\text {gL},j})_{j \in J}\) and a piecewise \(\textsf{C}^{1,\alpha }\) atlas \((\Psi _{\text {pC},i})_{i \in I}\).

For a piecewise \(\textsf{C}^{1,\alpha }\) region \(\Lambda \), we fix a piecewise \(\textsf{C}^{1,\alpha }\) atlas \((\Psi _{\text {pC},i})_{i \in I}\) and define the set of all edges, \(\Gamma \) by

$$\begin{aligned} \Gamma :=\bigcup _{i \in I}\Psi _{\text {pC},i}(\partial ( [0,1]^d)) . \end{aligned}$$
(2.14)

Remarks 2.2

  1. (i)

    Any global Lipschitz region is obviously a piecewise Lipschitz region.

  2. (ii)

    Our definition of a global Lipschitz region is a bit more general than the usual notion of a strong Lipschitz region (see [2, Pages 66–67]), where every \(v \in \partial \Lambda \) has a neighborhood \(U_v \subset {{\mathbb {R}}}^{d+1}\) such that, after an affine-linear transformation, the set \(\Lambda \cap U_v\) looks like the graph below a Lipschitz function \(\Psi _v :(0,1)^d \rightarrow {{\mathbb {R}}}\). To get to our definition from this, one can choose the graph function \(x \mapsto (x, \Psi _v(x))\) on \((0,1)^d\) as the bi-Lipschitz function needed in our definition. (As a Lipschitz function, it naturally extends to all of \([0,1]^d\).)

  3. (iii)

    For a piecewise Lipschitz region \(\Lambda \subset {{\mathbb {R}}}^{d+1}\) and for \(v \in \partial \Lambda \), let n(v) be the unit outward normal vector at v. This is only well defined up to null sets with respect to the d-dimensional Hausdorff (surface) measure \({\mathcal {H}}^{d}\) on \(\partial \Lambda \), see Lemma A.6.

  4. (iv)

    As the set of edges, \(\Gamma \), depends on the piecewise \(\textsf{C}^{1,\alpha }\) atlas \(\Psi _{\textrm{pC},i}\) it may be a different set depending on the atlas.

For a continuous function \(f:[0,1]\rightarrow {\mathbb {C}}\) with \(f(0)=0\) and being Hölder continuous at the two endpoints 0 and 1, we introduce the linear functional

$$\begin{aligned} f\mapsto {\textsf{I}}(f) :=\frac{1}{4\pi ^2}\int _0^1\mathrm dt\, \frac{f(t)-tf(1)}{t(1-t)} . \end{aligned}$$
(2.15)

By our assumption, \(|{\textsf{I}}(f)|<\infty \). We note for later use two special cases. Namely, \(\textsf{I}(m):=\mathsf I((\cdot )^m) = -1/(4\pi ^2)\,\sum _{r=1}^{m-1}r^{-1}\); as usual we interpret the sum on the right-hand side as zero if \(m=1\), which coincides with the vanishing of \(\textsf{I}\) on affine linear functions. The second example concerns the \(\gamma \)-Rényi entropy function \(h_\gamma \) defined in (2.10). Here, \(\mathsf I(h_\gamma ) = (1+\gamma )/(24\gamma )\), see [20].

Our first main result is the following theorem, which we prove in the next section.

Theorem 2.3

Let \(f:[0,1]\rightarrow {\mathbb {C}}\) be a polynomial with \(f(0)=0\), let \(\Lambda \subset {{\mathbb {R}}}^3 , \mu>B>0, \nu :=\lceil \frac{1}{2}(\mu /B -1)\rceil \in {\mathbb {N}}\), the smallest integer larger or equal to \(\frac{1}{2}(\mu /B -1)\), and \(\mu (\ell ) :=\mu - (2\ell +1)B\). Let \( \textrm{D}_\mu (L\Lambda ) \) be the operator defined in (2.9).

  1. (i)

    If \(\Lambda \) is a piecewise Lipschitz region (see Definition 2.1), then we have the asymptotic expansion of the trace on \(\textsf{L}^2({{\mathbb {R}}}^3)\),

    $$\begin{aligned} \textrm{tr}\,f(\textrm{D}_\mu ({L\Lambda }))&= L^3 \frac{B}{2\pi ^2} \,\sum _{\ell =0}^{\nu -1} \sqrt{\mu (\ell )} f(1) |\Lambda | \nonumber \\&+ L^2\ln (L) \nu B \,\textsf{I}(f) \,\frac{1}{\pi } \int _{\partial \Lambda } \textrm{d} {\mathcal {H}}^2(v) \,|n(v) \cdot e_3 |+ o(L^2\ln (L)) , \end{aligned}$$
    (2.16)

    as \(L\rightarrow \infty \). Here, n(v) is the unit normal outward vector at \(v\in \partial \Lambda \), which is well defined for almost every \(v \in \partial \Lambda \), and \({\mathcal {H}}^2\) is the two-dimensional (surface) Hausdorff measure on \(\partial \Lambda \).

  2. (ii)

    If \(\Lambda \) is a piecewise \(\textsf{C}^{1,\alpha }\) region (see Definition 2.1), then the error term is \(O(L^2)\) instead of \(o(L^2\ln (L))\).

Remarks 2.4

  1. (i)

    The condition \(f(0)=0\) is no restriction in the sense that in general the operator on the left-hand side has to be replaced by \(f(\textrm{D}_\mu ({L\Lambda })) - f(0)\textrm{D}_\mu ({L\Lambda })\) and \(\textsf{I}(f)\) on the right-hand side by \(\textsf{I}(\tilde{f})\) with \(\tilde{f}(t):= f(t) - (1-t)f(0)\).

  2. (ii)

    For the ideal Fermi gas with one-particle Hamiltonian \(\textrm{H}_0 = -\Delta \) on \(\textsf{L}^2({{\mathbb {R}}}^3)\), Fermi energy \(\mu >0\), ground state Fermi projection \(\textrm{D}_\mu = \mathbb {1}(-\Delta \le \mu )\) and Fermi sea \(\Gamma :=\{p\in {{\mathbb {R}}}^3 : p^2\le \mu \}\) it was proved in [20] that

    $$\begin{aligned} \textrm{tr}\,f(\textrm{D}_\mu (L\Lambda )) = L^3 f(1) |\Gamma /(2\pi ) ||\Lambda |+ L^2\ln (L) \,\frac{\mu }{2\pi }\,\textsf{I}(f) \, {\mathcal {H}}^2(\partial \Lambda ) + o(L^2\ln (L))\nonumber \\ \end{aligned}$$
    (2.17)

    as \(L\rightarrow \infty \). To this end, note that \(|\Gamma | = \frac{4\pi }{3} \mu ^{3/2}\) and that our functional \(\textsf{I}\) here is the same as the functional I in [20]. The double-surface integral \(J(\partial \Gamma ,\partial \Lambda )\) [20, (2)] equals \(\frac{\mu }{2\pi }\mathcal H^{2}(\partial \Lambda )\). Letting B tend to zero in (2.16) but keeping the Fermi energy \(\mu \) fixed, the prefactor \(\nu B\) tends to \(\mu /2\). The remaining integral over \(\partial \Lambda \) is independent of the strength B and remains fixed. For the volume term, we have in this limit

    $$\begin{aligned}&\frac{B}{2\pi ^2}\sum _{\ell =0}^{\nu -1} \sqrt{\mu - (2\ell +1)B} \sim \frac{\mu ^{3/2}}{4\pi ^2 \nu } \\&\quad \sum _{\ell =0}^\nu \sqrt{1-\ell /\nu } \sim \frac{\mu ^{3/2}}{4\pi ^2} \int _0^1\text{ d }x\,\sqrt{x} = \frac{\mu ^{3/2}}{6\pi ^2} . \end{aligned}$$

    In this limit the volume term equals the above volume term at \(B=0\) as in (2.17). To summarize, we obtain

    $$\begin{aligned} \lim _{B\downarrow 0} \text{ rhs } \text{ of } (2.16)&= L^3 f(1) \frac{\mu ^{3/2}}{6\pi ^2}|\Lambda | + L^2\ln (L) \frac{\mu }{2\pi } \,\mathsf I(f) \int _{\partial \Lambda } \textrm{d}{\mathcal {H}}^2(v) \,|n(v)\cdot e_3| \\&\quad + o(L^2\ln (L)) , \end{aligned}$$

    which is identical to the right-hand side (rhs) of (2.17) except for the prefactor depending on \(\partial \Lambda \).

  3. (iii)

    There is no ’level mixing’ at the order in \(L^2\ln (L)\) in the sense that each Landau level enters individually in the numerical coefficient. In [21], we proved that level mixing occurs in the two-dimensional setting at the next-to-leading order, namely at the order L. We expect level mixing to occur in the present case at the order \(L^2\). This is certainly possible to prove, say for a cylindrical region, but it requires a three-term expansion in the \(x^\parallel \)-coordinate and the by now proved two-term expansion in the \(x^\perp \)-coordinate [21]. The caveat for us to proceed with this question is that the mentioned three-term expansion has not been proved so far for the entropy function. This is an interesting open problem.

  4. (iv)

    For (2.16) to hold we require only weak regularity of the boundary \(\partial \Lambda \) like in the proof in [20] for the ideal Fermi gas. In contrast, the proof of the corresponding two-term asymptotics for the two-dimensional model in [21] required \(\textsf{C}^3\) smooth regions. This smoothness was a technical condition and may not be necessary. On the other hand and more importantly, only the leading contribution of the two-dimensional Hamiltonian enters and the extra logarithm stems from an expansion in the longitudinal direction, where weaker conditions suffice.

3 Proof of Theorem 2.3

We split the proof into two steps. The first one is the lemma below, which reduces the computation of the trace to an integral of the trace of the projection operator \(\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ]\) localized to the sets \(L\Lambda _{x^\perp }\subset {{\mathbb {R}}}\) with respect to \(x^\perp \in {{\mathbb {R}}}^2\). The second step starts from there, proves an asymptotic expansion of this trace, and finishes the proof of Theorem 2.3.

Definition 3.1

For any Borel set \(E\subset {{\mathbb {R}}}^3\) and any \(x^\perp \in {{\mathbb {R}}}^2\) we define \(E_{x^\perp } :=\{x^\parallel \in {{\mathbb {R}}}: (x^\perp ,x^\parallel )\in E\}\) to collect the third components of the intersection \(E\cap (\{x^\perp \}\times {{\mathbb {R}}})\).

Lemma 3.2

Let \(m \in {\mathbb {N}}\) with \(m \ge 2\). Then, under the same conditions as in Theorem 2.3(i), there is a constant C depending only on Bm and \(\mu \) such that

$$\begin{aligned}&\Big |\text {tr}\,(\text {D}_\mu ({L\Lambda }))^m - L^2\frac{B}{2 \pi }\sum _{\ell =0}^{\nu -1} \int _{{{\mathbb {R}}}^2} \text {d} x^\perp \, {\text{ tr }} \left( \mathbb {1}_{L \Lambda _{x^\perp }}\mathbb {1}[(-\text {i}\nabla ^\parallel )^2 \le \mu (\ell )] \mathbb {1}_{L \Lambda _{x^\perp }} \right) ^m\Big |\nonumber \\&\quad \le C \mathcal K(\Lambda ) L^2 , \end{aligned}$$
(3.1)

where the \(\Lambda \) dependent constant \({\mathcal {K}}(\Lambda )\) is defined in Lemma A.3; it is positive and finite for any piecewise Lipschitz region \(\Lambda \). Note that \(L \Lambda _{x^\perp } :=L \left( \Lambda _{x^\perp } \right) \) is (in general) different from \((L \Lambda )_{x ^\perp }\).

Proof

We utilize the same changes of coordinates in the first two components (that is, for the planar \(x_0^\perp \)-coordinates) as in [21]. For the convenience of the reader we repeat all steps.

As \(\Lambda \) is bounded, the operator \(\mathbb {1}_{L\Lambda } \textrm{D}_\mu \) is Hilbert–Schmidt and therefore \(\textrm{D}_\mu (L \Lambda )\) is trace-class. We may write

$$\begin{aligned} \textrm{tr}\,\textrm{D}_\mu (L \Lambda )^m = \int _{{{\mathbb {R}}}^3} \text{ d } x_0\, \textrm{D}_\mu (L \Lambda )^m(x_0,x_0) \end{aligned}$$
(3.2)

with integral kernel

$$\begin{aligned} \textrm{D}_\mu (L \Lambda )(x,y) = \sum _{\ell =0}^{\nu -1} p_\ell (x^\perp ,y^\perp ) k_{\mu (\ell )}(x^\parallel ,y^\parallel ) ,\quad x=(x^\perp ,x^\parallel ) ,y=(y^\perp ,y^\parallel ) , \end{aligned}$$
(3.3)

as in (2.8). Therefore, the trace is of the form

$$\begin{aligned} \text {tr}\,\text {D}_\mu (L \Lambda )^m&= \int _{L\Lambda } \mathrm dx_0\sum _{\ell _1,\ldots ,\ell _m=0}^{\nu -1} \int _{{{\mathbb {R}}}^{2(m-1)}} \text { d } x^\perp _1\cdots \text { d } x^\perp _{m-1}\, p_{\ell _1}(x^\perp _0,x^\perp _1)\\ {}&\quad p_{\ell _2}(x^\perp _1,x^\perp _2)\cdots p_{\ell _m}(x^\perp _{m-1},x^\perp _0)\\ {}&\quad \times \,\int _{{{\mathbb {R}}}^{m-1}} \text { d } x_1^\parallel \cdots \text { d } x_{m-1}^\parallel \, k_{\mu (\ell _1)}(x^\parallel _0-x_1^\parallel ) \cdots k_{\mu (\ell _m)}(x_{m-1}^\parallel -x^\parallel _0)\,\\ {}&\quad \times 1_{L\Lambda }(x_1)\cdots 1_{L\Lambda }(x_{m-1}) . \end{aligned}$$

We begin by approximating \(1_{L\Lambda }(x_j)\) by \(1_{L\Lambda }(x_0^\perp ,x_j^\parallel )\). We call the resulting approximate term \(T(L\Lambda )\). This means

$$\begin{aligned} T(L \Lambda )&:=\int _{L \Lambda } \mathrm d x_0\sum _{\ell _1,\ldots ,\ell _m=0}^{\nu -1} \int _{{{\mathbb {R}}}^{2(m-1)}} \text{ d } x^\perp _1\cdots \text{ d } x^\perp _{m-1}\, p_{\ell _1}(x^\perp _0,x^\perp _1) p_{\ell _2}(x^\perp _1,x^\perp _2)\\&\quad \cdots p_{\ell _m}(x^\perp _{m-1},x^\perp _0)\\&\quad \times \,\int _{{{\mathbb {R}}}^{m-1}} \text{ d } x_1^\parallel \cdots \text{ d } x_{m-1}^\parallel \, k_{\mu (\ell _1)}(x_0^\parallel -x_1^\parallel ) \cdots k_{\mu (\ell _m)}(x_{m-1}^\parallel -x_0^\parallel )\, \\&\quad \times 1_{L\Lambda }(x_0^\perp ,x_1^\parallel )\cdots 1_{L\Lambda }(x_0^\perp ,x_{m-1}^\parallel )\, . \end{aligned}$$

As the second line is independent of \(x_j^\perp \), the integrals over \(x_1^\perp , \ldots , x_{m-1}^\perp \) can be easily resolved and yield the diagonal of the integral kernel of the operator \(\textrm{P}_{\ell _1} \cdots \textrm{P}_{\ell _{m}}\) at \(x_0^\perp \), which is \(B/(2\pi )\), if \(\ell _1= \dots =\ell _m\) and 0 otherwise. Thus, we have

$$\begin{aligned} T(L \Lambda )&= \int _{L \Lambda } \mathrm dx_0\sum _{\ell =0}^{\nu -1} \frac{B}{2\pi } \quad \int _{{{\mathbb {R}}}^{m-1}} \text{ d } x_1^\parallel \cdots \text{ d } x_{m-1}^\parallel \, k_{\mu (\ell )}(x_0^\parallel -x_1^\parallel )\cdots k_{\mu (\ell )}(x_{m-1}^\parallel -x_0^\parallel )\, \\ {}&\quad \times 1_{L\Lambda }(x_0^\perp ,x_1^\parallel )\cdots 1_{L\Lambda }(x_0^\perp ,x_{m-1}^\parallel ) . \end{aligned}$$

Now, we set \(x^\perp :=x_0^\perp /L \) and observe \(1_{L \Lambda }(x^\perp _0, x_j^\parallel )=1_{L \Lambda _{x ^\perp }}(x_j^\parallel )\). Therefore, we have

$$\begin{aligned} T(L \Lambda )&= L^2 \int _{{{\mathbb {R}}}^2} \mathrm d x ^\perp \sum _{\ell =0}^{\nu -1} \frac{B}{2\pi } \int _{L \Lambda _{x^\perp } } \mathrm dx_0^\parallel \int _{{{\mathbb {R}}}^{m-1}} \text{ d } x_1^\parallel \cdots \text{ d } x_{m-1}^\parallel \, \\&\quad \times k_{\mu (\ell )}(x_0^\parallel -x_1^\parallel ) \cdots k_{\mu (\ell )}(x_{m-1}^\parallel -x_0^\parallel )\, 1_{L\Lambda }(Lx^\perp ,x_1^\parallel )\cdots 1_{L\Lambda }(Lx^\perp ,x_{m-1}^\parallel )\, \\&= L^2 \frac{B}{2\pi }\int _{{{\mathbb {R}}}^2} \mathrm d x ^\perp \sum _{\ell =0}^{\nu -1} \textrm{tr}\,\left( \mathbb {1}_{L \Lambda _{x ^\perp } } \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu (\ell )] \mathbb {1}_{L \Lambda _{x^\perp } } \right) ^m , \end{aligned}$$

which is the expression in the claim. Thus, we are left to bound the error term of our approximation. Let us denote by \(U \subset {{\mathbb {R}}}^{3m}\) the set of all tuples \((x_0,x_1, \dots ,x_{m-1})\) where \(1_{L\Lambda } (x_0) 1_{L \Lambda } (x_1) \cdots 1_{L \Lambda } (x_{m-1}) \) is not equal to \(1_{L\Lambda } (x_0) 1_{L \Lambda } (x_0^\perp ,x_1^\parallel ) \cdots 1_{L \Lambda } (x_0^\perp ,x_{m-1}^\parallel )\). Then, using the notation \(x_m :=x_ 0\) we trivially have

$$\begin{aligned} \big |T(L \Lambda ) - \textrm{tr}\,\textrm{D}_\mu (L \Lambda )^m \big |\le \int _U \mathrm dx_0 \mathrm dx_1 \cdots \mathrm dx_{m-1} \prod _{j=0}^{m-1} |\mathrm D_\mu (x_j, x_{j+1} ) |. \end{aligned}$$
(3.4)

We will now enlarge U until we get a set where the integral can easily be calculated. Let \((x_0,x_1,\dots , x_{m-1}) \in U\). Then, there is a \(j \in \{ 1, \dots , m-1\}\) such that \(1_{L \Lambda } (x_j) \ne 1_{L \Lambda } (x_0^\perp , x_j^\parallel )\). Thus, the line between \(x_j\) and \((x_0^\perp ,x_j^\parallel )\) has to intersect the boundary \(L \partial \Lambda \), which implies \({\text {dist}}(x_j, L \partial \Lambda ) \le \Vert x_j^\perp - x_0^\perp \Vert \). By the triangle and mean inequalities, we observe that

$$\begin{aligned} {\text {dist}}(x_j, L \partial \Lambda ) \le \Vert x_j^\perp - x_0^\perp \Vert \le \sum _{k=1} ^{m} \Vert x_k^\perp - x_{k-1}^\perp \Vert \le \sqrt{m} \sqrt{ \sum _{k=1} ^{m} \Vert x_k^\perp - x_{k-1}^\perp \Vert ^2 } . \end{aligned}$$
(3.5)

For \(j \in \{0, \dots , m-1\}\), let \(U_j \subset {{\mathbb {R}}}^{3m}\) be the set of all \((x_0,x_1, \dots , x_{m-1})\in {{\mathbb {R}}}^{3m}\) satisfying

$$\begin{aligned} {\text {dist}}(x_j, L \partial \Lambda ) \le {\sqrt{m} } \sqrt{ \sum _{k=1} ^{m} \Vert x_k^\perp - x_{k-1}^\perp \Vert ^2 } . \end{aligned}$$
(3.6)

As \(U \subset \bigcup _{j=1}^{m-1} U_j\), we see that

$$\begin{aligned} \int _{U} \mathrm dx_0 \mathrm dx_1 \cdots \mathrm dx_{m-1} \prod _{j=0}^{m-1} |\mathrm D_\mu (x_j, x_{j+1} ) |&\le \sum _{k=1} ^{m-1} \int _{U_k} \mathrm dx_0 \mathrm dx_1 \cdots \mathrm dx_{m-1} \prod _{j=0}^{m-1} |\mathrm D_\mu (x_j, x_{j+1} )|\end{aligned}$$
(3.7)
$$\begin{aligned}&= (m-1) \int _{U_0} \mathrm dx_0 \mathrm dx_1 \cdots \mathrm dx_{m-1} \prod _{j=0}^{m-1} |\mathrm D_\mu (x_j, x_{j+1} )|. \end{aligned}$$
(3.8)

The cyclic parameter shift \((x_0,x_1, \dots , x_{m-1}) \mapsto (x_1, x_2, \dots , x_0)\) sends \(U_j\) to \(U_{j+1}\) and does not change the integrand. For \(1 \le j \le m\), let \(y_j :=x_j- x_{j-1}\). We will change variables from \((x_0,x_1, \dots , x_{m-1})\) to \((x_0,y_1, \dots , y_{m-1}) =:(x_0, {\textbf{y}})\). Using \(y_m^\perp = - \sum _{j=1}^{m-1} y_j^\perp \), similar to (3.5), we observe that

$$\begin{aligned} m \sum _{k=1} ^{m} \Vert x_k^\perp - x_{k-1}^\perp \Vert ^2 = m( \Vert {\textbf{y}} ^\perp \Vert ^2 + \Vert y_m^\perp \Vert ^2) \le m \Vert {\textbf{y}} ^\perp \Vert ^2 + m(m-1) \Vert {\textbf{y}} ^\perp \Vert ^2= m^2 \Vert {\textbf{y}}^\perp \Vert ^2 . \end{aligned}$$
(3.9)

Thus, under this change of variables the set \(U_0\) is mapped into the set

$$\begin{aligned} V :=\left\{ (x_0,y_1, \dots , y_{m-1} ) \in {{\mathbb {R}}}^{3m} :{\text {dist}}(x_0, L \partial \Lambda ) \le m \Vert \textbf{y}^\perp \Vert \right\} . \end{aligned}$$
(3.10)

Let us first estimate the integrand in terms of the \(y_j\)’s. With (2.8), (2.5) and (2.6), we get

$$\begin{aligned} |\mathrm D_\mu (x_j, x_{j+1} ) |\le C_{\mu ,B,1} \frac{\exp (-B \Vert y_{j+1}^\perp \Vert ^2/8)}{\langle y_{j+1}^\parallel \rangle } . \end{aligned}$$
(3.11)

We recall that \(\langle x \rangle =\sqrt{1+ x^2}\) is the Japanese bracket.

For \(x_0 \in {{\mathbb {R}}}^3\), let \(\Omega _{x_0} :=\{ {\textbf{y}} ^\perp \in {{\mathbb {R}}}^{2(m-1)} :{\text {dist}}(x_0, L \partial \Lambda ) \le m\Vert {\textbf{y}}^\perp \Vert \}\), and thus \(V=\{(x_0,{\textbf{y}}) \in {{\mathbb {R}}}^{3m} :{\textbf{y}}^\perp \in \Omega _{x_0}\}\). We have

$$\begin{aligned} \int _{U_0} \mathrm dx_0 \mathrm dx_1&\cdots \mathrm dx_{m-1} \prod _{j=0}^{m-1} |\mathrm D_\mu (x_j, x_{j+1} )|\end{aligned}$$
(3.12)
$$\begin{aligned}&\le C_{\mu ,B,1}^m \int _{V} \mathrm dx_0 \mathrm d \textbf{y}\prod _{j=1}^m \frac{\exp (-B \Vert y_{j}^\perp \Vert ^2/8)}{\langle y_{j}^\parallel \rangle } \end{aligned}$$
(3.13)
$$\begin{aligned}&= C_{\mu ,B,1}^m \left( \int _{{{\mathbb {R}}}^{m-1}} \mathrm d{\textbf{y}}^\parallel \prod _{j=1}^m \frac{1}{\langle y_{j}^\parallel \rangle } \right) \int _{{{\mathbb {R}}}^3} \mathrm dx_0 \int _{\Omega _{x_0}} \mathrm d{\textbf{y}}^\perp \exp (- B \Vert {\textbf{y}}^\perp \Vert ^2/8 ) . \end{aligned}$$
(3.14)

We need the estimate

$$\begin{aligned} \int _{{{\mathbb {R}}}^{m-1}} \mathrm d{\textbf{y}}^\parallel \prod _{j=1}^m \frac{1}{\langle y_{j}^\parallel \rangle } \le 2^m m! , \end{aligned}$$
(3.15)

which is proved in Appendix B. We also have the bound

$$\begin{aligned} \int _{\Omega _{x_0}} \mathrm d{\textbf{y}}^\perp \exp (- B \Vert {\textbf{y}}^\perp \Vert ^2 /8)&\le \sup _{{\textbf{y}}^\perp \in \Omega _{x_0}} \left( \exp (- B \Vert {\textbf{y}}^\perp \Vert ^2 /9)\right) \nonumber \\&\quad \int _{{{\mathbb {R}}}^{2(m-1)}} \mathrm d{\textbf{y}}^\perp \exp (- B \Vert {\textbf{y}}^\perp \Vert ^2 /72) \end{aligned}$$
(3.16)
$$\begin{aligned}&= \exp \left( \frac{ - B{\text {dist}}(x_0, L \partial \Lambda )^2 }{9m^2 } \right) \sqrt{72 \pi /B }^{2(m-1)} . \end{aligned}$$
(3.17)

Thus, we arrive at

$$\begin{aligned}&\int _{U_0} \mathrm dx_0 \mathrm dx_1 \cdots \mathrm dx_{m-1} \prod _{j=0}^{m-1} |\mathrm D_\mu (x_j, x_{j+1} )|\nonumber \\&\quad \le C_{\mu ,B,1}^m C_{B,2}^m m! \int _{{{\mathbb {R}}}^3} \mathrm dx_0 \exp \left( \frac{ - B{\text {dist}}(x_0, L \partial \Lambda )^2 }{9m^2 } \right) \end{aligned}$$
(3.18)
$$\begin{aligned}&\quad \le C_{\mu ,B,1}^m C_{B,2}^m m! \sum _{k=0}^\infty \left|B_{k+1} (L \partial \Lambda ) \right|\exp \left( -\frac{B}{9 m^2} k^2 \right) . \end{aligned}$$
(3.19)

Here, we used an \((1,\infty )\) Hölder estimate on the sets \(k \le {\text {dist}} (x_0,L\partial \Lambda )\le k+1\) for the integral over \({{\mathbb {R}}}^3\). We then enlarged these sets to the \((k+1)\)-neighborhood \(B_{k+1}(L \partial \Lambda )\), as their measures can be estimated more easily. Thus, using Lemma A.3 with \(d=2\) and \(r=k+1\), we arrive at

$$\begin{aligned}&|T(L \Lambda ) - \textrm{tr}\,\textrm{D}_\mu (L \Lambda )^m |\end{aligned}$$
(3.20)
$$\begin{aligned}&\le (m-1)(C_{\mu ,B,1} C_{B,2} )^m m! \sum _{k=0} ^\infty \left|B_{k+1} (L \partial \Lambda ) \right|\exp \left( -\frac{B}{9m^2} k^2 \right) \end{aligned}$$
(3.21)
$$\begin{aligned}&\le (m-1)(C_{\mu ,B,1} C_{B,2} )^m m! \sum _{k=0} ^\infty L^3 \left|B_{\frac{k+1}{L}} ( \partial \Lambda ) \right|\exp \left( -\frac{B}{9 m^2} k^2 \right) \end{aligned}$$
(3.22)
$$\begin{aligned}&\le (m-1)(C_{\mu ,B,1} C_{B,2} )^m m! \sum _{k=0} ^\infty L^3 {\mathcal {K}} (\Lambda ) \left( \frac{k+1}{L} + \frac{(k+1)^3}{L^3} \right) \exp \left( -\frac{B}{9 m^2} k^2 \right) \end{aligned}$$
(3.23)
$$\begin{aligned}&\le (m-1)(C_{\mu ,B,1} C_{B,2} )^m m! {\mathcal {K}} (\Lambda ) L^2 \sup _{t>0}\left( (t+1)^3(t+2)^2 \exp \left( -\frac{B}{9m^2} t^2 \right) \right) \nonumber \\&\quad \sum _{k=0} ^\infty \frac{1}{(k+1)(k+2)} \end{aligned}$$
(3.24)
$$\begin{aligned}&\le (m-1)(C_{\mu ,B,1} C_{B,2} )^m m! {\mathcal {K}} (\Lambda ) L^2 C_{B,3} m^{5} \le {\mathcal {K}} (\Lambda ) L^2 C_{\mu ,B}^m m! , \end{aligned}$$
(3.25)

which was our claim. \(\square \)

In the next step, we accomplish the

Proof of Theorem 2.3

As the expression is linear in f, it suffices to consider monomials \(f(t) = t^m\) with integer \(m\ge 1\). In the special case \(m=1\), we just use (2.8), (2.5), and (2.6) to see

$$\begin{aligned} \textrm{tr}\,\mathrm D_{\mu }(L\Lambda )&= \int _{L\Lambda } \mathrm d x_0 \, \mathrm D_{\mu }(x_0,x_0) = \int _{L\Lambda } \mathrm d x_0 \sum _{\ell =0}^{\nu -1} k_{\mu (\ell )}(0) p_\ell (x_0^\perp ,x_0^\perp ) \end{aligned}$$
(3.26)
$$\begin{aligned}&= \int _{L\Lambda } \mathrm d x_0 \sum _{\ell =0}^{\nu -1} \frac{\sqrt{\mu (\ell )} }{\pi }\frac{B}{2 \pi } = L^3 \frac{B}{2 \pi ^2} \sum _{\ell =0}^{\nu -1} \sqrt{\mu (\ell )} 1^1 |\Lambda |. \end{aligned}$$
(3.27)

As \({\textsf{I}}(1)={\textsf{I}}(id)=0\), this covers the case \(m=1\) and we may from now on assume \(m\ge 2\).

Our first aim is to understand the open sets \(\Lambda _{x^\perp }\). This is essentially a question about the nature of the sets \(\Lambda \). There are some results to choose from, so let us take a look. Due to Lemmas A.6 and A.8, for Lebesgue almost every \(x^\perp \in {{\mathbb {R}}}^2\), the set \(\Lambda _{x^\perp }\) is a finite union of disjoint intervals, \(\partial \left( \Lambda _{x^\perp } \right) =(\partial \Lambda )_{x^\perp }\), and \(\#(\partial (\Lambda _{x^\perp }))\) is twice the number of these intervals. Henceforth, we set \(\partial \Lambda _{x^\perp } :=\partial \left( \Lambda _{x^\perp }\right) \). The (improved) asymptotic expansion goes back to Landau and Widom [19] and is presented in Appendix C, see Corollary C.3. The coefficient \(\textsf{I}(m) = -1/(4\pi ^2)\,\sum _{r=1}^{m-1}r^{-1}\) is mentioned below (2.15).

For fixed \(\Lambda _{x^\perp }\), the error term \(\varepsilon ( \Lambda _{x^\perp }, L) \) remains bounded as \(L \rightarrow \infty \). However, we need to know, whether this error term is integrable over \(x^\perp \). Thus, the dependency on \(\Lambda _{x^\perp }\) is relevant.

To derive the \(o(L^2 \ln (L))\) error term, we subtract the volume term, divide by \(L^2 \ln (L)\) and use dominated convergence in order to exchange the limit \(L \rightarrow \infty \) with the integral over \(x^\perp \). Thus, instead of an estimate for the error term that is of a lower order in L than \(\ln (L)\), we only need an upper bound for the difference to the volume term, which is of order \(\ln (L)\). This upper bound is provided by Lemma 6.1. As any interval in \(L \Lambda _{x^\perp }\) has length at most CL, we arrive at

$$\begin{aligned} \Big |{\text {tr}}&\left( \mathbb {1}_{L \Lambda _{x^\perp } }\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu (\ell )] \mathbb {1}_{L \Lambda _{x^\perp } } \right) ^m - \frac{\sqrt{ \mu (\ell )}}{\pi }L |\Lambda _{x^\perp } |\Big | \end{aligned}$$
(3.28)
$$\begin{aligned}&= \Big |{\text {tr}}\Big [ \left( \mathbb {1}_{L \Lambda _{x^\perp } }\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu (\ell )] \mathbb {1}_{L \Lambda _{x^\perp } } \right) ^m - \mathbb {1}_{L \Lambda _{x^\perp } }\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu (\ell )] \mathbb {1}_{L \Lambda _{x^\perp } } \Big ]\Big |\end{aligned}$$
(3.29)
$$\begin{aligned}&\le \left\Vert \left( \mathbb {1}_{L \Lambda _{x^\perp } }\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu (\ell )] \mathbb {1}_{L \Lambda _{x^\perp } } \right) ^m - \mathbb {1}_{L \Lambda _{x^\perp } }\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu (\ell )] \mathbb {1}_{L \Lambda _{x^\perp } } \right\Vert _1 \end{aligned}$$
(3.30)
$$\begin{aligned}&\le C \#( \partial \Lambda _{x^\perp }) \ln (L) , \end{aligned}$$
(3.31)

where the constant C depends on m, \(\mu (\ell )\) and \(\Lambda \), but not on \(x^\perp \). With this estimate, we apply dominated convergence to get

$$\begin{aligned}&\lim _{L \rightarrow \infty } \frac{1}{L^2 \ln (L)} \left( \textrm{tr}\,\textrm{D}_\mu (L\Lambda )^m - BL^3 |\Lambda |\sum _{\ell =0}^{\nu -1} \frac{\sqrt{\mu (\ell )}}{2 \pi ^2} \right) \end{aligned}$$
(3.32)
$$\begin{aligned}&= \sum _{\ell =0}^{\nu -1} \lim _{L \rightarrow \infty } \frac{1}{L^2 \ln (L)} \frac{B}{2\pi } L^2 \Big (\int _{{{\mathbb {R}}}^2} \textrm{d}x^\perp \,{\text {tr}} \left( \mathbb {1}_{L \Lambda _{x^\perp } }\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu (\ell )] \mathbb {1}_{L \Lambda _{x^\perp } } \right) ^m \nonumber \\&\quad - \frac{\sqrt{\mu (\ell )}}{\pi }|L \Lambda _{x^\perp } |\Big ) \end{aligned}$$
(3.33)
$$\begin{aligned}&= \sum _{\ell =0}^{\nu -1} \frac{B}{2\pi } \int _{{\mathbb {R}}^2} \textrm{d}x^\perp \lim _{L \rightarrow \infty } \frac{1}{\ln (L)} \Big ( {\text {tr}} \big (\mathbb {1}_{L \Lambda _{x^\perp } }\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu (\ell )] \mathbb {1}_{L \Lambda _{x^\perp } } \big )^m - \frac{\sqrt{\mu (\ell )}}{\pi }|L \Lambda _{x^\perp } |\Big ) \end{aligned}$$
(3.34)
$$\begin{aligned}&= \nu \frac{B}{2\pi } 2\,\textsf{I}(m) \int _{{\mathbb {R}}^2} \textrm{d}x^\perp \, \#( \partial \Lambda _{x^\perp } ) = \nu B \,\textsf{I}(m) \frac{1}{\pi }\int _{\partial \Lambda } \textrm{d}{\mathcal {H}}^2(v) \, |n(v)\cdot e_3 |. \end{aligned}$$
(3.35)

We moved the sum over \(\ell \) to the front, as every summand converges as \(L \rightarrow \infty \). In the second line we used that \(\int _{{{\mathbb {R}}}^2} \text{ d } x^\perp \,|\Lambda _{x^\perp }| = |\Lambda |\). Finally, we inserted (A.44) to obtain the expansion with error term \(o(L^2\ln (L))\) as claimed in the theorem.

For the second part, we need to show that the error term for polynomials can be bounded by \(CL^2\), if \(\Lambda \) is a piecewise \(\textsf{C}^{1,\alpha }\) region for some \(0< \alpha <1\), as defined in Definition 2.1. This time, we use Corollary C.3 to deal with the trace of the one-dimensional operator. For that, we arrange each \(\partial \Lambda _{x^\perp }:=(\partial \Lambda )_{x^\perp }=\{w_{x^\perp 1}, \dots , w_{x^\perp \#( \partial \Lambda _{x^\perp })}\}\subset {{\mathbb {R}}}\) in the order of increasing third components and write

$$\begin{aligned} \Big |{\text {tr}}&\left( \mathbb {1}_{L \Lambda _{x^\perp } }\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ] \mathbb {1}_{L \Lambda _{x^\perp } } \right) ^m - \frac{\sqrt{\mu }}{\pi } L |\Lambda _{x^\perp } |- 2 \,\textsf{I}(m) \#( \partial \Lambda _{x^\perp } ) \ln (1+ L ) \Big | \end{aligned}$$
(3.36)
$$\begin{aligned}&\le C \sum _{i=1}^{\#( \partial \Lambda _{x^\perp }) -1}\big (1+|\ln (|w_{x^\perp i}-w_{x^\perp i+1} |) |\big ) \end{aligned}$$
(3.37)
$$\begin{aligned}&\le C \sum _{i=1}^{\#( \partial \Lambda _{x^\perp }) } \Big (1+ |\ln \big ( \inf _{v \in \partial \Lambda _{x^\perp }{\setminus } w_{x^\perp i}} |w_{x^\perp i} -v|\big )|\Big ). \end{aligned}$$
(3.38)

In the last step, we used that the distance between any two points in \(\partial \Lambda \) is bounded from above, as \(\Lambda \) is bounded to conclude that only short distances \(|v- v_i|\) can lead to an error term larger than the \(O(\#( \partial \Lambda _{x^\perp }))\)-term we have in front. A lower bound for the infimum is provided by Lemma A.1. This bound is zero in some cases, which leads to the logarithm being infinite. This just means that our integrand in the integral over \(x^\perp \) attains infinity. The integral can still exist and we will show that it does.

As the terms of order \(L^3\) and \(L^2\ln (L)\) work just like in the previous case, we will only consider the error term. Hence, we need to estimate

$$\begin{aligned} \int _{{{\mathbb {R}}}^2} \textrm{d}x^\perp&\sum _{i=1}^{ \#( \partial \Lambda _{x^\perp })} \Big (1+ |\ln ( \inf _{v \in \partial \Lambda _{x^\perp }{\setminus } w_{x^\perp i}} |w_{x^\perp i} -v|)|\Big ) \end{aligned}$$
(3.39)
$$\begin{aligned}&\le C \int _{{{\mathbb {R}}}^2} \textrm{d}x^\perp \sum _{i=1}^{\#(\partial \Lambda _{x^\perp })} \Big (1+ |\ln ( \min \{ {\text {dist}}(w_{x^\perp i},\Gamma ), |n((x^\perp , w_{x^\perp i})) \cdot e_3 |^{\frac{1}{\alpha }})|\}\Big ) \end{aligned}$$
(3.40)
$$\begin{aligned}&\le C \int _{{{\mathbb {R}}}^2} \textrm{d}x^\perp \sum _{w \in \{ x^\perp \} \times \partial \Lambda _{x^\perp }} \Big (1+ |\ln ({\text {dist}}(w,\Gamma ))|+ |\ln (|n(w) \cdot e_3 |) |\Big ) . \end{aligned}$$
(3.41)

In the first step, we applied Lemma A.1 with the vectors \(v_1 :=(x^\perp ,w_{x^\perp i})\) amd \(v_2 :=(x^\perp , v)\) noting that \(\frac{v_1-v_2}{\Vert v_1 -v_2\Vert } = \pm e_3\). We now want to rewrite this integral as an integral over the boundary \(\partial \Lambda \). This is possible by Lemma A.7. Hence, we have (recall that \({\mathcal {H}}^2\) is the canonical surface measure on \(\partial \Lambda \)),

$$\begin{aligned} \int _{{{\mathbb {R}}}^2} \textrm{d}x^\perp&\sum _{i=1}^{\#( \partial \Lambda _{x^\perp })} \Big (1+ |\ln ( \inf _{v \in \partial \Lambda _{x^\perp }{\setminus } w_{x^\perp i}} |w_{x^\perp i} -v|)|\Big ) \end{aligned}$$
(3.42)
$$\begin{aligned}&\le C \int _{\partial \Lambda } \textrm{d}{\mathcal {H}}^2(w)\, \big [1+ |\ln ({\text {dist}}(w,\Gamma ))|+ |\ln (|n(w) \cdot e_3 |) |\big ]\, |n(w) \cdot e_3|\end{aligned}$$
(3.43)
$$\begin{aligned}&\le C +C \int _{\partial \Lambda } \textrm{d}{\mathcal {H}}^2(w)\,|\ln ({\text {dist}}(w,\Gamma ))|\le C . \end{aligned}$$
(3.44)

In the second step, we used that \(0 \le |n(w) \cdot e_3 |\le 1\) and that for \(0 \le t \le 1\), we have \( 0\le |t \ln (t) |\le 1/\textrm{e}\). The last step is a rather lengthy, not particularly insightful calculation, which can be found in Lemma A.9.

Once we put the factor \(L^2\) back in front of this, we arrive at the error term \(O(L^2)\) which completes the proof of the second part of this theorem. \(\square \)

4 Entanglement Entropy

Here is the main result of this paper.

Theorem 4.1

Suppose that \(\Lambda \subset {{\mathbb {R}}}^3\) is a piecewise Lipschitz region and let \(\mu > B\). Let \(\nu :=\lceil \frac{1}{2}(\mu /B -1)\rceil \) and let \(h:[0,1] \rightarrow {{\mathbb {R}}}\) be a continuous function, which is \(\beta \)-Hölder continuous at 0 and 1 for some \(1 \ge \beta >0\), and assume that \(h(0)=h(1)=0\). Then, we have the asymptotic expansion

$$\begin{aligned} \textrm{tr}\,h(\mathrm D_\mu (L\Lambda ) )= L^2 \ln (L) \nu B \frac{1}{\pi }\, \textsf{I}(h) \int _{\partial \Lambda }\textrm{d}{\mathcal {H}}^2(v)\,|n(v)\cdot e_3| + o(L^2\ln (L)) .\nonumber \\ \end{aligned}$$
(4.1)

In particular, as the \(\gamma \)-Rényi entropy function \(h_\gamma \) is \(\beta \)-Hölder continuous for any \(\beta < \min (\gamma ,1)\), the \(\gamma \)-Rényi entanglement entropy, \(\mathrm S_\gamma (L\Lambda )\), of the ground state at Fermi energy \(\mu \) localized to \(L\Lambda \), satisfies the asymptotic expansion

$$\begin{aligned} \mathrm S_\gamma (L\Lambda ) = L^2\ln (L)\nu B\,\frac{1+\gamma }{24\gamma \pi }\int _{\partial \Lambda }\textrm{d}\mathcal H^2(v)\,|n(v)\cdot e_3| + o(L^2\ln (L)) \end{aligned}$$
(4.2)

as \(L\rightarrow \infty \).

Remarks 4.2

  1. (1)

    Unlike in Theorem 2.3, we cannot improve the \(o(L^2\ln (L))\) error term in (4.2) if we assume stronger regularity conditions on the boundary of \(\Lambda \). This is a limitation of our method of proof, which relies on the Stone–Weierstrass approximation. Here, we lose control of the error term.

  2. (2)

    Let us compare (4.2) to the asymptotic expansion of the entanglement entropy of the ground state at Fermi energy \(\mu >0\) in the ideal Fermi gas, as introduced in Remark 2.4(ii). Here, the \(\gamma \)-Rényi entanglement entropy satisfies

    $$\begin{aligned} \mathrm S_\gamma (L\Lambda ) = L^2\ln (L) \frac{\mu (1+\gamma )}{48\gamma \pi }\mathcal H^{2}(\partial \Lambda ) + o(L^2\ln (L)) . \end{aligned}$$
  3. (3)

    To the best of our knowledge, the result (4.2) is new, even in the physics literature. The factor \(\nu B\) satisfies \(\nu B = \mu /2 +(\delta -\frac{1}{2})B\) for some \(\delta \in [0,1)\). We can bound the surface integral in (4.2) by \({\mathcal {H}}^{2}(\partial \Lambda )\) due to \(|n(v)\cdot e_3|\le 1\). Let us set \(\mu _{\text {ref}} :=2\nu B \). This corresponds to the same number \(\nu \) of Landau levels as the original \(\mu \), and seems to be a suitable reference value for comparing the Landau Hamiltonian with the free Hamiltonian. Thus the entanglement entropy associated to the Landau Hamiltonian is always smaller than the one associated to the free Hamiltonian at the reference value \(\mu _{\text {ref}}\).

We use certain estimates on traces. To this end, let us denote by \(s_n({T}), n\in {\mathbb {N}}\), the singular values of the compact operator T on a (separable) Hilbert space, arranged in decreasing order. The standard notation \({\mathfrak {S}}_p, 0<p<\infty \) is used for the class of operators with a finite Schatten–von Neumann quasi-norm:

$$\begin{aligned} \Vert {T}\Vert _p :=\bigg [\sum _{n=1}^\infty s_n({T})^p\bigg ]^{\frac{1}{p}}<\infty . \end{aligned}$$

If \(p\ge 1\), then \(\Vert \cdot \Vert _p\) defines a norm. For \(0< p < 1\) it is a quasi-norm that satisfies the p-triangle inequality

$$\begin{aligned} \Vert {T}_1+{T}_2\Vert _p^p\le \Vert {T}_1 \Vert _p^p + \Vert {T}_2\Vert _p^p . \end{aligned}$$
(4.3)

The class \({\mathfrak {S}}_1\) is the standard trace-class. The class \({\mathfrak {S}}_2\) is the ideal of Hilbert–Schmidt operators. The p-Schatten quasi-norm estimate required for this proof is shown in Theorem 5.5.

Proof of Theorem 4.1

The proof goes along the same line of arguments as presented in [20, 21]. We recall that \(\textsf{I}(h_\gamma ) = (1+\gamma )/(24\gamma )\) and thus we are left to show the claim for the function h. Let \(r = \beta /2\) and \(\varepsilon >0\). We choose a smooth cutoff function \(\zeta _\varepsilon \) such that \(0\le \zeta _\varepsilon \le 1\) and such that \(\zeta _\varepsilon \) vanishes on \([\varepsilon ,1-\varepsilon ]\) and equals 1 on \([0,\varepsilon /2] \cup [1-\varepsilon /2,1]\). As h is continuous and \(\beta \)-Hölder continuous at 0 and 1, there is a constant C such that

$$\begin{aligned} h(t) \le Ct^\beta (1-t)^\beta , \quad t \in [0,1] . \end{aligned}$$
(4.4)

This implies

$$\begin{aligned} |(\zeta _\varepsilon h)(t)| \le C \varepsilon ^r t^r(1-t)^r\,\quad t \in [0,1] . \end{aligned}$$
(4.5)

As the function \(t \mapsto \frac{(1-\zeta _\varepsilon (t) )h(t) }{t(1-t)}\) is continuous, we can infer from the Stone–Weierstrass approximation theorem that there is a polynomial p and a function \(\delta _\varepsilon :[0,1] \rightarrow {{\mathbb {R}}}\) with \(\Vert \delta _\varepsilon \Vert _{{\textsf{L}}^\infty ([0,1])} \le \varepsilon ^r \) and

$$\begin{aligned} \frac{(1-\zeta _\varepsilon (t)) h(t) }{t(1-t)} = p(t) + \delta _\varepsilon (t) , \quad t \in [0,1] . \end{aligned}$$
(4.6)

Thus, we have

$$\begin{aligned} h(t) = p(t) t(1-t) + \delta _\varepsilon (t) t(1-t) + \zeta _\varepsilon (t) h(t) =:p(t)t(1-t) + \phi _\varepsilon (t)\, . \end{aligned}$$
(4.7)

As \(t(1-t) \le t^r (1-t)^r\), we observe

$$\begin{aligned} |\phi _\varepsilon (t) |\le C\varepsilon ^r t^r(1-t)^r , \quad t \in [0,1] . \end{aligned}$$
(4.8)

Thus, using Theorem 5.5, (2.7) and (4.3), we arrive at

(4.9)
(4.10)
(4.11)
(4.12)
(4.13)

In (4.10), we used that \(\mathrm D_\mu \) is a projection. Let \(q(t) :=p(t)t(1-t)\). Now, by linearity of \(\textsf{I}\) and the estimate (4.8), we have

$$\begin{aligned} |\textsf{I}(h) - \textsf{I}(q) |= |\textsf{I} (\phi _\varepsilon ) |\le C \varepsilon ^r \,\textsf{I} (t \mapsto t^r(1-t)^r) \le C \varepsilon ^r . \end{aligned}$$
(4.14)

Theorem 2.3(i) applied for the polynomial q with \(q(0)=q(1)=0\) yields

$$\begin{aligned} \textrm{tr}\,q(\mathrm D_\mu (L\Lambda )) = L^2 \ln (L) B \nu \,\textsf{I} (q) \frac{1}{\pi }\int _{\partial \Lambda }\textrm{d}\mathcal H^2(v)\,|n(v)\cdot e_3| + o(L^2\ln (L)) . \end{aligned}$$
(4.15)

Now, combining (4.13), (4.14) and (4.15), we arrive at

$$\begin{aligned} \limsup _{L\rightarrow \infty } \left| \frac{\textrm{tr}\,h(\textrm{D}_\mu (L\Lambda ))}{L^2\ln (L)} - \nu B \,\textsf{I}(h)\,\frac{1}{\pi } \int _{\partial \Lambda } \textrm{d} {\mathcal {H}}^2(v) \,|n(v)\cdot e_3| \right| \le C \varepsilon ^r . \end{aligned}$$
(4.16)

As \(\varepsilon >0\) is arbitrary, we have proved the claim. \(\square \)

5 Schatten–von Neumann Quasi-Norm Estimates

By a box in \({\mathbb {R}}^d\), we mean a Cartesian product of d intervals. These intervals do not have to be bounded. We will denote subsets of \({\mathbb {R}}\) by I, of \({\mathbb {R}}^2\) by \(\Upsilon \), and of \({\mathbb {R}}^3\) by \(\Lambda \). We will combine known estimates for the two-dimensional magnetic Hamiltonian from [21] and for the one-dimensional Hamiltonian [20, 34] without a magnetic field.

Let \(\Upsilon ,\Upsilon '\subset {\mathbb {R}}^2\) be Lipschitz regions and let \(I,I' \subset {\mathbb {R}}\) be finite unions of closed intervals. Then, we have

$$\begin{aligned} \mathbb {1}_{\Upsilon \times I}\big ( \textrm{P}_{ \ell } \otimes \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ]\big ) \mathbb {1}_{\Upsilon '\times I'} = \big (\mathbb {1}_{\Upsilon } \textrm{P}_{ \ell } \mathbb {1}_{\Upsilon '} \big ) \otimes \big (\mathbb {1}_{I} \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ] \mathbb {1}_{I'} \big ) . \end{aligned}$$
(5.1)

As the singular values of the tensor product of two operators are given by all possible products of pairs of the individual singular values, we have for any \(0<p \le \infty :\)

$$\begin{aligned} \left\Vert \mathbb {1}_{\Upsilon \times I}\left( \textrm{P}_{ \ell } \otimes \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ]\right) \mathbb {1}_{\Upsilon '\times I'}\right\Vert _p = \left\Vert \mathbb {1}_{\Upsilon } \textrm{P}_{ \ell } \mathbb {1}_{\Upsilon '} \right\Vert _p \left\Vert \mathbb {1}_{I} \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ] \mathbb {1}_{I'} \right\Vert _p . \end{aligned}$$
(5.2)

The following general properties will be useful:

Lemma 5.1

For any self-adjoint bounded operators \(S,T:\textsf{L}^2({\mathbb {R}}^d) \rightarrow \textsf{L}^2({\mathbb {R}}^d)\), any measurable sets \(\Omega _1,\Omega _2\), \(\Omega _1',\Omega _2' \subset {\mathbb {R}}^d\) and any \(0<p \le 1\), we have

  1. Symmetry

    \( \left\Vert \mathbb {1}_{\Omega _1} T \mathbb {1}_{\Omega _2} \right\Vert _p = \left\Vert \mathbb {1}_{\Omega _2} T \mathbb {1}_{\Omega _1} \right\Vert _p\),

  2. Monotonicity I

    \( \left\Vert \mathbb {1}_{\Omega _1} T \mathbb {1}_{\Omega _2} \right\Vert _p \le \left\Vert \mathbb {1}_{\Omega _1 \cup \Omega _1'} T \mathbb {1}_{\Omega _2\cup \Omega _2'} \right\Vert _p\),

  3. Monotonicity II

    If \(0 \le S \le T\), then \(\Vert S \Vert _p \le \Vert T \Vert _p\),

  4. Subadditivity

    \(\left\Vert \mathbb {1}_{\Omega _1 \cup \Omega _1'} T \mathbb {1}_{\Omega _2} \right\Vert _p^p \le \left\Vert \mathbb {1}_{\Omega _1} T \mathbb {1}_{\Omega _2} \right\Vert _p^p + \left\Vert \mathbb {1}_{\Omega _1'} T \mathbb {1}_{\Omega _2} \right\Vert _p^p\).

A proof of these properties can be found, for example, in [29].

We assume now that the magnetic-field strength has been “scaled out” so that \(B=1\) for the remainder of this section. The effective scale in the planar coordinates is \(L\sqrt{B}\) and in the perpendicular it is \(L\sqrt{\mu }\).

Next, we collect some more specific (quasi-)norm estimates for both the one dimensional free Hamiltonian and the constant magnetic field Hamiltonian in two dimensions.

Proposition 5.2

Let \(0< p \le 1, \ell \in {\mathbb {N}}_0\) and let \(\mu > 0\). Then, there is a constant C such that for any \(x \in {\mathbb {R}}^2, t \in {\mathbb {R}}, h\ge 2, \delta \ge 1 \), any measurable set \(\Upsilon \subset {{\mathbb {R}}}^2 \) such that \([-\delta ,1+\delta ]^2+x \subset \Upsilon \) and any measurable set \(I\subset {{\mathbb {R}}}\) such that \([t,t+h] \subset I\), we have the estimates

(5.3)
(5.4)
(5.5)
(5.6)

Proof

The first two inequalities follow by [21, Lemma 12], monotonicity I in Lemma 5.1, and the unitary translation invariance of \(\textrm{P}_\ell \). The 8 in the denominator was increased to 18 in (5.4) as we switched from circles to squaresFootnote 2. To prove the last inequality, we first use monotonicity I and the translation invariance, then the standard unitary equivalence, see, for example, [19, (7–10)], and finally [34, Corollary 4.7]. Thus,

(5.7)
(5.8)
(5.9)

For the third inequality, we will reduce to the case \(h=2, t=0\) by subadditivity, monotonicity I and translation invariance. Let \(m :=\lceil h/2 \rceil \in {\mathbb {N}}\) be the smallest integer larger or equal to h/2. Thus, as \(h\ge 2\), we have \(m \le h\). We observe that

$$\begin{aligned} \left\Vert \mathbb {1}_{[t,t+h]} \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ] \right\Vert _p^p&\le \sum _{k=0}^{m-1} \left\Vert \mathbb {1}_{[t+2k,t+2k+2]} \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ] \right\Vert _p^p \end{aligned}$$
(5.10)
$$\begin{aligned}&= m \left\Vert \mathbb {1}_{[0,2]} \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ] \right\Vert _p^p \end{aligned}$$
(5.11)
$$\begin{aligned}&\le 2h \left\Vert \mathbb {1}_{[0,2]} \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ] \right\Vert _p^p . \end{aligned}$$
(5.12)

Using subadditivity once more we now estimate

(5.13)
(5.14)
(5.15)
(5.16)

The second summand was bounded by (5.6), and the last quasi-norm identity is derived by the singular value identity \(s_n(A)^2=s_n(A^*A)\). Define \(Q :=\mathbb {1}_{[0,2]} \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ]\). Our claim is \(Q \in {\mathfrak {S}}_p\) for all \(0<p \le 1\). The last estimate shows that \(Q \in {\mathfrak {S}}_p\), if \(Q \in {\mathfrak {S}}_{2p}\) for \(p \le 1\). We now observe

$$\begin{aligned} \Vert Q \Vert _2^2= \int _0^2 \textrm{d}s \int _{{{\mathbb {R}}}} \textrm{d} t \, k_{\mu }^2(s-t) = \frac{2\sqrt{\mu }}{\pi } . \end{aligned}$$
(5.17)

Thus, we have \(Q \in {\mathfrak {S}}_2\) and hence \(Q \in \mathfrak S_{2^{1-n}}\) for any \(n \in {\mathbb {N}}\). Lastly, as \({\mathfrak {S}}_p \subset {\mathfrak {S}}_q\) for \(p<q\), we arrive at \(Q \in {\mathfrak {S}}_p\) for any \(0<p \le \infty \), which finishes the proof. \(\square \)

After all these preparations we finally state the crucial local estimates that are needed in the proof of Theorem 5.5.

Lemma 5.3

Let \(0< p \le 1\). Then, there is a constant C such that for any \(x \in {\mathbb {R}}^2, t \in {{\mathbb {R}}}, h\ge 2\), \(\delta \ge 1\), any measurable \(\Upsilon \subset {{\mathbb {R}}}^2\) such that \([-\delta ,1+\delta ]^2+x \subset \Upsilon \) and any interval \(I\subset {{\mathbb {R}}}\) such that \([t,t+h] \subset I \), we have the estimates

(5.18)
(5.19)

Proof

For the first inequality, we use (5.2) and Proposition 5.2. For the second inequality, we first observe

(5.20)

and then we use the p-triangle inequality, (5.2) and Proposition 5.2. \(\square \)

We now fix a region \(\Lambda \subset {\mathbb {R}}^3\) and define the signed distance function

$$\begin{aligned} {\text {d}}_\Lambda (x) :={\left\{ \begin{array}{ll} +{\text {dist}} (x, \partial \Lambda ) &{}\text { for } x \not \in \Lambda \\ -{\text {dist}} (x, \partial \Lambda ) &{}\text { for } x \in \Lambda \end{array}\right. } , \end{aligned}$$
(5.21)

where \({\text {dist}}\) is the Euclidean distance. The signed distance function is Lipschitz-continuous with Lipschitz constant 1.

In order to utilize Lemma 5.3, we need to essentially cover \(L \Lambda \) with a lot of very long boxes (of dimensions \(1 \times 1 \times O(L)\)). This boils down to choosing appropriate intervals that cover most of \(\Lambda _x\) (as defined in Definition 3.1), for any \(x \in {{\mathbb {R}}}^2\). Let \(G(x,\varepsilon )\) be the number of these intervals. The following lemma explicitly constructs such intervals and lists the properties that \(G(x,\varepsilon )\) and the intervals satisfy, which we need for our estimates. The basic idea is to collect connected components of \(\Lambda _x\), which go sufficiently deep inside \(\Lambda \).

Lemma 5.4

For any \(x \in {{\mathbb {R}}}^2\) and \(\varepsilon >0\), there is a finite (possibly empty) set of intervals \(A(x,\varepsilon )=\{ I_{1,x,\varepsilon }, \dots , I_{G(x, \varepsilon )),x,\varepsilon } \}\), satisfying the following conditions:

  1. (1)

    We have \({\text {d}}_\Lambda (I_{k,x,\varepsilon }) \subset (-\infty , -\varepsilon )\) and \({\text {dist}}(I_{k,x,\varepsilon }, \partial \Lambda )= \varepsilon \).

  2. (2)

    For any \(\lambda \in \Lambda _x\), there exists a j with \(1 \le j \le G(x,\varepsilon ) :\lambda \in I_{j,x,\varepsilon }\), or \({\text {d}}_\Lambda ((x,\lambda ))>-2\varepsilon \).

  3. (3)

    We have \(G(x, \varepsilon )= \# A(x,\varepsilon ) \le {\mathcal {H}}^1\left( {\text {d}}_\Lambda ^{-1}((-2 \varepsilon , -\varepsilon )) \cap (\{x \} \times {{\mathbb {R}}})\right) /\varepsilon \).

The signed distance function \({\text {d}}_\Lambda \), dependent on the piecewise Lipschitz region \(\Lambda \), is defined in (5.21).

We regard the lemma and its proof as the definitions of \(A(x,\varepsilon ),I_{j,x,\varepsilon }\) and \(G(x,\varepsilon )\).

Proof

We consider the set \(A_0(x,\varepsilon )\) of all connected components of \(\left( {\text {d}}_\Lambda ^{-1}((-\infty ,-\varepsilon )) \right) _x \subset {{\mathbb {R}}}\) (with the convention that the empty set has no connected components). The set \(A(x,\varepsilon )\) is defined as the set of all \(I \in A_0(x,\varepsilon )\), such that there is a \(\lambda \in I\) with \({\text {d}}_\Lambda ((x,\lambda )) \le -2 \varepsilon \). The first point is already satisfied for all \( I \in A_0(x,\varepsilon )\) and thus holds for all I in the smaller set \(A(x,\varepsilon )\). For the second claim, we observe that if \(\lambda \in \Lambda _x\) with \({\text {d}}_\Lambda ((x,\lambda )) \le -2 \varepsilon \), then \(\lambda \in \left( {\text {d}}_\Lambda ^{-1}((-\infty ,-\varepsilon )) \right) _x \) and thus there is an \(I \in A_0(x,\varepsilon )\) with \(\lambda \in I\). By definition of \(A(x,\varepsilon )\), this ensures \(I \in A(x,\varepsilon )\).

For the third claim, if \(A(x, \varepsilon ) \ne \emptyset \), let \(I =(\lambda _1, \lambda _4) \in A(x, \varepsilon )\) and define \(\lambda _2 :=\inf \{ \lambda \in I :{\text {d}}_\Lambda ( (x, \lambda ) \le -2 \varepsilon \}, \lambda _3 :=\sup \{ \lambda \in I :{\text {d}}_\Lambda ( (x, \lambda ) \le -2 \varepsilon \}\). Thus, \(\lambda _1< \lambda _2<\lambda _3<\lambda _4\),

$$\begin{aligned} {\text {d}}_\Lambda (\{x \} \times (\lambda _1,\lambda _2)) = {\text {d}}_\Lambda (\{x \} \times (\lambda _3,\lambda _4)) =(-2\varepsilon , -\varepsilon ) , \end{aligned}$$
(5.22)

and, as \({\text {d}}_\Lambda \) has Lipschitz constant 1, this means that

$$\begin{aligned} {\mathcal {H}}^1( ( \{x \} \times I) \cap {\text {d}}_\Lambda ^{-1}((-2 \varepsilon ,-\varepsilon )) ) \ge {\mathcal {H}}^1( \{x \} \times ( (\lambda _1,\lambda _2) \cup (\lambda _3,\lambda _4) ) ) \ge 2 \varepsilon .\nonumber \\ \end{aligned}$$
(5.23)

As \((\lambda _1, \lambda _2) \subset I\), \((\lambda _3, \lambda _4) \subset I\) and different elements of \(A(x,\varepsilon )\) are disjoint (as they are connected components), we can sum the inequality over all elements of \(A(x, \varepsilon )\) and arrive at

$$\begin{aligned} {\mathcal {H}}^1( (\{x \} \times {{\mathbb {R}}})\cap {\text {d}}_\Lambda ^{-1} ((-2 \varepsilon , -\varepsilon ))) \ge \sum _{I \in A(x, \varepsilon )} {\mathcal {H}}^1( (\{x \} \times I) \cap {\text {d}}_\Lambda ^{-1}((-2 \varepsilon ,\varepsilon )) ) \ge 2G(x,- \varepsilon ) \varepsilon , \end{aligned}$$
(5.24)

which implies the last claim (with an additional factor 1/2). \(\square \)

Theorem 5.5

Let \(\Lambda \) be a piecewise Lipschitz region (see Definition 2.1) and let \(0<p \le 1, \ell \in {\mathbb {N}}, \mu \in {{\mathbb {R}}}^+\). Then, there are constants \(L_0=L_0(\Lambda ,p,\ell , \mu ) > 3\) and \(C=C(\Lambda ,p,\ell , \mu )\) such that for all \(L>L_0\)

(5.25)

Proof

We want to cover most of \(L\Lambda \) with translates of cubes \([0,1]^2\times [0,h]\), where h grows like L and will use Lemma 5.3 on these.Footnote 3 We set \(\delta :=6 p^{-1/2} \sqrt{ \ln (L)}\). Let \(L_0\) be large enough to ensure that \(\delta \ge 1 \) for \(L>L_0\). Hence, these cubes need to keep a distance of at least \(\delta \) from the boundary. Set \(\varepsilon :=\frac{ 2(\delta +1)}{L}\). We also define the shorthand

$$\begin{aligned} \textrm{P} :=\textrm{P}_{ \ell } \otimes \mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ] . \end{aligned}$$
(5.26)

Let \(h_0\) be the length of the longest straight line contained in \(\Lambda \).

Consider any \(x \in {\mathbb {R}}^2\) with \(G(x, \varepsilon )\ge 1\), as defined in Lemma 5.4. For \(k \in \{1, \dots , G(x,\varepsilon )\}\), we define the boxes

$$\begin{aligned} Q_{x,k}&:=([0,1]^2+Lx) \times (LI_{k,x,\varepsilon })&\subset L\Lambda , \end{aligned}$$
(5.27)
$$\begin{aligned} Q'_{x,k}&:=([-\delta ,1+\delta ]^2+Lx ) \times (LI_{k,x,\varepsilon } )&\subset L\Lambda . \end{aligned}$$
(5.28)

These inclusions hold because \(\sqrt{2}< \sqrt{2} (\delta +1) <L \varepsilon =L {\text {dist}} ( \{ x \} \times I_{k,x,\varepsilon } , \partial \Lambda ) = {\text {dist}} ( \{ L x \} \times (L I_{k,x,\varepsilon }) , L \partial \Lambda )\).

We assume \(L>h_0\), \(Lh_0>1\) and \(L>2\). Now we have by monotonicity I in Lemma 5.1 and by Lemma 5.3

(5.29)
(5.30)
(5.31)

The constant C depends only on p and \(h_0\).

Now we consider some offset parameter \(s \in [0,1)^2\), and we define

$$\begin{aligned} \Lambda _{\varepsilon ,s} :=\Lambda {\setminus } \bigcup _{z \in {\mathbb {Z}}^2} \bigcup _{k =1}^{G\left( \frac{ z+s}{L} ,\varepsilon \right) } \frac{1}{L}Q_{\frac{z+s}{L} ,k} \subset {\text {d}}_\Lambda ^{-1}\left( \left( - 3\varepsilon ,0 \right) \right) . \end{aligned}$$
(5.32)

The inclusion is based on the fact that for each \(y \in {{\mathbb {R}}}^3\), there is a \(\frac{z+s}{L} \in {\mathbb {R}}^2\) with \(z\in {\mathbb {Z}}^2\) such that \(y \in \left( \frac{z+s}{L} + \frac{1}{L}[0,1]^2 \right) \times {\mathbb {R}}\). If \({\text {d}}_\Lambda (y) \le -3 \varepsilon \), then the point \(\left( \frac{z+s}{L} , y_3 \right) \) is at most \(\frac{\sqrt{2}}{L} < \varepsilon \) away from y and hence at least \(2\varepsilon \) away from the boundary. Therefore, there is a k such that \(y \in \frac{1}{L}Q_{\frac{z+s}{L},k}\).

We further define

$$\begin{aligned} Z_{\varepsilon } :=\left\{ u \in {\mathbb {Z}}^3 :\left( u+[0,1]^3\right) \cap L{\text {d}}_\Lambda ^{-1}\left( \left( - 3\varepsilon ,0 \right) \right) \not = \emptyset \right\} , \end{aligned}$$
(5.33)

so that \(L{\text {d}}_\Lambda ^{-1}\left( \left( - 3\varepsilon ,0 \right) \right) \subset \bigcup _{{u} \in Z_\varepsilon } \left( {u}+ [0,1]^3\right) \). As \(\varepsilon >\frac{\sqrt{3}}{L}\), the length of the diagonal in a cube \(\frac{1}{L} [0,1]^3\), we have (second inclusion)

$$\begin{aligned} L{\text {d}}_\Lambda ^{-1}\left( \left( - 3\varepsilon ,0 \right) \right) \subset \bigcup _{u \in Z_\varepsilon } \left( {u}+ [0,1]^3\right) \subset L{\text {d}}_\Lambda ^{-1}\left( \left( - 4\varepsilon , \varepsilon \right) \right) . \end{aligned}$$
(5.34)

Hence, the volume of the middle term, which is the cardinality, \(\# Z_\varepsilon \), of \(Z_\varepsilon \), can be bounded by the volume of the right-hand side. For \(\varepsilon <1\), using Lemma A.3, this is bounded by \(L^3C(\Lambda ) \varepsilon \). Hence for \(L > L_0\):

$$\begin{aligned} \# Z_\varepsilon \le L^3 C(\Lambda ) \varepsilon \le C(\Lambda ,p) L^2 \sqrt{ \ln (L)} . \end{aligned}$$
(5.35)

Using the monotonicity I and subadditivity properties in Lemma 5.1 and the covering (5.32), we can finally estimate,

(5.36)

The summands in the first sum can be bounded by \(C\ln (L)\) using (5.31). The second term will be bounded using monotonicity I, (5.32) and (5.34) in the first step and using monotonicity II, subadditivity, (5.18) and (5.35) in the second step. Hence, we have

(5.37)
(5.38)

For any fixed \(L>L_0\) and \(s \in [0,1)^2\), this is finite. Hence, we can integrate this over \(s \in [0,1)^2\) and get a different upper bound. As the volume of \([0,1)^2\) is 1, the left-hand side and the last term do not change, as it is an integral over a constant in both cases.

(5.39)

Now we can use Fubini on the product \({\mathbb {Z}}^2 \times [0,1)^2= {\mathbb {R}}^2\). Hence we have

(5.40)
$$\begin{aligned}&= C \ln (L) L^2 \int _{{\mathbb {R}}^2} G(x,\varepsilon ) \,\textrm{d}x +C(\Lambda ,p) L^2 \sqrt{ \ln (L)} \end{aligned}$$
(5.41)
$$\begin{aligned}&\le C \ln (L) L^2 \int _{{{\mathbb {R}}}^2} \left|\left( {\text {d}}_\Lambda ^{-1}((-2\varepsilon , -\varepsilon ) ) \right) _x \right|/ \varepsilon \mathrm dx+ C(\Lambda ,p) L^2 \sqrt{ \ln (L)} \end{aligned}$$
(5.42)
$$\begin{aligned}&= C \ln (L) L^2 |{\text {d}}_\Lambda ^{-1}((-2\varepsilon , -\varepsilon ) ) |/\varepsilon + C(\Lambda ,p) L^2 \sqrt{ \ln (L)} \le C( \Lambda , p) L^2 \ln (L) . \end{aligned}$$
(5.43)

In the first step, we did a change of variables, in the third step we used Lemma 5.4, in the last but one step Fubini and in in the final step we applied (A.9). \(\square \)

6 The Error Term can be Large and not Smaller than \(o(L^2\ln (L))\)

Without loss of generality, we assume throughout this section that \(\nu =1\) and \(B=1\) because the precise values are not relevant now. The non-asymptotic bound in the following lemma is simple and useful in the proof of the main theorem in this section.

Lemma 6.1

Let \(\Omega \subset {\mathbb {R}}\) be a finite union of intervals of finite lengths \(\ell _1, \dots , \ell _n\) with disjoint closures. Let \(m \in {\mathbb {N}}\) with \(m \ge 2\), \(\mu >0\), and \(\Delta = \textrm{d}^2/\textrm{d}^2 x\) the one-dimensional Laplacian. Then, we have the estimate

$$\begin{aligned} \Vert (\mathbb {1}_\Omega \mathbb {1}(- \Delta \le \mu )\mathbb {1}_\Omega )^m - \mathbb {1}_\Omega \mathbb {1}(- \Delta \le \mu )\mathbb {1}_\Omega \Vert _1 \le \frac{m-1}{\pi ^2} \sum _{j=1} ^n \ln ( 1+ \sqrt{\mu }\ell _j) + Cm n , \end{aligned}$$
(6.1)

where C is an entirely independent constant.

For \(m=2\), this estimate is sharp in the sense that the prefactor \(1/\pi ^2\) equals the coefficient of the leading asymptotic behavior of \(\textrm{tr}\,(\mathbb {1}_{L\Omega } \mathbb {1}(- \Delta \le \mu )\mathbb {1}_{L\Omega })^2\) for large L.

Proof

By scaling we can assume \(\mu =1\) since \(\mathbb {1}_{\Omega } \mathbb {1}(- \Delta \le \mu )\) is unitarily equivalent to \(\mathbb {1}_{\sqrt{\mu }\Omega } \mathbb {1}(- \Delta \le 1)\). In other words, we may set \(\mu =1\) and eventually replace the lengths \(\ell _j\) by \(\sqrt{\mu }\ell _j\).

Then, we use the geometric series \(a^m-a=a(a-1) (a^{m-2} + \cdots + a+1)\) with \(a :=\mathbb {1}_\Omega \mathbb {1}(- \Delta \le 1)\mathbb {1}_\Omega \). As a has on operator norm of at most 1, we can estimate

(6.2)

with the function \(k=k_1\) defined in (2.6).

For a fixed \(x \in \Omega \), we now enlarge the domain of integration in y by allowing \(y \in \Omega \), as long as x and y are in different intervals in \(\Omega \). In a formula, with \(\pi _0(\Omega )\) denoting the connected components (subintervals) of \(\Omega \), the new domain of integration in (6.2) is

$$\begin{aligned} \bigcup _{ I \in \pi _0(\Omega )} \big \{(x,y) :x \in I, y \not \in I \big \} . \end{aligned}$$
(6.3)

As the integrand only depends on \(x-y\), we may translate I to be of the form \((0,\ell _j)\). Hence, with \(n:=\#\pi _0(\Omega )\) the number of connected components of \(\Omega \), we have

(6.4)
(6.5)
(6.6)
(6.7)
(6.8)

The last step relies on an improved result of Landau and Widom with \(L=1\), see Corollary C.3. \(\square \)

In Theorem 2.3, we obtained for a general Lipschitz region \(\Lambda \subset {{\mathbb {R}}}^3\) an error term \(o(L^2 \ln (L))\) and not of the order \(L^2\). Specifically, using \(\textsf{I}(2) = -1/(4\pi ^2)\), we have the asymptotic expansion

$$\begin{aligned} \textrm{tr}\,\big (\textrm{D}_\mu (L\Lambda )-\textrm{D}_\mu (L\Lambda )^2\big )&= \frac{L^2 \ln (L)}{4 \pi ^3} \int _{\partial \Lambda } \textrm{d} \mathcal H^2( v)\, |n(v) \cdot e_3 |+o(L^2\ln (L)) . \end{aligned}$$
(6.9)

This allows us to define the error term \(\varepsilon (L,\Lambda )\) by the identity

$$\begin{aligned} \textrm{tr}\,\big (\textrm{D}_\mu (L\Lambda )-\textrm{D}_\mu (L\Lambda )^2\big )&= \frac{ L^2 \ln (L)}{4 \pi ^3} \int _{\partial \Lambda } \textrm{d}{ {\mathcal {H}}^2}( v)\, |n(v) \cdot e_3 |-L^2\ln (L) \,\varepsilon (L, \Lambda ) . \end{aligned}$$
(6.10)

In this notation, Theorem 2.3 states that \(\lim _{L \rightarrow \infty }\varepsilon (L,\Lambda )=0\) for a piecewise Lipschitz region \(\Lambda \) and we have \(\sup _{L \ge 2} |\varepsilon (L, \Lambda )|\ln (L) < \infty \), if \(\Lambda \) is a piecewise \(\textsf{C}^{1,\alpha }\) region. The main result of this section, which is the next theorem, shows that the estimate for Lipschitz regions is sharp and the error term can be large and just \(o(L^2\ln (L))\). The negative sign in front of the error term does not necessarily mean that it has a definite sign although in our example it will be. Although our result only deals with the error term for the simplest, non-trivial polynomial, namely \(t\mapsto t(1-t)\), we believe that also for the entropy the error term can be as large and only \(o(L^2\ln (L))\) for a Lipschitz region.

Theorem 6.2

Let \(\varphi :{{\mathbb {R}}}^+ \rightarrow {{\mathbb {R}}}^+\) be a bounded function with \(\lim _{L \rightarrow \infty } \varphi (L)=0\). Then, there is a piecewise Lipschitz region \(\Lambda \) and an \(L_0\) such that for any \(L \ge L_0\), the error term defined in (6.10) satisfies

$$\begin{aligned} \varepsilon (L, \Lambda ) \ge \varphi (L) . \end{aligned}$$
(6.11)

Remark 6.3

Let \({\mathcal {A}}\) be the subset of the space of all polynomials vanishing at 0 and 1 such that the error term in Theorem 2.3 for \(f \in {\mathcal {A}}\) is of order \(O(L^2)\) for any Lipschitz domain \(\Lambda \). This is clearly a linear subspace and the theorem tells us that \(t \mapsto t(1-t) \not \in {\mathcal {A}}\). Thus, the subspace has at least codimension one, which means that it satisfies (at least ) one linear constraint. We conjecture that this constraint might be \(f \in {\mathcal {A}} \implies \textsf{I}(f)=0\). That is, the error term can only achieve the order \(O(L^2)\), if the leading term of order \(L^2 \ln (L)\) vanishes.

Proof of Theorem 6.2

We begin with a non-negative, summable sequence \((a_i)_{i \in {\mathbb {N}}}\) with \(\sum _{i \in {\mathbb {N}}} a_i=1\), which we will choose later. Let \(g_0 : [0,1] \rightarrow {{\mathbb {R}}}^+\) be the zigzag function defined by \(g_0(0)=1\) and for \(t>0\),

$$\begin{aligned} g_0'(t)= {\left\{ \begin{array}{ll} +1 &{} \text {if } \exists j \in {\mathbb {N}}:0< t -\sum _{i<j} a_i \le \frac{1}{2} a_j \\ -1 &{} \text {if } \exists j \in {\mathbb {N}}:\frac{1}{2} a_j< t -\sum _{i<j} a_i \le a_j \end{array}\right. } . \end{aligned}$$
(6.12)

If \(j=1\), then we use the convention that \(\sum _{j<1}a_j :=0\). Clearly, \(g_0\) is Lipschitz continuous with Lipschitz bound 1. We expand \(g_0\) to \([-1,2]\) by setting \(g_0(t)=t+1\) for \(t < 0\) and \(g_0(t)=2-t\) for \(t>1\). This extension is still Lipschitz continuous and satisfies \(g_0(-1)=g_0(2)=0\). Now, we can define the region \(\Lambda \),

$$\begin{aligned} \Lambda :=\big \{ (x_1,x_2,x_3) \in {{\mathbb {R}}}^3 :x_1 \in (0,1), x_3 \in (-1,2), -g_0(x_3)< x_2 < g_0(x_3)\big \} . \end{aligned}$$
(6.13)

This clearly defines a piecewise Lipschitz region. We will now sketch why this is even a strong Lipschitz domain (See [2, Pages 66–67] for the definition.)

Fig. 1
figure 1

Example of a \(x_2\)-\(x_3\)-plot of the domain \(\Lambda \) for any \(x_1\in (0,1)\) and some sequence \((a_i)_{i \in {\mathbb {N}}}\). The upper half is the graph of \(g_0\). In green, one can see two sets \(\Lambda _{x^\perp }\). In the middle, one can see the ball of all points, with respect to which \(\Lambda \) is star-shaped

For any \(x_0 \in B_{1/ (2\sqrt{2})} (1/2,0,1/2)\), the region \(\Lambda \) is star shaped with respect to \(x_0\). For the definition of a strong Lipschitz domain, we need to choose an open cover of \(\partial \Lambda \) and a projection with a certain direction on every set of the cover. For any orthogonal (rank 2) projection \(\pi :{{\mathbb {R}}}^3 \rightarrow {{\mathbb {R}}}^2\), on the two connected components of the set \(\pi ^{-1} ( \pi (B_{1/ (4\sqrt{2} )} (1/2,0,1/2))) \cap \partial \Lambda \), one can define the chart as the inverse of \(\pi \), which has a Lipschitz constant less than 10. This leads to an open cover of \(\partial \Lambda \) and one can then choose a finite subcover.

The boundary \(\partial \Lambda \) can be covered by the sets \(\partial _1 \Lambda :=\big \{x \in \partial \Lambda :x_1 \in \{0,1\}\big \}\) and \(\partial _2 \Lambda :=\big \{ x \in \partial \Lambda :x_1 \in [0,1]\big \}\). These two boundary sets have a non-empty intersection, but \(\partial _1 \Lambda \cap \partial _2 \Lambda \) is a “one-dimensional” set with two-dimensional Hausdorff measure zero, that is, \(\mathcal H^2(\partial _1 \Lambda \cap \partial _2 \Lambda )=0\).

For almost every \(x \in \partial _1 \Lambda \), the outward normal vector n(x) is given by \(\pm e_1\), while for almost every \(x \in \partial _2 \Lambda \), the outward normal vector is given by \(\frac{1}{\sqrt{2}} (\pm e_2 \pm e_3)\); the vectors \({e_1,e_2,e_3}\) are the usual unit vectors in the positive \(x_1,x_2,x_3\) directions. Hence, we observe

$$\begin{aligned} {{\mathcal {H}}^2}( \partial _1 \Lambda )&= 4 \int _{-1}^2 g_0(t) \,\textrm{d}t \le 9 , \end{aligned}$$
(6.14)
$$\begin{aligned} {{\mathcal {H}}^2}( \partial _2 \Lambda )&= 2 \int _{-1}^2 \sqrt{1+ (g_0')^2(t)} \,\textrm{d}t = 6 \sqrt{2} , \end{aligned}$$
(6.15)
$$\begin{aligned} \int _{\partial \Lambda } |n(x) \cdot e_3 |\,\textrm{d}\mathcal H^2(x)&= \frac{1}{\sqrt{2}} {{\mathcal {H}}^2}( \partial _2 \Lambda ) = 6 . \end{aligned}$$
(6.16)

It is important that \({{\mathcal {H}}^2}( \partial \Lambda ) = {\mathcal H^2}( \partial _1\Lambda ) + {{\mathcal {H}}^2}( \partial _2\Lambda )\) is bounded independently of the sequence \((a_i)_{i \in {\mathbb {N}}}\) and that the surface integral in (6.16) is completely independent of the sequence.

The leading asymptotic term for the trace on the left-hand side of (6.10) is provided by Theorem 2.3. Here, \(\textsf{I}(2) = -1/(4\pi ^2)\) and hence

$$\begin{aligned} \textrm{tr}\,\big (\textrm{D}_\mu (L\Lambda )-\textrm{D}_\mu (L\Lambda )^2\big )&= \frac{L^2 \ln (L)}{4 \pi ^3} \int _{\partial \Lambda } \textrm{d}{ \mathcal H^2}( v)\, |n(v) \cdot e_3 |+o(L^2\ln (L)) \end{aligned}$$
(6.17)
$$\begin{aligned}&= \frac{L^2 \ln (L)}{\pi ^3} \,\frac{3}{2} + o(L^2\ln (L)) , \end{aligned}$$
(6.18)

where we used (6.16).

We need an upper bound for the constant \({\mathcal {K}}( \Lambda )\) defined in Lemma A.3, which is independent of the function g. We observe that \(\partial _1 \Lambda \) is the image of the Lipschitz functions \(f_j :[0,1]^2 \rightarrow \partial _1\Lambda ; (x_1,x_2) \mapsto (j,3x_1-1,g(3x_1-1)x_2)\) for \(j=0,1\) and \(\partial _2 \Lambda \) is in the image of the Lipschitz functions \(\tilde{f}_\pm :[0,1]^2 \rightarrow \partial _2 \Lambda ; (x_1,x_2) \mapsto (x_1,3x_2-1,\pm g(3x_2-1))\). Thus, the set \(\{f_0,f_1\tilde{f}_+, \tilde{f}_-\}\) defines a piecewise Lipschitz atlas of \(\partial \Lambda \). Hence, as \(C_{\text {lip}}(f_j)=3 \sqrt{2}\) and \(C_{\text {lip}}(\tilde{f}_\pm )=3\), we observe

$$\begin{aligned} {\mathcal {K}}(\Lambda ) \le (16 \sqrt{2})^2 \times 2 \times (1+(3\sqrt{2})^2 +1+3^2)< \infty . \end{aligned}$$
(6.19)

Hence, by Lemma 3.2, we have

$$\begin{aligned} \textrm{tr}\,\textrm{D}_\mu (L\Lambda )^m =\frac{L^2}{2 \pi } \int _{{{\mathbb {R}}}^2} \textrm{d}x^\perp \,{\text {tr}} \left( \mathbb {1}_{L \Lambda _{x^\perp } }\mathbb {1}[(-\textrm{i}\nabla ^\parallel )^2 \le \mu ] \,\mathbb {1}_{L \Lambda _{x^\perp } } \right) ^m + O( L^2 ) \, \end{aligned}$$
(6.20)

with \(x^\perp =(x_1,x_2)\). To get to the polynomial \(f(t)=t(1-t)\) we have to subtract this term with \(m=2\) from the term with \(m=1\). Now, we intend to use Lemma 6.1. To do so, we need to describe the lengths of the (sub)intervals of \(\Lambda _{x^\perp }\) depending on \(x^\perp =(x_1,x_2)\).

We can ignore the case \(x_1 \in \{0,1\}\), as this is a null set with respect to the Lebesgue measure on \({{\mathbb {R}}}^2\). If \(x_1 \in (0,1)\) and \(|x_2 |\le 1\) then the set \(\Lambda _{x^\perp }\) is a single interval of length \(\ell _1(x_2)=3-2 |x_2 |\). The interesting case is \(x_1 \in (0,1)\) and \(1< |x_2 |<2\). Here, for any \(i \in {\mathbb {N}}\) with \(a_i>2(|x_2 |-1)\), there is an interval of size \(\ell _i(x_2)= a_i - 2 (|x_2 |-1)\), as illustrated in Equation (1). For any \(|x_2 |>1\), this will only lead to finitely many intervals, as the sequence \((a_i)_{i \in {\mathbb {N}}}\) is a null sequence. Now, we apply (6.20) for \(m=1\) and \(m=2\), and then Lemma 6.1 and see that

$$\begin{aligned} \frac{ 2 \pi }{L^2}&\Big | \textrm{tr}\,\big ( \textrm{D}_\mu (L\Lambda )-\textrm{D}_\mu (L\Lambda )^2\big ) \Big | \end{aligned}$$
(6.21)
$$\begin{aligned}&\le {C+} \int _0^1 \textrm{d}x_1 \int _{{\mathbb {R}}}\textrm{d}x_2 \, \Vert (\mathbb {1}_{L\Lambda _{x^\perp }} \mathbb {1}(-\Delta \le \mu )\mathbb {1}_{L\Lambda _{x^\perp }})^2 - \mathbb {1}_{L\Lambda _{x^\perp }}\mathbb {1}(- \Delta \le \mu ) \mathbb {1}_{L\Lambda _{x^\perp }} \Vert _1 \end{aligned}$$
(6.22)
$$\begin{aligned}&\le \frac{2}{\pi ^2} \int _0^1 \textrm{d} x_2\, \big (\ln (1+L\sqrt{\mu }(3-2x_2)) + C\big ) \end{aligned}$$
(6.23)
$$\begin{aligned}&+ \frac{2}{\pi ^2}\int _1^2 \textrm{d}x_2\, \sum _{i \in {\mathbb {N}}:a_i > 2 (x_2-1)} \big (\ln (1+ L \sqrt{\mu }(a_i-2(x_2-1))+C\big ) \end{aligned}$$
(6.24)
$$\begin{aligned}&=\frac{2}{\pi ^2} \int _0^1 \textrm{d} x_2 \,\Big (\ln (1+L\sqrt{\mu }(3-2x_2)) +C\Big )\nonumber \\ {}&\quad + \frac{2}{\pi ^2}\sum _{i \in {\mathbb {N}}} \int _{0}^{\frac{1}{2} a_i} \textrm{d}t\, \Big (\ln (1+ L \sqrt{\mu }2t) +C\Big ) . \end{aligned}$$
(6.25)

In the second step, we also used that the set \(\Lambda _{x^\perp }\) is independent of \(x_1 \in (0,1)\). The third step uses Fubini to exchange the sum and the integral and then transforms the integration variable to \(t :=\frac{1}{2} a_i+1- x_2\). The lower bound 0 in the last integral stems from the condition \(a_i>2(x_2-1)\), respectively, from \(t>0\).

We intend to show that this upper bound is significantly smaller than the known asymptotics. The difference between the asymptotics and this upper bound can then be used as a lower bound for the error term. This is why it is very important that the coefficient in front of the upper bound is equal to the asymptotic coefficient and is thus the reason why we can only do this here for the polynomial \(f(t)=t(1-t)\).

We now allow our constants to depend on \(\mu \) (for general \(\nu \ge 1\), they depend on all values of \(\mu (\ell )\)) and use the trivial inequality \(\ln (1+ab) \le \ln (1+a) + \ln (1+b)\) for \(a,b\ge 0\) to arrive at

$$\begin{aligned} \frac{\pi ^3}{L^2}&\left| \textrm{tr}\,\textrm{D}_\mu (L\Lambda )-\textrm{D}_\mu (L\Lambda )^2 \right| \end{aligned}$$
(6.26)
$$\begin{aligned}&\le \int _0^1 (\ln (1+L) + C )\,{\textrm{d} x_2 \, } + \sum _{i \in {\mathbb {N}}} \Big [\int _0^{\frac{1}{2} a_i} \ln (1+ L t)\, \textrm{d}t + C a_i \Big ] {+ C} \end{aligned}$$
(6.27)
$$\begin{aligned}&= \ln (1+L) + \frac{1}{L} \sum _{i \in {\mathbb {N}}} {\Big (}\Big (1+ \frac{1}{2} a_iL\Big ) \Big ( \ln \Big (1+\frac{1}{2} a_i L\Big ) -1\Big ) -(-1) \Big ) + C \end{aligned}$$
(6.28)
$$\begin{aligned}&= \ln (1+L) + \sum _{i \in {\mathbb {N}}} \Big [\frac{1}{2} a_i \ln \Big (1+\frac{1}{2} a_i L\Big )+\frac{\ln (1+\frac{1}{2} a_i L)}{L} -\frac{1}{2} a_i \Big ] +C \end{aligned}$$
(6.29)
$$\begin{aligned}&\le \ln (L) + \sum _{i \in {\mathbb {N}}} \frac{1}{2} a_i \ln \Big (1+\frac{1}{2} a_i L\Big ) +C , \end{aligned}$$
(6.30)

or equivalently,

$$\begin{aligned}&\big (0\le \big )\,\textrm{tr}\,\big (\textrm{D}_\mu (L\Lambda )-\textrm{D}_\mu (L\Lambda )^2\big )\nonumber \\ {}&\quad \le \frac{L^2\ln (L)}{\pi ^3} \left[ 1+ \frac{1}{\ln (L)}\sum _{i \in {\mathbb {N}}} \frac{1}{2} a_i \ln \Big (1+\frac{1}{2} a_i L\Big ) + \frac{C}{\ln (L)}\right] . \end{aligned}$$
(6.31)

In the first step, we used \(1 \le 3- x_2 \le 3\), and in the fourth step, we used \(\ln (1+L)\le \ln (L)+1\) for \(L \ge 1\) and \(\ln (1+\frac{1}{2} a_i L) \le \frac{1}{2} a_i L\).

Now we rewrite (6.10) and use (6.16) and (6.31) to obtain

$$\begin{aligned} 2 \pi ^3 \varepsilon (L, \Lambda )&= \frac{1}{2} \int _{\partial \Lambda } \textrm{d}{\mathcal {H}}^2(v) \, |n(v)\cdot e_3| - \frac{2\pi ^3}{L^2\ln (L)} \textrm{tr}\,\big (\textrm{D}_\mu (L\Lambda ) - \textrm{D}_\mu (L\Lambda )^2\big ) \end{aligned}$$
(6.32)
$$\begin{aligned}&\ge 3 - {\left( 2+ \frac{1}{\ln (L)} \sum _{i \in {\mathbb {N}}} a_i \ln \Big (1+\frac{1}{2} a_i L\Big ) + \frac{2C}{\ln (L)} \right) } \end{aligned}$$
(6.33)
$$\begin{aligned}&=1 - \frac{1}{\ln (L)} \sum _{i \in {\mathbb {N}}} a_i \ln \Big (1+\frac{1}{2} a_i L\Big ) - \frac{C}{ \ln (L)} \end{aligned}$$
(6.34)
$$\begin{aligned}&= -\frac{1}{\ln (L)} \sum _{i \in {\mathbb {N}}} a_i \ln \left( \frac{1}{L}+\frac{1}{2} a_i \right) - \frac{C}{ \ln (L)} \end{aligned}$$
(6.35)
$$\begin{aligned}&\ge -\frac{1}{\ln (L)} \sum _{i \in {\mathbb {N}}, a_i < \frac{1}{L} } a_i \ln \left( \frac{3}{2L} \right) - \frac{C}{ \ln (L)} \end{aligned}$$
(6.36)
$$\begin{aligned}&\ge \sum _{i \in {\mathbb {N}}, a_i < \frac{1}{L} } a_i - \frac{C}{ \ln (L)} =:\varepsilon _0(L) . \end{aligned}$$
(6.37)

The fourth step uses \(\sum _i a_i=1\) and the fifth step relies on \(L \ge 2, a_i \le 1\) to get \(\ln ( \frac{1}{L}+\frac{1}{2} a_i ) \le 0 \). In the last step, C changed. Now, we just need to find a good sequence \((a_i)_{i \in {\mathbb {N}}}\). To show our claim, it suffices to find a sequence \((a_i)_{i \in {\mathbb {N}}}\) such that

$$\begin{aligned} \lim _{L \rightarrow \infty } \varphi (L) / \varepsilon _0(L) =0 , \end{aligned}$$
(6.38)

since then the quotient \(\varphi (L)/\varepsilon (L,\Lambda ) \le 2 \pi ^3\varphi (L)/\varepsilon _0(L)\rightarrow 0\) is less than 1 for large L, that is, \(\varepsilon (L,\Lambda )\ge \varphi (L)\ge 0\) for \(L\ge L_0\), where \(L_0\) is chosen below.

The construction of the sequence \(a_i\) relies on Lemma D.1, and we apply this Lemma with f as \(\varphi \). With the resulting function \(\textrm{Env}(\varphi )\) we define the sequence of real numbers \(a_i :=\textrm{Env}(\varphi )(i-1)-\textrm{Env}(\varphi )(i)\) for \(i \in {\mathbb {N}}\). As \(\lim _{L \rightarrow \infty } \textrm{Env}(\varphi )(L)=0\), we have \(\sum _{i \ge L} a_i=\textrm{Env}(\varphi )(L)\) and in particular \(\sum _{i \in {\mathbb {N}}}a_i=\textrm{Env}(\varphi )(0)=1\). As \(\textrm{Env}(\varphi )\) is non-increasing and convex, the \(a_i\) are non-negative and non-increasing. As the sequence defined this way is non-increasing and \(\sum _{i\in {\mathbb {N}}} a_i =1\), we have \(a_i \le \frac{1}{i} \sum _{j \le i}a_j \le \frac{1}{i}\). Hence, we know that \(i \ge L\) implies \(a_i \le \frac{1}{L}\). Thus, we have the estimate

$$\begin{aligned} \sum _{i \in {\mathbb {N}}, a_i \le \frac{1}{L} } a_i\ge \sum _{i \ge L} a_i= \textrm{Env}(\varphi )(L) . \end{aligned}$$
(6.39)

Furthermore, as \(\textrm{Env}(\varphi )(L) \ge C/\sqrt{\ln (2+L)}\), for L large enough, we have

$$\begin{aligned} \textrm{Env}(\varphi )(L) - \frac{C}{\ln (L)} \ge \frac{1}{2} \textrm{Env}(\varphi )(L) . \end{aligned}$$
(6.40)

Hence, we conclude that

$$\begin{aligned} 0\le \lim _{L \rightarrow \infty } \frac{\varphi (L)}{\varepsilon _0(L)} = \lim _{L \rightarrow \infty } \frac{\varphi (L)}{\sum _{i \in {\mathbb {N}}, a_i \le \frac{1}{L} } a_i - \frac{C}{ \ln (L)}} \le 2\lim _{L\rightarrow \infty }\frac{\varphi (L)}{\textrm{Env}(\varphi )(L)}=0 . \end{aligned}$$
(6.41)

One choice of \(L_0\) could be that \(4\varphi (L) \le \textrm{Env}(\varphi )(L)\) for \(L>L_0\) is satisfied. Thus, by this and (6.37), there is an \(L_0 >0\) such that for any \(L>L_0\), we have

$$\begin{aligned} \varepsilon (L,\Lambda ) \ge \varepsilon _0(L) /(2\pi ^3) \ge \varphi (L) . \end{aligned}$$
(6.42)

This finishes the proof. \(\square \)