On zero-density estimates and the PNT in short intervals for Beurling generalized numbers

We study the distribution of zeros of zeta functions associated to Beurling generalized prime number systems whose integers are distributed as $N(x) = Ax + O(x^{\theta})$. We obtain in particular \[ N(\alpha, T) \ll T^{\frac{c(1-\alpha)}{1-\theta}}\log^{9} T, \] for a constant $c$ arbitrarily close to $4$, improving significantly the current state of the art. We also investigate the consequences of the obtained zero-density estimates on the PNT in short intervals. Our proofs crucially rely on an extension of the classical mean-value theorem for Dirichlet polynomials to generalized Dirichlet polynomials.


Introduction
The zeros of the Riemann zeta function occupy a central role in analytic number theory as they are intimately connected to the distribution of the prime numbers.The most famous conjecture in this regard is the Riemann hypothesis stating that ζ(s) does not possess any zeros in the half-plane Re s > 1/2.On the other hand, in many arithmetic applications appealing to the full strength of the Riemann hypothesis in not necessary.For the PNT in short intervals, it suffices that the half-plane Re s > 1/2 is a zero-sparse region, in the sense that the density hypothesis holds, that is, N(α, T ) ≪ T 2(1−α) (log T ) O (1) , α > 1/2, T ≥ 2, where N(α, T ) counts the number of zeros of ζ(s) to the right of α and up to height T : In that case the PNT in short intervals holds in the form ψ(x + h) − ψ(x) ∼ h for all h ≫ x λ as x → ∞ whenever λ > 1/2.Assuming the Riemann hypothesis would not improve this.Unfortunately, also the density hypothesis remains only a conjecture to this day; the best known range [1] is currently α > 25/32 ≈ 0.781.If one weakens the density hypothesis further to where c > 2, then the PNT in short intervals holds in the range λ > 1 − 1/c.The current record for c is 12/5 [6,7].
In this work, we study (1.1) and the corresponding application of the PNT in short intervals in the Beurling context where there is in general no additive structure or functional equation.A Beurling generalized prime number system (P, N ) consists of a sequence of generalized primes P = (p k ) k with 1 < p 1 ≤ p 2 ≤ . . .and p k → ∞, and the generalized integers N which are formed by taking the multiplicative semigroup generated by the generalized primes and 1. Arranging the generalized integers in non-decreasing order1 , one obtains the sequence 1 = n 0 < n 1 = p 1 ≤ n 2 ≤ . . . .One may associate to this system many familiar number-theoretic functions, such as We often omit the subscripts P and k when there is no risk of confusion.The relationship between these three functions has been the subject of extensive research over the last century.We refer to the monograph [4] for a detailed account on the theory of Beurling systems.
In order to guarantee that ζ(s) has an analytic extension to the left of σ = 1, we will throughout this work assume that there exist some θ ∈ [0, 1) and some A > 0 such that (1.2) N(x) = Ax + O(x θ ).
We will refer to this condition2 by saying that the integers are well-behaved, or θ-well-behaved if we want to specify the exponent θ ∈ [0, 1).The condition (1.2) implies that ζ(s)−A/(s−1) has analytic continuation to the half-plane σ > θ.Even though this condition is quite restrictive, it is satisfied for many interesting multiplicative structures [10].
It has been known due to the work of Landau on the prime ideal theorem that the zeros of the zeta function associated to a system having well-behaved integers must all lie outside a de la Vallée Poussin-type region: for some suitable positive constant d, depending only on θ.In fact, for θ ∈ (1/2, 1) this zero-free region was shown to be optimal, apart from the precise value of d, in the seminal work of Diamond, Montgomery, and Vorhauer [3] who constructed a Beurling system whose zeta function has infinitely many zeros on a curve of the form σ = 1 − d/ log |t|.
A finer study of the distribution of zeros of Beurling zeta functions was recently initiated by Sz.Gy.Révész in [12].Therein a quantitative version of the following lemma was established (a similar estimate was already obtained by Diamond, Montgomery, and Vorhauer as part of their "clustering of zeros" result, [3,Theorem 2]).Lemma 1.1.Suppose that N(x) = Ax + O(x θ ) for some A > 0 and θ ∈ [0, 1).For each T > 2 and α > θ, the number of zeros ρ = β + iγ of the associated zeta function Révész's result implies in particular that for each α > θ we have N(α, T ) ≪ α T log T .The next natural question in the study of N(α, T ) is whether it is possible to obtain a zerodensity estimate.This was answered affirmatively by Révész in [13] under two conditions: the first (rather restrictive) condition that every Beurling integer must be a classical integer, that is n k ∈ N for each k, and the second rather mild condition, saying that the number of repeated values in the generalized integers (n k = n k+1 = . . .= n k+l ) is not too large on average (in [13], this is referred to as an "average Ramanujan condition").The first condition is especially unsatisfactory as it leaves open the possibility that the validity of zero-density estimates still relies on the (additively) very well-structured integers.
In this paper, we show that the two extra assumptions besides having well-behaved integers are superfluous for obtaining zero-density estimates.We also improve upon the constant c = (6 − 2θ)/(1 − θ) that Révész achieved for (1.1) even though we do not assume his additional hypotheses.In our result, α appears only in the form 1−α 1−θ , so that it is more natural to express α as a convex combination of θ and 1: α = (1−µ)θ +µ.Then 1−α 1−θ = 1−µ.In the zero-density estimate the function c(µ) occurs, which is given by It is increasing on [2/3, 1], with c(2/3) = 3 and c(1) = 4.Our main result is the following theorem.
It would be of interest to know if the range where N(α, T ) = o(T ) holds, can be improved, potentially to α > (θ + 1)/2.More or less simultaneous to the writing of this paper and independent from our work, Révész also managed to obtain a zero-density estimate only assuming (1.2) (see [14]).His method differs from ours, and the estimate he obtains is 2).Note that (as remarked in [14]), the factor (1 − α) −4 may be replaced by O((log T ) 4 ) by using the de la Vallée Poussin-Landau zero-free region (1.3).
Our zero-density estimate (1.5) is a direct improvement to (1.6), improving the constant in the exponent from 12 to c(µ) + ε.Our zero-density estimate also has a larger (effective) range, namely α > (θ + 2)/3 instead of α > (θ + 11)/12.Furthermore, the value of T 0 in our estimate is independent of α, whereas the value of T 0 could potentially go to ∞ as α → 1 in Révész's estimate 3 .However, his estimate has the advantage that it is completely explicit in terms of all the involved parameters.It should be possible to compute the implicit constant in (1.5) in terms of the parameters θ, A, and K, but we decided not to pursue this.
The proof of Theorem 1.2 is based on two central results.The first one, established in Section 2, is an adaptation of the mean value theorem for Dirichlet polynomials from which we deduce a large values estimate for generalized Dirichlet polynomials, Corollary 2.3.The results in this section are of significant intrinsic interest and we expect they may well have applications beyond the theory of Beurling numbers.The second main technical tool is a suitable zero-detection method.In Section 3 we explore Carlson's original zero-detection method.However, due to the fact that a Beurling zeta function cannot in general be as well approximated by its partial sums as the Riemann zeta function, a delicate optimization argument was required to reach c(µ) instead of a weaker constant in the exponent.
In Section 4 and 5 we investigate the consequences of our zero-density estimates on the PNT in short intervals.Interestingly and somewhat surprisingly, a zero-density estimate such as (1.1) alone does not suffice to obtain the PNT in short intervals for any λ < 1.We prove this in Section 5 by a careful analysis of the example of Diamond, Montgomery and Vorhauer.On the other hand, if one has a zero-free region of Littlewood-type at ones disposition, then the PNT in short intervals does follow from (1.1).Finally, in Appendix A, we provide examples of generalized prime number systems demonstrating that it is impossible to improve c(µ) in Theorem 1.2 to a value smaller than 1.
We normalize the Fourier transform as f (ξ) = ∞ −∞ f (x)e −iξx dx.The implicit constants in the Vinogradov notation ≪ or the O-symbols are allowed to depend on θ, the density A and the implicit constant in (1.2) but are, unless explicitly mentioned, otherwise absolute.

A mean value theorem
The main ingredient in the proof of our zero-density result is the following generalization of the classical mean value theorem for Dirichlet polynomials (see e.g.[11,Theorem 6.1]).
Theorem 2.1.Let N > 1 and suppose that 1 ≤ n 0 ≤ n 1 ≤ . . .≤ n K ≤ N are real numbers.For λ > 0 denote by χ(x, λ) the number of n k 's within distance λ of x: Suppose a k (k = 0, . . ., K) are complex numbers, and set Then for T 0 ∈ R, T > 0, and η > 0 we have (2.1) Note that for classical Dirichlet series, this result reduces to upper bound provided by the classical mean value theorem: if n k = k + 1, then with η = 1/(2N) the right hand side of (2.1) becomes (T + 2N) Proof.The proof is classical: one multiplies the integrand with a majorant of the characteristic function of [T 0 , T 0 +T ] having well-localized Fourier transform, one squares out the sum, and interchanges summation and integration.We will utilize the Beurling-Selberg function B(x) (see e.g.[16]) which satisfies Then We find Now log n k n l ≥ 2πη as soon as |n k − n l | ≥ 4πηN, say.Furthermore F (ξ, η) ≪ T + 1/η and we obtain The result now follows upon replacing η by η ′ = 4πη.
By an elementary lemma of Gallagher, one may deduce the corresponding discrete mean value theorem.Theorem 2.2.With the same notation as in Theorem 2.1, let δ > 0 and T ⊆ [T 0 + δ/2, T 0 + T − δ/2] be a set of δ-well-spaced points, in the sense that |t − t ′ | ≥ δ for t, t ′ ∈ T , t = t ′ .Then, for any η > 0, t∈T S(t) Proof.By Gallagher's lemma [11,Lemma 1.2] one has t∈T The result now follows by applying the Cauchy-Schwarz inequality, Theorem 2.1, and realizing that the coefficients b k of the generalized Dirichlet polynomial S ′ (t) satisfy Corollary 2.3.Assume the same hypotheses as in Theorem 2.2.Let V be such that S(t) ≥ V for every t ∈ T .Then

The zero-density estimate
Armed with the mean value theorem, we are now in the position to prove Theorem 1.2.The proof goes along classical lines: one introduces zero-detecting polynomials taking large values at zeros of ζ, and one verifies with the help of Corollary 2.3 that this cannot happen too often.Our exposition is inspired by [9,Section 10.2].
First we require an approximation of ζ(s) by the partial sums of its defining Dirichlet series in the critical strip.Two aspects are important: the length of the sum, and the error in the approximation.For s = σ + it with t ≍ T , we use sums of length T ν 1−θ for some ν ∈ (1,2].The shorter the sum is, the bigger the error will be.We recall we always assume that the generalized integers are θ-well-behaved for some θ ∈ [0, 1).
We note that basic estimates of this kind for Beurling zeta functions are worked out with explicit constants in [12] and [14].
The right hand side has analytic continuation (except for a simple pole at s = 1) to the half-plane σ > θ.Integrating by parts gives We take Y = T ν 1−θ .If ν ≤ 2, then the last error term dominates and the result follows.
We now fix a number ν ∈ (1, 2] and approximate ζ(s) by a sum of length T ν 1−θ .Consider for a parameter X ≥ 1 to be determined later, the Dirichlet polynomial where µ is the Möbius function of the number system.Uniformly for 0 ≤ σ ≤ 1 we trivially have where being the divisor function of the generalized number system 5 .
We now assume that σ > (θ + 1)/2 and The error term in (3.1) does not exceed 1/2 provided that T is sufficiently large.We split the sum X<n k ≤XT ν 1−θ in dyadic subsums: Under the assumptions (3.2) and (3.4) we get, for sufficiently large T , Let now α > (θ + 1)/2, and consider the set of zeros ρ = β + iγ of ζ satisfying β ≥ α and T ≤ |γ| ≤ 2T .From this set we select a subset R of 1-separated zeros in the following way.Take the first zero ρ 1 = β 1 + iγ 1 with γ 1 minimal in [T, 2T ].Inductively, if ρ j = β j + iγ j has been chosen, we chose ρ j+1 = β j+1 + iγ j+1 with γ j+1 minimal in [γ j + 1, 2T ].By applying Lemma 1.1 with σ = (θ + 1)/2, it is clear that the total number of zeros is bounded as )) for at least one value of l.Setting R l to be the number of zeros ρ in R for which D l (ρ) ≥ 1/(2(L + 1)) delivers We now obtain a zero-density estimate by bounding R l via Corollary 2.3.First we mention a simple but useful lemma concerning mean values of powers of the generalized divisor function [10,Proposition 4.4.1].For the proof of Theorem 1.2 we only need the case j = 2. Lemma 3.2 (Knopfmacher).For each j there exists a constant c j > 0 such that, with d being the generalized divisor function of a number system having well-behaved integers, The assumption (1.2) implies in particular that for λ The sums D l (ρ) can now be estimated via Corollary 2.3 by setting η = (2N) θ−1 , δ = 1, and V = 1/(2(L + 1)): 8 , and therefore Finally it remains to check the assumptions on X.The first one, (3.2), is immediately clear, while (3.3)-(3.4)shall determine the range of validity for α of the zero-density estimate.Condition (3.3) and T is sufficiently large.This inequality is equivalent with The quadratic polynomial F ν (α) has discriminant If ∆ ν < 0, that is when ν > (5 + √ 32)/7 ≈ 1.52, then F ν (α) > 0 for any α and the zerodensity estimate holds uniformly for α ≥ θ+(2ν−1)
In the first estimate, the implicit constant depends on ν if ν → 3/2 + , whereas the second estimate is uniform in ν.
By, if necessary, choosing a smaller value of ǫ (depending on δ) we can force ν < 3/2.With this choice of ν, the condition (3.8) holds, so for µ ≥ 2/3 + δ and T ≥ T 0 (ǫ), where the choice of ǫ depends only on ε and δ.This concludes the proof of Theorem 1.2.
Remark 3.4.The proof of Révész's zero-density estimate (1.6) avoids mean value theorems and works instead with (a variant of) the Halász-Montgomery inequality to bound a sum over zeros ρ of certain zero-detecting polynomials.One can also follow this approach here, although the resulting zero-density estimate is weaker than (1.5).Let us sketch the line of reasoning.
Enumerate the zeros counted by R l as ρ r = β r + iγ r , r = 1, . . ., R l , according to increasing imaginary part, so with γ r+1 − γ r ≥ 1.Then Applying the Halász-Montgomery inequality (see e.g.[11,Lemma 1.7]) gives In the second line we used |a k | ≤ d(n k ), Lemma 3.2, the hypothesis (1.2) and integration by parts to estimate n −σ−it k for t = 0.In the third line we exploited the separation of the γ r : If now T ≪ N 2α−1−θ−ε , then we get the bound R l ≪ N 2−2α T ε say.Now we apply Huxley's trick: we divide the interval [T, 2T ] in subintervals of length T 1 with T 1 ≪ N 2α−1−θ−ε and apply the above argument to each subinterval.We obtain In order to obtain a non-trivial result, we now have to assume that α ≥ (θ + 3)/4.Using Selecting the optimal X = T .The final condition (3.4) implies in a similar way as before the positivity of a certain quadratic polynomial, this time given by This polynomial is positive if ν > (5 + √ 32)/7, and otherwise its largest zero is One may verify that α(ν) ≥ α(ν) with equality if and only if ν = 1.Hence, for each fixed ν ∈ [1, 2], the zero-density estimate (3.9) obtained via the Halász-Montgomery estimate is inferior to the one obtained by the mean value theorem (Theorem 3.3) both in terms of the value of the exponent and its range of validity.Furthermore the Halász-Montgomery approach is only valid for α > (θ + 3)/4, whereas the mean-value theorem approach is able to reach α > (θ + 2)/3.
Remark 3.5.When θ = 0, we can compare our zero-density estimate 1) for the Riemann zeta function.For each fixed α, Carlson's estimate is superior to our estimate (although when α → 1 − , they are of similar quality6 , both exponents being ∼ 4(1 − α)).Furthermore, Carlson's estimate has effective range α > 1/2, while our estimate only goes up to 2/3.The reason we are not quite able to reach Carlson's result is that the Riemann zeta function can be much better approximated by its partial sums.Indeed, it is well known that by a basic van der Corput lemma (which relies on the equal spacing of the integers!)one has for example (see e.g.[15,Theorem 4.11]) uniformly for σ ≥ σ 0 > 0 and 1 ≤ |t| ≤ T .Stronger versions of Lemma 3.1 would deliver better zero-density estimates, but it seems unlikely the lemma can be improved for general Beurling number systems admitting θ-well-behaved integers.
Subsequent improvements of Carlson's estimate rely on deeper properties of the Riemann zeta function (such as the fourth power moment estimate, or subconvexity bounds), unavailable for general Beurling number systems.
One might expect that a better knowledge of the local distribution of the generalized integers yields a better zero-density estimate.This is indeed the case; as an example we show an improvement assuming the average Ramanujan condition (3.10).In conjunction with the assumption that n k ∈ N, this hypothesis allowed Révész to obtain his first zerodensity theorem [13,Theorem 4].We remark that the exponent in (3.11) still improves the exponent (6 − 2θ)(1 − α)/(1 − θ) obtained by Révész even though we do not assume the restrictive n k ∈ N. Theorem 3.6.Let (P, N ) be a Beurling generalized number system satisfying (1.2) for some A > 0 and θ ∈ [0, 1), and: ∃p > 1 : ∀ε > 0 : Then for every ε > 0 and δ > 0 there exists T 0 = T 0 (ε, δ) such that uniformly for T ≥ T 0 , α ≥ (θ + 2)/3 + δ: The exponent provided by Theorem 1.2 if θ > 0. For θ = 0, (3.10) holds a fortiori and no improvement is achieved then.
Proof.We apply Corollary 2.3 with η = 1/N.This gives From Hölder's inequality7 , (3.10) and Lemma 3.2 we infer that the sum is ≪ ε,p N 1+ε .Inserting the range The optimal choice is now , and (3.4) is fulfilled if When this polynomial has a non-negative discriminant, in particular when ν ≤ 3/2, its largest root is We have α(ν) ≥ θ+(2ν−1) 2ν if and only if ν ≤ 3/2.For a given α > (θ + 2)/3, we have Inserting this value for ν in (3.12) yields the exponent Selecting ν slightly to the right of α(ν) in a similar fashion as in the proof of Theorem 1.2 then gives the result.
Remark 3.7.An artefact of the assumption (3.10) is the presence of the factor T ε .For α → 1 − , it would then still be advantageous to employ the estimate of Theorem 1.2.However, replacing (3.10) by an "ε-free" assumption (e.g. with some power of log x) naturally leads to an estimate with T to the power (c(α) + ε)(1 − α) instead of c(α)(1 − α) + ε, times a certain power of log T , c(α) being the function appearing in the exponent in (3.11).One actually needs such a zero-density result for the application of the PNT in short intervals.

The PNT in short intervals
One of the interesting applications of zero-density estimates for the Riemann-zeta function is the PNT in short intervals.Denoting the Chebyshev function by ψ, and given 0 < λ < 1, we say that the PNT holds in intervals of length at least For the rational primes, (4.1) was first shown to hold for some λ < 1 by Hoheisel [5], while the currently best known range for λ is λ > 7/12, a consequence of the Ingham-Huxley zero-density estimate N(α, T ) ≪ T  1) for the Riemann zeta function [6,7,8].The starting point for proving (4.1) is an explicit formula ψ(x) ≈ x − ρ x ρ ρ , where the sum is taken over the zeta-zeros.The density estimate is then able to deliver bounds on the contribution from these zeros.In the Beurling setting, we also have such an explicit Riemann-von Mangoldt-type formula [12, Theorem 5.1]8 Theorem 4.1 (Révész).Let (P, N ) be a Beurling generalized number system with θ-wellbehaved integers, and let b ∈ (θ, 1) and 4 ≤ T ≤ x.Then where the sum is over the zeros ρ = β + iγ of the associated Beurling zeta function.
Suppose we apply the above theorem with certain b and T = x a for certain a < 1 to estimate the difference ψ(x + h) − ψ(x).The contribution from a single zero is Suppose now that ζ has infinitely many zeros β + iγ satisfying β ≥ 1 − d/ log γ for some d > 0. When γ ≍ T = x a , the contribution from such zeros is potentially as large as ≫ h and this might be problematic for proving (4.1).A careful analysis of the example of Diamond, Montgomery, and Vorhauer shows that this obstruction is indeed insurmountable.
Proposition 4.2.For all θ ∈ (1/2, 1), there exists a Beurling generalized number system having θ-well-behaved integers with the following property.For every λ ∈ [4/5, 1) there exist two sequences of numbers The restriction λ ≥ 4/5 is inconsequential as the validity of Proposition 4.2 for 4/5 ≤ λ < 1 implies the one for the range 0 ≤ λ < 1.We postpone the proof of this proposition to Section 5 as it requires quite a bit of extra notation.
In view of this proposition, the PNT on short intervals is out of reach for general systems with well-behaved integers, although a Chebyshev bound in short intervals might still be attainable, that is, ψ(x + h) − ψ(x) ≍ h.On the other hand, if the zeta function admits a zero-free region of Littlewood type, then it is possible to obtain the PNT in short intervals.
Theorem 4.3.Suppose (P, N ) is a Beurling generalized number system with θ-well-behaved integers. ( and if for some b > θ we have the zero-density estimate and if for some b > θ we have the zero-density estimate The classical proof (see e.g.[7, Theorem 1] or [9, Theorem 10.5]) goes through without difficulty.We include it here for convenience of the reader.
Proof.Given b and λ as in the statement of the theorem, suppose that h ≫ x λ .We apply Theorem 4.1 with the given b and with T = x a for some a with 1 − a < λ to get In the first case, the integral is, for ac < 1, If we now take a such that 1 − λ < a < d 1 (cd 1 + L) −1 , then this is o(1).Since the first term in (4.2) is also o(1), it follows that ψ(x + h) − ψ(x) ∼ h.
In the second case, the integral is Selecting a such that 1 − λ < a < d 2 (cd 2 + max{log 2K, cd 2 }) −1 , we see that the integral is < 1.Since the other terms are o(1), we obtain that ψ(x + h) − ψ(x) ≍ h for x sufficiently large.
Theorem 1.2 yields a zero-density estimate with c = 4+ε 1−θ and L = 9; so we have the following corollary. .
If d can be taken to be arbitrarily large (by modifying t 0 accordingly), then it holds in the range λ > (θ + 3)/4.
In the proof of the second point of Theorem 4.3, it is crucial that the estimate N(α, T ) ≤ KT c(1−α) holds all the way up to α = 1−d 2 / log T .If one starts from an estimate N(α, T ) ≪ T c(1−α) (log T ) L , then this estimate can be made "log-free" by enlarging the value of c, but only in the range 1 − α ≫ log 2 T log T , which is insufficient for proving the second part of the theorem.
One can wonder if number systems with well-behaved integers always satisfy for some K > 0 and c > 0. When for example 1 − α = d/ log T , this yields an upper bound independent of T and this appears to be quite difficult to achieve in general.However, the example of Diamond, Montgomery, and Vorhauer satisfies (4.3).Also, we were unable to find a simple modification of their example breaking (4.3).Since their construction is in some sense extremal with respect to the distribution of zeros, one might be tempted to conjecture that (4.3) holds for any number system with well-behaved integers.
On a related note one can also immediately ask the question whether any number system having well-behaved integers admits Chebyshev bounds in short intervals: does ψ(x + h) − ψ(x) ≍ h when h ≫ x λ always hold for some λ < 1?

Proof of Proposition 4.2
The construction of Diamond, Montgomery, and Vorhauer is based on a continuous number system (Π C (x), N C (x)): a pair of (absolutely) continuous functions supported on [1, ∞), which are non-decreasing, satisfy Π C (1) = 0, N C (1) = 1, and are linked via the identity , where the exponential is taken with respect to the multiplicative convolution of measures (see [4,)).The function N C (x) plays the role of an integer-counting function, and Π C (x) that of a Riemann-type prime-counting function.
The actual discrete example (P, N ) is a suitable "approximation" of this continuous system, whose existence is guaranteed by an ingenious probabilistic method.Since in particular it suffices to show that ψ C does not admit the PNT in short intervals.Let us first briefly describe the relevant notions of this continuous system.We adopt the notations from the book of Diamond and Zhang [4,, where a slightly more streamlined version of the construction from [3] is presented.For k = 1, 2, . . ., they set The zeta function is defined as This defines a function holomorphic for σ > 1/2 except for a simple pole at s = 1, and admitting zeros at s = ρ k and s = ρ k , situated on the curve σ = 1 − 1 log|t| .Actually ζ C (s) has much more zeros, but the others lie to the left of this curve.
The Chebyshev function ψ C (x) is given by Here g(u) is a non-negative function supported on [e, ∞) which is a polynomial in log u of degree at most m − 1 for e m < u < e m+1 .It satisfies [4,Lemma 17.18] g(u) log u < 7/3 for u ≥ e 4 , (5.2) g(u) log u − 1 < 2.7u −0.22 for u ≥ e 5 , (5.3) (g(u) log u) ′ ≪ u −1.22 for u ≥ e 5 . (5.4) The exact definition of g is not important for our purposes, but it is worth noting that (The interested reader may consult [4, for more background information.)If we let K be such that γ K ≤ x < γ K+1 , then the sum in (5.1) goes only up to K, since the terms I k with k > K are 0 in view of the support of g.
Let now λ ∈ [4/5, 1), and let j be such that 1 − λ ∈ (4 −j−1 , 4 −j ].We will set x = x K = B exp(4 κ ), where 1 ≤ B = B K ≪ 1 will be chosen later and (5.5) Once B and hence x are determined, we choose By definition of κ, γ K ≤ x < x + h < γ K+1 for sufficiently large K when h ≍ x λ .There holds (5.6) We show the term with k = K −j is responsible for the largest contribution.The contribution from the terms with k < K − j is, in view of (5.2), bounded as 2 h Here, the J k are the integrals appearing in the right hand side of (5.6).Using the fact that g(u) is a piecewise polynomial in log u and integrating by parts, one may show (as in [4, p. 224]) that Finally we consider the term J K−j .We change variables and integrate by parts to find Note that the choice (5.5) of κ implies x/γ K−j = B 1−λ x λ and x 4 −(K−j) = B 4 −(K−j) exp 1  1−λ ≥ e 5 .Let us first estimate the integral in (5.7).We have in view of (5.4) that For the integral x 4 −(K−j) (4 K−j − 1)u 4 K−j −2 g(u) log u sin(4 K−j γ K−j log u) du, we write g(u) log u = 1 + (g(u) log u − 1).Integrating by parts again gives Also, by (5.3) We now return to the main term [. . .] in (5.7).Replacing g(u) log u by 1, this main term becomes Consider the term sin(γ K−j log x) = sin(γ K−j 4 κ + (log B)γ K−j ).For sufficiently large K we choose 1 ≤ B = B K ≤ 1.09 such that this is 1 or −1, if K is even or odd, respectively.We approximate sin(γ K−j log(x + h)) as ] say, such that this sine is −1 or 1, if K is even or odd, respectively.In any case we have if K is sufficiently large.By (5.3), the error introduced by replacing the main term [. . .] in (5.7) by M is at most if K, h, x are sufficiently large and where we used 5.4B 1−λ < 5.5.If λ ≥ 4/5, then 5.5 exp − 0.22 1−λ ≤ 1.9, so that the above error is ≤ 0.95|M|.Collecting all estimates, we conclude The sum is since λ ≥ 4/5.Finally, in view of (5.5), the O j -term is This concludes the proof of Proposition 4.2.Since the signs of the main terms alternate we have shown in particular lim sup Appendix A. Sharpness of the zero-density theorem In [14] it was announced that during a visit in Budapest in July 2022, we had come up with a family of Beurling systems displaying, in suitable ranges for σ, N(σ, T ) ≫ T c(1−σ) , for a suitable c > 0. This demonstrates that in the general setting of θ-well behaved systems, one can (apart from the precise value of the constant c) not obtain better than Carlsontype zero-density estimates.Afterwards we realized that this was already observed in the pioneering paper [3]; their example P θ,ε , 1/2 < θ < 1, ε > 0 has θ-well behaved integers while its zeta function admits on a sequence T n → ∞ and for a suitable d.Our example has only the mild advantage that the estimate on the zeta zeros holds for all sufficiently large T and not only on a subsequence.
It also provides a lower bound for the number of zeros a Beurling zeta function can have on a line σ = β, where θ < β < 1.For the purpose of completeness we include a brief sketch of our example in this appendix.We shall provide a Beurling system P β,θ,ε , 1/2 < θ < β < 1, ε > 0 having the following properties: (1) there exists T 0 = T 0 (β, ε) such that N(β, T ) ≫ T , for θ < σ < β and T ≥ T 0 , (3) there exists A = A(β, ε) > 0 such that N(x) = Ax + O ε,β x θ exp(C √ log x) , for a suitable constant C > 0. Our example draws inspiration from the Diamond-Montgomery-Vorhauer template.We first define a continuous system through its zeta function and then extract a discrete system via a probabilistic approximation procedure.As in Section 5 we define the zeta function as for a sufficiently large k 0 , but now the parameters ℓ k and ρ k are chosen as The function G admits a zero at 0 and at [3, Lemma 2] On the interval I k one may perform the same analysis to see that the product of all the factors in (A.1) except G(ℓ k (s − ρ k )) is bounded.This final factor can be bounded by O(|t|).Therefore we are justified to switch the contour in (A.2) to the line Re s = θ, picking up the residue Ax + O β,ε (1) at the point s = 1 in the process.Observing that ((x + 1) s+1 − x s+1 )/(s + 1) ≪ min{x σ , x σ+1 /t}, we achieve the bound O ε (x θ log x) for the displaced contour integral (A.2) on the segments of Re s = θ where ζ(s) is bounded.On the intervals θ + iI k we find In conclusion we obtain N(x) ≤ Ax+O ε,β (x θ log x).The analysis of the lower bound for N(x) is similar and omitted.This concludes the verification of property (3) for the continuous system.
Finally, following a suitable probabilistic discretization procedure, e.g [2, Theorem 1.2] or [4, Lemma 17.5], one obtains a discrete Beurling system for which its zeta function ζ D (s) satisfies ζ D (s) = ζ(s)F (s), where F (s) is an analytic function on the half-plane Re s > 1/2+ε and bounded by F (s) ≪ exp(C √ log t) for a suitable absolute constant C > 0. So, as long as θ > 1/2, the discrete system inherits all the properties analyzed above from the continuous system, possibly with a different value for A and a slightly worse error term in the asymptotics for N.

Corollary 4 . 4 .
If (P, N ) is a Beurling generalized number system with θ-well-behaved integers for which ζ(s) has no zeros for