We know that the number of prime numbers less than \(x\) tends to \(\frac { x }{ \ln { \left( x \right) } } \)

Using this I have thought of a way of **estimation of sum of prime numbers less than "n"**.

Using the formula above, we get that the number of prime numbers less than \(x\) is \(\frac { x }{ \ln { \left( x \right) } } \), so the number of prime numbers less than \(x+1\) is \(\frac { x+1 }{ \ln { \left( x+1 \right) } }\).

So we can say that the probability of \(x+1\) being prime is \(\frac { \frac { x+1 }{ \ln { \left( x+1 \right) } } -\frac { x }{ \ln { \left( x \right) } } }{ 1 } \)

As \(x\) will grow bigger, \(1\) will become a very small value for it, so we can replace it with \(dx\). Here \(dx\) is the small quantity we generally use in calculus.

Probability of \(x+dx\) being prime is \[\frac { \frac { x+dx }{ \ln { \left( x+dx \right) } } -\frac { x }{ \ln { \left( x \right) } } }{ dx } \]

Solving the derivative, we get that the probability of \(x\) being prime is:- \[\frac { 1 }{ \ln { \left( x \right) } } -\frac { 1 }{ { \left( \ln { \left( x \right) } \right) }^{ 2 } } \]

We can generally say that the sum of prime numbers less than \(n\) is \(\sum _{ i=2 }^{ n }{ iP\left( i \right) } \). Here \( iP\left( i \right)\) is the probability of \(i\) being prime.

So we can rewrite this summation as an integration:- \[\int _{ 2 }^{ n }{ x\left( \frac { 1 }{ \ln { \left( x \right) } } -\frac { 1 }{ { \left( \ln { \left( x \right) } \right) }^{ 2 } } \right) } dx\]

Solving the integration, we get (Here \(\beta \left( n \right) \) represents **sum of prime numbers less than "n"**):- \[\beta \left( n \right) =\frac { { n }^{ 2 } }{ \ln { \left( n \right) } } -li({ n }^{ 2 })-\left[ \frac { 4 }{ \ln { \left( 2 \right) } } -li\left( 4 \right) \right] \]

Ignoring the constant value on the RHS, which would be too small as \(n\) grows bigger, we have:- \[\boxed { \beta \left( n \right) =\frac { { n }^{ 2 } }{ \ln { \left( n \right) } } -li({ n }^{ 2 }) } \]

**Please read this and comment if there are any mistakes or about anything else regarding this derivation.
If you are good at coding, please check if this formula works for large numbers.**

Thanks.

## Comments

Sort by:

TopNewestYour formula seems to approximate quite well even for huge values of n, see for yourself.

We can however simplify it by approximating pi(n) using the Prime Number Theorem :

\[\pi(n) \simeq \frac{n}{ln(n)}\]

instead of \[li(n)\]

so that we have : \[\beta(n) \simeq \frac{n^2}{ln(n)} - \frac{n^2}{ln(n^2)}\] \[\beta(n) \simeq \frac{n^2}{ln(n^2)}-\frac{n^2}{2 \cdot ln(n)}\] Finally, \[\beta(n) \simeq \frac{n^2}{2 \cdot ln(n)} \simeq \pi(n^2)\]

This gives better results, while being simpler. The accuracy for relatively big values of n (n > 10^3) can be enhanced by replacing back pi(n) with li(n) :

\[\boxed{\beta(n) \simeq li(n^2)}\]

Below is the graph of the difference between your formulae and the actual values, up to n = 10^5.

Plot[ Sum[ nBoole[PrimeQ[n]], {n, 2, i}] - ((i^2)/Log[i] - LogIntegral[i^2]), {i, 2, 10^5}]

And here is the graph of the difference between li(n^2) and the actual values, up to n = 10^5.

Plot[LogIntegral[i^2] - Sum[nBoole[PrimeQ[n]], {n, 2, i}], {i, 2, 10^5}]

As we can see, your formulae tends to "go away" from the actual values but in a rather nice/smooth way while li(n^2) is generally closer to the actual values but the variations of the difference are looking more random.

This last approximation gets very close to the actual value as n gets big :

\[\frac{li((10^{12})^2)}{\beta(10^{12})} = \frac{li(10^{24})}{\beta(10^{12})} = 1.0000006083242533806771...\]

\[\frac{\frac{(10^{12})^2}{2 \cdot ln((10^{12})^2)}}{\beta(10^{12})} = \frac{\pi(10^{24})}{\beta(10^{12})} = 0.9815582161130313615399...\]

\[\frac{\frac{10^{24}}{ln(10^{12})} - li(10^{24})}{\beta(10^{12})} = 0.9631158239018093424027...\] – Nathan Soufflet · 1 year, 7 months ago

Log in to reply

– Archit Boobna · 1 year, 7 months ago

Thanks a lot for the improvements!! :) Found it really helpful.Log in to reply

I thought of a simpler way of getting the same result, although your reasoning is nice :

Directly from the Prime Number Theorem, we get :

\[\pi(n) \simeq \frac{n}{ln(n)}\]

So, we have : \[P(n) \simeq \frac{\pi(n)}{n}\ \simeq \frac{n}{\frac{ln(n)}{n}} \]

\[P(n) \simeq \frac{1}{ln(n)}\]

hence :

\[\beta(n) = \sum_{i=2}^{n}{i\cdot P(i)} \simeq \int{\frac{n}{ln(n)}}dn\]

and \[\int{\frac{n}{ln(n)}}dn \simeq Ei(ln(n^2)) \simeq Ei(2 \cdot ln(n)) \]

where Ei(n) is the exponential integral, however we know that : \[li(n) = Ei(ln(n))\]

So, we have : \[Ei(2 \cdot ln(n)) = li(n^2)\]

We end up with our nice formula : \[\boxed{\beta(n) \simeq li(n^2)}\]

More generally, \[\beta(n, p) = \sum_{i=2}^{n}{i^{ p} \cdot P(i)} \simeq li(n^{p+1}), \forall (n, p) \in (\mathbb{R^+})^2, n \neq 1 \] – Nathan Soufflet · 1 year, 7 months ago

Log in to reply

I have tested this too, when I tried it for n=100, it gave 925 which is quite close to 1060, the actual answer. – Archit Boobna · 1 year, 7 months ago

Log in to reply