# Sandbox 4

\[\Large \mathbf{\frac{1}{1^2} + \frac{1}{2^2} + \frac{1}{3^2} + \frac{1}{4^2} + \frac{1}{5^2} + \cdots = \ ?}\]

\[\large A+B\]

\[\large B+C+D\]

\[\large A+D\]

\[\large N\]

\[\large 2016\cdot N\]

\[\large 2016 N\]

\[\large \color{red}{N}\]

\[\large \color{red}{2016 \cdot N}\]

\[\large 2016 \cdot \color{red}{N}\]

\[\large \color{red}{2016N}\]

\[\large \color{red}{2016\cdot N}\]

What is the minimum number of people such that the probability that at least two of them share the same birthday is at least 50%?

If you've seen this problem before, you would know that the answer is 23 if we are on Earth with 365 days. But what happens if we are on Mars which has 20 days in a year, or if we are on another planet?

#### Contents

## Setup of the Birthday Problem

Let’s understand how to obtain the answer of 23 for Earth with a probability of 50%. Let’s assume that it is equally likely to be born on any of the 365 days. Given \(n\) people, the probability that no two people will share the same birthday, meaning that they have distinct birthdays, is given by

\[ \frac{ 365 \times 364 \times \ldots \times (365 - n + 1 ) } { 365 ^ n} .\]

Thus, the probability that at least two people share the same birthday, which is the complement of the above, is

\[ 1 - \frac{ 365 \times 364 \times \ldots \times (365 - n + 1 ) } { 365 ^ n} = 1 - \frac{ 365! } { (365-n)! 365^n } . \]

We then want the smallest \(n\) such that

\[ 1 - \frac{ 365! } { (365-n)! 365^n } > 0.5. \]

This is not an easy inequality to solve directly. Let's graph this inequality to get a better sense of it

GRAPH OF \(y = 1 - \frac{ 365! } { (365-n)! 365^n } , y = 0.5 \). LOOKS LIKE

THEN ANSWER OF EARTH

(GO INTO MORE DETAIL of the generalization) More generally, if there are \(D\) days in the year, and we want a probability of at least \(p \), then we find the value of \(n\) which satisfies

\[ 1 - \frac{ D! } { ( D - n)! D^n } = p, \]

and round it up to the nearest integer.

## Obtaining an approximation answer

We see that the main difficulty is in solving this equation, which requires testing numerous values in order to get an exact answer. Let's see how we can apply approximations to simplify the calculations. We use the maclaurin series approximation \( e^x \approx 1 + x \) for small \( x\), to obtain

\[ \begin{align} \frac{ D \times (D-1) \times \ldots \times (D - n + 1 ) } { D ^ n} & = 1 \times ( 1 - \frac{1}{D} ) \times ( 1 - \frac{2}{D } ) \times \ldots \times( 1 - \frac{ (D - n + 1) } { D ^n } \\ -& \approx e^{0} \times e ^ { -1 /D } \times e ^ { -2 / D} \times \ldots \times e ^ { -(n-1)/D } \\ -& = e^{ -n(n-1) / (2 D ) } \end{align} \]

As such, we have the exponential equation

\[ \begin{align} & 1 - e^{-n(n-1)/(2D)} = p \\ \Rightarrow & 1 - p = e ^ { -n(n-1)/ (2D) } \\ \Rightarrow & - 2D \ln (1-p) = n^2 - n \\ \Rightarrow & \frac{1}{4} - 2D \ln (1-p) = ( n - \frac{1}{2})^ 2 \\ \Rightarrow & n = 1/2 + \sqrt{ 1/4 - 2D \ln (1-p) } \\ \Rightarrow & n \approx 1/2 + \sqrt{ - 2D \ln (1-p) } \end{align} \]

## Application to \( p = 0.5 \)

(INCOMPLETE)

As it turns out, the approximation of \(n = \lceil 1/2 + \sqrt{ 2D \ln 2 } \rceil \) is pretty accurate. In fact, \( n = \lceil \sqrt{ 2D \ln 2 } \rceil \) or \(\lceil \sqrt{ 2D \ln 2 } \rceil + 1\). With \(D = 20 \), we get \(1/2 + \sqrt{ 2D \ln 2 } = 5.76 \), and rounding up gives 6!

Given that we know that if \( D = 365 \), then \( n = 23 \), we could use the approximation that \( n \propto \sqrt{D} \). A quick estimate that we could do in our head is \( 23 \times \sqrt{ 20 / 365} \approx 23 \times \sqrt{ 20 / 360 } = 23 \times \sqrt{1/18} \approx 23 \times \frac{1}{4.25} \approx 5.4 \) people. You can use this (or the more accurate version \( n - \frac{1}{2} \propto \sqrt{D} \)) to quickly estimate the value and wow people at your accurate guesses!

## Analysis of the accuracy

(INCOMPLETE)

ANALYSIS AS TO HOW ACCURATE THIS APPROXIMATION IS! Actual graph - How far away are we from the answer? When does the answer disagree? Improving the approximation. Are we under-approximating or over-approximating?

## Points to ponder

(INCOMPLETE)

- Venus has BLAH days. What is the corresponding value of \(n\)?
- What happens if we apply this to another \(p\) value like \( p = 0.99 \)?
- This approximation procedure is very similar to the concept of poisson approximation of the binomial distribution. Do you see the similarity?

## divergent series

The binary operation of addition can be understood simply as a inherent part of the field structure of the reals. It also makes sense to naturally extend this operation to arbitrarily large finite sums by induction.

But the extension to infinite sums is non-trivial and requires a rigorous definition. So, here is the conventional meaning that is assigned to such objects:

Let \(a_1, a_2 \cdots\) be a (infinite) sequence of real numbers. We assign a value to the symbol \(a_1 + a_2 + \cdots\) iff it is convergent. And we define convergence as follows: A series is convergent iff the limit of the partial sums, i.e, the sequence \(a_1, (a_1 + a_2), (a_1 + a_2 + a_3), \cdots\) exist.

Under this definition, it is easy to see that the series \(1 -1 + 1 - 1 \cdots\) does not converge and hence this series does not have a value.

What you are really doing here is applying a summability method to get a value for the divergent series that fits. This is actually often useful. However, this is not the conventional meaning of the value of a series. When you are applying such methods you should state clearly that you are doing so.