Jeff Hinrichs: Birthdays, Ramanujan but in the end it’s just Python.

It’s Sunday morning and I get some time to myself, so I’m listening to some blues and catching up on my newsfeeds when I come across this interesting article on calculating what size group of people would be necessary to have a 50/50 chance of two of them sharing the same birthday.

General birthday problem

The primary focus of the article is on this equation to calculate the probability of uniqueness given a sample size of r from a group of N things to choose.

$$p = \frac {N!} {N^r (N-r)!}$$

In the article, he is concerned with the possibility of overflow with the size of the factorials involved and since scipy doesn’t have log factorial he implemented his solution with gamma log — see above article for his code. So I say to myself, “Wonder if I can do this with just straight Python?”

A quick google on log factorial found this approximation of log factorial by Srinivasa Ramanujan on math.stackexchange If you have not heard of Ramanujan before — stop and google him immediately. Wow!

$$\log n! \approx n\log n-n+\frac{\log(n(1+4n(1+2n)))}{6}+\frac{\log(\pi)}{2}$$

Which works out to the following CPython code:

from math import log, pi, exp

def logn_factorial(n):
    """return an approximation of log n! using Ramanujan's equation."""
    return n * log(n) - n + (log(n * (1 + 4*n * (1 + 2*n)))/6) + (log(pi)/2)

p = exp(logn_factorial(N) - logn_factorial(N - r) - r*log(N))

Ok, so no scipy needed to follow along with this article plus I get to use some very cool math. I like Sunday morning fun time. However, I then start thinking, “overflow”, hmmmm. What is the upper bound of math.factorial anyway. Since it’s the birthday problem lets see what happens with 365

>>> import math
>>> math.factorial(365)
25104128675558732292929443748812027705165520269876079766872595193901106138220937419666018009000254169376172314360982328660708071123369979853445367910653872383599704355532740937678091491429440864316046925074510134847025546014098005907965541041195496105311886173373435145517193282760847755882291690213539123479186274701519396808504940722607033001246328398800550487427999876690416973437861078185344667966871511049653888130136836199010529180056125844549488648617682915826347564148990984138067809999604687488146734837340699359838791124995957584538873616661533093253551256845056046388738129702951381151861413688922986510005440943943014699244112555755279140760492764253740250410391056421979003289600000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Ok, let’s dial it up by a factor of 1000.

>>> math.factorial(365000)

Well it took a bit but it ran, with over 40 screens of numbers. Python is still going strong, so what is the upper bound of math.factorial? A search brought me here http://bugs.python.org/issue8692 and specifically this message. Which means that the max size of the result can not exceed sys.maxsize - 1 digits, or on a 64bit platform, 2**63 – 1 digits of capability. Thanks to some dedicated individuals who seemed to be having as much fun as I was, math.factorial is up to the task.

The take-away from this article is not the cool math, or the approximations of log n! — it is…

Don’t underestimate the power of Python.

Try straight Python before you move on to something more complex. The approximations of $\log n!$ were unnecessary. All that was needed was just Python and we get the following implementation of the Probability of Uniqueness:

from math import factorial

p = factorial(N) / ((N**r) * factorial(N-r))

A simple and straightforward implementation of the equation.

* A quick note, I’m usually a stickler for good variable names, however, when working the math equations I stick as close as possible to the equation that I am implementing.

* Also, if you are working on big enough numbers and I mean huge numbers, then approximations might be needed but by then you are going to be working at the limits of what a 64bit platform can do.

Premature optimizations and all that, when was the last time you fell in to the trap of implementing something to handle a misconceived belief of a shortcoming in Python or one of the standard libs?

Jeff Hinrichs: Birthdays, Ramanujan but in the end it’s just Python.

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Moondru Mudichu 20-07-2016 – Polimer tv Serial

More Questions Than Answers Four Years Later: Was Killing Of Bobby “Boo Boo”...

ANTHONY NOLAN MCNEILLY Arrested by Clackamas County Sheriff's Office on Mar...

The 10 Tennessee Cities With The Largest Black Population For 2021

Murder charge millionaire Clint Spearpoint gets bail to help firm

Tearful heroin dealer Mohamed Moosajee jailed for four and half...

Celemony Melodyne Studio Edition v3.2.2.2 -AiR

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Suits Saison 1 VOSTFR [12/12] [MEGAUPLOAD]

Driver sought by police following a crash in Camborne

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Alleged killer remanded to Dodds

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

Created Release: VG-Ripper 2.9.64 (Sep 13, 2014)

Black Angus Grilled Artichokes

99 God Status for Whatsapp, Facebook

Missing Boy, Kyle Cronin-Sunday, 15

New Malayalam kambi Audio Talk Sussiyude cycle paditham

The Weeknd – Hurry Up Tomorrow (2025) [FLAC 24bit/88,2kHz]