An old interview challenge is to generate prime numbers or check if a number is prime. No advanced mathematics needed, just variants on the Sieve of Eratosthenes. Starting with a basic prime checker.
defisprime(n):divs=range(2,int(n**0.5)+1)returnall(n%dfordindivs)%timeisprime(1_000_003)A common optimization is to skip even numbers.
defisprime(n):divs=range(3,int(n**0.5)+1,2)returnn==2orall(n%dfordindivs)%timeisprime(1_000_003)Brief digression on that optimization. There's nothing special about removing multiples of 2; removing multiples is the whole point. The step scalar could instead be thought of as a cycle: itertools.accumulate(itertools.repeat(2)). So removing multiples of 3 would remove every third step: itertools.accumulate(itertools.cycle([2, 4])).
Or the equivalent could be done with slicing.
importitertoolsdefisprime(n):divs=range(5,int(n**0.5)+1,2)returnnin(2,3)orall(n%dfordinitertools.chain(divs[::3],divs[1::3]))%timeisprime(1_000_003)The catch is the cycles grow exponentially with diminishing returns on each successive number.
Onto prime generation, while keeping the odds-only optimization. Typically it's requested to generate the first N primes, or up to some value. But that's easily generalized with itertools.islice and itertools.takewhile. A more Pythonic approach is an unbounded generator.
defprimes():yield2ints=itertools.count(3,2)whileTrue:prime=next(ints)yieldprimeints=(nforninintsifn%prime)list(itertools.islice(primes(),10))Elegant, but doesn't work. The problem is the scoping of prime, which is being used in the generator expression but also modified in the loop. Instead it can be replaced with a filter on a partially bound function, but unfortunately functools.partial only binds left arguments and rmod is needed here. One alternative is to use bound methods as a first-class function, even dunder methods.
defprimes():yield2ints=itertools.count(3,2)whileTrue:prime=next(ints)yieldprimeints=filter(prime.__rmod__,ints)%timenext(itertools.islice(primes(),1000,None))Elegant, but slow and could overflow the stack. A more traditional approach would use the same checking logic as isprime, but also cache the primes so as to not duplicate divisors.
defprimes():yield2primes=[]forninitertools.count(3,2):ifall(n%pforpinitertools.takewhile(int(n**0.5).__ge__,primes)):primes.append(n)yieldn%timenext(itertools.islice(primes(),1000,None))Onto interface design. The primes are being stored anyway, so it would be nice if they were re-iterable. A generator can be written as a class with __iter__ and __next__, but an under-appreciated feature is that __iter__ itself can be a generator. And now that it's a class, isprime can be expressed as in while also benefiting from the cache.
classPrimes:def__init__(self):self.ints=itertools.count(3,2)self.cache=[2]def__iter__(self):yield fromself.cacheforninself.ints:ifninself:self.cache.append(n)yieldndef__contains__(self,n):returnall(n%pforpinitertools.takewhile(int(n**0.5).__ge__,self))primes=Primes()%timenext(itertools.islice(primes,1000,None))%time1_000_003inprimesThere's a hybrid approach though, that's faster and nearly as simple as the above sieves. Instead of doing repeated divisions, keep track of each found prime along with the next multiple that it would eliminate. The inner loop is then optimized because it only needs to account for collisions.
defprimes():multiples={}forninitertools.count(2):prime=multiples.pop(n,0)ifnotprime:prime=nyieldnkey=n+primewhilekeyinmultiples:key+=primemultiples[key]=prime%timenext(itertools.islice(primes(),1000,None))Now to add back the odds-only optimization, the step scalar needs to be double the prime number. Another way to reduce collisions is to recognize that each new prime is irrelevant until its square value is reached.
defprimes():yield2multiples={}forninitertools.count(3,2):step=multiples.pop(n,0)ifstep:# compositekey=n+stepwhilekeyinmultiples:key+=stepmultiples[key]=stepelse:# primemultiples[n**2]=n*2yieldn%timenext(itertools.islice(primes(),1000,None))And finally let's add back the caching. Yielding a clean interface, an efficient implementation for all use cases, and still relatively simple.
classPrimes:def__init__(self):self.ints=itertools.count(3,2)self.cache=[2]self.multiples={}def__iter__(self):yield fromself.cacheforninself.ints:step=self.multiples.pop(n,0)ifstep:# compositekey=n+stepwhilekeyinself.multiples:key+=stepself.multiples[key]=stepelse:# primeself.multiples[n**2]=n*2self.cache.append(n)yieldndef__contains__(self,n):returnall(n%pforpinitertools.takewhile(int(n**0.5).__ge__,self))primes=Primes()%time1_000_003inprimes%time1_000_003inprimes