Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22911

Pythonicity: Random Selection

$
0
0

Random selection utilities used to be common in interviews. Less so in Python circles because of the builtin random module. Still advanced examples may come up. First is a generalization of shuffle and sample.

In [1]:
importitertoolsimportrandomdefshuffled(iterable):"""Generate values in random order for any iterable.    Faster than `random.shuffle` if not all values are required.    More flexible than `random.sample` if the desired number is unknown a priori."""values=list(iterable)whilevalues:index=random.randrange(0,len(values))values[index],values[-1]=values[-1],values[index]yieldvalues.pop()list(itertools.islice(shuffled(range(10)),5))
Out[1]:
[9, 7, 5, 4, 6]

Next up is a random sample in a single pass, e.g., if the data is being read from a large file. The solution requires mathematical induction:

  • each Nth element has a fair chance of being selected
  • each previously selected element has a fair chance of being removed
In [2]:
defsample(iterable,k):"""Return a random sample from any iterable in a single pass.    More memory efficient than `random.sample`."""it=iter(iterable)selection=list(itertools.islice(it,k))# error handling and shuffling are consistent with random.sampleifnot0<=k<=len(selection):raiseValueError("sample larger than population")random.shuffle(selection)forcount,valueinenumerate(it,k+1):index=random.randrange(0,count)ifindex<len(selection):selection[index]=valuereturnselectionsample(iter(range(10)),5)
Out[2]:
[1, 2, 6, 3, 8]

Viewing all articles
Browse latest Browse all 22911

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>