How can you read binary files in Python? And how can you read very large binary files in small chunks?

Table of contents
How to read a binary file in Python
If we try to read a zip file using the built-in open function in Python using the default read mode, we'll get an error:
>>> withopen("exercises.zip")aszip_file:... contents=zip_file.read()...Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/lib/python3.10/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final)UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8e in position 11: invalid start byte>>> withopen("exercises.zip")aszip_file:... contents=zip_file.read()...Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/lib/python3.10/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final)UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8e in position 11: invalid start byteWe get an error because zip files aren't text files, they're binary files.
To read from a binary file, we need to open it with the mode rb instead of the default mode of rt:
>>> withopen("exercises.zip",mode="rb")aszip_file:... contents=zip_file.read()...>>> withopen("exercises.zip",mode="rb")aszip_file:... contents=zip_file.read()...When you read from a binary file, you won't get back strings.
You'll get back a bytes object, also known as a byte string:
>>> withopen("exercises.zip",mode="rb")aszip_file:... contents=zip_file.read()...>>> type(contents)<class 'bytes'>>>> contents[:20]b'PK\x03\x04\n\x00\x00\x00\x00\x00Y\x8e\x84T\x00\x00\x00\x00\x00\x00'>>> withopen("exercises.zip",mode="rb")aszip_file:... contents=zip_file.read()...>>> type(contents)<class 'bytes'>>>> contents[:20]b'PK\x03\x04\n\x00\x00\x00\x00\x00Y\x8e\x84T\x00\x00\x00\x00\x00\x00'Byte strings don't have characters in them: they have bytes in them.
The bytes in a file won't help us very much unless we understand what they mean.
Use a library to read your binary file
You probably won't read a …