Custom zlib options
ZIP_64 methods, you can customise the compression object by overriding the default
get_compressobj parameter, which is shown below.
for zipped_chunk in stream_zip(unzipped_files(), get_compressobj=lambda: zlib.compressobj(wbits=-zlib.MAX_WBITS, level=9)):
If you wish to disable compression entirely for these methods, you can pass
level=0 in the above. There is no way to customize the zlib object for the
ZIP_AUTO method, other than passing
level into it. See Methods for details and other ways to not compress member files.
Custom chunk size
bytes instance size is 65536 bytes. To customise this, you can override the
for zipped_chunk in stream_zip(unzipped_files(), chunk_size=65536):
This one size is used both for input - splitting or gathering any uncompressed data into
chunk_size bytes before attempting to compress it, and in output - splitting or gathering any compressed data into
chunk_size bytes before returning it to client code.
There may be performance differences with a different
chunk_size values. The default chunk_size may not be optimal for your use case.
By default so-called extended timestamps are included in the ZIP, which store the modification time of member files more accurately than the original ZIP format allows. To omit the extended timestamps, you can pass
for zipped_chunk in stream_zip(unzipped_files(), extended_timestamps=False):
This is useful to keep the total number of bytes down as much as possible. This is also useful when creating Open Document files using
stream_zip. Open Document files cannot have extended timestamps in their member files if they are to pass validation.
Password protection / encryption
The data of ZIP files can be password protected / encrypted by passing a password as the
password parameter to
stream_zip. This encrypts the data with AES-256, adhering to the WinZip AE-2 specification.
password = secrets.token_urlsafe(32)
encrypted_zipped_chunks = stream_zip(member_files(), password=password)
You should use a long and random password, for example one generated by the Python secrets module.
While AE-2 is seen as more secure than ZipCrypto, the original mechanism of password protecting ZIP files, fewer clients support AE-2 than ZipCrypto.
More importantly, AE-2 has flaws. These include:
Not encrypting metadata, for example member file names, modification times, permissions, and sizes.
Not including sufficient mechanisms to alert recipients if data or metadata has been intercepted and changed. This can itself lead to information leakage.
A higher risk of information leakage when there’s a higher number of member files in the ZIP encrypted with the same password, as stream-zip does. Although AE-2 with AES-256 likely mitigates this enough for all situations but the extremely risk averse that also have an extremely high number of member files.