Welcome to python-isal’s documentation!
Introduction
Faster zlib and gzip compatible compression and decompression by providing Python bindings for the ISA-L library.
This package provides Python bindings for the ISA-L library. The Intel(R) Intelligent Storage Acceleration Library (ISA-L) implements several key algorithms in assembly language. This includes a variety of functions to provide zlib/gzip-compatible compression.
python-isal
provides the bindings by offering four modules:
isal_zlib
: A drop-in replacement for the zlib module that uses ISA-L to accelerate its performance.igzip
: A drop-in replacement for the gzip module that usesisal_zlib
instead ofzlib
to perform its compression and checksum tasks, which improves performance.igzip_threaded
offers anopen
function which returns buffered read or write streams that can be used to read and write large files while escaping the GIL using one or multiple threads. This functionality only works for streaming, seeking is not supported.igzip_lib
: Provides compression functions which have full access to the API of ISA-L’s compression functions.
isal_zlib
and igzip
are almost fully compatible with zlib
and
gzip
from the Python standard library. There are some minor differences
see: differences-with-zlib-and-gzip-modules.
Quickstart
The python-isal modules can be imported as follows
from isal import isal_zlib
from isal import igzip
from isal import igzip_lib
isal_zlib
and igzip
are meant to be used as drop in replacements so
their api and functions are the same as the stdlib’s modules. Except where
ISA-L does not support the same calls as zlib (See differences below).
A full API documentation can be found on our readthedocs page.
python -m isal.igzip
implements a simple gzip-like command line
application (just like python -m gzip
). Full usage documentation can be
found on our readthedocs page.
Installation
Installation with pip
pip install isal
Installation is supported on Linux, MacOS and Windows. On most platforms wheels are provided. The installation will include a staticallly linked version of ISA-L. If a wheel is not provided for your system the installation will build ISA-L first in a temporary directory. Please check the ISA-L homepage for the build requirements.
The latest development version of python-isal can be installed with:
pip install git+https://github.com/rhpvorderman/python-isal.git
This requires having the build requirements installed. If you wish to link dynamically against a version of libisal installed on your system use:
PYTHON_ISAL_LINK_DYNAMIC=true pip install isal --no-binary isal
ISA-L is available in numerous Linux distro’s as well as on conda via the conda-forge channel. Checkout the ports documentation on the ISA-L project wiki to find out how to install it. It is important that the development headers are also installed.
On Debian and Ubuntu the ISA-L libraries (including the development headers) can be installed with:
sudo apt install libisal-dev
Installation via conda
Python-isal can be installed via conda, for example using the miniconda installer with a properly setup conda-forge channel. When used with bioinformatics tools setting up bioconda provides a clear set of installation instructions for conda.
python-isal is available on conda-forge and can be installed with:
conda install python-isal
This will automatically install the ISA-L library dependency as well, since it is available on conda-forge.
python-isal as a dependency in your project
Python-isal supports a limited amount of platforms for which wheels have been made available. To prevent your users from running into issues when installing your project please list a python-isal dependency as follows.
setup.cfg
:
install_requires =
isal; platform.machine == "x86_64" or platform.machine == "AMD64" or platform.machine == "aarch64"
setup.py
:
extras_require={
":platform.machine == 'x86_64' or platform.machine == 'AMD64' or platform.machine == 'aarch64'": ['isal']
},
Differences with zlib and gzip modules
Compression level 0 in
zlib
andgzip
means no compression, while inisal_zlib
andigzip
this is the lowest compression level. This is a design choice that was inherited from the ISA-L library.Compression levels range from 0 to 3, not 1 to 9.
isal_zlib.Z_DEFAULT_COMPRESSION
has been aliased toisal_zlib.ISAL_DEFAULT_COMPRESSION
(2).isal_zlib
only supportsNO_FLUSH
,SYNC_FLUSH
,FULL_FLUSH
andFINISH_FLUSH
. Other flush modes are not supported and will raise errors.zlib.Z_DEFAULT_STRATEGY
,zlib.Z_RLE
etc. are exposed asisal_zlib.Z_DEFAULT_STRATEGY
,isal_zlib.Z_RLE
etc. for compatibility reasons. However,isal_zlib
only supports a default strategy and will give warnings when other strategies are used.zlib
supports different memory levels from 1 to 9 (with 8 default).isal_zlib
supports memory levels smallest, small, medium, large and largest. These have been mapped to levels 1, 2-3, 4-6, 7-8 and 9. Soisal_zlib
can be used with zlib compatible memory levels.igzip.open
returns a classIGzipFile
instead ofGzipFile
. Since the compression levels are not compatible, a difference in naming was chosen to reflect this.igzip.GzipFile
does exist as an alias ofigzip.IGzipFile
for compatibility reasons.igzip._GzipReader
has been rewritten in C. Since this is a private member it should not affect compatibility, but it may cause some issues for instances where this code is used directly. If such issues should occur, please report them so the compatibility issues can be fixed.
API Documentation: isal_zlib
The functions in this module allow compression and decompression using the zlib library, which is based on GNU zip.
adler32(string[, start]) – Compute an Adler-32 checksum.
compress(data[, level]) – Compress data, with compression level 0-9 or -1.
compressobj([level[, …]]) – Return a compressor object.
crc32(string[, start]) – Compute a CRC-32 checksum.
decompress(string,[wbits],[bufsize]) – Decompresses a compressed string.
decompressobj([wbits[, zdict]]) – Return a decompressor object.
‘wbits’ is window buffer size and container format.
Compressor objects support compress() and flush() methods; decompressor objects support decompress() and flush().
- class isal.isal_zlib.Compress
Object returned by isal_zlib.compressobj
- compress(data, /)
Returns a bytes object containing compressed data.
- data
Binary data to be compressed.
After calling this function, some of the input data may still be stored in internal buffers for later processing. Call the flush() method to clear these buffers.
- flush(mode=4, /)
Return a bytes object containing any remaining compressed data.
- mode
One of the constants Z_SYNC_FLUSH, Z_FULL_FLUSH, Z_FINISH. If mode == Z_FINISH, the compressor object can no longer be used after calling the flush() method. Otherwise, more data can still be compressed.
- class isal.isal_zlib.Decompress
Object returned by isal_zlib.compressobj.
- decompress(data, /, max_length=0)
Return a bytes object containing the decompressed version of the data.
- data
The binary data to decompress.
- max_length
The maximum allowable length of the decompressed data. Unconsumed input data will be stored in the unconsumed_tail attribute.
After calling this function, some of the input data may still be stored in internal buffers for later processing. Call the flush() method to clear these buffers.
- eof
True if the end-of-stream marker has been reached.
- flush(length=16384, /)
Return a bytes object containing any remaining decompressed data.
- length
the initial size of the output buffer.
- unconsumed_tail
A bytes object that contains any data that was not consumed by the last decompress() call because it exceeded the limit for the uncompressed data buffer. This data has not yet been seen by the zlib machinery, so you must feed it (possibly with further data concatenated to it) back to a subsequent decompress() method call in order to get correct output.
- unused_data
Data found after the end of the compressed stream.
- isal.isal_zlib.adler32(data, value=1, /)
Compute an Adler-32 checksum of data.
- value
Starting value of the checksum.
The returned checksum is an integer.
- isal.isal_zlib.compress(data, /, level=2, wbits=15)
Returns a bytes object containing compressed data.
- data
Binary data to be compressed.
- level
Compression level, in 0-3.
- wbits
The window buffer size and container format.
- isal.isal_zlib.compressobj(level=2, method=8, wbits=15, memLevel=8, strategy=0, zdict=None)
Return a compressor object.
- level
The compression level (an integer in the range 0-3; default is currently equivalent to 2). Higher compression levels are slower, but produce smaller results.
- method
The compression algorithm. If given, this must be DEFLATED.
- wbits
+9 to +15: The base-two logarithm of the window size. Include a zlib container.
-9 to -15: Generate a raw stream.
+25 to +31: Include a gzip container.
- memLevel
Controls the amount of memory used for internal compression state. Valid values range from 1 to 9. Higher values result in higher memory usage, faster compression, and smaller output.
- strategy
Used to tune the compression algorithm. Not supported by ISA-L. Only a default strategy is used.
- zdict
The predefined compression dictionary - a sequence of bytes containing subsequences that are likely to occur in the input data.
- isal.isal_zlib.crc32(data, value=0, /)
Compute a CRC-32 checksum of data.
- value
Starting value of the checksum.
The returned checksum is an integer.
- isal.isal_zlib.crc32_combine()
Combine crc1 and crc2 into a new crc that is accurate for the combined data blocks that crc1 and crc2 where calculated from.
- crc1
the first crc32 checksum
- crc2
the second crc32 checksum
- crc2_length
the length of the data block crc2 was calculated from
- isal.isal_zlib.decompress(data, /, wbits=15, bufsize=16384)
Returns a bytes object containing the uncompressed data.
- data
Compressed data.
- wbits
The window buffer size and container format.
- bufsize
The initial output buffer size.
- isal.isal_zlib.decompressobj(wbits=15, zdict=b'')
Return a decompressor object.
- wbits
The window buffer size and container format.
- zdict
The predefined compression dictionary. This must be the same dictionary as used by the compressor that produced the input data.
API-documentation: igzip
Similar to the stdlib gzip module. But using the Intel Storage Accelaration Library to speed up its methods.
- class isal.igzip.IGzipFile(filename=None, mode=None, compresslevel=2, fileobj=None, mtime=None)
The IGzipFile class simulates most of the methods of a file object with the exception of the truncate() method.
This class only supports opening files in binary mode. If you need to open a compressed file in text mode, use the gzip.open() function.
- __init__(filename=None, mode=None, compresslevel=2, fileobj=None, mtime=None)
Constructor for the IGzipFile class.
At least one of fileobj and filename must be given a non-trivial value.
The new class instance is based on fileobj, which can be a regular file, an io.BytesIO object, or any other object which simulates a file. It defaults to None, in which case filename is opened to provide a file object.
When fileobj is not None, the filename argument is only used to be included in the gzip file header, which may include the original filename of the uncompressed file. It defaults to the filename of fileobj, if discernible; otherwise, it defaults to the empty string, and in this case the original filename is not included in the header.
The mode argument can be any of ‘r’, ‘rb’, ‘a’, ‘ab’, ‘w’, ‘wb’, ‘x’, or ‘xb’ depending on whether the file will be read or written. The default is the mode of fileobj if discernible; otherwise, the default is ‘rb’. A mode of ‘r’ is equivalent to one of ‘rb’, and similarly for ‘w’ and ‘wb’, ‘a’ and ‘ab’, and ‘x’ and ‘xb’.
The compresslevel argument is an integer from 0 to 3 controlling the level of compression; 0 is fastest and produces the least compression, and 3 is slowest and produces the most compression. Unlike gzip.GzipFile 0 is NOT no compression. The default is 2.
The mtime argument is an optional numeric timestamp to be written to the last modification time field in the stream when compressing. If omitted or None, the current time is used.
- write(data)
Write buffer b to the IO stream.
Return the number of bytes written, which is always the length of b in bytes.
Raise BlockingIOError if the buffer is full and the underlying raw stream cannot accept more data at the moment.
- exception isal.igzip.BadGzipFile
Exception raised in some cases for invalid gzip files.
- isal.igzip.compress(data, compresslevel: int = 3, *, mtime: SupportsInt | None = None) bytes
Compress data in one shot and return the compressed string. Optional argument is the compression level, in range of 0-3.
- isal.igzip.decompress(data)
Decompress a gzip compressed string in one shot. Return the decompressed string.
This function checks for extra gzip members. Using isal_zlib.decompress(data, wbits=31) is faster in cases where only one gzip member is guaranteed to be present.
- isal.igzip.open(filename, mode='rb', compresslevel=2, encoding=None, errors=None, newline=None)
Open a gzip-compressed file in binary or text mode. This uses the isa-l library for optimized speed.
The filename argument can be an actual filename (a str or bytes object), or an existing file object to read from or write to.
The mode argument can be “r”, “rb”, “w”, “wb”, “x”, “xb”, “a” or “ab” for binary mode, or “rt”, “wt”, “xt” or “at” for text mode. The default mode is “rb”, and the default compresslevel is 2.
For binary mode, this function is equivalent to the GzipFile constructor: GzipFile(filename, mode, compresslevel). In this case, the encoding, errors and newline arguments must not be provided.
For text mode, a GzipFile object is created, and wrapped in an io.TextIOWrapper instance with the specified encoding, error handling behavior, and line ending(s).
API-documentation: igzip_threaded
- isal.igzip_threaded.open(filename, mode='rb', compresslevel=2, encoding=None, errors=None, newline=None, *, threads=1, block_size=1048576)
Utilize threads to read and write gzip objects and escape the GIL. Comparable to gzip.open. This method is only usable for streamed reading and writing of objects. Seeking is not supported.
threads == 0 will defer to igzip.open. A threads < 0 will attempt to use the number of threads in the system.
- Parameters:
filename – str, bytes or file-like object (supporting read or write method)
mode – the mode with which the file should be opened.
compresslevel – Compression level, only used for gzip writers.
encoding – Passed through to the io.TextIOWrapper, if applicable.
errors – Passed through to the io.TextIOWrapper, if applicable.
newline – Passed through to the io.TextIOWrapper, if applicable.
threads – If 0 will defer to igzip.open, if < 0 will use all threads available to the system. Reading gzip can only use one thread.
block_size – Determines how large the blocks in the read/write queues are for threaded reading and writing.
- Returns:
An io.BufferedReader, io.BufferedWriter, or io.TextIOWrapper, depending on the mode.
API Documentation: igzip_lib
Pythonic interface to ISA-L’s igzip_lib.
This module comes with the following constants:
|
The lowest compression level (0) |
|
The highest compression level (3) |
|
The compromise compression level (2) |
|
Default size for the starting buffer (16K) |
|
Maximum window size bits (15). |
|
Flag to compress to a raw deflate block |
|
Flag to compress a gzip block, consisting of a gzip header, a raw
deflate block and a gzip trailer.
|
|
Flag to compress a gzip block without a header. |
|
Flag to compress a zlib block, consisting of a zlib header, a raw
deflate block and a zlib trailer.
|
|
Flag to compress a zlib block without a header. |
|
Flag to decompress a raw deflate block. |
|
Flag to decompress a gzip block including header and verify the
checksums in the trailer.
|
|
Decompresses a raw deflate block (no header, no trailer) and
updates the crc member on the IgzipDecompressor object with a
crc32 checksum.
|
|
Like DECOMP_GZIP_NO_HDR but reads the trailer and verifies the
crc32 checksum.
|
|
Flag to decompress a zlib block including header and verify the
checksums in the trailer.
|
|
Decompresses a raw deflate block (no header, no trailer) and
updates the crc member on the IgzipDecompressor object with an
adler32 checksum.
|
|
Like DECOMP_ZLIB_NO_HDR but reads the trailer and verifies the
adler32 checksum.
|
|
The default memory level for the internal level buffer.
(Equivalent to MEM_LEVEL_LARGE.)
|
|
The minimum memory level. |
|
|
|
|
|
|
|
The largest memory level. |
- class isal.igzip_lib.IgzipDecompressor(flag=0, hist_bits=15, zdict=b'')
Create a decompressor object for decompressing data incrementally.
- flag
Flag signifying which headers and trailers the stream has.
- hist_bits
The lookback distance is 2 ^ hist_bits.
- zdict
Dictionary used for decompressing the data
For one-shot decompression, use the decompress() function instead.
- crc
The checksum that is saved if DECOMP_ZLIB* or DECOMP_GZIP* flags are used.
- decompress(data, /, max_length=-1)
Decompress data, returning uncompressed data as bytes.
If max_length is nonnegative, returns at most max_length bytes of decompressed data. If this limit is reached and further output can be produced, self.needs_input will be set to
False
. In this case, the next call to decompress() may provide data as b’’ to obtain more of the output.If all of the input data was decompressed and returned (either because this was less than max_length bytes, or because max_length was negative), self.needs_input will be set to True.
Attempting to decompress data after the end of stream is reached raises an EOFError. Any data found after the end of the stream is ignored and saved in the unused_data attribute.
- eof
True if the end-of-stream marker has been reached.
- needs_input
True if more input is needed before more decompressed data can be produced.
- unused_data
Data found after the end of the compressed stream.
- isal.igzip_lib.compress(data, /, level=2, flag=0, mem_level=0, hist_bits=15)
Returns a bytes object containing compressed data.
- data
Binary data to be compressed.
- level
Compression level, in 0-3.
- flag
Controls which header and trailer are used.
- mem_level
Sets the memory level for the memory buffer. Larger buffers improve performance.
- hist_bits
Sets the size of the view window. The size equals 2^hist_bits. Similar to zlib wbits value except that the header and trailer are controlled by the flag parameter.
- isal.igzip_lib.decompress(data, /, flag=0, hist_bits=15, bufsize=16384)
Returns a bytes object containing the uncompressed data.
- data
Compressed data.
- flag
The container format.
- hist_bits
The window buffer size.
- bufsize
The initial output buffer size.
python -m isal.igzip usage
A simple command line interface for the igzip module. Acts like igzip.
usage: python -m isal.igzip [-h] [-0 | -1 | -2 | -3 | -d] [-c | -o OUTPUT]
[-n] [-f]
[file]
Positional Arguments
- file
Named Arguments
- -0, --fast
use compression level 0 (fastest)
- -1
use compression level 1
- -2
use compression level 2 (default)
- -3, --best
use compression level 3 (best)
- -d, --decompress
Decompress the file instead of compressing.
Default:
True
- -c, --stdout
write on standard output
Default:
False
- -o, --output
Write to this output file
- -n, --no-name
do not save or restore the original name and timestamp
Default:
False
- -f, --force
Overwrite output without prompting
Default:
False
Contributing
Please make a PR or issue if you feel anything can be improved. Bug reports are also very welcome. Please report them on the github issue tracker.
Development
The repository needs to be cloned recursively to make sure the
ISA-L repository is checked out:
git clone --recursive https://github.com/pycompression/python-isal.git
. If
the repository is already checked out you can use git submodule update --init
.
Patches should be made on a feature branch. To run the testing install tox
with pip install tox
and run the commands tox -e lint
and
tox
. That will run most of the testing that is also performed by the CI.
For changes to the documentation run tox -e docs
. For changes to the C
code please also run tox -e asan
to check for memory leaks. This requires
libasan to be installed.
Building requires the ISA-L build requirements as well.
Acknowledgements
This project builds upon the software and experience of many. Many thanks to:
The ISA-L contributors for making ISA-L. Special thanks to @gbtucker for always being especially helpful and responsive.
The Cython contributors for making it easy to create an extension and helping a novice get start with pointer addresses.
The CPython contributors. Python-isal mimicks
zlibmodule.c
andgzip.py
from the standard library to make it easier for python users to adopt it.@marcelm for taking a chance on this project and make it a dependency for his xopen and by extension cutadapt projects. This gave python-isal its first users who used python-isal in production.
Mark Adler (@madler) for the excellent comments in his pigz code which made it very easy to replicate the behaviour for writing gzip with multiple threads using the
threading
andisal_zlib
modules. Another thanks for his permissive license, which allowed the crc32_combine code to be included in the project. (ISA-L does not provide a crc32_combine function, unlike zlib.) And yet another thanks to Mark Adler and also for Jean-loup Gailly for creating the gzip format which is very heavily used in bioinformatics. Without that, I would have never written this library from which I have learned so much.The github actions team for creating the actions CI service that enables building and testing on all three major operating systems.
@animalize for explaining how to test and build python-isal for ARM 64-bit platforms.
And last but not least: everyone who submitted a bug report or a feature request. These make the project better!
Python-isal would not have been possible without you!
Changelog
version 1.7.2
Use upstream ISA-L version 2.31.1 which includes patches to make installation on MacOS ARM64 possible.
Fix a bug where bytes were copied in the wrong order on big endian architectures. Fixes test failures on s390x.
Enable building on GNU/Hurd platforms.
version 1.7.1
Fix a bug where flushing files when writing in threaded mode did not work properly.
Prevent threaded opening from blocking python exit when an error is thrown in the calling thread.
version 1.7.0
Include a patched ISA-L version 2.31. The applied patches make compilation and wheelbuilding on MacOS ARM64 possible.
Fix a bug where READ and WRITE in isal.igzip were inconsistent with the values in gzip on Python 3.13
Small simplifications to the
igzip.compress
function, which should lead to less overhead.
version 1.6.1
Fix a bug where streams that were passed to igzip_threaded.open where closed.
version 1.6.0
Fix a bug where compression levels for IGzipFile where checked in read mode.
Update statically linked ISA-L release to 2.31.0
Fix an error that occurred in the
__close__
function when a threaded writer was initialized with incorrect parameters.
version 1.5.3
Fix a bug where append mode would not work when using
igzip_threaded.open
.
version 1.5.2
Fix a bug where a filehandle remained opened when
igzip_threaded.open
was used for writing with a wrong compression level.Fix a memory leak that occurred when an error was thrown for a gzip header with the wrong magic numbers.
Fix a memory leak that occurred when isal_zlib.decompressobj was given a wrong wbits value.
version 1.5.1
Fix a memory leak in the GzipReader.readall implementation.
version 1.5.0
Make a special case for threads==1 in
igzip_threaded.open
for writing files. This now combines the writing and compression thread for less overhead.Maximize time spent outside the GIL for
igzip_threaded.open
writing. This has decreased wallclock time significantly.
version 1.4.1
Fix several errors related to unclosed files and buffers.
version 1.4.0
Drop support for python 3.7 and PyPy 3.8 as these are no longer supported. Add testing and support for python 3.12 and PyPy 3.10.
Added an experimental
isal.igzip_threaded
module which has anopen
function. This can be used to read and write large files in a streaming fashion while escaping the GIL.The internal
igzip._IGzipReader
has been rewritten in C. As a result the overhead of decompressing files has significantly been reduced andpython -m isal.igzip
is now very close to the Cigzip
application.The
igzip._IGZipReader
in C is now used inigzip.decompress
. The_GzipReader
also can read from objects that support the buffer protocol. This has reduced overhead significantly.
version 1.3.0
Gzip headers are now actively checked for a BGZF extra field. If found the block size is taken into account when decompressing. This has further improved bgzf decompression speed by 5% on some files compared to the more generic solution of 1.2.0.
Integrated CPython 3.11 code for reading gzip headers. This leads to more commonality between the python-isal code and the upstream gzip.py code. This has enabled the change above. It comes at the cost of a slight increase in overhead at the
gzip.decompress
function.
version 1.2.0
Bgzip files are now detected and a smaller reading buffer is used to accomodate the fact that bgzip blocks are typically less than 64K. (Unlike normal gzip files that consist of one block that spans the entire file.) This has reduced decompression time for bgzip files by roughly 12%.
Speed-up source build by using ISA-L Unix-specific makefile rather than the autotools build.
Simplify build setup. ISA-L release flags are now used and not overwritten with python release flags when building the included static library.
Fix bug where zdict’s could not be set for
isal_zlib.decompressobj
andigzip_lib.IgzipDecompressor
.Escape GIL when calling inflate, deflate, crc32 and adler32 functions just like in CPython. This allows for utilising more CPU cores in combination with the threading module. This comes with a very slight cost in efficiency for strict single-threaded applications.
version 1.1.0
Added tests and support for Python 3.11.
version 1.0.1
Fixed failing tests and wheel builds for PyPy.
version 1.0.0
Python-isal has been rewritten as a C-extension (first implementation was in Cython). This has made the library faster in many key areas.
Since the module now mostly contains code copied from CPython and then modified to work with ISA-L the license has been changed to the Python Software Foundation License version 2.
Python versions lower than 3.7 are no longer supported. Python 3.6 is out of support since December 2021.
Stub files with type information have now been updated to correctly display positional-only arguments.
Expose
READ
andWRITE
constants on theigzip
module. These are also present in Python’s stdlibgzip
module and exposing them allows for better drop-in capability ofigzip
. Thanks to @alexander-beedie in https://github.com/pycompression/python-isal/pull/115.A
--no-name
flag has been added topython -m isal.igzip
.Reduced wheel size by not including debug symbols in the binary. Thanks to @marcelm in https://github.com/pycompression/python-isal/pull/108.
Cython is no longer required as a build dependency.
isal_zlib.compressobj and isal_zlib.decompressobj are now about six times faster.
igzip.decompress has 30% less overhead when called.
Error structure has been simplified. There is only
IsalError
which hasException
as baseclass instead ofOSError
.isal_zlib.IsalError
,igzip_lib.IsalError
,isal_zlib.error
andigzip_lib.error
are all aliases of the same error class.GzipReader now uses larger input and output buffers (128k) by default and IgzipDecompressor.decompress has been updated to allocate
maxsize
buffers when these are of reasonable size, instead of growing the buffer to maxsize on every call. This has improved gzip decompression speeds by 7%.Patch statically linked included library (ISA-L 2.30.0) to fix the following:
ISA-L library version variables are now available on windows as well, for the statically linked version available on PyPI.
Wheels are now always build with nasm for the x86 architecture. Previously yasm was used for Linux and MacOS due to build issues.
Fixed a bug upstream in ISA-L were zlib headers would be created with an incorrect wbits value.
Python-isal shows up in Python profiler reports.
Support and tests for Python 3.10 were added.
Due to a change in the deployment process wheels should work for older versions of pip.
Added a
crc
property to the IgzipDecompressor class. Depending on the decompression flag chosen, this will update with an adler32 or crc32 checksum.All the decompression NO_HDR flags on igzip_lib were incorrectly documented. This is now fixed.
version 0.11.1
Fixed an issue which occurred rarely that caused IgzipDecompressor’s unused_data to report back incorrectly. This caused checksum errors when reading gzip files. The issue was more likely to trigger in multi-member gzip files.
version 0.11.0
In this release the python -m isal.igzip
relatively slow decompression rate
has been improved in both speed and usability. Previously it was 19% slower
than igzip
when used with the -d
flag for decompressing, now it is
just 8% slower. Also some extra flags were added to make it easier to select
the output file.
Prompt when an output file is overwritten with the
python -m isal.igzip
command line utility and provide the-f
or--force
flags to force overwriting.Added
-o
and--output
flags to thepython -m isal.igzip
command line utility to allow the user to select the destination of the output file.Reverse a bug in the build system which caused some docstring and parameter information on
igzip_lib
andisal_zlib
to disappear in the documentation and the REPL.Increase the buffer size for
python -m isal.igzip
so it is now closer to speeds reached withigzip
.Add a
READ_BUFFER_SIZE
attribute toigzip
which allows setting the amount of raw data that is read at once.Add an
igzip_lib.IgzipDecompressor
object which can decompress without using an unconsumed_tail and is therefore more efficient.
version 0.10.0
Added an
igzip_lib
module which allows more direct access to ISA-L’s igzip_lib API. This allows features such as headerless compression and decompression, as well as setting the memory levels manually.Added more extensive documentation.
version 0.9.0
Fix a bug where a AttributeError was triggered when zlib.Z_RLE or zlib.Z_FIXED were not present.
Add support for Linux aarch64 builds.
Add support for pypy by adding pypy tests to the CI and setting up wheel building support.
version 0.8.1
Fix a bug where multi-member gzip files where read incorrectly due to an offset error. This was caused by ISA-L’s decompressobj having a small bitbuffer which was not taken properly into account in some circumstances.
version 0.8.0
Speed up
igzip.compress
andigzip.decompress
by improving the implementation.Make sure compiler arguments are passed to ISA-L compilation step. Previously ISA-L was compiled without optimisation steps, causing the statically linked library to be significantly slower.
A unused constant from the
isal_zlib
library was removed:ISAL_DEFAULT_HIST_BITS
.Refactor isal_zlib.pyx to work almost the same as zlibmodule.c. This has made the code look cleaner and has reduced some overhead.
version 0.7.0
Remove workarounds in the
igzip
module for theunconsumed_tail
andunused_data
bugs.igzip._IGzipReader
now functions the same asgzip._GzipReader
with only a few calls replaced withisal_zlib
calls for speed.Correctly implement
unused_data
andunconsumed_tail
onisal_zlib.Decompress
objects. It works the same as in CPython’s zlib now.Correctly implement flush implementation on
isal_zlib.Compress
andisal_zlib.Decompress
objects. It works the same as in CPython’s zlib now.
version 0.6.1
Fix a crash that occurs when opening a file that did not end in
.gz
while outputting to stdout usingpython -m isal.igzip
.
version 0.6.0
python -m gzip
’s behaviour has been changed since fixing bug: bpo-43316. This bug was not present inpython -m isal.igzip
but it handled the error differently than the solution in CPython. This is now corrected andpython -m isal.igzip
handles the error the same as the fixedpython -m gzip
.Installation on Windows is now supported. Wheels are provided for Windows as well.
version 0.5.0
Fix a bug where negative integers were not allowed for the
adler32
andcrc32
functions inisal_zlib
.Provided stubs (type-hint files) for
isal_zlib
and_isal
modules. Package is now tested with mypy to ensure correct type information.The command-line interface now reads in blocks of 32K instead of 8K. This improves performance by about 6% when compressing and 11% when decompressing. A hidden
-b
flag was added to adjust the buffer size for benchmarks.A
-c
or--stdout
flag was added to the CLI interface of isal.igzip. This allows it to behave more like thegzip
orpigz
command line interfaces.
version 0.4.0
Move wheel building to cibuildwheel on github actions CI. Wheels are now provided for Mac OS as well.
Make a tiny change in setup.py so python-isal can be build on Mac OS X.
version 0.3.0
Set included ISA-L library at version 2.30.0.
Python-isal now comes with a source distribution of ISA-L in its source distribution against which python-isal is linked statically upon installation by default. Dynamic linking against system libraries is now optional. Wheels with the statically linked ISA-L are now provided on PyPI.
version 0.2.0
Fixed a bug where writing of the gzip header would crash if an older version of Python 3.7 was used such as on Debian or Ubuntu. This is due to differences between point releases because of a backported feature. The code now checks if the backported feature is present.
Added Python 3.9 to the testing.
Fixed
setup.py
to list setuptools as a requirement.Changed homepage to reflect move to pycompression organization.
version 0.1.0
Publish API documentation on readthedocs.
Add API documentation.
Ensure the igzip module is fully compatible with the gzip stdlib module.
Add compliance tests from CPython to ensure isal_zlib and igzip are validated to the same standards as the zlib and gzip modules.
Added a working gzip app using
python -m isal.igzip
Add test suite that tests all possible settings for functions on the isal_zlib module.
Create igzip module which implements all gzip functions and methods.
Create isal_zlib module which implements all zlib functions and methods.