Welcome to python-isal’s documentation!

Introduction

Faster zlib and gzip compatible compression and decompression by providing Python bindings for the ISA-L library.

This package provides Python bindings for the ISA-L library. The Intel(R) Intelligent Storage Acceleration Library (ISA-L) implements several key algorithms in assembly language. This includes a variety of functions to provide zlib/gzip-compatible compression.

python-isal provides the bindings by offering four modules:

  • isal_zlib: A drop-in replacement for the zlib module that uses ISA-L to accelerate its performance.

  • igzip: A drop-in replacement for the gzip module that uses isal_zlib instead of zlib to perform its compression and checksum tasks, which improves performance.

  • igzip_threaded offers an open function which returns buffered read or write streams that can be used to read and write large files while escaping the GIL using one or multiple threads. This functionality only works for streaming, seeking is not supported.

  • igzip_lib: Provides compression functions which have full access to the API of ISA-L’s compression functions.

isal_zlib and igzip are almost fully compatible with zlib and gzip from the Python standard library. There are some minor differences see: differences-with-zlib-and-gzip-modules.

Quickstart

The python-isal modules can be imported as follows

from isal import isal_zlib
from isal import igzip
from isal import igzip_lib

isal_zlib and igzip are meant to be used as drop in replacements so their api and functions are the same as the stdlib’s modules. Except where ISA-L does not support the same calls as zlib (See differences below).

A full API documentation can be found on our readthedocs page.

python -m isal.igzip implements a simple gzip-like command line application (just like python -m gzip). Full usage documentation can be found on our readthedocs page.

Installation

Installation with pip

pip install isal

Installation is supported on Linux, MacOS and Windows. On most platforms wheels are provided. The installation will include a staticallly linked version of ISA-L. If a wheel is not provided for your system the installation will build ISA-L first in a temporary directory. Please check the ISA-L homepage for the build requirements.

The latest development version of python-isal can be installed with:

pip install git+https://github.com/rhpvorderman/python-isal.git

This requires having the build requirements installed. If you wish to link dynamically against a version of libisal installed on your system use:

PYTHON_ISAL_LINK_DYNAMIC=true pip install isal --no-binary isal

ISA-L is available in numerous Linux distro’s as well as on conda via the conda-forge channel. Checkout the ports documentation on the ISA-L project wiki to find out how to install it. It is important that the development headers are also installed.

On Debian and Ubuntu the ISA-L libraries (including the development headers) can be installed with:

sudo apt install libisal-dev

Installation via conda

Python-isal can be installed via conda, for example using the miniconda installer with a properly setup conda-forge channel. When used with bioinformatics tools setting up bioconda provides a clear set of installation instructions for conda.

python-isal is available on conda-forge and can be installed with:

conda install python-isal

This will automatically install the ISA-L library dependency as well, since it is available on conda-forge.

python-isal as a dependency in your project

Python-isal supports a limited amount of platforms for which wheels have been made available. To prevent your users from running into issues when installing your project please list a python-isal dependency as follows.

setup.cfg:

install_requires =
    isal; platform.machine == "x86_64" or platform.machine == "AMD64" or platform.machine == "aarch64"

setup.py:

extras_require={
    ":platform.machine == 'x86_64' or platform.machine == 'AMD64' or platform.machine == 'aarch64'": ['isal']
},

Differences with zlib and gzip modules

  • Compression level 0 in zlib and gzip means no compression, while in isal_zlib and igzip this is the lowest compression level. This is a design choice that was inherited from the ISA-L library.

  • Compression levels range from 0 to 3, not 1 to 9. isal_zlib.Z_DEFAULT_COMPRESSION has been aliased to isal_zlib.ISAL_DEFAULT_COMPRESSION (2).

  • isal_zlib only supports NO_FLUSH, SYNC_FLUSH, FULL_FLUSH and FINISH_FLUSH. Other flush modes are not supported and will raise errors.

  • zlib.Z_DEFAULT_STRATEGY, zlib.Z_RLE etc. are exposed as isal_zlib.Z_DEFAULT_STRATEGY, isal_zlib.Z_RLE etc. for compatibility reasons. However, isal_zlib only supports a default strategy and will give warnings when other strategies are used.

  • zlib supports different memory levels from 1 to 9 (with 8 default). isal_zlib supports memory levels smallest, small, medium, large and largest. These have been mapped to levels 1, 2-3, 4-6, 7-8 and 9. So isal_zlib can be used with zlib compatible memory levels.

  • igzip.open returns a class IGzipFile instead of GzipFile. Since the compression levels are not compatible, a difference in naming was chosen to reflect this. igzip.GzipFile does exist as an alias of igzip.IGzipFile for compatibility reasons.

  • igzip._GzipReader has been rewritten in C. Since this is a private member it should not affect compatibility, but it may cause some issues for instances where this code is used directly. If such issues should occur, please report them so the compatibility issues can be fixed.

API Documentation: isal_zlib

The functions in this module allow compression and decompression using the zlib library, which is based on GNU zip.

  • adler32(string[, start]) – Compute an Adler-32 checksum.

  • compress(data[, level]) – Compress data, with compression level 0-9 or -1.

  • compressobj([level[, …]]) – Return a compressor object.

  • crc32(string[, start]) – Compute a CRC-32 checksum.

  • decompress(string,[wbits],[bufsize]) – Decompresses a compressed string.

  • decompressobj([wbits[, zdict]]) – Return a decompressor object.

‘wbits’ is window buffer size and container format.

Compressor objects support compress() and flush() methods; decompressor objects support decompress() and flush().

class isal.isal_zlib.Compress

Object returned by isal_zlib.compressobj

compress(data, /)

Returns a bytes object containing compressed data.

data

Binary data to be compressed.

After calling this function, some of the input data may still be stored in internal buffers for later processing. Call the flush() method to clear these buffers.

flush(mode=4, /)

Return a bytes object containing any remaining compressed data.

mode

One of the constants Z_SYNC_FLUSH, Z_FULL_FLUSH, Z_FINISH. If mode == Z_FINISH, the compressor object can no longer be used after calling the flush() method. Otherwise, more data can still be compressed.

class isal.isal_zlib.Decompress

Object returned by isal_zlib.compressobj.

decompress(data, /, max_length=0)

Return a bytes object containing the decompressed version of the data.

data

The binary data to decompress.

max_length

The maximum allowable length of the decompressed data. Unconsumed input data will be stored in the unconsumed_tail attribute.

After calling this function, some of the input data may still be stored in internal buffers for later processing. Call the flush() method to clear these buffers.

eof

True if the end-of-stream marker has been reached.

flush(length=16384, /)

Return a bytes object containing any remaining decompressed data.

length

the initial size of the output buffer.

unconsumed_tail

A bytes object that contains any data that was not consumed by the last decompress() call because it exceeded the limit for the uncompressed data buffer. This data has not yet been seen by the zlib machinery, so you must feed it (possibly with further data concatenated to it) back to a subsequent decompress() method call in order to get correct output.

unused_data

Data found after the end of the compressed stream.

isal.isal_zlib.adler32(data, value=1, /)

Compute an Adler-32 checksum of data.

value

Starting value of the checksum.

The returned checksum is an integer.

isal.isal_zlib.compress(data, /, level=2, wbits=15)

Returns a bytes object containing compressed data.

data

Binary data to be compressed.

level

Compression level, in 0-3.

wbits

The window buffer size and container format.

isal.isal_zlib.compressobj(level=2, method=8, wbits=15, memLevel=8, strategy=0, zdict=None)

Return a compressor object.

level

The compression level (an integer in the range 0-3; default is currently equivalent to 2). Higher compression levels are slower, but produce smaller results.

method

The compression algorithm. If given, this must be DEFLATED.

wbits
  • +9 to +15: The base-two logarithm of the window size. Include a zlib container.

  • -9 to -15: Generate a raw stream.

  • +25 to +31: Include a gzip container.

memLevel

Controls the amount of memory used for internal compression state. Valid values range from 1 to 9. Higher values result in higher memory usage, faster compression, and smaller output.

strategy

Used to tune the compression algorithm. Not supported by ISA-L. Only a default strategy is used.

zdict

The predefined compression dictionary - a sequence of bytes containing subsequences that are likely to occur in the input data.

isal.isal_zlib.crc32(data, value=0, /)

Compute a CRC-32 checksum of data.

value

Starting value of the checksum.

The returned checksum is an integer.

isal.isal_zlib.crc32_combine()

Combine crc1 and crc2 into a new crc that is accurate for the combined data blocks that crc1 and crc2 where calculated from.

crc1

the first crc32 checksum

crc2

the second crc32 checksum

crc2_length

the lenght of the data block crc2 was calculated from

isal.isal_zlib.decompress(data, /, wbits=15, bufsize=16384)

Returns a bytes object containing the uncompressed data.

data

Compressed data.

wbits

The window buffer size and container format.

bufsize

The initial output buffer size.

isal.isal_zlib.decompressobj(wbits=15, zdict=b'')

Return a decompressor object.

wbits

The window buffer size and container format.

zdict

The predefined compression dictionary. This must be the same dictionary as used by the compressor that produced the input data.

API-documentation: igzip

Similar to the stdlib gzip module. But using the Intel Storage Accelaration Library to speed up its methods.

class isal.igzip.IGzipFile(filename=None, mode=None, compresslevel=2, fileobj=None, mtime=None)

The IGzipFile class simulates most of the methods of a file object with the exception of the truncate() method.

This class only supports opening files in binary mode. If you need to open a compressed file in text mode, use the gzip.open() function.

__init__(filename=None, mode=None, compresslevel=2, fileobj=None, mtime=None)

Constructor for the IGzipFile class.

At least one of fileobj and filename must be given a non-trivial value.

The new class instance is based on fileobj, which can be a regular file, an io.BytesIO object, or any other object which simulates a file. It defaults to None, in which case filename is opened to provide a file object.

When fileobj is not None, the filename argument is only used to be included in the gzip file header, which may include the original filename of the uncompressed file. It defaults to the filename of fileobj, if discernible; otherwise, it defaults to the empty string, and in this case the original filename is not included in the header.

The mode argument can be any of ‘r’, ‘rb’, ‘a’, ‘ab’, ‘w’, ‘wb’, ‘x’, or ‘xb’ depending on whether the file will be read or written. The default is the mode of fileobj if discernible; otherwise, the default is ‘rb’. A mode of ‘r’ is equivalent to one of ‘rb’, and similarly for ‘w’ and ‘wb’, ‘a’ and ‘ab’, and ‘x’ and ‘xb’.

The compresslevel argument is an integer from 0 to 3 controlling the level of compression; 0 is fastest and produces the least compression, and 3 is slowest and produces the most compression. Unlike gzip.GzipFile 0 is NOT no compression. The default is 2.

The mtime argument is an optional numeric timestamp to be written to the last modification time field in the stream when compressing. If omitted or None, the current time is used.

write(data)

Write buffer b to the IO stream.

Return the number of bytes written, which is always the length of b in bytes.

Raise BlockingIOError if the buffer is full and the underlying raw stream cannot accept more data at the moment.

exception isal.igzip.BadGzipFile

Exception raised in some cases for invalid gzip files.

isal.igzip.GzipFile

alias of IGzipFile

isal.igzip.compress(data, compresslevel=3, *, mtime=None)

Compress data in one shot and return the compressed string. Optional argument is the compression level, in range of 0-3.

isal.igzip.decompress(data)

Decompress a gzip compressed string in one shot. Return the decompressed string.

This function checks for extra gzip members. Using isal_zlib.decompress(data, wbits=31) is faster in cases where only one gzip member is guaranteed to be present.

isal.igzip.open(filename, mode='rb', compresslevel=2, encoding=None, errors=None, newline=None)

Open a gzip-compressed file in binary or text mode. This uses the isa-l library for optimized speed.

The filename argument can be an actual filename (a str or bytes object), or an existing file object to read from or write to.

The mode argument can be “r”, “rb”, “w”, “wb”, “x”, “xb”, “a” or “ab” for binary mode, or “rt”, “wt”, “xt” or “at” for text mode. The default mode is “rb”, and the default compresslevel is 2.

For binary mode, this function is equivalent to the GzipFile constructor: GzipFile(filename, mode, compresslevel). In this case, the encoding, errors and newline arguments must not be provided.

For text mode, a GzipFile object is created, and wrapped in an io.TextIOWrapper instance with the specified encoding, error handling behavior, and line ending(s).

API-documentation: igzip_threaded

isal.igzip_threaded.open(filename, mode='rb', compresslevel=2, encoding=None, errors=None, newline=None, *, threads=1, block_size=1048576)

Utilize threads to read and write gzip objects and escape the GIL. Comparable to gzip.open. This method is only usable for streamed reading and writing of objects. Seeking is not supported.

threads == 0 will defer to igzip.open. A threads < 0 will attempt to use the number of threads in the system.

Parameters:
  • filename – str, bytes or file-like object (supporting read or write method)

  • mode – the mode with which the file should be opened.

  • compresslevel – Compression level, only used for gzip writers.

  • encoding – Passed through to the io.TextIOWrapper, if applicable.

  • errors – Passed through to the io.TextIOWrapper, if applicable.

  • newline – Passed through to the io.TextIOWrapper, if applicable.

  • threads – If 0 will defer to igzip.open, if < 0 will use all threads available to the system. Reading gzip can only use one thread.

  • block_size – Determines how large the blocks in the read/write queues are for threaded reading and writing.

Returns:

An io.BufferedReader, io.BufferedWriter, or io.TextIOWrapper, depending on the mode.

API Documentation: igzip_lib

Pythonic interface to ISA-L’s igzip_lib.

This module comes with the following constants:

ISAL_BEST_SPEED

The lowest compression level (0)

ISAL_BEST_COMPRESSION

The highest compression level (3)

ISAL_DEFAULT_COMPRESSION

The compromise compression level (2)

DEF_BUF_SIZE

Default size for the starting buffer (16K)

MAX_HIST_BITS

Maximum window size bits (15).

COMP_DEFLATE

Flag to compress to a raw deflate block

COMP_GZIP

Flag to compress a gzip block, consisting of a gzip header, a raw
deflate block and a gzip trailer.

COMP_GZIP_NO_HDR

Flag to compress a gzip block without a header.

COMP_ZLIB

Flag to compress a zlib block, consisting of a zlib header, a raw
deflate block and a zlib trailer.

COMP_ZLIB_NO_HDR

Flag to compress a zlib block without a header.

DECOMP_DEFLATE

Flag to decompress a raw deflate block.

DECOMP_GZIP

Flag to decompress a gzip block including header and verify the
checksums in the trailer.

DECOMP_GZIP_NO_HDR

Decompresses a raw deflate block (no header, no trailer) and
updates the crc member on the IgzipDecompressor object with a
crc32 checksum.

DECOMP_GZIP_NO_HDR_VER

Like DECOMP_GZIP_NO_HDR but reads the trailer and verifies the
crc32 checksum.

DECOMP_ZLIB

Flag to decompress a zlib block including header and verify the
checksums in the trailer.

DECOMP_ZLIB_NO_HDR

Decompresses a raw deflate block (no header, no trailer) and
updates the crc member on the IgzipDecompressor object with an
adler32 checksum.

DECOMP_ZLIB_NO_HDR_VER

Like DECOMP_ZLIB_NO_HDR but reads the trailer and verifies the
adler32 checksum.

MEM_LEVEL_DEFAULT

The default memory level for the internal level buffer.
(Equivalent to MEM_LEVEL_LARGE.)

MEM_LEVEL_MIN

The minimum memory level.

MEM_LEVEL_SMALL

MEM_LEVEL_MEDIUM

MEM_LEVEL_LARGE

MEM_LEVEL_EXTRA_LARGE

The largest memory level.

class isal.igzip_lib.IgzipDecompressor(flag=0, hist_bits=15, zdict=b'')

Create a decompressor object for decompressing data incrementally.

flag

Flag signifying which headers and trailers the stream has.

hist_bits

The lookback distance is 2 ^ hist_bits.

zdict

Dictionary used for decompressing the data

For one-shot decompression, use the decompress() function instead.

crc

The checksum that is saved if DECOMP_ZLIB* or DECOMP_GZIP* flags are used.

decompress(data, /, max_length=-1)

Decompress data, returning uncompressed data as bytes.

If max_length is nonnegative, returns at most max_length bytes of decompressed data. If this limit is reached and further output can be produced, self.needs_input will be set to False. In this case, the next call to decompress() may provide data as b’’ to obtain more of the output.

If all of the input data was decompressed and returned (either because this was less than max_length bytes, or because max_length was negative), self.needs_input will be set to True.

Attempting to decompress data after the end of stream is reached raises an EOFError. Any data found after the end of the stream is ignored and saved in the unused_data attribute.

eof

True if the end-of-stream marker has been reached.

needs_input

True if more input is needed before more decompressed data can be produced.

unused_data

Data found after the end of the compressed stream.

isal.igzip_lib.compress(data, /, level=2, flag=0, mem_level=0, hist_bits=15)

Returns a bytes object containing compressed data.

data

Binary data to be compressed.

level

Compression level, in 0-3.

flag

Controls which header and trailer are used.

mem_level

Sets the memory level for the memory buffer. Larger buffers improve performance.

hist_bits

Sets the size of the view window. The size equals 2^hist_bits. Similar to zlib wbits value except that the header and trailer are controlled by the flag parameter.

isal.igzip_lib.decompress(data, /, flag=0, hist_bits=15, bufsize=16384)

Returns a bytes object containing the uncompressed data.

data

Compressed data.

flag

The container format.

hist_bits

The window buffer size.

bufsize

The initial output buffer size.

python -m isal.igzip usage

A simple command line interface for the igzip module. Acts like igzip.

usage: python -m isal.igzip [-h] [-0 | -1 | -2 | -3 | -d] [-c | -o OUTPUT]
                            [-n] [-f]
                            [file]

Positional Arguments

file

Named Arguments

-0, --fast

use compression level 0 (fastest)

-1

use compression level 1

-2

use compression level 2 (default)

-3, --best

use compression level 3 (best)

-d, --decompress

Decompress the file instead of compressing.

Default: True

-c, --stdout

write on standard output

Default: False

-o, --output

Write to this output file

-n, --no-name

do not save or restore the original name and timestamp

Default: False

-f, --force

Overwrite output without prompting

Default: False

Contributing

Please make a PR or issue if you feel anything can be improved. Bug reports are also very welcome. Please report them on the github issue tracker.

Acknowledgements

This project builds upon the software and experience of many. Many thanks to:

  • The ISA-L contributors for making ISA-L. Special thanks to @gbtucker for always being especially helpful and responsive.

  • The Cython contributors for making it easy to create an extension and helping a novice get start with pointer addresses.

  • The CPython contributors. Python-isal mimicks zlibmodule.c and gzip.py from the standard library to make it easier for python users to adopt it.

  • @marcelm for taking a chance on this project and make it a dependency for his xopen and by extension cutadapt projects. This gave python-isal its first users who used python-isal in production.

  • Mark Adler (@madler) for the excellent comments in his pigz code which made it very easy to replicate the behaviour for writing gzip with multiple threads using the threading and isal_zlib modules. Another thanks for his permissive license, which allowed the crc32_combine code to be included in the project. (ISA-L does not provide a crc32_combine function, unlike zlib.) And yet another thanks to Mark Adler and also for Jean-loup Gailly for creating the gzip format which is very heavily used in bioinformatics. Without that, I would have never written this library from which I have learned so much.

  • The github actions team for creating the actions CI service that enables building and testing on all three major operating systems.

  • @animalize for explaining how to test and build python-isal for ARM 64-bit platforms.

  • And last but not least: everyone who submitted a bug report or a feature request. These make the project better!

Python-isal would not have been possible without you!

Changelog

version 1.6.1

  • Fix a bug where streams that were passed to igzip_threaded.open where closed.

version 1.6.0

  • Fix a bug where compression levels for IGzipFile where checked in read mode.

  • Update statically linked ISA-L release to 2.31.0

  • Fix an error that occurred in the __close__ function when a threaded writer was initialized with incorrect parameters.

version 1.5.3

  • Fix a bug where append mode would not work when using igzip_threaded.open.

version 1.5.2

  • Fix a bug where a filehandle remained opened when igzip_threaded.open was used for writing with a wrong compression level.

  • Fix a memory leak that occurred when an error was thrown for a gzip header with the wrong magic numbers.

  • Fix a memory leak that occurred when isal_zlib.decompressobj was given a wrong wbits value.

version 1.5.1

  • Fix a memory leak in the GzipReader.readall implementation.

version 1.5.0

  • Make a special case for threads==1 in igzip_threaded.open for writing files. This now combines the writing and compression thread for less overhead.

  • Maximize time spent outside the GIL for igzip_threaded.open writing. This has decreased wallclock time significantly.

version 1.4.1

  • Fix several errors related to unclosed files and buffers.

version 1.4.0

  • Drop support for python 3.7 and PyPy 3.8 as these are no longer supported. Add testing and support for python 3.12 and PyPy 3.10.

  • Added an experimental isal.igzip_threaded module which has an open function. This can be used to read and write large files in a streaming fashion while escaping the GIL.

  • The internal igzip._IGzipReader has been rewritten in C. As a result the overhead of decompressing files has significantly been reduced and python -m isal.igzip is now very close to the C igzip application.

  • The igzip._IGZipReader in C is now used in igzip.decompress. The _GzipReader also can read from objects that support the buffer protocol. This has reduced overhead significantly.

version 1.3.0

  • Gzip headers are now actively checked for a BGZF extra field. If found the block size is taken into account when decompressing. This has further improved bgzf decompression speed by 5% on some files compared to the more generic solution of 1.2.0.

  • Integrated CPython 3.11 code for reading gzip headers. This leads to more commonality between the python-isal code and the upstream gzip.py code. This has enabled the change above. It comes at the cost of a slight increase in overhead at the gzip.decompress function.

version 1.2.0

  • Bgzip files are now detected and a smaller reading buffer is used to accomodate the fact that bgzip blocks are typically less than 64K. (Unlike normal gzip files that consist of one block that spans the entire file.) This has reduced decompression time for bgzip files by roughly 12%.

  • Speed-up source build by using ISA-L Unix-specific makefile rather than the autotools build.

  • Simplify build setup. ISA-L release flags are now used and not overwritten with python release flags when building the included static library.

  • Fix bug where zdict’s could not be set for isal_zlib.decompressobj and igzip_lib.IgzipDecompressor.

  • Escape GIL when calling inflate, deflate, crc32 and adler32 functions just like in CPython. This allows for utilising more CPU cores in combination with the threading module. This comes with a very slight cost in efficiency for strict single-threaded applications.

version 1.1.0

  • Added tests and support for Python 3.11.

version 1.0.1

  • Fixed failing tests and wheel builds for PyPy.

version 1.0.0

Python-isal has been rewritten as a C-extension (first implementation was in Cython). This has made the library faster in many key areas.

  • Since the module now mostly contains code copied from CPython and then modified to work with ISA-L the license has been changed to the Python Software Foundation License version 2.

  • Python versions lower than 3.7 are no longer supported. Python 3.6 is out of support since December 2021.

  • Stub files with type information have now been updated to correctly display positional-only arguments.

  • Expose READ and WRITE constants on the igzip module. These are also present in Python’s stdlib gzip module and exposing them allows for better drop-in capability of igzip. Thanks to @alexander-beedie in https://github.com/pycompression/python-isal/pull/115.

  • A --no-name flag has been added to python -m isal.igzip.

  • Reduced wheel size by not including debug symbols in the binary. Thanks to @marcelm in https://github.com/pycompression/python-isal/pull/108.

  • Cython is no longer required as a build dependency.

  • isal_zlib.compressobj and isal_zlib.decompressobj are now about six times faster.

  • igzip.decompress has 30% less overhead when called.

  • Error structure has been simplified. There is only IsalError which has Exception as baseclass instead of OSError. isal_zlib.IsalError, igzip_lib.IsalError, isal_zlib.error and igzip_lib.error are all aliases of the same error class.

  • GzipReader now uses larger input and output buffers (128k) by default and IgzipDecompressor.decompress has been updated to allocate maxsize buffers when these are of reasonable size, instead of growing the buffer to maxsize on every call. This has improved gzip decompression speeds by 7%.

  • Patch statically linked included library (ISA-L 2.30.0) to fix the following:

    • ISA-L library version variables are now available on windows as well, for the statically linked version available on PyPI.

    • Wheels are now always build with nasm for the x86 architecture. Previously yasm was used for Linux and MacOS due to build issues.

    • Fixed a bug upstream in ISA-L were zlib headers would be created with an incorrect wbits value.

  • Python-isal shows up in Python profiler reports.

  • Support and tests for Python 3.10 were added.

  • Due to a change in the deployment process wheels should work for older versions of pip.

  • Added a crc property to the IgzipDecompressor class. Depending on the decompression flag chosen, this will update with an adler32 or crc32 checksum.

  • All the decompression NO_HDR flags on igzip_lib were incorrectly documented. This is now fixed.

version 0.11.1

  • Fixed an issue which occurred rarely that caused IgzipDecompressor’s unused_data to report back incorrectly. This caused checksum errors when reading gzip files. The issue was more likely to trigger in multi-member gzip files.

version 0.11.0

In this release the python -m isal.igzip relatively slow decompression rate has been improved in both speed and usability. Previously it was 19% slower than igzip when used with the -d flag for decompressing, now it is just 8% slower. Also some extra flags were added to make it easier to select the output file.

  • Prompt when an output file is overwritten with the python -m isal.igzip command line utility and provide the -f or --force flags to force overwriting.

  • Added -o and --output flags to the python -m isal.igzip command line utility to allow the user to select the destination of the output file.

  • Reverse a bug in the build system which caused some docstring and parameter information on igzip_lib and isal_zlib to disappear in the documentation and the REPL.

  • Increase the buffer size for python -m isal.igzip so it is now closer to speeds reached with igzip.

  • Add a READ_BUFFER_SIZE attribute to igzip which allows setting the amount of raw data that is read at once.

  • Add an igzip_lib.IgzipDecompressor object which can decompress without using an unconsumed_tail and is therefore more efficient.

version 0.10.0

  • Added an igzip_lib module which allows more direct access to ISA-L’s igzip_lib API. This allows features such as headerless compression and decompression, as well as setting the memory levels manually.

  • Added more extensive documentation.

version 0.9.0

  • Fix a bug where a AttributeError was triggered when zlib.Z_RLE or zlib.Z_FIXED were not present.

  • Add support for Linux aarch64 builds.

  • Add support for pypy by adding pypy tests to the CI and setting up wheel building support.

version 0.8.1

  • Fix a bug where multi-member gzip files where read incorrectly due to an offset error. This was caused by ISA-L’s decompressobj having a small bitbuffer which was not taken properly into account in some circumstances.

version 0.8.0

  • Speed up igzip.compress and igzip.decompress by improving the implementation.

  • Make sure compiler arguments are passed to ISA-L compilation step. Previously ISA-L was compiled without optimisation steps, causing the statically linked library to be significantly slower.

  • A unused constant from the isal_zlib library was removed: ISAL_DEFAULT_HIST_BITS.

  • Refactor isal_zlib.pyx to work almost the same as zlibmodule.c. This has made the code look cleaner and has reduced some overhead.

version 0.7.0

  • Remove workarounds in the igzip module for the unconsumed_tail and unused_data bugs. igzip._IGzipReader now functions the same as gzip._GzipReader with only a few calls replaced with isal_zlib calls for speed.

  • Correctly implement unused_data and unconsumed_tail on isal_zlib.Decompress objects. It works the same as in CPython’s zlib now.

  • Correctly implement flush implementation on isal_zlib.Compress and isal_zlib.Decompress objects. It works the same as in CPython’s zlib now.

version 0.6.1

  • Fix a crash that occurs when opening a file that did not end in .gz while outputting to stdout using python -m isal.igzip.

version 0.6.0

  • python -m gzip’s behaviour has been changed since fixing bug: bpo-43316. This bug was not present in python -m isal.igzip but it handled the error differently than the solution in CPython. This is now corrected and python -m isal.igzip handles the error the same as the fixed python -m gzip.

  • Installation on Windows is now supported. Wheels are provided for Windows as well.

version 0.5.0

  • Fix a bug where negative integers were not allowed for the adler32 and crc32 functions in isal_zlib.

  • Provided stubs (type-hint files) for isal_zlib and _isal modules. Package is now tested with mypy to ensure correct type information.

  • The command-line interface now reads in blocks of 32K instead of 8K. This improves performance by about 6% when compressing and 11% when decompressing. A hidden -b flag was added to adjust the buffer size for benchmarks.

  • A -c or --stdout flag was added to the CLI interface of isal.igzip. This allows it to behave more like the gzip or pigz command line interfaces.

version 0.4.0

  • Move wheel building to cibuildwheel on github actions CI. Wheels are now provided for Mac OS as well.

  • Make a tiny change in setup.py so python-isal can be build on Mac OS X.

version 0.3.0

  • Set included ISA-L library at version 2.30.0.

  • Python-isal now comes with a source distribution of ISA-L in its source distribution against which python-isal is linked statically upon installation by default. Dynamic linking against system libraries is now optional. Wheels with the statically linked ISA-L are now provided on PyPI.

version 0.2.0

  • Fixed a bug where writing of the gzip header would crash if an older version of Python 3.7 was used such as on Debian or Ubuntu. This is due to differences between point releases because of a backported feature. The code now checks if the backported feature is present.

  • Added Python 3.9 to the testing.

  • Fixed setup.py to list setuptools as a requirement.

  • Changed homepage to reflect move to pycompression organization.

version 0.1.0

  • Publish API documentation on readthedocs.

  • Add API documentation.

  • Ensure the igzip module is fully compatible with the gzip stdlib module.

  • Add compliance tests from CPython to ensure isal_zlib and igzip are validated to the same standards as the zlib and gzip modules.

  • Added a working gzip app using python -m isal.igzip

  • Add test suite that tests all possible settings for functions on the isal_zlib module.

  • Create igzip module which implements all gzip functions and methods.

  • Create isal_zlib module which implements all zlib functions and methods.