zip archive in python

I would like to create zip archives within a python batch script. I would like to compress individual files or entire directories of files.

You can use the built-in zipfile module, and create a ZipFile as you would a normal File object, e.g.,

>>> 
>>> foo = zipfile.ZipFile('foo.zip', mode='w')
>>> foo.write('foo.txt')
>>> 

Unfortunately, by default the zipfile is uncompressed. You can add multiple files and directories to your zipfile, which can be useful for archival, but they will not be compressed. In order to compress the files, you’ll need to have the zlib library installed (it should already be installed in newer versions of python, 2.5 and greater). Simply use the ZIP_DEFLATED flag as follows,

>>> 
>>> foo = zipfile.ZipFile('foo.zip', mode='w')
>>> foo.write('foo.txt', compress_type=zipfile.ZIP_DEFLATED)
>>> 

In order to archive an entire directory (and all its contents) you can use the os.walk function. This function will return a list of all files and subdirectories as a triple (root, dirs, files). You can iterate through the returned files as follows,

>>> 
>>> foo = zipfile.ZipFile('foo.zip', mode='w')
>>> for root, dirs, files in os.walk('/path/to/foo'):
...     for name in files:
...         file_to_zip = os.path.join(root, name)
...         foo.write(file_to_zip, compress_type=zipfile.ZIP_DEFLATED)
...         
>>> 

We can put this all together into a handy utility function that creates a compressed zipfile for any file or directory. This is also available in the following github repo.

def ziparchive(filepath, zfile=None):
    ''' create/overwrite a zip archive

        can be a file or directory, and always overwrites the output zipfile if one already exists

        An optional second argument can be provided to specify a zipfile name, 
        by default the basename will be used with a .zip extension

        >>>
        >>> ziparchive('foo/data/')
        >>> zf = zipfile.ZipFile('data.zip', 'r')
        >>> 

        >>> 
        >>> ziparchive('foo/data/', 'foo/eggs.zip')
        >>> zf = zipfile.ZipFile('foo/eggs.zip', 'r')
        >>> 
    '''
    if zfile is None:
        zfile = os.path.basename(filepath.strip('/')) + '.zip'
    filepath = filepath.rstrip('/')
    zf = zipfile.ZipFile(zfile, mode='w')
    if os.path.isfile(filepath):
        zf.write(filepath, filepath[len(os.path.dirname(filepath)):].strip('/'), compress_type=zipfile.ZIP_DEFLATED)
    else:
        for root, dirs, files in os.walk(filepath):
            for name in files:
                file_to_zip = os.path.join(root, name)
                arcname = file_to_zip[len(os.path.dirname(filepath)):].strip('/')
                zf.write(file_to_zip, arcname, compress_type=zipfile.ZIP_DEFLATED)