Python Zipfile – Benefits, Modules, Objects in Zipfiles in Python

1. Python Zipfile – Objective

In our previous tutorial, we studied Image Processing with Scipy and NumPy in Python and in this article, we will learn about Python Zipfile. Moreover, we will see how we can extract Zipfile in Python. In addition, we will cover how can we write Python Zipfile, and getting information about them using Python. At last, we will see some methods of the Python Zipfiles module provides and exceptions.

So, let’s begin Python Zipfile Tutorial.

Python Zipfile - Benefits, Modules, Objects in Zipfiles in Python

Python Zipfile – Benefits, Modules, Objects in Zipfiles in Python

2. What is Python Zipfile?

Python ZIPfile is an ideal way to group similar files and compress large files to reduce their size. The compression is lossless. This means that using the compression algorithm, we can operate on the compressed data to perfectly reconstruct the original data. So, in Python zipfile is an archive file format and a compression standard; it is a single file that holds compressed files.
Do you know about Python Forensics – Hash Function, Virtualization & much more

3. Advantages of Python Zipfiles

Bunching files into zips offer the following advantages:
1. It reduces storage requirements
Since ZIP files use compression, they can hold much more for the same amount of storage
2. It improves transfer speed over standard connections
Since it is just one file holding less storage, it transfers faster
Now, let’s learn about the module zipfile.

4. Python Zipfile Module

The Python zipfile module has tools to create, read, write, append, and list ZIP files. At the time of writing. It lets us handle ZIP files that use ZIP64 extensions and decrypt encrypted files in ZIP archives. It cannot handle multi-disk ZIP files or create encrypted files.

Python Zipfile Module has the following members:

a. exception zipfile.BadZipFile

For a bad ZIP file, it raises this exception.

b. exception zipfile.BadZipfile

This is an alias for the previous exception in the list. It is to make it compatible with older Python versions. This is deprecated since version 3.2.

Let’s Discuss Errors and Exceptions in Python Programming

c. exception zipfile.LargeZipFile

When a Python ZIPfile needs ZIP64 functionality, but it hasn’t been enabled, Python throws this exception.

d. class zipfile.ZipFile

This is the class for reading and writing ZIP files in Python.

e. class zipfile.PyZipFile

This class lets us create ZIP archives holding Python libraries.

f. class zipfile.ZipInfo(filename=’NoName’, date_time=(1980, 1, 1, 0, 0, 0))

With this class, we can represent information about an archive member. The getinfo() and infolist() methods of Python ZipFile objects return instances of this class.

g. zipfile.is_zipfile(filename)

This considers the magic number of a ZIP file. If it is a valid ZIP file, it returns True; otherwise, False. This works on files and file-like objects.

h. zipfile.ZIP_STORED

This is the numeric constant for uncompressed archive members.

i. zipfile.ZIP_DEFLATED

This is the numeric constant for the usual ZIP compression method. It needs the zlib module.

j. zipfile.ZIP_BZIP2

This is the numeric constant for the BZIP2 compression method. It needs the bz2 module.

k. zipfile.ZIP_LZMA

This is the numeric constant for the LZMA compression method. It needs the lzma module.
Read about Exception Handling in Python for Python Programming

5. Python ZipFile Objects

Python zipfile class this type:

a. class zipfile.ZipFile(file, mode=’r’, compression=ZIP_STORED, allowZip64=True)

This method opens a Python ZIPfile. Here, file may be a file-like object or a string path to a file. We have the following modes:

‘r’- To read an existing file

‘w’- To truncate and write a new file

‘a’- To append to an existing file

Using the compression argument, we can select the compression method to use when writing the archive. allowZip64 is True by default. This creates ZIP files that use ZIP64 extensions for zipfiles larger than GiB.

ZipFile has the following functions:

i. ZipFile.close()

This closes the archive file. If we do not call this before exiting the program, Python doesn’t write the records intended to.

b. ZipFile.getinfo(name)

This returns a ZipInfo object holding information about the archive member name.

c. ZipFile.infolist()

This returns a list holding a ZipInfo object for each archive member.

Let’s revise Python Classes and Object Oriented Programming

d. ZipFile.namelist()

This returns a list of archive members by name.

e. ZipFile.open(name, mode=’r’, pwd=None)

This function extracts a member from the archive as a file-like object (CipExtFile). The mode can be ‘r’, ‘U’, or ‘rU’. pwd is the password for an encrypted file. name is a filename in the archive or a ZipInfo object.

Since it is also a context manager, we can use it with the ‘with’ statement:

with ZipFile(‘spam.zip’) as myzip:

with myzip.open(‘eggs.txt’) as myfile:

print(myfile.read())

f. ZipFile.extract(member, path=None, pwd=None)

This extracts a member from the archive to the current working directory. member may be a filename or a ZipInfo object, path is a different directory to extract to, and pwd is the password for an encrypted file.

g. ZipFile.extractall(path=None, members=None, pwd=None)

This extracts all members from the archive to the current working directory. The arguments mean the same as above.

h. ZipFile.printdir()

This prints a table of contents for the archive to sys.stdout.

i. ZipFile.setpassword(pwd)

This sets pwd as the default password to extract encrypted files.

j. ZipFile.read(name, pwd=None)

This returns the bytes of name in the archive, where name is the name of a file in the archive, or of a ZipFile object.

Let’s Know about Python Function Arguments with Types, Syntax and Examples

k. ZipFile.testzip()

This checks the CRCs and file headers for all files in the archive, and returns the name of the first bad file. If there is none, it returns None.

l. ZipFile.write(filename, arcname=None, compress_type=None)

This writes the file filename to the archive, calling it arcname.

m. ZipFile.writestr(zinfo_or_arcname, bytes[, compress_type])

This writes the string bytes to the archive

We also have some data attirbutes:

  •  ZipFile.debug

This denotes the level of debug output to use. 0 means no output (default) and 3 means the most output.

  • ZipFile.comment

This is the comment text linked with a Python ZIPfile.

6. Extracting ZIP Files in Python

Now, let’s try it hands-on. Let’s try extracting a Python Zipfile.

>>> from zipfile import ZipFile
>>> import os
>>> os.chdir("C:\\Users\\lifei\\Desktop")
>>> file="Demo.zip"
>>> with ZipFile(file,'r') as zip:          #ZipFile constructor; READ mode; ZipFile object named as zip
        zip.printdir()                      #To print contents of the archive
print("Extracting files")
        zip.extractall()                    #Extract contents of the ZIP to the current working directory
print("Finished extracting")

File Name              Modified                           Size

Demo/1.txt            2018-06-15 17:40:06          0

Demo/2.txt            2018-06-15 17:40:12          0

Demo/3.txt            2018-06-15 17:40:16          0

Extracting files

Finished extracting

As you can see, this extracts all files in the ZIP Demo.zip. It creates a folder labeled ‘Demo’ on the Desktop. We explain the code through comments. You can also extract just a single file using the method extract():

>>> with ZipFile(file,'r') as zip:
zip.extract('Demo/2.txt')

‘C:\\Users\\lifei\\Desktop\\Demo\\2.txt’

This creates a folder on the Desktop labeled ‘Demo’. But this time, it only contains one file- ‘2.txt’.
Let’s Discuss Python Read And Write File – File Handling In Python

7. How to Write Python ZIP File?

We use the write() method to write to a ZIP. Here’s the code we use:

>>> from zipfile import ZipFile
>>> import os
>>> os.chdir('C:\\Users\\lifei\\Desktop')
>>>
>>> def get_paths(directory):
        paths=[]
        for root, directories, files in os.walk(directory):
           for filename in files:
                filepath=os.path.join(root,filename)
                paths.append(filepath)
     return paths
>>>
>>> directory='./Demo'
>>> paths=get_paths(directory)
>>> print("Zipping these files:")
Zipping these files:
>>> for file in paths:
print(file)
./Demo\1.txt
./Demo\2.txt
./Demo\3.txt
>>> with ZipFile('Demo.zip','w') as zip:
for file in paths:
zip.write(file)
>>> print("Zip successful")

Zip successful
Now let’s see how this works:
We create a function with uses the method os.walk(). In every iteration, it appends the files in that directory to the list paths. Then, we get a list of the file paths bypassing the Demo directory’s path to the function get_paths(). Then, we create a ZipFile object in WRITE mode. Finally, we use the write() method to write all these files to the ZIP.

8. Getting Information about ZIP Files in Python

To find out more about a Python zipfile, we use the method infolist(). Let’s see how:

>>> from zipfile import ZipFile
>>> import datetime
>>> file="Demo.zip"
>>> with ZipFile(file,'r') as zip:
       for info in zip.infolist():
         print(info.filename)          
         print('\tModified:\t'+str(datetime.datetime(*info.date_time)         
         print('\tSystem:\t\t'+str(info.create_system)+'(0=Windows,3=Unix)')
         print('\tZIP version:\t'+str(info.create_version))
         print('\tCompressed:\t'+str(info.compress_size)+' bytes')
         print('\tUncompressed:\t'+str(info.file_size)+' bytes')

Demo/1.txt
Modified:            2018-06-15 17:56:32
System:              0(0=Windows,3=Unix)
ZIP version:          20
                Compressed:        0 bytes
Uncompressed:    0 bytes
Demo/2.txt
Modified:                  2018-06-15 17:57:18
System:                    0(0=Windows,3=Unix)
ZIP version:              20
Compressed:            29 bytes
Uncompressed:       40 bytes
Demo/3.txt
Modified:           2018-06-15 17:56:42
System:             0(0=Windows,3=Unix)
ZIP version:        20
Compressed:      0 bytes
Uncompressed:   0 bytes

Read about Python File I/O – Python Write to File and Read File

Here, we use the method infolist() to create an instance of the ZipInfo class that holds all information about the Python zipfile. It lets us access information like file names, a system where the file was created, file modification data, ZIP version, size of files, and so.

So, this was all about Python Zipfile Tutorial. Hope you like our explanation.

9. Conclusion

Hence, like we’ve always said, there are so many things you can do with Python. Using the Python zipfile module, we can even handle ZIP files. Tell us what you think about Python Zipfile in the comments.
Related Topic- Python Modules vs Packages
For reference

Leave a Reply

Your email address will not be published. Required fields are marked *