Python Zipfile – Benefits, Modules, Objects in Zipfiles in Python
Keeping you updated with latest technology trends, Join DataFlair on Telegram
1. Python Zipfile – Objective
In our previous tutorial, we studied Image Processing with Scipy and NumPy in Python and in this article, we will learn about Python Zipfile. Moreover, we will see how we can extract Zipfile in Python. In addition, we will cover how can we write Python Zipfile, and getting information about them using Python. At last, we will see some methods of the Python Zipfiles module provides and exceptions.
So, let’s begin Python Zipfile Tutorial.
2. What is Python Zipfile?
Python ZIPfile is an ideal way to group similar files and compress large files to reduce their size. The compression is lossless. This means that using the compression algorithm, we can operate on the compressed data to perfectly reconstruct the original data. So, in Python zipfile is an archive file format and a compression standard; it is a single file that holds compressed files.
Do you know about Python Forensics – Hash Function, Virtualization & much more
3. Advantages of Python Zipfiles
Bunching files into zips offer the following advantages:
1. It reduces storage requirements
Since ZIP files use compression, they can hold much more for the same amount of storage
2. It improves transfer speed over standard connections
Since it is just one file holding less storage, it transfers faster
Now, let’s learn about the module zipfile.
4. Python Zipfile Module
The Python zipfile module has tools to create, read, write, append, and list ZIP files. At the time of writing. It lets us handle ZIP files that use ZIP64 extensions and decrypt encrypted files in ZIP archives. It cannot handle multi-disk ZIP files or create encrypted files.
Python Zipfile Module has the following members:
a. exception zipfile.BadZipFile
For a bad ZIP file, it raises this exception.
b. exception zipfile.BadZipfile
This is an alias for the previous exception in the list. It is to make it compatible with older Python versions. This is deprecated since version 3.2.
c. exception zipfile.LargeZipFile
When a Python ZIPfile needs ZIP64 functionality, but it hasn’t been enabled, Python throws this exception.
d. class zipfile.ZipFile
This is the class for reading and writing ZIP files in Python.
e. class zipfile.PyZipFile
This class lets us create ZIP archives holding Python libraries.
f. class zipfile.ZipInfo(filename=’NoName’, date_time=(1980, 1, 1, 0, 0, 0))
With this class, we can represent information about an archive member. The getinfo() and infolist() methods of Python ZipFile objects return instances of this class.
This considers the magic number of a ZIP file. If it is a valid ZIP file, it returns True; otherwise, False. This works on files and file-like objects.
This is the numeric constant for uncompressed archive members.
This is the numeric constant for the usual ZIP compression method. It needs the zlib module.
This is the numeric constant for the BZIP2 compression method. It needs the bz2 module.
This is the numeric constant for the LZMA compression method. It needs the lzma module.
Read about Exception Handling in Python for Python Programming
5. Python ZipFile Objects
Python zipfile class this type:
a. class zipfile.ZipFile(file, mode=’r’, compression=ZIP_STORED, allowZip64=True)
This method opens a Python ZIPfile. Here, file may be a file-like object or a string path to a file. We have the following modes:
‘r’- To read an existing file
‘w’- To truncate and write a new file
‘a’- To append to an existing file
Using the compression argument, we can select the compression method to use when writing the archive. allowZip64 is True by default. This creates ZIP files that use ZIP64 extensions for zipfiles larger than GiB.
ZipFile has the following functions:
This closes the archive file. If we do not call this before exiting the program, Python doesn’t write the records intended to.
This returns a ZipInfo object holding information about the archive member name.
This returns a list holding a ZipInfo object for each archive member.
This returns a list of archive members by name.
e. ZipFile.open(name, mode=’r’, pwd=None)
This function extracts a member from the archive as a file-like object (CipExtFile). The mode can be ‘r’, ‘U’, or ‘rU’. pwd is the password for an encrypted file. name is a filename in the archive or a ZipInfo object.
Since it is also a context manager, we can use it with the ‘with’ statement:
with ZipFile(‘spam.zip’) as myzip:
with myzip.open(‘eggs.txt’) as myfile:
f. ZipFile.extract(member, path=None, pwd=None)
This extracts a member from the archive to the current working directory. member may be a filename or a ZipInfo object, path is a different directory to extract to, and pwd is the password for an encrypted file.
g. ZipFile.extractall(path=None, members=None, pwd=None)
This extracts all members from the archive to the current working directory. The arguments mean the same as above.
This prints a table of contents for the archive to sys.stdout.
This sets pwd as the default password to extract encrypted files.
j. ZipFile.read(name, pwd=None)
This returns the bytes of name in the archive, where name is the name of a file in the archive, or of a ZipFile object.
This checks the CRCs and file headers for all files in the archive, and returns the name of the first bad file. If there is none, it returns None.
l. ZipFile.write(filename, arcname=None, compress_type=None)
This writes the file filename to the archive, calling it arcname.
m. ZipFile.writestr(zinfo_or_arcname, bytes[, compress_type])
This writes the string bytes to the archive
We also have some data attirbutes:
This denotes the level of debug output to use. 0 means no output (default) and 3 means the most output.
This is the comment text linked with a Python ZIPfile.
6. Extracting ZIP Files in Python
Now, let’s try it hands-on. Let’s try extracting a Python Zipfile.
>>> from zipfile import ZipFile >>> import os >>> os.chdir("C:\\Users\\lifei\\Desktop") >>> file="Demo.zip" >>> with ZipFile(file,'r') as zip: #ZipFile constructor; READ mode; ZipFile object named as zip zip.printdir() #To print contents of the archive print("Extracting files") zip.extractall() #Extract contents of the ZIP to the current working directory print("Finished extracting")
File Name Modified Size
Demo/1.txt 2018-06-15 17:40:06 0
Demo/2.txt 2018-06-15 17:40:12 0
Demo/3.txt 2018-06-15 17:40:16 0
As you can see, this extracts all files in the ZIP Demo.zip. It creates a folder labeled ‘Demo’ on the Desktop. We explain the code through comments. You can also extract just a single file using the method extract():
>>> with ZipFile(file,'r') as zip: zip.extract('Demo/2.txt')
This creates a folder on the Desktop labeled ‘Demo’. But this time, it only contains one file- ‘2.txt’.
Let’s Discuss Python Read And Write File – File Handling In Python
7. How to Write Python ZIP File?
We use the write() method to write to a ZIP. Here’s the code we use:
>>> from zipfile import ZipFile >>> import os >>> os.chdir('C:\\Users\\lifei\\Desktop') >>> >>> def get_paths(directory): paths= for root, directories, files in os.walk(directory): for filename in files: filepath=os.path.join(root,filename) paths.append(filepath) return paths >>> >>> directory='./Demo' >>> paths=get_paths(directory) >>> print("Zipping these files:") Zipping these files: >>> for file in paths: print(file) ./Demo\1.txt ./Demo\2.txt ./Demo\3.txt >>> with ZipFile('Demo.zip','w') as zip: for file in paths: zip.write(file) >>> print("Zip successful")
Now let’s see how this works:
We create a function with uses the method os.walk(). In every iteration, it appends the files in that directory to the list paths. Then, we get a list of the file paths bypassing the Demo directory’s path to the function get_paths(). Then, we create a ZipFile object in WRITE mode. Finally, we use the write() method to write all these files to the ZIP.
8. Getting Information about ZIP Files in Python
To find out more about a Python zipfile, we use the method infolist(). Let’s see how:
>>> from zipfile import ZipFile >>> import datetime >>> file="Demo.zip" >>> with ZipFile(file,'r') as zip: for info in zip.infolist(): print(info.filename) print('\tModified:\t'+str(datetime.datetime(*info.date_time) print('\tSystem:\t\t'+str(info.create_system)+'(0=Windows,3=Unix)') print('\tZIP version:\t'+str(info.create_version)) print('\tCompressed:\t'+str(info.compress_size)+' bytes') print('\tUncompressed:\t'+str(info.file_size)+' bytes')
Modified: 2018-06-15 17:56:32
ZIP version: 20
Compressed: 0 bytes
Uncompressed: 0 bytes
Modified: 2018-06-15 17:57:18
ZIP version: 20
Compressed: 29 bytes
Uncompressed: 40 bytes
Modified: 2018-06-15 17:56:42
ZIP version: 20
Compressed: 0 bytes
Uncompressed: 0 bytes
Here, we use the method infolist() to create an instance of the ZipInfo class that holds all information about the Python zipfile. It lets us access information like file names, a system where the file was created, file modification data, ZIP version, size of files, and so.
So, this was all about Python Zipfile Tutorial. Hope you like our explanation.
Hence, like we’ve always said, there are so many things you can do with Python. Using the Python zipfile module, we can even handle ZIP files. Tell us what you think about Python Zipfile in the comments.
Related Topic- Python Modules vs Packages