Partial ZIP File Downloads, (Mon, Jan 20th)

SANS Internet Storm Center, InfoCON: green 2025-01-20

Say you want a file that is inside a huge online ZIP file (several gigabytes large). Downloading the complete ZIP file would take too long.

If the HTTP server supports the range header, you can do the following:

We will work with my DidierStevensSuite.zip file as an example (it's 13MB in size, not several GBs, but the principle remains te same).

First, with a HEAD HTTP request, we figure out the ZIP file size:

The size of the ZIP file is 13189336 bytes.

The end of a ZIP file contains a series of DIR records that compose the directory of files (and directories) contained inside the ZIP file. This directory is usually small, compared to the file size, so we will do a partial download starting at position 13000000.

This can be done with the curl range option: this will add a header that specifies the range we want to download:

Next we use my zipdump.py tool to parse the ZIP records (-f l) inside the partial ZIP download like this:

Let's say that the file we want to obtain, is xor-kpa.py. It's ZIP DIR record starts at posistion 0x0002e05d.

We can analyze that record like this:

Field headeroffset tells us were the corresponding ZIP FILE record is insize the ZIP file: at position 11892478. That ZIP FILE record contains the compressed data of the file (xor-kpa.py) we want. So that's the begin value of our range option: -r 11892478-

To determine the end value of our range option, we look at the next record in line (that's for file XORSearch.exe):

That ZIP FILE record starts at position 11899893. So 11899893 minus 1 is the end value of our range option: -r 11892478-11899892.

Here is the curl command to download the entiry ZIP FILE record for file xor-kpa.py:

And we analyze that partial download with zipdump.py like this:

The zipdump.py command to decompress (-s decompress) the ZIP data for file xor-kpa.py and write it to disk (-d), is the following:

And that gives us the desired file:

Didier Stevens Senior handler blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.