Talk:ZIP (file format)

Latest comment: 4 months ago by AnonMoos in topic Windows version

Are unreferenced/zombie files allowed?

edit

The article says:

Because ZIP files may be appended to, only files specified in the central directory at the end of the file are valid. Scanning a ZIP file for local file headers is invalid (except in the case of corrupted archives), as the central directory may declare that some files have been deleted and other files have been updated.

However, the specification (6.3.9) says:

4.3.2 Each file placed into a ZIP file MUST be preceded by a "local file header" record for that file. Each "local file header" MUST be accompanied by a corresponding "central directory header" record within the central directory section of the ZIP file.

Also, "4.3.6 Overall .ZIP file format" and "4.3.8 File Data" do not mention that arbitrary meaningless bytes may be stored between files.

I checked that 7-zip does not support unreferenced files. If I take a zip file and rewrite central directory record, removing some files from it, 7-zip still shows these files as available. If I additionally replace the signatures of their local file headers with some garbage, then 7-zip cannot open the resulting file. Discussion on 7-zip forum.

I think this claim should either be removed or a citation should be added. -- Preceding unsigned comment added by Stgatilov (talk o contribs) 04:36, 11 May 2021 (UTC)Reply

If you scan from the beginning for local file headers, then there's also the problem of an ZIP file contained uncompressed within another ZIP file (which is legal). AnonMoos (talk) 11:15, 11 May 2021 (UTC)Reply
Unfortunately unreferenced files can occur depending on how the zip application changes and updates files. Writing a file signature at the location of the central directory and writing the central directory back out after it. The only thing that needs to be updated is the one entry in the central directory to locate to the new file signature. At the end of the central directory we write the end of directory signature which must have the updated position the central directory starts at.
If we read the zip the proper way bottom up we find the end of directory signature, then we know where the start of the central directory is. Which each central directory entire has the file path and the file signature location. Each file signature also has the file path as well. If we read from the start of the zip down we would see two file signatures with the same file name and path. The central directory only locates to one file signature added at the start of the central directory before writing the central directory back out and adjusting two values, for the entire to file signature, and the start position to the central directory. Damian Recoskie (talk) 21:37, 1 March 2023 (UTC)Reply

How unreferenced files happen.

edit

First of all it should be simplified so everyone can understand it rather than using words like appending files to the end of the zip.

The fast way of reading a zip is bottom up. Once the end of directory signature is found it tells you where to read the file to start reading the central directory. The central directory signatures is a listing of files paths in the zip followed by the location the file signature for the file can be found.

As a bit of a note each file signature contains the file path of the file as well. We can write a new file signature for this file at the location where the central directory starts. We then can write the central directory after this file signature and update the file signature location of just the one.

This also means you end up with two file signatures in the zip for the same file in which the one we added contains the changes. So the reality is we can add files at the start of the central directory and write it back out with the file we changed with the updated positions to the added file signature. This also means we end up with a file signature that is not referenced from the central directory. 24.150.140.42 (talk) 21:04, 1 March 2023 (UTC)Reply

ANSI

edit

The term "ANSI" as used in the article is technically incorrect, and will convey very little meaning to those who don't already know the term. At a minimum Windows-1252 should be linked (see the explanations on that page). AnonMoos (talk) 08:52, 12 June 2024 (UTC)Reply

Windows version

edit

I saw article for "Windows up to 11" I don't know if the version older support zip file. 47.234.198.142 (talk) 00:08, 16 June 2024 (UTC)Reply

ZIP files themselves are not very operating-system specific. It's whether the software to compress and uncompress them is supported on an operating system. The first PKZip version which supported "flate" compression was released before Windows 3.1, when most X86 computers were still running MS-DOS. Info-ZIP was ported to a wide range of operating system. AnonMoos (talk) 18:33, 18 June 2024 (UTC)Reply