0% found this document useful (0 votes)
2 views4 pages

Java_Java10commonscompress

The document details a code change in the ZipFile class of the commons-compress Java library, addressing an issue with Unicode extra fields that caused retrieval failures in zip files created with WinZip. A new LinkedHashMap is introduced to maintain the original entries while ensuring the correct association between ZIP entries and their offsets. The change aims to enhance the handling of Unicode extra fields in zip files, resolving the identified bug (COMPRESS-164).

Uploaded by

goutam_dutta123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views4 pages

Java_Java10commonscompress

The document details a code change in the ZipFile class of the commons-compress Java library, addressing an issue with Unicode extra fields that caused retrieval failures in zip files created with WinZip. A new LinkedHashMap is introduced to maintain the original entries while ensuring the correct association between ZIP entries and their offsets. The change aims to enhance the handling of Unicode extra fields in zip files, resolving the identified bug (COMPRESS-164).

Uploaded by

goutam_dutta123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 4

====================Info Start============================

{
"id": 10,
"repo_name": "commons-compress",
"Commit URL":
"https://github.com/apache/commons-compress/commit/347660646cfad588481c13058842d49f
a3779f84?diff=split",
"Issue URL": "https://issues.apache.org/jira/browse/COMPRESS-164",
"language": "Java"
}

====================Info End====================================

====================Additional Info End====================================

For the Code Change area ,


Line of Code starting with "+" represents that the line is REMOVED.
Line of Code starting with "-" represents that the line is ADDED.

While extracting for desired refinement code please be careful in choosing the
right line of code.

Error types = [ code logic , best practice , code quality , security ]

====================Additional Info End====================================

====================Commit Message Start====================================

ZipFile doesn't work properly for unicode extra fields. Based on patc… …h by
Volker Leidl. COMPRESS-164git-svn-id:
https://svn.apache.org/repos/asf/commons/proper/compress/trunk@1210501 13f79535-
47bb-0310-9956-ffa450edef68

====================Commit Message End====================================

====================Code Change Start====================================

diff --git a/src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java


b/src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java
index b0588695..964c86db 100644
--- a/src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java
+++ b/src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java
@@ -805,11 +805,8 @@ public class ZipFile {
// the hashcode - see COMPRESS-164
// Map needs to be reconstructed in order to keep central
// directory order
- Map<ZipArchiveEntry, OffsetEntry> origMap =
- new LinkedHashMap<ZipArchiveEntry, OffsetEntry>(entries);
- entries.clear();
- for (ZipArchiveEntry ze : origMap.keySet()) {
- OffsetEntry offsetEntry = origMap.get(ze);
+ for (ZipArchiveEntry ze : entries.keySet()) {
+ OffsetEntry offsetEntry = entries.get(ze);
long offset = offsetEntry.headerOffset;
archive.seek(offset + LFH_OFFSET_FOR_FILENAME_LENGTH);
byte[] b = new byte[SHORT];
@@ -842,7 +839,6 @@ public class ZipFile {
nameMap.put(ze.getName(), ze);
}
}
- entries.put(ze, offsetEntry);
}
}

====================Code Change End====================================

====================Additional Info Start====================================

{
"Do you want to reject this annotation": {
"options": [
"1. Yes",
"2. No"
],
"answer": "2"
},
"Does the code have a valid bug": {
"options": [
"1. Yes",
"2. No"
],
"answer": "1"
},
"Is the provided refinement correct": {
"options": [
"1. Correct",
"2. Not Correct",
"3. Partially Correct"
],
"answer": "1"
},

"Annotator Name": "joyce.jacob",


"Time taken to annotate (in mins)": "40"
}

====================Additional Info End====================================

====================Debug Prompt Start====================================

Fix the issue in the code.

====================Debug Prompt End=====================================

====================Error Type Start====================================


code logic

====================Error Type End=====================================

====================Error Explanation Start====================================

The `ZipFile` class is used to read entries from a zip file. The purpose of this
code is to store the entries in a `HashMap`. But, after initially populating the
`HashMap`, the Unicode extra fields are being read, causing the alteration in the
`ZipArchiveEntry` name and its hash code, because of which the subsequent `HashMap`
calls fail to retrieve the original values. This causes the
`ZipFile.getInputStream()` to return null for a zip file created with WinZip
containing Unicode extra fields.

====================Error Explanation End====================================

===================Refinement Summary Start====================================

To enable the reading of zip files with Unicode extra fields, a new `LinkedHashMap`
is created with the key of type `ZipArchiveEntry`, and the values of type
`OffsetEntry` to store a copy of the original `entries` map. It then clears the
original `entries` map before iterating over the keys of `origMap`. By using a
`for-each` loop, the `OffsetEntry` values corresponding to each key in the
`LinkedHashMap` are retrieved. The code subsequently adds a key-value pair to the
`entries` map, where the key is a `ZipArchiveEntry` object and the value is an
`OffsetEntry` object. This will maintain the association between ZIP entries and
their offsets within the archive and ensure that the iteration order remains
consistent by reconstructing the entries `HashMap` after the Unicode extra fields
have been parsed. After making these changes in the code, the Unicode extra fields
in zip files will be properly handled.

===================Refinement Summary End====================================

===================Desired Refinement Code


Start====================================

src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java
```
@@ -805,11 +805,8 @@ public class ZipFile {
// the hashcode - see COMPRESS-164
// Map needs to be reconstructed in order to keep central
// directory order
Map<ZipArchiveEntry, OffsetEntry> origMap =
new LinkedHashMap<ZipArchiveEntry, OffsetEntry>(entries);
entries.clear();
for (ZipArchiveEntry ze : origMap.keySet()) {
OffsetEntry offsetEntry = origMap.get(ze);
long offset = offsetEntry.headerOffset;
archive.seek(offset + LFH_OFFSET_FOR_FILENAME_LENGTH);
byte[] b = new byte[SHORT];
@@ -842,7 +839,6 @@ public class ZipFile {
nameMap.put(ze.getName(), ze);
}
}
entries.put(ze, offsetEntry);
}
}
```
===================Desired Refinement Code End ====================================

===================Alternative Refinement Summary


Start=================================

===================Alternative Refinement Summary


End====================================

===================Alternative Refinement Code


Start====================================

===================Alternative Refinement Code


End====================================

You might also like