-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: Fix RESTFileScanTaskParser to handle empty delete file references list #14568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@ajreid21 the fix LGTM, can you please add a test to |
singhpk234
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix, LGTM, agree with @nastra on adding the UT
| List<Integer> indices = JsonUtil.getIntegerList(DELETE_FILE_REFERENCES, jsonNode); | ||
| Preconditions.checkArgument( | ||
| Collections.max(indices) < allDeleteFiles.size(), | ||
| indices.isEmpty() || Collections.max(indices) < allDeleteFiles.size(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix LGTM, i had it sitting on the part 2 pr : https://github.com/apache/iceberg/pull/13400/files#diff-584f9d53626a76efc298a57ada578de6d07c6d0b00f767f85a05e40471426374R90 since Jun 26 :)
I have one more fix which is in the part 2 pr let me get this out of this as well
amogh-jahagirdar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ajreid21 @singhpk234 for this fix, agree with the fix but would be good to add the test like @nastra mentioned!
| generator.writeFieldName(DATA_FILE); | ||
| ContentFileParser.toJson(fileScanTask.file(), partitionSpec, generator); | ||
| if (deleteFileReferences != null) { | ||
| if (deleteFileReferences != null && !deleteFileReferences.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this, we shouldn't be producing the field if there's nothing there. But we can be more accepting on the read side as done below.
singhpk234
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too, thanks @ajreid21 !
|
thanks for the reviews @singhpk234 @amogh-jahagirdar |
Currently, when you deserialize the FileScanTask JSON using RESTFileScanTaskParser.fromJson, the deserializer checks whether delete_file_references node exists (but not whether it's empty), and if so, the deserializer fails at
iceberg/core/src/main/java/org/apache/iceberg/rest/RESTFileScanTaskParser.java
Line 90 in 7368e59
TableScanResponseParser.serializeScanTasks currently always puts an empty deleted_file_references list, even if the field does not exist --
iceberg/core/src/main/java/org/apache/iceberg/rest/TableScanResponseParser.java
Line 121 in 7368e59