-
Notifications
You must be signed in to change notification settings - Fork 3k
API: Fix default FileIO#newInputFile ManifestFile, DataFile and DeleteFile implementation to pass lengths #9953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: Fix default FileIO#newInputFile ManifestFile, DataFile and DeleteFile implementation to pass lengths #9953
Conversation
2b6300e to
6c0b6f8
Compare
6c0b6f8 to
9e094ca
Compare
ajantha-bhat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment about the test validation.
Thanks for fixing it.
I think we do have to conclude whether we need 1.5.1 release because of this. Because one extra IO for each manifest and DataFile reading seems like a problem for me.
|
ping @danielcweeks, @rdblue, @nastra |
|
@ajantha-bhat Replied my thoughts on the Trino PR. In short, don't think a patch release is required. I'll summarize why I think that here: To be clear, there is no extra I/O being done in practice for What's remaining is the |
9e094ca to
47e5e90
Compare
…eFile implementations
47e5e90 to
01a2868
Compare
Fokko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great catch @amogh-jahagirdar thanks for fixing this 👍
|
Thanks for the reviews @ajantha-bhat and @Fokko ! Merging |
…eFile implementations (apache#9953)
…eFile implementations (apache#9953)
As part of adding encryption support, in #9592 we added some new FileIO APIs, namely
The overriden implementaiton in EncryptedFileIO is correct but the default implementation in
FileIOfor these new APIs should pass in a length since it's always known from the Iceberg metadata.Without this, FileIO implementations which end up calling these default implementations (specifically referring to these new APIs)will make extra requests to the object store/file system to determine the length which we can avoid.