Skip to content

Conversation

@Fokko
Copy link
Contributor

@Fokko Fokko commented Dec 20, 2023

With retries with conflicting manifest merges. This makes the caching a bit more defensive so cached emptied when cleaning up a commit.

@github-actions github-actions bot added the core label Dec 20, 2023
@Fokko Fokko marked this pull request as draft December 20, 2023 13:28
@Fokko Fokko force-pushed the fd-fix-deletes-as-well branch from 8f5da1e to 245ae82 Compare December 20, 2023 14:01
With retries with conflicting manifest merges.

Ryan pointed out that this might also occur whith the deletes.
However, I was unable to replicate this with a test. I've added
the test that should uncover this issue when merging DELETE
manifests, and deleting the old one before the transaction
is succesfully commited.
@Fokko Fokko force-pushed the fd-fix-deletes-as-well branch from 245ae82 to fa3a58e Compare December 20, 2023 14:09
@Fokko Fokko force-pushed the fd-fix-deletes-as-well branch from 0be28ac to d10d041 Compare December 20, 2023 19:22
@Fokko Fokko marked this pull request as ready for review December 20, 2023 19:26
for (ManifestFile cachedNewDeleteManifest : cachedNewDeleteManifests) {
if (!committed.contains(cachedNewDeleteManifest)) {
deleteFile(cachedNewDeleteManifest.path());
hasDeleteDeletes = true;
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think a better name would be clearCachedDeleteManifests (but I see this was just following the pattern for data file manifests)

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also another nit: I'd probably use the same pattern we did for the data manifest case where we just null it out and don't exercise the loop if it's null, but I see that the logic is the same with clearing the cached manifests and the loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't null out cachedNewDeleteManifests because it's final. So the only thing that's being done in the code is to clear that collection

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also looked at that. I think it is for performance reasons since writer.toManifestFiles(); returns a list. For the deletes, we do a addAll which is in O(n). I'm a bit torn, I like the performance optimization, but in practice, I don't think that we write that many manifests, so n is rather small. Therefore I prefer avoiding nulling it out to make the code easier to read.

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, also worth mentioning that @Fokko and me explored adding a test that passes with this fix and fails without it, but we were not able to come up with a test. However, We still wanted to align the handling of delete manifests with how it was done for #9230

@nastra nastra merged commit c340915 into apache:main Dec 21, 2023
nastra pushed a commit to nastra/iceberg that referenced this pull request Dec 21, 2023
nastra added a commit that referenced this pull request Dec 21, 2023
Co-authored-by: Fokko Driesprong <fokko@apache.org>
@rdblue
Copy link
Contributor

rdblue commented Dec 21, 2023

Thanks for getting this in @nastra and @Fokko!

@Fokko Fokko deleted the fd-fix-deletes-as-well branch December 21, 2023 21:13
lisirrx pushed a commit to lisirrx/iceberg that referenced this pull request Jan 4, 2024
geruh pushed a commit to geruh/iceberg that referenced this pull request Jan 26, 2024
devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024
zhongyujiang pushed a commit to zhongyujiang/iceberg that referenced this pull request Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants