Skip to content

Conversation

@rdblue
Copy link
Contributor

@rdblue rdblue commented Apr 29, 2022

Fixes #4666.

@github-actions github-actions bot added the core label Apr 29, 2022
@rdblue rdblue added this to the Iceberg 0.13.2 Release milestone Apr 29, 2022
@rdblue rdblue changed the title Core: Fix table corruption from OOM during commit cleanup Core: Fix table corruption from OOM during commit cleanup in Spark Apr 29, 2022
@dilipbiswal
Copy link
Contributor

LGTM

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@dilipbiswal
Copy link
Contributor

@rdblue @RussellSpitzer
Should we move

LOG.info("Committed snapshot {} ({})", newSnapshotId.get(), getClass().getSimpleName());
into the following try block as a safe measure ?

@rdblue
Copy link
Contributor Author

rdblue commented Apr 29, 2022

@dilipbiswal, done.

@dilipbiswal
Copy link
Contributor

Thanks a lot @rdblue


} catch (RuntimeException e) {
LOG.warn("Failed to load committed table metadata, skipping manifest clean-up", e);
} catch (Throwable e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanAll is called in line 323 before this try-catch block. what is the further cleanup after this block? I only see notifyListeners.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are talking about OutOfMemoryError happened during the first/commit try-catch block (line 286-324), then cleanAll won't be called either as OutOfMemoryError is not RuntimeException.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanAll is called when the commit fails. If the commit fails, then there is no problem with Spark cleaning up the data files, or cleaning the metadata files that Iceberg creates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is an OutOfMemoryError throw from the try block (line 326-346), this change will catch and swallow it. So it skip further cleanup from line 333-339. I am not following how this fix the issue #4666 where committed data files were deleted.

Copy link
Member

@RussellSpitzer RussellSpitzer Apr 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue was that a throwable not caught by the catch block here would be thrown to Spark, which would cause Spark to consider the commit failed and cause Spark to perform it's own abort code removing the committed files.

So the old behavior was

  1. Table Operations Commit (this can happen successfully)
  2. While Cleaning up old files or theoretically while calling notify listeners a non-runtime exception is thrown
  3. The commit has been successful but the exception is re-thrown to the Spark Commit code
  4. The Spark commit code sees the exception and executes it's abort method

So the fix is basically never ever throw an exception once the commit has occurred, regardless of what happens.

Copy link
Contributor

@stevenzwu stevenzwu Apr 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RussellSpitzer thanks a lot for the detailed explanation. Didn't realize it is Spark abort flow deleted the data files.

I am uneasy with swallowing fatal errors (like OutOfMemoryError) though. Should Spark only catch CommitFailedException and perform abort only for that specific exception? I assume Spark doesn't perform abort for CommitStateUnknownException.

Copy link
Member

@RussellSpitzer RussellSpitzer Apr 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenzwu It's not in our code, it's in Spark's code where it doesn't have a defined exception for either of those states. Apache Spark basically has this code

try {
  DataSource.commit() 
  } catch (anything) {
  datasource.abort()
  }

source

Spark doesn't know about any of our particular exceptions, it just wants to know if it's call to DataSource.commit failed or not. In this particular instance the worry is that the Iceberg commit did succeed and no exceptions were thrown but additional code ran after the Iceberg commit (like our clean up code or notify listeners) which throws another exception. This exception is surfaced up as being thrown by DataSource.commit() and causes Spark to call DataSource.abort().

Spark itself cannot distinguish between an exception that happened before the actual commit and after the actual commit within our DataSource commit method. If we wanted to let engines handle this we could have our Iceberg SparkWrite try to handle this but I think that would probably be difficult to manage as well. I think it more sense to say that once committed, we suppress all exceptions. Our Commit method in SnapshotProducer only throws exceptions when the commit fails and in no other circumstances.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I thought it was iceberg-spark.

When CommitStateUnknownException is thrown, commit may have actually succeeded in the backend. If Spark performs abort and delete data files in this case, it could corrupt table state. could this happen or it is taken care of by iceberg-spark?

Sorry for asking some newbie Spark questions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that may be possible... All of our tests at the moment just make sure that the Metadata.json file is not incorrectly removed. We probably should add an additional test to make sure data files in Spark are not removed. I would think we would need a catch here

    try {
      operation.commit(); // abort is automatically called if this fails
    } catch (CommitStateUnknownException commitStateUnknownException) {
      LOG.warn("Unknown Commit State", commitStateUnknownException);
    }

Copy link
Contributor

@stevenzwu stevenzwu May 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching and swallowing the CommitStateUnknownException is also not ideal. If the commit actually failed, that would lead Spark to falsely assume the commit is successful.

Ideally, Spark needs to distinguish these commit results and then iceberg-spark can translate Iceberg commit result to Spark commit result.

  • commit success. treat it as success
  • commit failure. treat it as failure and perform the abort
  • commit state unknown. treat it as failure but don't perform the abort

This discussion is probably outside the scope of this PR. This PR is fine. Basically, it swallows any exceptions from post-commit-success cleanup steps.

} catch (RuntimeException e) {
LOG.warn("Failed to load committed table metadata, skipping manifest clean-up", e);
} catch (Throwable e) {
LOG.warn("Failed to load committed table metadata or during cleanup, skipping further cleanup", e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failed loading committed table metadata or during cleanup

or

Failed to load committed table metadata or failed during cleanup

@StefanXiepj
Copy link
Contributor

This pr looks like copied from Core: Skipping manifest clean-up for all Error or Exception. #4507 , @rdblue why not continue on the original pr and copied them into new pr ? I think we should encourage more contributors to join us.

@dilipbiswal @RussellSpitzer @stevenzwu @electrum cc, pls

Copy link
Contributor

@kbendick kbendick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@rdblue rdblue merged commit 9ec7a8f into apache:master May 2, 2022
@rdblue
Copy link
Contributor Author

rdblue commented May 2, 2022

Thanks for reviewing this, everyone! Good to have it fixed.

nastra pushed a commit to nastra/iceberg that referenced this pull request May 16, 2022
rdblue added a commit that referenced this pull request May 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Iceberg commit flow can abort and delete data files exposed to readers

6 participants