Skip to content

GlueTableOperations/DynamoDbTableOperations can delete current metadata file after incorrect exception handling #7151

@ChristinaTech

Description

@ChristinaTech

Apache Iceberg version

1.1.0 (latest release)

Query engine

Spark

Please describe the bug 🐞

We recently encountered an issue whereby GlueTableOperations, while performing an Iceberg commit on behalf of GlueCatalog, can incorrectly interpret a successful commit as a failure, and delete the now-current table metadata file as part of cleanup. This leaves the Iceberg table inaccessible as the "current metadata pointer" now points to a deleted metadata file. We were able to correct this via an engineer manually calling Glue APIs to correct the pointer to the previous metadata file, but this represents an availability risk to our data lake service.

The reason this happens seems to be a direct result of the AWS client's default 3 attempts for a given API call, whereby Iceberg only looks at the exception thrown by the final attempt, as shown here:

org.apache.iceberg.exceptions.CommitFailedException: Cannot commit catalog_name.database_name.table_name because Glue detected concurrent update
Caused by: software.amazon.awssdk.services.glue.model.ConcurrentModificationException: Update table failed due to concurrent modifications. (Service: Glue, Status Code: 400, Request ID: <removed>)
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: Unable to execute HTTP request: Read timed out
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 2 failure: Service returned error code ServiceUnavailableException (Service: Glue, Status Code: 500, Request ID: <removed>)

We were very quickly able to determine no other writers were running on this table during the incident, which means the ConcurrentModificationException had to be from one of its own prior attempts updating the catalog despite returning a failure. If it had received the standard timeout exception, the exception logic would have correctly called checkCommitStatus and determined the commit was actually successful. However, as it only saw the ConcurrentModificationException from the final attempt, it treated the commit as failed and performed cleanup it should not have done. Notably, it would have also exhibited this incorrect behavior if the ServiceUnavailableException had been the last attempt.

As expected, Iceberg attempts to refresh its metadata and retry the commit. Unfortunately, it just deleted the object the metadata pointer directs to, resulting in:

org.apache.spark.SparkException: Writing job aborted
Caused by: org.apache.iceberg.exceptions.NotFoundException: Location does not exist: s3://fake-bucket-name/database_name.db/table_name/metadata/06814-ec5ff66c-af38-492c-ba38-55610536d9a7.metadata.json
Caused by: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404, Request ID: <removed>, Extended Request ID: <removed>)

While investigating, I noticed the same sequence of events would also cause DynamoDbTableOperations, which uses an AWS client configured in the same way, to take the same incorrect action, with the same outcome of the table becoming inaccessible.

Note: Removed some solution-specific information from error logs.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions