Skip to content

Spark write abort result in table miss metadata location file  #8927

@dyno

Description

@dyno

try {
persistTable(tbl, updateHiveTable);
lock.ensureActive();
commitStatus = CommitStatus.SUCCESS;
} catch (LockException le) {
throw new CommitStateUnknownException(
"Failed to heartbeat for hive lock while "
+ "committing changes. This can lead to a concurrent commit attempt be able to overwrite this commit. "
+ "Please check the commit history. If you are running into this issue, try reducing "
+ "iceberg.hive.lock-heartbeat-interval-ms.",
le);

recently we encounter a few cases that write to iceberg table aborted and then table is not usable complaining with error message like

Caused by: org.apache.iceberg.exceptions.NotFoundException: Failed to open input stream for file: s3://some/path/to/table/metadata/13637-45c53fb2-5124-4891-ace3-c63ed91e1d26.metadata.json

the symptem seems to be that the hive commit is persistent in metastore, but the spark write abort then deleted the location file. result in the table is not useable.

E1024 22:07:51.303 pool-38-thread-273   o.a.s.s.e.d.v.OverwriteByExpressionExec:77] Data source write support IcebergBatchWrite(table=<redacted>, format=PARQUET) is aborting.
W1024 22:07:51.303 pool-38-thread-273   o.a.i.s.s.SparkWrite:226] Skipping cleanup of written files
E1024 22:07:51.303 pool-38-thread-273   o.a.s.s.e.d.v.OverwriteByExpressionExec:77] Data source write support IcebergBatchWrite(table=<redacted>, format=PARQUET) aborted.

and through s3 access log we can confirm metadata location file is deleted along the way.

and we have to fix the table by restore the metadata location to previous one in hive.

-- get previous_metadata_location.
show tblproperties xxx;

alter table xxx set tblproperties('metadata_location' = '{previous_metadata_location}');

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions