-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed as not planned
Labels
Description
iceberg/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
Lines 244 to 255 in 333227f
| try { | |
| persistTable(tbl, updateHiveTable); | |
| lock.ensureActive(); | |
| commitStatus = CommitStatus.SUCCESS; | |
| } catch (LockException le) { | |
| throw new CommitStateUnknownException( | |
| "Failed to heartbeat for hive lock while " | |
| + "committing changes. This can lead to a concurrent commit attempt be able to overwrite this commit. " | |
| + "Please check the commit history. If you are running into this issue, try reducing " | |
| + "iceberg.hive.lock-heartbeat-interval-ms.", | |
| le); |
recently we encounter a few cases that write to iceberg table aborted and then table is not usable complaining with error message like
Caused by: org.apache.iceberg.exceptions.NotFoundException: Failed to open input stream for file: s3://some/path/to/table/metadata/13637-45c53fb2-5124-4891-ace3-c63ed91e1d26.metadata.json
the symptem seems to be that the hive commit is persistent in metastore, but the spark write abort then deleted the location file. result in the table is not useable.
E1024 22:07:51.303 pool-38-thread-273 o.a.s.s.e.d.v.OverwriteByExpressionExec:77] Data source write support IcebergBatchWrite(table=<redacted>, format=PARQUET) is aborting.
W1024 22:07:51.303 pool-38-thread-273 o.a.i.s.s.SparkWrite:226] Skipping cleanup of written files
E1024 22:07:51.303 pool-38-thread-273 o.a.s.s.e.d.v.OverwriteByExpressionExec:77] Data source write support IcebergBatchWrite(table=<redacted>, format=PARQUET) aborted.
and through s3 access log we can confirm metadata location file is deleted along the way.
and we have to fix the table by restore the metadata location to previous one in hive.
-- get previous_metadata_location.
show tblproperties xxx;
alter table xxx set tblproperties('metadata_location' = '{previous_metadata_location}');