Skip to content

Conversation

@sumeetgajjar
Copy link
Contributor

@sumeetgajjar sumeetgajjar commented Aug 26, 2022

#3056 added support for PURGE flag to Spark.
However, the corresponding DROP TABLE doc is outdated.
This PR aims at documenting the updated behavior of the DROP TABLE DDL command.

Closes #5646

@github-actions github-actions bot added the docs label Aug 26, 2022
@sumeetgajjar sumeetgajjar changed the title Update drop table behavior in spark-ddl docs Docs: Update drop table behavior in spark-ddl docs Aug 26, 2022
@sumeetgajjar sumeetgajjar changed the title Docs: Update drop table behavior in spark-ddl docs [Docs] Update drop table behavior in spark-ddl docs Aug 26, 2022
@sumeetgajjar
Copy link
Contributor Author

A gentle ping @Fokko @samredai

Prior to 0.14, running `DROP TABLE` would remove the table from the catalog and delete the table contents as well.

From 0.14 onwards, `DROP TABLE` would only remove the table from the catalog.
In order to delete the table contents `DROP TABLE PURGE` should be used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth mentioning that gc.enabled has to be true for the contents to be deleted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is implied that if an operation involves removing any table files then gc.enabled should be true.
With that said, I didn't find an occurrence of gc.enabled anywhere in the iceberg spark docs. Thus decided to follow the same pattern.

Copy link
Contributor

@samredai samredai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just left a small suggestion. 👍


### `DROP TABLE PURGE`

To delete a table, run:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the distinction a bit clearer with DROP TABLE, how about rewording this to:

To drop the table from the catalog and delete the table's contents, run:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @samredai, thank you for your comment, took care of it in the latest commit.

@sumeetgajjar
Copy link
Contributor Author

Hi @wypoon @samredai, thank you for your comments and review on this PR.
can you also review the exact same change for the 0.14 branch: #5647?

@sumeetgajjar
Copy link
Contributor Author

sumeetgajjar commented Sep 30, 2022

@pvary @flyrain @Fokko
Since this PR is already reviewed, can you please help us merge it?

@pvary pvary merged commit 36c95a0 into apache:master Sep 30, 2022
@pvary
Copy link
Contributor

pvary commented Sep 30, 2022

Thanks for the fix @sumeetgajjar and @wypoon and @samredai for the review!

@sumeetgajjar
Copy link
Contributor Author

Thanks @pvary for the prompt response!!!
Should we merge the cherry-pick of this fix to the 0.14 branch as well #5647?

@sumeetgajjar
Copy link
Contributor Author

Thank you @pvary for merging this PR and @wypoon and @samredai for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Documented drop table behavior for Spark is outdated.

4 participants