Skip to content

Conversation

@munendrasn
Copy link
Contributor

Fixes #6763

  • dynamodb.catalog.schema.format-version property specifies the schema version. Default value is v1
  • Doc update, and Changes in DynamoDbCatalog Python implementation are not included. Once this is reviewed, will address them
  • Unlike GlueCatalog, DynamoDbCatalog allows tables to created without a corresponding namespace associated with it. Again, not addressed in this PR assuming it was by design. If not, will update it accordingly based on the review
  • listTables fix is included in AWS: set lastEvaluatedKey for listTables in DynamoDb Catalog #6823

@jackye1995 FYI

@github-actions github-actions bot added the AWS label Feb 16, 2023
@jackye1995 jackye1995 self-requested a review February 21, 2023 04:23
@munendrasn
Copy link
Contributor Author

Hi @jackye1995
This looks good for review. Kindly, let me know if anything need to be addressed before the review


public static final String DYNAMODB_V2_SCHEMA_DEFAULT_TABLE_NAME = "iceberg_v2";

public static final String DYNAMODB_CATALOG_SCHEMA_FORMAT =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use a simpler name, just dynamodb.schema-version should work? Any reason to create sublevels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea was to indicate that this property is related to DynamoDb catalog, essentially separate out/differentiate from DynamoDb lock manager.
Let me know If I should follow current naming convention. Also, what about dynamodb.catalog.schema-version instead of existing dynamodb.catalog.schema.format-version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified the config property to dynamodb.catalog.schema-version. let me know if this looks good

public static final String DYNAMODB_CATALOG_SCHEMA_FORMAT =
"dynamodb.catalog.schema.format-version";

public static final DynamoDbSchemaVersion DYNAMODB_DEFAULT_SCHEMA_VERSION =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use int for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will use int for versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed version to use ints

@munendrasn munendrasn force-pushed the dynamodb-namespace branch from c22ceaf to e24f7e0 Compare March 2, 2023 17:29
@munendrasn
Copy link
Contributor Author

@jackye1995
I have included the suggestions, please let me know if any further changes required

Copy link
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good to me, just a nit comment


public static final String DYNAMODB_TABLE_NAME_DEFAULT = "iceberg";

public static final String DYNAMODB_V2_SCHEMA_DEFAULT_TABLE_NAME = "iceberg_v2";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we name it DYNAMODB_TABLE_NAME_DEFAULT_V2? And the default iceberg_v2 is a bit confusing, can we use iceberg_dynamodb_v2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated default name


public static final String DYNAMODB_CATALOG_SCHEMA_VERSION = "dynamodb.catalog.schema-version";

public static final int DYNAMODB_DEFAULT_SCHEMA_VERSION = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking if we should just make it default to v2, since there is not much users with prod dependencies of the Dynamo catalog as of now. Any thoughts? @munendrasn @amogh-jahagirdar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, Until now, some of the bugs encountered in DynamoDbCatalog might not surface to user depending on how users might be using the Catalog

  • listTables is uncommon operation, and only at higher cardinality the bug would surface
  • default warehouse bug too might not surface if user always provides explicit location to table or warehouse path is not set in the namespace

Probably, there might be users with dependencies on DynamoDbCatalog. So, we would need to consider the impact for those users.
Still, it might be okay to make v2 as default if we call this change out in the release as major change - just a cautionary note

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, let's keep it in v1 then, and see how this goes, and flip the default after 1-2 releases.

@jackye1995
Copy link
Contributor

@amogh-jahagirdar do you have any additional comment?

@munendrasn
Copy link
Contributor Author

@jackye1995 @amogh-jahagirdar Please review, I have rebased the changes with latest main/master branch

@munendrasn
Copy link
Contributor Author

@jackye1995 @amogh-jahagirdar Please let me know if this is still required with recent conversations around catalog. If not, I think we can close this

@munendrasn munendrasn closed this Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ACL when using DynamoDb based Catalog

2 participants