Skip to content

Conversation

@Ppei-Wang
Copy link
Contributor

Spark3.4,3.5: In describe extended view command: fix wrong view catalog and namespace, and add view text.
Before this fix,the view catalog and namespace in describe extended view command shows the current catalog and namespace when it was create.Not the catalog and namespace of the view itself.

Add view text is Compatible with other engines' usage habits, such as the widely used hive view.

@github-actions github-actions bot added the spark label Dec 11, 2024
@Ppei-Wang
Copy link
Contributor Author

@rdblue hello,Could you please help approve workflow runs and assign a Reviewer?

@nastra nastra self-requested a review December 12, 2024 07:36
toCatalystRow("View Catalog and Namespace", viewCatalogAndNamespace.quoted, "") ::
toCatalystRow("View Query Output Columns", outputColumns, "") ::
toCatalystRow("View Properties", viewProperties, "") ::
toCatalystRow("View Text", viewText, "") ::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we omitted the view text on purpose. See also #9513 (comment) for some context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,i got it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it worth leaving a code comment so other developers can avoid adding the view text in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ebyhr thanks for your suggestion, I think you are right,I added a code comment.

}

@TestTemplate
public void describeExtendedViewWithoutCurrentNamespace() {
Copy link
Contributor

@nastra nastra Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at the issue and I think the issue is less with DESCRIBE itself, but rather the namespace being used during view creation:

These 2 tests should be all we need

@TestTemplate
  public void createViewInDefaultNamespace() {
    String viewName = viewName("createViewInDefaultNamespace");
    String sql = String.format("SELECT id, data FROM %s WHERE id <= 3", tableName);

    sql("CREATE VIEW %s (id, data) AS %s", viewName, sql);
    View view = viewCatalog().loadView(TableIdentifier.of(NAMESPACE, viewName));
    assertThat(view.currentVersion().defaultCatalog()).isNull();
    assertThat(view.currentVersion().defaultNamespace()).isEqualTo(NAMESPACE);
  }

  @TestTemplate
  public void createViewWithoutCurrentNamespace() {
    String viewName = viewName("createViewWithoutCurrentNamespace");
    Namespace namespace = Namespace.of("test_namespace");
    String sql = String.format("SELECT id, data FROM %s WHERE id <= 3", tableName);

    sql("CREATE NAMESPACE IF NOT EXISTS %s", namespace);
    sql("CREATE VIEW %s.%s (id, data) AS %s", namespace, viewName, sql);
    View view = viewCatalog().loadView(TableIdentifier.of(namespace, viewName));
    assertThat(view.currentVersion().defaultCatalog()).isNull();
    assertThat(view.currentVersion().defaultNamespace()).isEqualTo(namespace);
  }

I still need to double-check some things, but most likely the fix we need is this:

--- a/spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala
+++ b/spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala
@@ -48,7 +48,7 @@ case class CreateV2ViewExec(
   override protected def run(): Seq[InternalRow] = {
     val currentCatalogName = session.sessionState.catalogManager.currentCatalog.name
     val currentCatalog = if (!catalog.name().equals(currentCatalogName)) currentCatalogName else null
-    val currentNamespace = session.sessionState.catalogManager.currentNamespace
+    val currentNamespace = ident.namespace()

Copy link
Contributor Author

@Ppei-Wang Ppei-Wang Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra but at org/apache/spark/sql/connector/catalog/View.java is defined that

public interface View {
...
/**
  * The current catalog when the view is created.
  */
 String currentCatalog();

 /**
  * The current namespace when the view is created.
  */
 String[] currentNamespace();
...
}

i think above changes have a bit conflicting?
and my suggestion is that added catalog and namespace of view itself at spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkView.java. Also change initialization accordingly in create view comand and desc extended view command. what about this idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I actually confused myself and that's exactly the part I had to double-check. So you're right about how the current namespace is being used.
Can you please update the tests to this?

+
+  @TestTemplate
+  public void createAndDescribeViewInDefaultNamespace() {
+    String viewName = viewName("createViewInDefaultNamespace");
+    String sql = String.format("SELECT id, data FROM %s WHERE id <= 3", tableName);
+
+    sql("CREATE VIEW %s (id, data) AS %s", viewName, sql);
+    TableIdentifier identifier = TableIdentifier.of(NAMESPACE, viewName);
+    View view = viewCatalog().loadView(identifier);
+    assertThat(view.currentVersion().defaultCatalog()).isNull();
+    assertThat(view.name()).isEqualTo(ViewUtil.fullViewName(catalogName, identifier));
+    assertThat(view.currentVersion().defaultNamespace()).isEqualTo(NAMESPACE);
+
+    String location = viewCatalog().loadView(identifier).location();
+    assertThat(sql("DESCRIBE EXTENDED %s.%s", NAMESPACE, viewName))
+        .contains(
+            row("id", "int", ""),
+            row("data", "string", ""),
+            row("", "", ""),
+            row("# Detailed View Information", "", ""),
+            row("Comment", "", ""),
+            row("View Catalog and Namespace", String.format("%s.%s", catalogName, NAMESPACE), ""),
+            row("View Query Output Columns", "[id, data]", ""),
+            row(
+                "View Properties",
+                String.format(
+                    "['format-version' = '1', 'location' = '%s', 'provider' = 'iceberg']",
+                    location),
+                ""));
+  }
+
+  @TestTemplate
+  public void createAndDescribeViewWithoutCurrentNamespace() {
+    String viewName = viewName("createViewWithoutCurrentNamespace");
+    Namespace namespace = Namespace.of("test_namespace");
+    String sql = String.format("SELECT id, data FROM %s WHERE id <= 3", tableName);
+
+    sql("CREATE NAMESPACE IF NOT EXISTS %s", namespace);
+    sql("CREATE VIEW %s.%s (id, data) AS %s", namespace, viewName, sql);
+    TableIdentifier identifier = TableIdentifier.of(namespace, viewName);
+    View view = viewCatalog().loadView(identifier);
+    assertThat(view.currentVersion().defaultCatalog()).isNull();
+    assertThat(view.name()).isEqualTo(ViewUtil.fullViewName(catalogName, identifier));
+    assertThat(view.currentVersion().defaultNamespace()).isEqualTo(NAMESPACE);
+
+    String location = viewCatalog().loadView(identifier).location();
+    assertThat(sql("DESCRIBE EXTENDED %s.%s", namespace, viewName))
+        .contains(
+            row("id", "int", ""),
+            row("data", "string", ""),
+            row("", "", ""),
+            row("# Detailed View Information", "", ""),
+            row("Comment", "", ""),
+            row("View Catalog and Namespace", String.format("%s.%s", catalogName, namespace), ""),
+            row("View Query Output Columns", "[id, data]", ""),
+            row(
+                "View Properties",
+                String.format(
+                    "['format-version' = '1', 'location' = '%s', 'provider' = 'iceberg']",
+                    location),
+                ""));
+  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra thanks for your suggestions, i added above tests. And do you think it is necessary to modify spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkView.java, create view comand and desc extended view command?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra thanks for your suggestions, i added above tests. And do you think it is necessary to modify spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkView.java, create view comand and desc extended view command?

@Ppei-Wang what exactly did you want to modify in the other places? All the other places should be correct (because they refer to the default catalog that was set when the view was created)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra ok,I got it. sorry, a bit influenced by the previous comment. There is no problem following my original idea.

systemProp.defaultHiveVersions=2
systemProp.knownHiveVersions=2,3
systemProp.defaultSparkVersions=3.5
systemProp.defaultSparkVersions=3.5,3.4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
systemProp.defaultSparkVersions=3.5,3.4
systemProp.defaultSparkVersions=3.5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra thanks for your suggestions, modified.

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for fixing this @Ppei-Wang

@nastra nastra merged commit b428fbc into apache:main Dec 18, 2024
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants