MR: support imported data reads in input format by using name mapping #3312

edgarRd · 2021-10-18T22:59:54Z

Adding name mapping to readers built in IcebergInputFormat. This will allow Hive to read data imported during Hive -> Iceberg table migration, which otherwise fails with NPE while trying to resolve the Iceberg types in the parquet/orc files.

Since the change is pretty straightforward - and we have this same behavior in all other readers - I've not added more unit tests. We also already have extensive name mapping unit tests. However, if folks feel we need a unit test for this I'm happy to add it. I've validated this change by reading via Hive using the IcebergInputFormat on a migrated table.

PTAL @pvary @aokolnychyi

rdblue · 2021-10-18T23:39:07Z

mr/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java

      if (reuseContainers) {
        avroReadBuilder.reuseContainers();
      }
+      if (nameMapping != null) {


Nit: empty line bewteen control flow statements

rdblue

Looks good to me. A minor nit on formatting, but otherwise good.

rdblue · 2021-10-18T23:40:17Z

Thanks, @edgarRd!

edgarRd · 2021-10-18T23:57:45Z

Thanks, @rdblue!

MR: support imported data reads in input format by using name mapping

32863b9

github-actions bot added the MR label Oct 18, 2021

rdblue reviewed Oct 18, 2021

View reviewed changes

rdblue approved these changes Oct 18, 2021

View reviewed changes

rdblue merged commit 34e72b5 into apache:master Oct 18, 2021

edgarRd deleted the mr-name-mapping branch October 19, 2021 14:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MR: support imported data reads in input format by using name mapping #3312

MR: support imported data reads in input format by using name mapping #3312

Uh oh!

edgarRd commented Oct 18, 2021

Uh oh!

rdblue Oct 18, 2021

Uh oh!

rdblue left a comment

Uh oh!

rdblue commented Oct 18, 2021

Uh oh!

edgarRd commented Oct 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MR: support imported data reads in input format by using name mapping #3312

MR: support imported data reads in input format by using name mapping #3312

Uh oh!

Conversation

edgarRd commented Oct 18, 2021

Uh oh!

rdblue Oct 18, 2021

Choose a reason for hiding this comment

Uh oh!

rdblue left a comment

Choose a reason for hiding this comment

Uh oh!

rdblue commented Oct 18, 2021

Uh oh!

edgarRd commented Oct 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants