Skip to content

Conversation

@ismailsimsek
Copy link
Contributor

@ismailsimsek ismailsimsek commented Jan 9, 2025

resolves #10844
resolves #11914

Copied over kafka-connect-transforms code. no code changes made
applied code formatting
and updated build.gradle accordingly

cc: @bryanck could you please take look at this when you have chance?

@ismailsimsek
Copy link
Contributor Author

@jbonofre @bryanck its ready for review

@Fokko Fokko requested a review from bryanck January 15, 2025 14:04
@ismailsimsek ismailsimsek force-pushed the kafka-smt-copy branch 2 times, most recently from 2e2c727 to 9e64c5b Compare January 15, 2025 16:45
@bryanck
Copy link
Contributor

bryanck commented Jan 16, 2025

Thanks @ismailsimsek for porting this over!

@ismailsimsek ismailsimsek force-pushed the kafka-smt-copy branch 2 times, most recently from 823cc83 to 34445e6 Compare January 16, 2025 17:56
@liko9
Copy link
Contributor

liko9 commented Feb 15, 2025

Can someone please add this to the milestone for 1.9.0?

@Fokko Fokko added this to the Iceberg 1.9.0 milestone Feb 15, 2025
@ismailsimsek
Copy link
Contributor Author

@Fokko @bryanck @danielcweeks @jbonofre could you please review?

Currently, i believe only open question is on LICENSE/NOTICE changes, not sure what to do on that part. should we merge and followup with the new PR for LICENSE/NOTICE updates?

@jbonofre
Copy link
Member

@ismailsimsek let me do a pass and suggest update on LICENSE/NOTICE if needed.

@jbonofre
Copy link
Member

I did a pass about LICENSE/NOTICE.

First, in kafka-connect-transforms, I see use of debezium package, but for specific code (not code copied from debezium), so it's OK.

As kafka-connect-transforms is used in kafka-connect-runtime distributions, I did a pass on the transitive dependencies.
Here's what I see in the PR:

  1. The versions in LICENSE doesn't match the one in the distributions (I suggest to do a rebase to fix that)
  2. bson should be in LICENSE. As bson comes from https://github.com/mongodb/mongo-java-driver and there's no NOTICE there, no need to update NOTICE in the distributions. That should be in this PR.
  3. detector-resources-support (from Google Opentelemetry) and exporter-metrics (from Google Opentelemetry) are not in LICENSE. I will check that (on main) as I think I fixed it already.

So, @ismailsimsek specifically to this PR, bson should be added in LICENSE. If you want I can create a commit in this PR for that.

@ismailsimsek
Copy link
Contributor Author

I did a pass about LICENSE/NOTICE.

First, in kafka-connect-transforms, I see use of debezium package, but for specific code (not code copied from debezium), so it's OK.

As kafka-connect-transforms is used in kafka-connect-runtime distributions, I did a pass on the transitive dependencies. Here's what I see in the PR:

1. The versions in `LICENSE` doesn't match the one in the distributions (I suggest to do a rebase to fix that)

2. `bson` should be in `LICENSE`. As bson comes from https://github.com/mongodb/mongo-java-driver and there's no `NOTICE` there, no need to update `NOTICE` in the distributions. That should be in this PR.

3. `detector-resources-support` (from Google Opentelemetry) and `exporter-metrics` (from Google Opentelemetry) are not in `LICENSE`. I will check that (on `main`) as I think I fixed it already.

So, @ismailsimsek specifically to this PR, bson should be added in LICENSE. If you want I can create a commit in this PR for that.

Thanks for the review, @jbonofre, I appreciate your feedback. yep its ok for me, feel free to commit it directly to the PR, otherwise I can incorporate them soon.

bryanck and others added 15 commits February 18, 2025 17:20
(cherry picked from commit 639b0d5b41b827d984aae04efe594315ec2b2b91)
(cherry picked from commit 63cde8e7f6c12392c7741922d5e6ad807051f24a)
(cherry picked from commit b9cd15de938e57bf178e1ccb443e481fde881224)
(cherry picked from commit c17c6f734dac96975959e6165416396e4058332c)
(cherry picked from commit 03dcf40b484f40f62c551b9ddf5cefea93a3440a)
(cherry picked from commit d0adaf9f961ceb89aaa408c03874788b3cf2c422)
(cherry picked from commit 5812322e595cee663d920aedaed21998fffa9bdf)
(cherry picked from commit bf82d607dc2b5e816c8b6f59bcbdc48281154e98)
(cherry picked from commit 89f533b2e689cbd1935c3bd1b82eea5e9dc0cd07)
(cherry picked from commit 92e4d984fe41c20faf68b1c36e6fd20759e0a19f)
* smt-nested-json-as-map

- parse json objects into Maps rather than Structs prior to handing to the iceberg connector, for users with unstructured json data.

(cherry picked from commit 303435aa794d8df1728f83ca5179e896b17ca4ff)
* option-to-inject-kafka-metadata

- SMT to add Kafka metadata (topic, partition, offset, timestamp) to Struct and Map types

(cherry picked from commit 423b4a8b0f2e42f2dd7de315631e944c285dcb09)
* matf-non-flattening-mongodb-debezium-smt

- adds debezium mongo SMT for converting BSON before/after into typed Struct before/after

(cherry picked from commit 21d741e53ce77547edbb5838f1b2b49db619be0c)
@ismailsimsek
Copy link
Contributor Author

@jbonofre rebased and updated LICENSE file. please feel free commit/change if anything missing.

@jbonofre
Copy link
Member

@ismailsimsek awesome ! thanks. I'm preparing a PR for your branch 😄

@ismailsimsek
Copy link
Contributor Author

@jbonofre @danielcweeks @Fokko @ajantha-bhat could you please review? I believe all open points are addressed.

Note: We'll want to merge this without squashing.

It looks good! We'll want to merge this without squashing.

@bryanck
Copy link
Contributor

bryanck commented Feb 24, 2025

It is OK with me if we lose my history. Could you look into the failing tests though?

@jbonofre
Copy link
Member

@ismailsimsek I created a PR to "fix" the LICENSE/NOTICE: ismailsimsek#201 Maybe worth to consider ?

…dition of transforms

Update kafka-connect-runtime distributions LICENSE and NOTICE with addition of transforms
@ismailsimsek
Copy link
Contributor Author

ismailsimsek commented Feb 24, 2025

@jbonofre merged it. and test are passing now

@bryanck
Copy link
Contributor

bryanck commented Feb 24, 2025

LGTM, I'll leave it open for one more day and see if there is any more feedback.

@bryanck bryanck merged commit 7d9e96f into apache:main Feb 25, 2025
43 checks passed
@bryanck
Copy link
Contributor

bryanck commented Feb 25, 2025

Thanks @ismailsimsek for porting this over!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Official iceberg kafka-connect is missing SMTs from original Databricks/Tabular repository Kafka Connect: Add SMTs for Debezium and AWS DMS

7 participants