Skip to content

Conversation

@linguoxuan
Copy link
Contributor

@linguoxuan linguoxuan commented Jan 12, 2026

This closes FLINK-38889.

Purpose

This PR fixes the issue where YAML Kafka sink connector does not support serializing complex types (MAP, ARRAY, ROW) to JSON format (Debezium / Canal), while Kafka SQL connector handles them without problem.

Root Cause

The issue was in the TableSchemaInfo class, which is responsible for converting CDC's RecordData format to Flink's RowData format before JSON serialization. The createFieldGetter() method lacked the necessary conversion logic for complex types.

Changes

  1. Added complex type conversion methods in TableSchemaInfo.java: support for ARRAY, MAP, and ROW types

Testing

  1. TableSchemaInfoTest.java:
  2. DebeziumJsonSerializationSchemaTest.java:
  3. CanalJsonSerializationSchemaTest.java:
  4. KafkaDataSinkITCase.java

Copy link
Member

@yuxiqian yuxiqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Skyler for the quick fix! Just left some trivial comments.

@linguoxuan
Copy link
Contributor Author

Thanks @yuxiqian for the review! I have made the changes as suggested. Since the suggestions focused on code formatting, I force-pushed the code to make it clearer. PTAL.

@yuxiqian
Copy link
Member

yuxiqian commented Jan 13, 2026

Thanks for the quick response! Just pushed another commit to simplify IT case and docs style.

Would @lvyanquan like to take another look?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for serializing complex types (MAP, ARRAY, ROW) to JSON format in the Kafka sink connector for both Debezium and Canal JSON formats. Previously, only the Kafka SQL connector supported these complex types, while the YAML-configured Kafka sink connector would fail when encountering them.

Changes:

  • Refactored type conversion logic from TableSchemaInfo into a new RecordDataConverter utility class
  • Added conversion support for ARRAY, MAP, and ROW types with recursive handling for nested structures
  • Added comprehensive test coverage including unit tests and integration tests for various complex type scenarios

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
RecordDataConverter.java New utility class that handles conversion of CDC RecordData to Flink SQL RowData, including support for complex types (ARRAY, MAP, ROW) with recursive nesting
TableSchemaInfo.java Refactored to delegate field getter creation to RecordDataConverter, removing duplicate conversion logic
TableSchemaInfoTest.java Added test for nested ROW types within ARRAY to verify complex type conversion
DebeziumJsonSerializationSchemaTest.java Added test to verify Debezium JSON serialization of complex types
CanalJsonSerializationSchemaTest.java Added test to verify Canal JSON serialization of complex types
KafkaDataSinkITCase.java Added comprehensive integration tests covering basic complex types, nested arrays, maps with array values, null/empty collections, and deeply nested structures

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@linguoxuan
Copy link
Contributor Author

linguoxuan commented Jan 24, 2026

lvyanquan

Hi @lvyanquan, I have modified the codes as requested by Copilot. Please take another look if you hava time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants