-
Notifications
You must be signed in to change notification settings - Fork 3k
Flink: FLIP-27 Iceberg source and builder that puts everything together #5109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…P-27 Flink source does a deep copy to array pool RowData
| protected ArrayData buildList(ReusableArrayData list) { | ||
| list.setNumElements(writePos); | ||
| return list; | ||
| // Since ReusableArrayData is not accepted by Flink, use GenericArrayData temporarily to walk around it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FLIP-27 source reader needs to do a deep copy when handing over a batch/array of records from reader thread to Flink operator thread. I reverted this change from PR #4712 so that the CI build for this PR can pass for now.
We should discuss how to fix this forward. @yittg would love to get the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to update/fix the ArrayDataSerializer#copy method from Flink first.
@Override
public ArrayData copy(ArrayData from) {
if (from instanceof GenericArrayData) {
return copyGenericArray((GenericArrayData) from);
} else if (from instanceof ColumnarArrayData) {
return copyColumnarArray((ColumnarArrayData) from);
} else if (from instanceof BinaryArrayData) {
return ((BinaryArrayData) from).copy();
} else {
return toBinaryArray(from);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized the ReusableArrayData is actually from Iceberg code FlinkParquetReaders
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really get it, i think the fix in Flink is included in 1.15.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FLIP-27 source reader uses this RowDataUtil util method to clone the RowData as it batches records for thread handover. It depends on ArrayDataSerializer to clone the field. With the change from PR #4712, ReusableArrayData object is not cloned and reused, which corrupts the batched RowData array.
public static RowData clone(RowData from, RowData reuse, RowType rowType, TypeSerializer[] fieldSerializers) {
GenericRowData ret;
if (reuse instanceof GenericRowData) {
ret = (GenericRowData) reuse;
} else {
ret = new GenericRowData(from.getArity());
}
ret.setRowKind(from.getRowKind());
for (int i = 0; i < rowType.getFieldCount(); i++) {
if (!from.isNullAt(i)) {
RowData.FieldGetter getter = RowData.createFieldGetter(rowType.getTypeAt(i), i);
ret.setField(i, fieldSerializers[i].copy(getter.getFieldOrNull(from)));
} else {
ret.setField(i, null);
}
}
return ret;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevenzwu Do you have any failed test case? Let me have a look or reproduce it locally. I'm still confused why it still can not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yittg you can check out the dev branch for this PR and run the TestIcebergSourceBounded. testCustomizedFlinkDataTypes method should fail, because the array field has the same value for all records.
For easier read of the diff, you can change the record count from 10 to 2.
List<Record> records = RandomGenericData.generate(schema, 10, 0L);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @stevenzwu,
I finally got the point, the ArrayDataSerializer in Flink should be renew each time because it reuse the BinaryArrayData internally. 'Think we can change the signature of RowDataUtil#clone to accept a supplier of serializer to walk around it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into the Flink for why map data works but array data not with similar implementation, found that the MapDataSerializer#toBinaryMap always contains copy semantics implicitly but not for ArrayDataSerializer#toBinaryArray.
see https://issues.apache.org/jira/browse/FLINK-28214
| } | ||
|
|
||
| public StreamingStartingStrategy startingStrategy() { | ||
| public StreamingStartingStrategy streamingStartingStrategy() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ScanContext is an internal class. this change shouldn't break any users.
| try (TableLoader loader = tableLoader) { | ||
| return loader.loadTable(); | ||
| } catch (IOException e) { | ||
| throw new RuntimeException("Failed to close table loader", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be UncheckedIOException instead of generic RuntimeException?
|
|
||
| @Override | ||
| public Boundedness getBoundedness() { | ||
| return scanContext.isStreaming() ? Boundedness.BOUNDED : Boundedness.CONTINUOUS_UNBOUNDED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is backwards. Good catch, @stevenzwu!
| return this; | ||
| } | ||
|
|
||
| public IcebergSource<T> build() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use public <T> IcebergSource<T> build() here to allow a bit easier type customization.
| } | ||
| } | ||
|
|
||
| public static <T> Builder<T> builder() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use builderForRowData here also.
If you do that, you may want to pass the ReaderFunction<T> in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I tried to implement the RowData factory method, I found it kind of need to duplicate 3 of the ScanContext configs for the RowDataReaderFunction.
public static Builder<RowData> builderForRowData(Configuration readConfig, Table table, ScanContext context) {
ReaderFunction<RowData> readerFunction = new RowDataReaderFunction(readConfig, table.schema(), context.project(),
context.nameMapping(), context.caseSensitive(), table.io(), table.encryption());
return new Builder<RowData>()
.readerFunction(readerFunction);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I actually like to make ScanContext public and exposed to users. Then it is ok to have method like Builder<RowData> builderForRowData(Configuration readConfig, Table table, ScanContext context). We can also avoid duplicate more than a dozen of methods from ScanContext to IcebergSource$Builder. Users can construct ScanContext once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably do better here. Why not initialize readerFunction in the build method if it is null?
| return this; | ||
| } | ||
|
|
||
| public Builder caseSensitive(boolean newCaseSensitive) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue I like to revisit the discussion of exposing ScanContext directly to users (instead of replicating the ScanContext methods in the source builder here). Not only we can avoid code duplication, we can also avoid potential out-of-sync problem. In the past, I have seen the case where we added new methods to ScanContext but forgot to add them to source builder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is making the API more complicated for the convenience of developers, so I don't think that we should expose it.
|
Thanks, @stevenzwu! I know we still need to discuss the builder that makes reading as RowData easy, but I went ahead and merged this because we can add that later. |
|
@klam-shop @zoucao This is the PR for source builder that puts everything together. You can try the MVP version out from the master branch. Also your feedbacks on the source builder/construction are welcomed. |
|
Thanks for your great work, @stevenzwu, and we will try it as soon as possible. |
|
Thank you @stevenzwu ! |
No description provided.