Skip to content

Conversation

@rodmeneses
Copy link
Contributor

@rodmeneses rodmeneses commented Apr 9, 2024

This is how I prepared this PR:

  1. cd flink
  2. git mv v.18/ v1.19
  3. git commit -am"Flink: Move flink/v1.18 to flink/v1.19"
  4. rm -rf v1.18/
  5. cp -R v1.19/ v1.18
  6. git add v1.18/
  7. git commit -am"Flink: Recover flink/1.18 files from history"
  8. code changes, properties changes so that all unit test pass
  9. git commit -am"Flink: Refactoring code and properties to make Flink 1.19 to work"

a new PR for deleting v1.16 will immediately follow up after this is merged

@rodmeneses rodmeneses changed the title Flink: Adds support for 1.19 version Flink: Adds support for Flink 1.19 version Apr 9, 2024
@rodmeneses rodmeneses marked this pull request as draft April 9, 2024 20:39
@rodmeneses rodmeneses force-pushed the addingFlink119 branch 7 times, most recently from 370f098 to d002a96 Compare April 10, 2024 19:48
@rodmeneses rodmeneses marked this pull request as ready for review April 10, 2024 20:36
@rodmeneses
Copy link
Contributor Author

cc: @pvary @stevenzwu @mas-chen please take a look at your earliest convenience. Thanks

* @param ifExists If we should use the 'IF EXISTS' when dropping the database
*/
protected void dropDatabase(String database, boolean ifExists) {
sql("CREATE DATABASE IF NOT EXISTS temp");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a default database to use? I find it strange that we are creating a database to drop a database 😄
What happens in Flink if we do not create/use a database before creating a table?

public void clean() {
sql("DROP TABLE IF EXISTS %s", TABLE_NAME);
sql("DROP DATABASE IF EXISTS %s", DATABASE_NAME);
dropDatabase(DATABASE_NAME, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this? Something behind the scenes issue a USE DATABASE command?

If so, this could be a breaking change which we might want to highlight, as creating and dropping a database immediately after that won't work anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a discussion offline with @pvary, we have decided to create a new PR with a better cleaning up logic that will be applied to the unit tests. After that is merged, we can continue with this PR.
Thanks!

Copy link
Contributor Author

@rodmeneses rodmeneses Apr 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up changing this implementation of dropDatabase to:

  protected void dropDatabase(String database, boolean ifExists) {
    String currentCatalog = getTableEnv().getCurrentCatalog();
    sql("USE CATALOG %s", DEFAULT_CATALOG_NAME);
    sql("USE %s", getTableEnv().listDatabases()[0]);
    sql("USE CATALOG %s", currentCatalog);
    sql("DROP DATABASE %s %s", ifExists ? "IF EXISTS" : "", database);
  }

which is similar to our current dropCatalog implementation.

ContinuousSplitPlannerImpl splitPlanner =
new ContinuousSplitPlannerImpl(
tableResource.tableLoader().clone(), scanContextWithInvalidSnapshotId, null);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like unnecessary changes here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @nastra and thanks a lot for your reviews.
Some comments:

  1. The best way to review this PR is by looking at individual commits. The most important one is the 4th commit called "Flink: Refactoring code and properties to make Flink 1.19 to work". If you review that particular one, you will notice that I'm not introducing the extra line changes.
  2. The reason you're seeing those changes is because of the way Github is presenting the differences, which is showing the complete set of differences in the whole PR, and somehow, it's showing diff between the new 1.19 code and the old 1.16 code
  3. Now, we do we even have those changes? This is because of a previous back port of changes between versions. There were some inconsistencies when the back port was made and on those old commits the extra lines (or the use of Assertions.assertThatThrownBy instead of statically imported) were introduced.
  4. So, again, these changes you're referring to are not part of this PR.
  5. I updated the PR description with the exact steps I take to create the PR

Thanks a lot!

.monitorInterval(Duration.ofMillis(100))
.maxPlanningSnapshotCount(0)
.build();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see my comment above

import org.apache.iceberg.util.StructLikeSet;

public class TestFlinkInputFormatReaderDeletes extends TestFlinkReaderDeletesBase {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to remove all of these whitespace changes, which would make it much easier to review the PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see my comment above

.asStruct());
// Adding a required field should fail because Iceberg's SchemaUpdate does not allow
// incompatible changes.
Assertions.assertThatThrownBy(() -> sql("ALTER TABLE tl ADD (pk STRING NOT NULL)"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be statically imported, similar to assertThat()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see my comment above

@ajantha-bhat
Copy link
Member

@rodmeneses: Just curious to know what command you have used for the step "Flink: Recover flink/1.18 files from history" ?
It is really nice.

I thought the only way was to add back the folder (which shows file was added).

@rodmeneses
Copy link
Contributor Author

@rodmeneses: Just curious to know what command you have used for the step "Flink: Recover flink/1.18 files from history" ? It is really nice.

I thought the only way was to add back the folder (which shows file was added).

Thanks @ajantha-bhat . I updated the PR description with the steps I took. Please take a look! 😄

@rodmeneses rodmeneses force-pushed the addingFlink119 branch 2 times, most recently from e388a59 to c3a8bdd Compare April 14, 2024 21:32
sql("DROP DATABASE `%s`", databaseName());
}
}
testCreateConnectorTable();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to this point in the unit test, there's only one catalog default_catalog created and only one database default_database created so far. So, trying to drop that database will fail, as we are currently using it, and because if FLINK-33226, it will be impossible to drop the database.

@rodmeneses rodmeneses force-pushed the addingFlink119 branch 2 times, most recently from 6591b3c to f03df0b Compare April 14, 2024 23:13
FlinkSource.Builder builder = FlinkSource.forRowData();
Optional.ofNullable(options.get("case-sensitive"))
.ifPresent(value -> builder.caseSensitive(Boolean.parseBoolean(value)));
.ifPresent(value -> builder.caseSensitive(Boolean.getBoolean(value)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseBoolean is correct here (this has just been recently changed on main)

@rodmeneses rodmeneses force-pushed the addingFlink119 branch 7 times, most recently from a76dcd2 to 6656ba9 Compare April 15, 2024 17:08
Copy link
Contributor

@pvary pvary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra: Any comments? I would like to merge this soon, as any merge to Flink code path will make this PR stale, and @rodmeneses needs to recreate the whole PR.

Thanks,

@manuzhang
Copy link
Member

Why do we remove Flink 1.1.6 in this PR?

}

createIcebergTable(tablePath, table, ignoreIfExists);
Preconditions.checkArgument(table instanceof ResolvedCatalogTable, "table should be resolved");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this new code that's required for Flink 1.19?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not clear to me why there's a diff on FlinkCatalog

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that the diff is still from a previous attempt where changes were done based on Flink 1.16. What about removing Flink 1.16 building from the gradle files but doing the actual remove of Flink 1.16 folders in a separate PR? I think that would avoid these diffs here

@nastra
Copy link
Contributor

nastra commented Apr 16, 2024

@nastra: Any comments? I would like to merge this soon, as any merge to Flink code path will make this PR stale, and @rodmeneses needs to recreate the whole PR.

Thanks,

I think there's still an issue as there are a bunch of files/diffs that are because Flink 1.16 is being removed and git detects it as a move (with some additional changes). This can also be seen when looking at the file path, where a Flink 1.16 file is moved to a Flink 1.19 file, while also adding some diffs where it's not clear why the diff is there in the first place.

My suggestion would be to do the actual removal of the 1.16 directory as a separate PR in an immediate follow-up. This would mean to skip tests 8 + 9 from the PR description, but it's fine to update gradle files to not build 1.16 anymore. Thoughts on the suggestion?

@rodmeneses
Copy link
Contributor Author

@nastra: Any comments? I would like to merge this soon, as any merge to Flink code path will make this PR stale, and @rodmeneses needs to recreate the whole PR.
Thanks,

I think there's still an issue as there are a bunch of files/diffs that are because Flink 1.16 is being removed and git detects it as a move (with some additional changes). This can also be seen when looking at the file path, where a Flink 1.16 file is moved to a Flink 1.19 file, while also adding some diffs where it's not clear why the diff is there in the first place.

My suggestion would be to do the actual removal of the 1.16 directory as a separate PR in an immediate follow-up. This would mean to skip tests 8 + 9 from the PR description, but it's fine to update gradle files to not build 1.16 anymore. Thoughts on the suggestion?

Hi @nastra thanks for your review and comment,

I think there's still an issue as there are a bunch of files/diffs that are because Flink 1.16 is being removed and git detects it as a move (with some additional changes).
This is because you are looking at the changes in the whole PR. If you se each of the 4 commits individually, you'll find that everything is making sense

I did the approach of not deleting v1.16 and this PR was updated. But, if you see the changes as a whole PR, now it's even worse than before, because we dont see the history properly.

Given this, I'd suggest to move forward with deleting v1.16 in this same PR.
Thoughts ? @nastra @pvary

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the individual commits and they do what I was expecting, so LGTM. Thanks for updating this. @pvary and I also just talked offline about doing the deletion of Flink 1.16 in a separate PR.

@pvary
Copy link
Contributor

pvary commented Apr 16, 2024

Why do we remove Flink 1.1.6 in this PR?

@manuzhang: This is how we usually do these changes. We support the 3 last version of Flink, so when we add a new version, we remove the old one.
Also we do the changes this way, to keep the history of the main directory (in our case 1.18->1.19).

Old PRs:

@pvary pvary merged commit b3ebcf1 into apache:main Apr 16, 2024
@pvary
Copy link
Contributor

pvary commented Apr 16, 2024

Merged to main.
Thanks for the PR @rodmeneses and @nastra for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants