Skip to content

ESQL: Drop null columns in text formats #117643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Dec 16, 2024

Conversation

kanoshiou
Copy link
Contributor

This PR resolves the issue where, despite setting drop_null_columns=true, columns that are entirely null are still returned when using format=txt, format=csv, or format=tsv.

Closes #116848

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.0.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Nov 27, 2024
@astefan astefan added :Analytics/ES|QL AKA ESQL and removed needs:triage Requires assignment of a team area label labels Nov 28, 2024
@astefan astefan requested a review from nik9000 November 28, 2024 14:38
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 28, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! I left a small thing but I think it's great!

@bpintea, you wrote the text formatter stuff, would you like to review this one too?

@kanoshiou kanoshiou changed the title Drop null columns in text formats ESQL: Drop null columns in text formats Dec 5, 2024
@bpintea bpintea self-assigned this Dec 9, 2024
Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, left some small remarks.

@@ -60,17 +60,23 @@ public TextFormatter(EsqlQueryResponse response) {
/**
* Format the provided {@linkplain EsqlQueryResponse} optionally including the header lines.
*/
public Iterator<CheckedConsumer<Writer, IOException>> format(boolean includeHeader) {
public Iterator<CheckedConsumer<Writer, IOException>> format(boolean includeHeader, boolean dropNullColumns) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that includeHeader was passed here as param, but this object is only created once per response. Maybe it's worth configuring it right from the start by passing these options in the c'tor and building there the [] dropColumns, rather than passing this array down the method chain.

@kanoshiou
Copy link
Contributor Author

Thank you @bpintea . I have modified the code based on your comments. Please review it when you have time. If there are any other changes needed, please let me know. I am happy to make them.

@bpintea
Copy link
Contributor

bpintea commented Dec 13, 2024

@elasticsearchmachine test this please

hasHeader(request) && esqlResponse.columns().isEmpty() == false
hasHeader(request) && esqlResponse.columns() != null
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe columns will never be null. However, changing the check to see if they are empty might cause some test cases to fail. Perhaps it should be updated to handle empty values and adjust the affected test cases accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try it out, but it looks to me like it should produce the same result. But yes, seems like esqlResponse.columns() will currently never be null.

Comment on lines 1276 to 1283
case "csv" -> {
assertEquals(initialValue, "\r\n");
assertEquals("\r\n", initialValue);
initialValue = "";
}
case "tsv" -> {
assertEquals(initialValue, "\n");
assertEquals("\n", initialValue);
initialValue = "";
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the test case I mentioned earlier that caused an error. When I first encountered this error, the expected and actual values were reversed, which confused me at first. Therefore, I think it would be better to flip them.

@kanoshiou
Copy link
Contributor Author

@bpintea, I have modified the code to resolve the test failure. However, I can only see this one failure because my IDEA crashes every time when running checks (Maybe I should update my equipment😂).

It might be more convenient for external contributors if the elasticsearch-pull-request pipeline were open to the public, similar to the logstash-pull-request-pipeline.

@bpintea
Copy link
Contributor

bpintea commented Dec 16, 2024

buildkite test this

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @kanoshiou!

@bpintea bpintea added the auto-backport Automatically create backport pull requests when merged label Dec 16, 2024
@bpintea bpintea merged commit 6d6eac2 into elastic:main Dec 16, 2024
18 checks passed
bpintea pushed a commit to bpintea/elasticsearch that referenced this pull request Dec 16, 2024
This PR resolves the issue where, despite setting `drop_null_columns=true`, columns that are entirely null are still returned when using `format=txt`, `format=csv`, or `format=tsv`.

Closes elastic#116848
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

elasticsearchmachine pushed a commit that referenced this pull request Dec 16, 2024
This PR resolves the issue where, despite setting `drop_null_columns=true`, columns that are entirely null are still returned when using `format=txt`, `format=csv`, or `format=tsv`.

Closes #116848

Co-authored-by: kanoshiou <[email protected]>
@bpintea
Copy link
Contributor

bpintea commented Dec 16, 2024

@kanoshiou

It might be more convenient for external contributors if the elasticsearch-pull-request pipeline were open to the public

A bit late for this PR, but it is from now on.

@kanoshiou
Copy link
Contributor Author

Thank you very much @bpintea !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ESQL: drop_null_columns doesn't work with format=txt
5 participants