Skip to content

Returning ignored fields in the simulate ingest API #117214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Dec 23, 2024

Conversation

masseyke
Copy link
Member

@masseyke masseyke commented Nov 20, 2024

As described in #116497, A new ignored_fields array is returned if ingest would ignore any fields in the input. For example:

curl -X POST "localhost:9200/_ingest/_simulate?pretty" -H 'Content-Type: application/json' -d'
{
  "docs": [
    {        
      "_index": "simulate-test",
      "_id": "y9Es_JIBiw6_GgN-U0qy",
      "_score": 1,
      "_source": {
        "abc": "sfdsfsfdsfsfdsfsfdsfsfdsfsfdsf"
      }    
    }      
  ],
  "index_template_substitutions": {
    "ind_temp": {
      "index_patterns": ["simulate-test"],
      "composed_of": ["simulate-test"]
    }    
  },
  "component_template_substitutions": {
    "simulate-test": {
      "template": {
        "mappings": {
          "dynamic": false,
          "properties": {
            "abc": {
              "type": "keyword",
              "ignore_above": 1
            }
          }  
        }    
      }    
    }      
  }
}
'
{
  "docs" : [
    {
      "doc" : {
        "_id" : "y9Es_JIBiw6_GgN-U0qy",
        "_index" : "simulate-test",
        "_version" : -3,
        "_source" : {
          "abc" : "sfdsfsfdsfsfdsfsfdsfsfdsfsfdsf"
        },
        "executed_pipelines" : [ ],
        "ignored_fields" : [
          {
            "field": "abc"
          }
        ]
      }
    }
  ]
}

@masseyke masseyke added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Nov 20, 2024
@masseyke masseyke added >enhancement v8.18.0 auto-backport Automatically create backport pull requests when merged labels Dec 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @masseyke, I've created a changelog YAML for you.

@masseyke masseyke marked this pull request as ready for review December 17, 2024 20:25
@masseyke masseyke requested a review from dakrone December 17, 2024 20:26
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Dec 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Keith!

@@ -168,11 +177,12 @@ protected void doInternalExecute(
/**
* This creates a temporary index with the mappings of the index in the request, and then attempts to index the source from the request
* into it. If there is a mapping exception, that exception is returned. On success the returned exception is null.
* @parem componentTemplateSubstitutions The component template definitions to use in place of existing ones for validation
* @param componentTemplateSubstitutions The component template definitions to use in place of existing ones for validation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, nice catch :)

@@ -360,6 +372,23 @@ private void validateUpdatedMappings(
0
);
});
final Collection<String> ignoredFields;
if (result == null) {
ignoredFields = List.of();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably just do return List.of(); here and avoid having to create the ignoredFields local field? It's not a big deal either way though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of prefer having a single return statement rather than 3 -- it makes debugging easier for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay either way :)

ignoredFields = List.of();
} else {
List<LuceneDocument> luceneDocuments = result.parsedDoc().docs();
if (luceneDocuments != null && luceneDocuments.size() == 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add an assert luceneDocuments().size() == 1 somewhere here to ensure that we fail if in the future a single index request results in more than one doc? (We'd silently ignored the response if we didn't)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@masseyke masseyke merged commit 43e6fad into elastic:main Dec 23, 2024
15 of 16 checks passed
@masseyke masseyke deleted the simulate-ingest-return-ignored-fields branch December 23, 2024 21:53
masseyke added a commit to masseyke/elasticsearch that referenced this pull request Dec 23, 2024
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement Team:Data Management Meta label for data/management team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants