-
Notifications
You must be signed in to change notification settings - Fork 474
aws/data_stream/route53_public_logs: Handle non-%HOSTNAME entries #9249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… log, so non-hostname logs can be parsed.
5a39b42
to
d49c066
Compare
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
Yup, still relevant and does not contain a dependency on xz 🤪 |
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
/test |
@cafuego : cAn we please resolve the merge conflicts |
/test |
@shmsr Yes, but buildkite seems to be unhappy about something; it's not completing the test or pinging back with the result (which has been fine previously) |
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
- '%{BASE10NUM} %{TIMESTAMP_ISO8601:_tmp.timestamp} %{DATA:aws.route53.hosted_zone_id} %{DATA:_tmp.question} %{WORD:dns.question.type} %{WORD:dns.response_code} %{WORD:network.transport} %{EDGE_LOCATION:aws.route53.edge_location} %{IP:source.address} (%{SUBNET:aws.route53.edns_client_subnet}|-)' | ||
pattern_definitions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative would be to provided an augmented hostname pattern that includes all the codepoints that occur in DNS names. This only requires that we add _
to the possible codepoints.
- '%{BASE10NUM} %{TIMESTAMP_ISO8601:_tmp.timestamp} %{DATA:aws.route53.hosted_zone_id} %{DATA:_tmp.question} %{WORD:dns.question.type} %{WORD:dns.response_code} %{WORD:network.transport} %{EDGE_LOCATION:aws.route53.edge_location} %{IP:source.address} (%{SUBNET:aws.route53.edns_client_subnet}|-)' | |
pattern_definitions: | |
- '%{BASE10NUM} %{TIMESTAMP_ISO8601:_tmp.timestamp} %{DATA:aws.route53.hosted_zone_id} %{DNS_QUESTION:_tmp.question} %{WORD:dns.question.type} %{WORD:dns.response_code} %{WORD:network.transport} %{EDGE_LOCATION:aws.route53.edge_location} %{IP:source.address} (%{SUBNET:aws.route53.edns_client_subnet}|-)' | |
pattern_definitions: | |
DNS_QUESTION: \b(?:[_0-9A-Za-z][_0-9A-Za-z-]{0,62})(?:\.(?:[_0-9A-Za-z][_0-9A-Za-z-]{0,62}))*(\.?|\b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the very least you also need to add an escaped backslash, or you'll still miss any encoded UTF-8 data.
DNS entries can also contain any binary rubbish you'd care to put in (although Route53 may not allow you to add that) so from a log analysis perspective it's important to grab those logs and allow them to be analysed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DNS entries can also contain any binary rubbish you'd care to put in
That's horrifying, but point well made.
Co-authored-by: Dan Kortschak <[email protected]>
/test |
packages/aws/data_stream/route53_public_logs/_dev/test/pipeline/test-route53.log-expected.json
Outdated
Show resolved
Hide resolved
Aww, sad trombone. I'll update the test. |
…e/test-route53.log-expected.json Co-authored-by: Dan Kortschak <[email protected]>
/test |
🚀 Benchmarks reportTo see the full report comment with |
💚 Build Succeeded
History
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
Package aws - 2.21.0 containing this change is available at https://epr.elastic.co/search?package=aws |
@efd6 thanks mate! |
Proposed commit message
Change the HOSTNAME pattern to DATA in the grok snippet to match route53 public logs, so that DNS records that contain characters not included in
[0-9A-Za-z][0-9A-Za-z-]
can still be processed.This notably includes entries such as _dmarc, _spf and various types of domain ownership verification records.
Those are a non-exhaustive list that are being dropped from my logs, but note that the DNS spec doesn't prevent other characters being present. The spec does not inform what a client might query for, which is what we're dealing with, not a zone file.
Checklist
changelog.yml
file.Author's Checklist
How to test this PR locally
Use the provided patterns in
packages/aws/data_stream/route53_public_logs/_dev/test/pipeline/test-route53.log