Skip to content

aws/data_stream/route53_public_logs: Handle non-%HOSTNAME entries #9249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jul 22, 2024

Conversation

cafuego
Copy link
Contributor

@cafuego cafuego commented Feb 28, 2024

Proposed commit message

Change the HOSTNAME pattern to DATA in the grok snippet to match route53 public logs, so that DNS records that contain characters not included in [0-9A-Za-z][0-9A-Za-z-] can still be processed.

This notably includes entries such as _dmarc, _spf and various types of domain ownership verification records.

Those are a non-exhaustive list that are being dropped from my logs, but note that the DNS spec doesn't prevent other characters being present. The spec does not inform what a client might query for, which is what we're dealing with, not a zone file.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

  • [ ]

How to test this PR locally

Use the provided patterns in packages/aws/data_stream/route53_public_logs/_dev/test/pipeline/test-route53.log

@cafuego cafuego requested review from a team as code owners February 28, 2024 05:43
@cafuego cafuego force-pushed the route53-log-hostnames branch from 5a39b42 to d49c066 Compare February 28, 2024 05:46
@botelastic
Copy link

botelastic bot commented Mar 29, 2024

Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Mar 29, 2024
@botelastic botelastic bot removed the Stalled label Mar 29, 2024
@cafuego
Copy link
Contributor Author

cafuego commented Mar 29, 2024

Yup, still relevant and does not contain a dependency on xz 🤪

@botelastic
Copy link

botelastic bot commented Apr 29, 2024

Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Apr 29, 2024
@ishleenk17
Copy link
Member

/test

@botelastic botelastic bot removed the Stalled label Apr 29, 2024
@ishleenk17
Copy link
Member

@cafuego : cAn we please resolve the merge conflicts

@shmsr shmsr changed the title Enhancement: - handle non-%HOSTNAME entries in route53 logs aws/data_stream/route53_public_logs: Handle non-%HOSTNAME entries Apr 29, 2024
@shmsr shmsr added the enhancement New feature or request label Apr 29, 2024
@cafuego
Copy link
Contributor Author

cafuego commented Apr 29, 2024

/test

@cafuego
Copy link
Contributor Author

cafuego commented Apr 29, 2024

@shmsr Yes, but buildkite seems to be unhappy about something; it's not completing the test or pinging back with the result (which has been fine previously)

@botelastic
Copy link

botelastic bot commented May 29, 2024

Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

@botelastic botelastic bot added the Stalled label May 29, 2024
@botelastic botelastic bot removed the Stalled label May 29, 2024
@cafuego cafuego removed their assignment May 29, 2024
@botelastic
Copy link

botelastic bot commented Jun 28, 2024

Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Jun 28, 2024
@botelastic botelastic bot removed the Stalled label Jul 18, 2024
@andrewkroh andrewkroh added the Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] label Jul 19, 2024
@elasticmachine
Copy link

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

Comment on lines +34 to 35
- '%{BASE10NUM} %{TIMESTAMP_ISO8601:_tmp.timestamp} %{DATA:aws.route53.hosted_zone_id} %{DATA:_tmp.question} %{WORD:dns.question.type} %{WORD:dns.response_code} %{WORD:network.transport} %{EDGE_LOCATION:aws.route53.edge_location} %{IP:source.address} (%{SUBNET:aws.route53.edns_client_subnet}|-)'
pattern_definitions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative would be to provided an augmented hostname pattern that includes all the codepoints that occur in DNS names. This only requires that we add _ to the possible codepoints.

Suggested change
- '%{BASE10NUM} %{TIMESTAMP_ISO8601:_tmp.timestamp} %{DATA:aws.route53.hosted_zone_id} %{DATA:_tmp.question} %{WORD:dns.question.type} %{WORD:dns.response_code} %{WORD:network.transport} %{EDGE_LOCATION:aws.route53.edge_location} %{IP:source.address} (%{SUBNET:aws.route53.edns_client_subnet}|-)'
pattern_definitions:
- '%{BASE10NUM} %{TIMESTAMP_ISO8601:_tmp.timestamp} %{DATA:aws.route53.hosted_zone_id} %{DNS_QUESTION:_tmp.question} %{WORD:dns.question.type} %{WORD:dns.response_code} %{WORD:network.transport} %{EDGE_LOCATION:aws.route53.edge_location} %{IP:source.address} (%{SUBNET:aws.route53.edns_client_subnet}|-)'
pattern_definitions:
DNS_QUESTION: \b(?:[_0-9A-Za-z][_0-9A-Za-z-]{0,62})(?:\.(?:[_0-9A-Za-z][_0-9A-Za-z-]{0,62}))*(\.?|\b)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least you also need to add an escaped backslash, or you'll still miss any encoded UTF-8 data.

DNS entries can also contain any binary rubbish you'd care to put in (although Route53 may not allow you to add that) so from a log analysis perspective it's important to grab those logs and allow them to be analysed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DNS entries can also contain any binary rubbish you'd care to put in

That's horrifying, but point well made.

@efd6
Copy link
Contributor

efd6 commented Jul 22, 2024

/test

@cafuego
Copy link
Contributor Author

cafuego commented Jul 22, 2024

Aww, sad trombone. I'll update the test.

…e/test-route53.log-expected.json

Co-authored-by: Dan Kortschak <[email protected]>
@efd6
Copy link
Contributor

efd6 commented Jul 22, 2024

/test

@elasticmachine
Copy link

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@elasticmachine
Copy link

💚 Build Succeeded

History

Copy link

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@efd6 efd6 merged commit 782c4ea into elastic:main Jul 22, 2024
5 checks passed
@elasticmachine
Copy link

Package aws - 2.21.0 containing this change is available at https://epr.elastic.co/search?package=aws

@cafuego
Copy link
Contributor Author

cafuego commented Jul 22, 2024

@efd6 thanks mate!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Integration:aws AWS Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants