0% found this document useful (0 votes)

98 views46 pages

Log Analytics Withamazonelasticsearchservice

Log analytics withamazonelasticsearchservice

Uploaded by

Nataraju G

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views46 pages

Log Analytics Withamazonelasticsearchservice

Log analytics withamazonelasticsearchservice

Uploaded by

Nataraju G

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Log Analytics with Amazon

Kinesis and Amazon Elasticsearch

Service

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What to do with a terabyte of logs?
Log analytics architecture

data source Amazon Kinesis Firehose Amazon Elasticsearch Kibana

Service
Amazon Elasticsearch Service is a cost-effective
managed service that makes it easy to deploy,
manage, and scale open source Elasticsearch for log
analytics, full-text search and more.
Amazon
Elasticsearch
Service
Amazon Elasticsearch Service benefits

Easy to use Scalable Highly available

Open-source Secure AWS integrated

compatible
Adobe Developer Platform (Adobe I/O)
1

Amazon Amazon
Data
Kinesis Spark Streaming Elasticsearch
Sources
Streams Service

PROBLEM SOLUTION BENEFITS

• Cost effective monitor • Log data is routed with • Management and
for XL amount of log Amazon Kinesis to operational simplicity
data Amazon Elasticsearch
• Over 200,000 API calls Service, then • Flexibility to try out
per second at peak - displayed using AES different cluster config
destinations, response Kibana during dev and test
times, bandwidth
• Adobe team can easily
• Integrate seamlessly see traffic patterns and
with other components
of AWS eco-system. error rates, quickly
identifying anomalies
and potential
challenges
McGraw Hill Education

PROBLEM SOLUTION BENEFITS

• Supporting a wide catalog • Search and analyze test • Confidence to scale
across multiple services in results, student/teacher throughout the school year.
multiple jurisdictions interaction, teacher From 0 to 32TB in 9 months
• Over 100 million learning effectiveness, student • Focus on their business, not
events each month progress their infrastructure
• Tests, quizzes, learning
modules begun / completed • Analytics of applications
/ abandoned and infrastructure are now
integrated to understand
operations in real time
Get set up right
Amazon ES overview

CloudTrail CloudWatch

Elasticsearch API

Elastic Load
IAM
Balancing

Amazon Route
53
Data pattern

One index per day Shard 1

logs_01.21.2017 Shard 2
logs_01.22.2017 Shard 3
host
logs_01.23.2017 Each index has ident
logs_01.24.2017 multiple shards auth
logs_01.25.2017 Each shard contains timestamp
a set of documents etc.
logs_01.26.2017
Each document contains
logs_01.27.2017
a set of fields and values
Amazon ES cluster
Deployment of indices to a cluster

Amazon ES cluster
• Index 1 Primary Replica
– Shard 1 1 1 1 3 2 1
– Shard 2 2 2
3 1 1 2
– Shard 3 3 3
Instance 1, Instance 2
Master
• Index 2 3 2
– Shard 1 1 1
2 3
– Shard 2 2 2
Instance 3
– Shard 3 3 3
How many instances?

The index size will be about the same as the

corpus of source documents
• Double this if you are deploying an index replica
Size based on storage requirements
• Either local storage or up to 1.5TB of EBS per
instance

• Example: 2TB corpus will need 4 instances

– Assuming a replica and using EBS
– Or with i2.2xlarge nodes (1.6TB ephemeral storage)
Instance type recommendations
Instance Workload
T2 Entry point. Dev and test.
M3, M4 Equal read and write volumes.

R3, R4 Read-heavy or workloads with high memory demands (e.g.,

aggregations).

C4 High concurrency/indexing workloads

I2 Up to 1.6 TB of SSD instance storage.

Cluster with no dedicated masters

Amazon ES cluster

1 3 2 1

3 1 1 2

Instance 1, Instance 2
Master
3 2

2 3

Instance 3
Cluster with dedicated masters
Amazon ES cluster

1 3 2 1

3 1 1 2

Instance 1 Instance 2

3 2

2 3

Dedicated master nodes Instance 3

Data nodes: queries and updates

Master node selection

• < 10 nodes - m3.medium, c4.large

• 11-20 nodes - m4.large, r4.large, m3.large, r3.large
• 21-40 nodes - c4.xlarge, m4.xlarge, r4.xlarge, m3.xlarge
Cluster with zone awareness

Amazon ES cluster

1 3 2 3 1 2

3 1 2 2 3 1

Instance 1 Instance 2 Instance 3 Instance 4

Availability Zone 1 Availability Zone 2

Small use cases

Application
Instance

• Logstash co-located on the • Up to 200GB of data

Application instance • m3.medium + 100G EBS
• SigV4 signing via provided data nodes
output plugin • 3x m3.medium master nodes
Large use cases

Amazon
DynamoDB

Amazon S3
bucket AWS
Lambda

Amazon
CloudWatch

• Data flows from instances • Up to 5TB of data

and applications via • r3.2xlarge + 512GB EBS
Lambda; CWL is implicit data nodes
• SigV4 signing via • 3x m3.medium master nodes
Lambda/roles
XL use cases

Amazon
Kinesis

Amazon
EMR

• Ingest supported through • Up to 60 TB of data

high-volume technologies • R3.8xlarge + 640GB data
like Spark or Kinesis nodes
• 3x m3.xlarge master nodes
Best practices

Data nodes = Storage needed/Storage per node

Use GP2 EBS volumes
Use 3 dedicated master nodes for production deployments
Enable Zone Awareness
Set indices.fielddata.cache.size = 40
Amazon Kinesis
Amazon Kinesis: Streaming Data Made Easy
Services make it easy to capture, deliver, process streams on AWS

Amazon Kinesis Amazon Kinesis Amazon Kinesis

Streams Firehose Analytics
Amazon Kinesis Streams

• Easy administration
• Build real time applications with framework of choice
• Low cost
Amazon Kinesis Firehose

• Zero administration
• Direct-to-data store integration
• Seamless elasticity
Amazon Kinesis Analytics

• Interact with streaming data in real-time using SQL

• Build fully managed and elastic stream processing
applications that process data for real-time visualizations
and alarms
Amazon Kinesis - Firehose vs. Streams

Amazon Kinesis Streams is for use cases that require custom

processing, per incoming record, with sub-1 second processing
latency, and a choice of stream processing frameworks.

Amazon Kinesis Firehose is for use cases that require zero

administration, ability to use existing analytics tools based on
Amazon S3, Amazon Redshift and Amazon Elasticsearch, and a
data latency of 60 seconds or higher.
Kinesis Firehose overview

Delivery Stream: Underlying

AWS resource

Destination: Amazon ES,

Amazon Redshift, or Amazon
S3

Record: Put records in

streams to deliver to
destinations
Kinesis Firehose Data Transformation
• Firehose buffers up to 3MB of ingested data
• When buffer is full, automatically invokes Lambda function,
passing array of records to be processed
• Lambda function processes and returns array of transformed
records, with status of each record
• Transformed records are saved to configured destination

[{" [{
"recordId": "1234", "recordId": "1234",
"data": "encoded-data" "result": "Ok"
}, "data": "encoded-data"
{ },
"recordId": "1235", {
"data": "encoded-data" "recordId": "1235",
} "result": "Dropped"
] "data": "encoded-data"
}
]
Kinesis Firehose delivery architecture with
transformations
Data transformation
function

data source transformed

records
source records

Firehose Amazon Elasticsearch

delivery stream Service

S3 bucket
delivery failure

transformation failure

source records
Kinesis Firehose features for ingest

Serverless scale Error handling S3 Backup

Best practices

Use smaller buffer sizes to increase throughput, but be

careful of concurrency

Use index rotation based on sizing

Default: stream limits: 2,000 transactions/second, 5,000

records/second, and 5 MB/second
Log analysis with aggregations
Amazon ES aggregations

Buckets – a collection of documents meeting some criterion

Metrics – calculations on the content of buckets
Metric: count

Bucket: time
host:199.72.81.55 with <histogram of verb>

199.72.81.55

1, GET GET 5
4, GET POST 2
8, POST PUT 1
12, GET
30, PUT
42, GET
58, GET
100 POST
...

Look up Field data Buckets Counts

A more complicated aggregation

Bucket: ARN
Bucket: Region
Bucket: eventName
Metric: Count
Best practices

Make sure that your fields are not_analyzed

Visualizations are based on buckets/metrics

Use a histogram on the x-axis first, then sub-aggregate

Run Elasticsearch in the AWS cloud with Amazon
Elasticsearch Service
Use Kinesis Firehose to ingest data simply
Kibana for monitoring, Elasticsearch queries for
Amazon deeper analysis
Elasticsearch
Service
What to do next

Qwiklab:
https://qwiklabs.com/searches/lab?keywords=introduction
%20to%20amazon%20elasticsearch%20service
Centralized logging solution
https://aws.amazon.com/answers/logging/centralized-
logging/
Our overview page on AWS
https://aws.amazon.com/elasticsearch-service/
Q&A

Thank you for joining!

SDP Amazon ECS Delivery Calibration Guide
No ratings yet
SDP Amazon ECS Delivery Calibration Guide
31 pages
3HE17250AAAATQZZA - V1 - NSP NFM-P 21.3 XML API Developer Guide
100% (1)
3HE17250AAAATQZZA - V1 - NSP NFM-P 21.3 XML API Developer Guide
366 pages
Lab Assignment 2-CSET 463
No ratings yet
Lab Assignment 2-CSET 463
20 pages
Details of Train 18 in Hindi
No ratings yet
Details of Train 18 in Hindi
83 pages
Lesson 2 - Navigating LabVIEW
No ratings yet
Lesson 2 - Navigating LabVIEW
52 pages
Article No. 1 To Go Retrospective, or To Go Prospective by Chris Larkin
No ratings yet
Article No. 1 To Go Retrospective, or To Go Prospective by Chris Larkin
3 pages
SAAC03-Services Summary
No ratings yet
SAAC03-Services Summary
8 pages
DevOps With AWS by MR Veerababu Naresh IT
No ratings yet
DevOps With AWS by MR Veerababu Naresh IT
17 pages
Aws Services
No ratings yet
Aws Services
14 pages
Amazon Cloudfront Overview: Tal Saraf General Manager Amazon Cloudfront and Route 53
No ratings yet
Amazon Cloudfront Overview: Tal Saraf General Manager Amazon Cloudfront and Route 53
40 pages
WWW Acte in AWS Training in Hyderabad
No ratings yet
WWW Acte in AWS Training in Hyderabad
18 pages
Title: Presented By:-Gaurav Sharma Roll No. - 19EMCCS037 Batch - A, Year - 4th Branch - CSE
No ratings yet
Title: Presented By:-Gaurav Sharma Roll No. - 19EMCCS037 Batch - A, Year - 4th Branch - CSE
13 pages
1) AWS Things To Know
No ratings yet
1) AWS Things To Know
14 pages
Lesson Navigation: Network Access Control List (NACL)
0% (1)
Lesson Navigation: Network Access Control List (NACL)
13 pages
AWS Overview Presentation
No ratings yet
AWS Overview Presentation
20 pages
Amazon Web Services
100% (2)
Amazon Web Services
71 pages
Project DevOps
No ratings yet
Project DevOps
24 pages
Ansible 2 For Beginners
No ratings yet
Ansible 2 For Beginners
8 pages
Aws 2
No ratings yet
Aws 2
28 pages
Amazon S3 (API Version 2006-03-01)
100% (1)
Amazon S3 (API Version 2006-03-01)
171 pages
Ec2 Instance
No ratings yet
Ec2 Instance
10 pages
Amazon Neptune
No ratings yet
Amazon Neptune
384 pages
COM-421-Lecture-Notes-7 - Open Stack
No ratings yet
COM-421-Lecture-Notes-7 - Open Stack
24 pages
Aws VPC
No ratings yet
Aws VPC
3 pages
Ec2 Ug PDF
No ratings yet
Ec2 Ug PDF
722 pages
Aws General
No ratings yet
Aws General
1,114 pages
Azure Log Analytics Knowledge Check - Level 200
No ratings yet
Azure Log Analytics Knowledge Check - Level 200
35 pages
AWS Solution Architect Associate Agenda PDF
No ratings yet
AWS Solution Architect Associate Agenda PDF
6 pages
Object Storage: AWS - Level 1 Hands-On IAM
100% (1)
Object Storage: AWS - Level 1 Hands-On IAM
1 page
Cloud Computing
No ratings yet
Cloud Computing
18 pages
Unilever Case Study - Aws
No ratings yet
Unilever Case Study - Aws
17 pages
AWS CP - Sruya Kiran Sir Notes
No ratings yet
AWS CP - Sruya Kiran Sir Notes
8 pages
DEVOPS
100% (1)
DEVOPS
57 pages
AWS & DEVOPS Paint-Notes1
No ratings yet
AWS & DEVOPS Paint-Notes1
35 pages
Lab Requirements: AWS Solution Architect Associate Training
No ratings yet
Lab Requirements: AWS Solution Architect Associate Training
1 page
AWS Certified DevOps Engineer
No ratings yet
AWS Certified DevOps Engineer
2 pages
Technical Account Manager or Support Delivery Manager or Account
No ratings yet
Technical Account Manager or Support Delivery Manager or Account
2 pages
AWS SolutionsArchitect-Associate Version 3.0-2
0% (1)
AWS SolutionsArchitect-Associate Version 3.0-2
6 pages
Module 7: Data Management Backup, DR, Test/Dev Environments
No ratings yet
Module 7: Data Management Backup, DR, Test/Dev Environments
9 pages
9.elastic MapReduce-Redshift
No ratings yet
9.elastic MapReduce-Redshift
16 pages
Aws Unilever Final
No ratings yet
Aws Unilever Final
17 pages
Using EC2 Roles and Instance Profiles in AWS
No ratings yet
Using EC2 Roles and Instance Profiles in AWS
21 pages
AWS Module 3 Notes
No ratings yet
AWS Module 3 Notes
5 pages
Virtual Private Cloud (Amazon VPC) : Naresh I Technologies Amazon Web Services Avinash Reddy T
No ratings yet
Virtual Private Cloud (Amazon VPC) : Naresh I Technologies Amazon Web Services Avinash Reddy T
33 pages
System Operations On AWS
No ratings yet
System Operations On AWS
4 pages
DevOps Tutorial (Technical Guftgu)
No ratings yet
DevOps Tutorial (Technical Guftgu)
24 pages
What Is Aws?: Saas (Software As A Service)
No ratings yet
What Is Aws?: Saas (Software As A Service)
16 pages
Getting Started With Amazon EC2: Ian Massingham, Technical Evangelist at AWS
No ratings yet
Getting Started With Amazon EC2: Ian Massingham, Technical Evangelist at AWS
73 pages
AWS Tutorial
No ratings yet
AWS Tutorial
32 pages
5 AWS Database Services 19-08-2024
No ratings yet
5 AWS Database Services 19-08-2024
21 pages
Aws Cloud9 Ug
No ratings yet
Aws Cloud9 Ug
596 pages
Amazon ElastiCache - Digital Cloud Training (2019!05!25 07-29-21)
No ratings yet
Amazon ElastiCache - Digital Cloud Training (2019!05!25 07-29-21)
7 pages
MA 6.101 Probability and Statistics: Assistant Professor, IIIT Hyderabad
No ratings yet
MA 6.101 Probability and Statistics: Assistant Professor, IIIT Hyderabad
128 pages
AWS Solution Architect Associate Dump4
No ratings yet
AWS Solution Architect Associate Dump4
13 pages
DevOps & AWS Course Content
No ratings yet
DevOps & AWS Course Content
5 pages
Class26 Aws Iam
No ratings yet
Class26 Aws Iam
6 pages
AWS Solutions Architect Lesson 2
No ratings yet
AWS Solutions Architect Lesson 2
161 pages
Directoryservice Admin Guide
No ratings yet
Directoryservice Admin Guide
263 pages
Multitasking Can Make You Lose... Um... Focus
No ratings yet
Multitasking Can Make You Lose... Um... Focus
5 pages
05 - Identity Access Management - IAMLab
No ratings yet
05 - Identity Access Management - IAMLab
5 pages
11 - AWS RDS Notes
No ratings yet
11 - AWS RDS Notes
4 pages
Cognos Content Store Survival Guide
No ratings yet
Cognos Content Store Survival Guide
22 pages
Hashicorp Terraform Associate Certification (Exam 003)
From Everand
Hashicorp Terraform Associate Certification (Exam 003)
Kimiko Lee
No ratings yet
AppDynamics Third Edition
From Everand
AppDynamics Third Edition
Gerardus Blokdyk
No ratings yet
VedicReport7 22 20249 19 53AM
No ratings yet
VedicReport7 22 20249 19 53AM
55 pages
100 Generative AI Use Cases Examples For Industries
100% (6)
100 Generative AI Use Cases Examples For Industries
63 pages
Enterprise Data Warehousing On Aws
No ratings yet
Enterprise Data Warehousing On Aws
26 pages
AarshaVani January2018
No ratings yet
AarshaVani January2018
9 pages
Playful Python Projects Modeling and Animation Maxim Mozgovoy Instant Download
No ratings yet
Playful Python Projects Modeling and Animation Maxim Mozgovoy Instant Download
85 pages
1 Pivot Assinments
No ratings yet
1 Pivot Assinments
8 pages
Ahb Faqs FAQsnew
No ratings yet
Ahb Faqs FAQsnew
10 pages
Infineon-AURIX TC3xx Asynchronous Synchronous Interface-Training-v01 00-EN
No ratings yet
Infineon-AURIX TC3xx Asynchronous Synchronous Interface-Training-v01 00-EN
9 pages
Python For Bioinformatics, Second Edition Sebastian Bassi PDF Download
100% (4)
Python For Bioinformatics, Second Edition Sebastian Bassi PDF Download
59 pages
Mini Project Seminar On Pizza Ordering Application For Android
No ratings yet
Mini Project Seminar On Pizza Ordering Application For Android
29 pages
Getac V110-SpecSheet
No ratings yet
Getac V110-SpecSheet
2 pages
Motor Monitoring Relay PDF
No ratings yet
Motor Monitoring Relay PDF
132 pages
Getting Started PVT Sim
No ratings yet
Getting Started PVT Sim
23 pages
Ns 2 Qa
No ratings yet
Ns 2 Qa
30 pages
26 23 00 - Low-Voltage Switchgear Guide Specification
No ratings yet
26 23 00 - Low-Voltage Switchgear Guide Specification
21 pages
FEE322 Lecture 1 - Introduction
No ratings yet
FEE322 Lecture 1 - Introduction
8 pages
Fluke 68X Lanmeter
No ratings yet
Fluke 68X Lanmeter
12 pages
CVE311
No ratings yet
CVE311
40 pages
Accident Detection
No ratings yet
Accident Detection
11 pages
AVL Trees in Java
No ratings yet
AVL Trees in Java
7 pages
Vyzex Floor POD Plus Pilot's Guide PDF
No ratings yet
Vyzex Floor POD Plus Pilot's Guide PDF
20 pages
KP3 Plus MIDIimp
No ratings yet
KP3 Plus MIDIimp
13 pages
DCN Manual
No ratings yet
DCN Manual
43 pages
Chandan Kumar Resume
No ratings yet
Chandan Kumar Resume
1 page
2N6292 Silicon NPN Transistor Audio Power Output and Medium Power Switching TO 220 Type Package
No ratings yet
2N6292 Silicon NPN Transistor Audio Power Output and Medium Power Switching TO 220 Type Package
2 pages
Sk015 - Jawapan Tutor Edit
No ratings yet
Sk015 - Jawapan Tutor Edit
22 pages
Apple Lap Sample Questionnaire
71% (7)
Apple Lap Sample Questionnaire
2 pages
City of Oakland MTC Climate Initiatve PTDM Grant Proposal
No ratings yet
City of Oakland MTC Climate Initiatve PTDM Grant Proposal
14 pages
Light Detector Using Nand Gate
78% (9)
Light Detector Using Nand Gate
22 pages
PEM533 - Manual
No ratings yet
PEM533 - Manual
96 pages

Log Analytics Withamazonelasticsearchservice

Uploaded by

Log Analytics Withamazonelasticsearchservice

Uploaded by

Log Analytics with Amazon

Kinesis and Amazon Elasticsearch

data source Amazon Kinesis Firehose Amazon Elasticsearch Kibana

Easy to use Scalable Highly available

Open-source Secure AWS integrated

PROBLEM SOLUTION BENEFITS

PROBLEM SOLUTION BENEFITS

One index per day Shard 1

The index size will be about the same as the

• Example: 2TB corpus will need 4 instances

R3, R4 Read-heavy or workloads with high memory demands (e.g.,

C4 High concurrency/indexing workloads

I2 Up to 1.6 TB of SSD instance storage.

Dedicated master nodes Instance 3

Data nodes: queries and updates

• < 10 nodes - m3.medium, c4.large

Instance 1 Instance 2 Instance 3 Instance 4

Availability Zone 1 Availability Zone 2

• Logstash co-located on the • Up to 200GB of data

• Data flows from instances • Up to 5TB of data

• Ingest supported through • Up to 60 TB of data

Data nodes = Storage needed/Storage per node

Amazon Kinesis Amazon Kinesis Amazon Kinesis

• Interact with streaming data in real-time using SQL

Amazon Kinesis Streams is for use cases that require custom

Amazon Kinesis Firehose is for use cases that require zero

Delivery Stream: Underlying

Destination: Amazon ES,

Record: Put records in

data source transformed

Firehose Amazon Elasticsearch

Serverless scale Error handling S3 Backup

Use smaller buffer sizes to increase throughput, but be

Use index rotation based on sizing

Default: stream limits: 2,000 transactions/second, 5,000

Buckets – a collection of documents meeting some criterion

Look up Field data Buckets Counts

Make sure that your fields are not_analyzed

Visualizations are based on buckets/metrics

Use a histogram on the x-axis first, then sub-aggregate

Thank you for joining!

You might also like