Exercise 2 - Building A Log-Analytics Solution With Amazon Kinesis
Exercise 2 - Building A Log-Analytics Solution With Amazon Kinesis
0]
Note
The exercises in this course will have an associated charge in your AWS account. In
this exercise, you will create the following resources:
The final exercise task includes instructions to delete all the resources that you
create for this exercise.
In this exercise, you produce data with the Kinesis agent, which runs on an EC2 instance.
The agent simulates one of the web servers in your organization’s large server farm. Then,
you ingest some dummy access logs with Kinesis Data Firehose. You move those logs to
Amazon S3. Then, you use Kinesis Data Analytics to get data and aggregate data points for
Kinesis Data Analytics to output. You send the aggregated data to another Kinesis Data
Firehose delivery stream that outputs the data to Amazon OpenSearch Service. Finally, you
visualize the data with OpenSearch Dashboards. The following schematic provides an
overview of your workflow:
Flowchart of exercise tasks
Setting up
When you sign in to the AWS Management Console, you must first ensure that you have
appropriate Identity and Access Management (IAM) users, roles, or policies to work with
cloud resources. IAM ensures that only the right users have permissions to perform certain
tasks. With IAM, you can securely control access to your account and set up granular
permissions on an as-needed basis.
In this exercise, you use the AWS CloudFormation template to configure backend
resources. The AWS CloudFormation template is a JSON or YAML file that provisions some
of the AWS services for your needs. We provide the template for you later in this section.
Before you upload the template in AWS CloudFormation, you must create an IAM role with
full administrative priviledges for CloudFormation based on the EC2 use case.
1. In the AWS Management Console, enter IAM in the search field. Choose IAM from the
list.
6. Choose Next.
8. In the list of available options, select the AWS managed policy that provides full
access to AWS CloudFormation. Choose Next.
With the new IAM role, you now have access to AWS CloudFormation. You can also use
IAM roles to share temporary access with users who might need to access the AWS
resources associated with your account. For more information about IAM, see What’s IAM.
1. In the search field of the AWS Management Console, enter CloudFormation. Choose
CloudFormation from the list.
2. At the top right of the console, make sure you are in the US East (N. Virginia) - us-
east-1 Region.
5. Select Choose file and browse to where you downloaded the exercise-2-kinesis
template.
7. Choose Next.
1. In the search field of the AWS Management Console, search for and open Amazon
OpenSearch Service.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"*"
]
},
"Action": [
"es:*"
],
"Resource": "arn:aws:es:us-east-1:<FMI>:domain/web-log-summary/*"
}
]
}
Replace the FMI in the JSON code with your account number. When you replace the
FMI with your own value, make sure that you also delete the angle brackets (<>). You
can find your account number in the top-right area of the console, next to the Region.
It should have a format similar to 0000-0000-0000. Remove the dashes before you
save your changes. For example:
],
"Resource": "arn:aws:es:us-east-1:000000000000:domain/web-log-summary/*"
}
1. In the search field of the AWS Management Console, search for and open Kinesis.
Choosing Create for Destination settings opens a separate browser tab so that you
can create an S3 bucket.
As an example, you can use the following naming convention and replace the FMI with
your initials.
<FMI>-web-log-ingestion-bucket
Example:
emr-web-log-ingestion-bucket
5. For Region, keep US East (N. Virginia) us-east-1. Your OpenSearch Service cluster
and Kinesis streams are provisioned in this Region.
8. Next to the S3 bucket box, choose Browse and select the S3 bucket that you
created. If the bucket isn’t listed, choose the refresh icon.
10. Choose Create delivery stream. This process can take up to 5 minutes to complete.
1. In the search field of the AWS Management Console, search for and open EC2.
3. Open the instance summary information of the Dummy Web Server instance by
choosing its Instance ID.
4. Choose Connect.
5. In Connect to instance, choose the Session Manager tab and choose Connect.
This action loads a session, and you should be presented with a shell prompt: sh-
4.2$ .
6. In the session terminal, install the Kinesis Agent on the instance by running the
following command:
It may take a minute for the command to start installing the Kinesis Agent.
1. Confirm that you want to install the required packages by pressing Y on your
keyboard.
Next, you will edit the agent.json file by replacing the value of the deliveryStream
key with name of the delivery stream that you created.
2. In your text editor of choice, open the /etc/aws-kinesis/agent.json file. The following
example uses Vim:
sudo vim /etc/aws-kinesis/agent.json
3. View the current contents of agent.json , which should be similar to this example:
{
"cloudwatch.emitMetrics": true,
"kinesis.endpoint": "",
"firehose.endpoint": "",
"flows": [
{
"filePattern": "/tmp/app.log*",
"kinesisStream": "yourkinesisstream",
"partitionKeyOption": "RANDOM"
},
{
"filePattern": "/tmp/app.log*",
"deliveryStream": "yourdeliverystream"
}
]
}
4. Use the keyboard arrows to navigate the file and enter the i key to start editing.
Delete existing contents of agent.json and paste the following code. Make sure that
you update the FMI value for deliveryStream (which should be web-log-ingestion-
stream ).
{
"cloudwatch.emitMetrics": true,
"kinesis.endpoint": "",
"firehose.endpoint": "",
"flows": [{
"filePattern": "/tmp/logs/access_log*",
"deliveryStream": "<FMI>",
"dataProcessingOptions": [{
"optionName": "LOGTOJSON",
"logFormat": "COMMONAPACHELOG"
}]
}]
}
Note: If you gave the delivery stream a different name, exit the agent.json file and
retrieve the name of your delivery stream by running the following command:
{
"DeliveryStreamNames": [
"web-log-ingestion-stream"
],
"HasMoreDeliveryStreams": false
}
The updated agent.json file should have the following:
{
"cloudwatch.emitMetrics": true,
"kinesis.endpoint": "",
"firehose.endpoint": "",
"flows": [{
"filePattern": "/tmp/logs/access_log*",
"deliveryStream": "web-log-ingestion-stream",
"dataProcessingOptions": [{
"optionName": "LOGTOJSON",
"logFormat": "COMMONAPACHELOG"
}]
}]
}
5. Enter the ESC key to switch back to command mode. If you are using Vim, enter :wq
to save and exit the file.
If you want to see what is happening, or if you want to troubleshoot the agent, you can
use the tail -f command on the /var/log/aws-kinesis-agent/aws-kinesis-agent.log
file.
Note: It might take approximately 5 minutes before data starts to appear in your
Firehose delivery stream.
2. On the Data Firehose card, choose Create delivery stream and configure the
following settings.
Source: Direct PUT
Destination: Amazon OpenSearch Service
Delivery stream name: web-log-aggregated-data
Destination settings: Browse to and select the web-log-summary domain
Index: request_data
Backup settings: Create
Again, this action opens a new browser tab so that you can create an Amazon S3
bucket.
<FMI>-web-log-aggregated-errors
Example:
emr-web-log-aggregated-errors
8. Select the bucket that you created for this stream, and then Choose.
9. Expand Buffer hints, compression and encryption, and configure the following
settings.
10. Choose Create delivery stream and wait for the Status to become Active.
11. In the navigation pane of the AWS Management Console, choose Services, and
search for and open IAM.
12. In the IAM navigation pane, under Access management, choose Roles.
14. Copy the role’s ARN value, which should look similar to the following:
arn:aws:iam::000000000000:role/service-role/KinesisFirehoseServiceRole-web-log-aggre
15. Return to the OpenSearch Service console, and in the navigation pane, choose
Domains.
17. Choose the link for the OpenSearch Dashboards URL. This action launches the
dashboard.
20. If you see a Select your tenant dialog box, keep the Private setting, and choose
Confirm.
21. Expand the navigation pane by choosing the menu icon (in the upper-left area of the
OpenSearch Dashboards console).
24. Search for the all_access role and open its details by choosing the Role link.
25. Choose the Mapped users tab and choose Manage mapping.
26. In the Users box, paste the IAM role ARN that you copied earlier and press Enter.
arn:aws:iam::000000000000:role/service-role/KinesisFirehoseServiceRole-web-log-aggre
The Mapped users list should have an admin user and a user with the IAM role ARN
that you copied. They should look similar to this example:
7. Scroll to the Source tab, and in the Source stream section, and choose Configure.
Example schema
10. At the top of the web-log-aggregation-app pane, expand Steps to configure your
application.
11. In Step 2, choose Configure SQL. With the SQL editor, you can write queries to
process streaming data.
The application should start after 30–90 seconds. You should see the message
Application web-log-aggregation-app has been successfully started.
You have now configured the application source and real-time analytics.
5. On the Sample web logs tile, choose Add data. For more information on how to
explore data in OpenSearch, see Getting started with OpenSearch Dashboards.
This solution uses Amazon Kinesis to ingest the raw data, and then saves the data in
Amazon S3. It delivers only a refined version of the data to OpenSearch Service.
Depending on how you want to architect your data lake solution—and taking many other
subjects into consideration, such as budget included—you might want to take a different
approach. For example, you might send the raw logs directly to OpenSearch Service, and
then do all the filtering in OpenSearch Dashboards. You can decide how you will design
your solution, and remember that it’s important to match the workload to the need.
Cleaning up
In this task, you delete the AWS resources that you created for this exercise.
© 2022 Amazon Web Services, Inc. or its affiliates. All rights reserved. This work may not be
reproduced or redistributed, in whole or in part, without prior written permission from Amazon Web
Services, Inc. Commercial copying, lending, or selling is prohibited. Corrections, feedback, or
other questions? Contact us at https://support.aws.amazon.com/#/contacts/aws-training. All
trademarks are the property of their owners.