Big Data Visualizer Course Notes
Big Data Visualizer Course Notes
Level 01
Introduction
The person who completes this course will be able to identify the principles necessary
to leverage data, with large volumes, variety and growth.
As a Big Data viewer you can develop applications that obtain, clean and process all
types of data from various sources. The reason for doing this is to generate and present
graphs that allow you to have a vision of the behavior of an organization. This vision is
very valued, since it can suggest the direction of an entire company with the objective of
improving processes, minimizing costs and finding growth opportunities.
Lesson 01
It refers to the use of an immense amount of data, requiring the ability to obtain, store,
manipulate and analyze millions of data, which would be impossible to do with
conventional analysis tools (data capture, relational databases, dynamic tables).
Main features
Volume
Variety (Structured and unstructured data)
Speed
Other features
Veracity
Worth
Purposes of Big Data
Improve operations
Complex decision making
Costs reduction
Time reduction
Deployment of personalized offers
Business intelligence
Lesson 02
Ecosystem
Workflow
Data collection
Big Data
Big Analytics
Users
Lifecycle
Clusters. Several computers connected to each other, each one is known as a NODE.
Advantages:
Parallel work
High performance
High workload support
Scalability
Level 02
Lesson 01
API Types
Before starting data acquisition, install and import the required libraries in Python
(Social Network Library, Requests Library), then declare the following variables:
Import Facebook
Import requests
Token = “EAAC.”
Graph = Facebook GraphAPI (token)
quantityComments = 100
PageId = 13254565767
LikesCount = 0
CommentList = [ ]
Flag = False
Comments = graph.get_connections(PageId, 'feed')
Lesson 02
Types of databases
Key-Value Oriented
Document-oriented
Column oriented
Graph-oriented
Lesson 01
Identify:
Patterns
Correlations
Trends
Customer preferences
Goals
Benefits
Improvement of services
Generation of efficiency in operations that give an advantage over the
competition.
Predictive analysis
Data mining
Text analysis
Statistic analysis
Machine Learning
Data visualization
Other tools based on NoSQL data analysis
Sectors that use Big Data
It refers to the study of data with the ability for the machine to learn without needing to
be explicitly programmed. It is achieved by building algorithms that create a model from
a sample of data, based on which machines create predictions or express decisions
themselves. They work for complex models, difficult to make conventionally, since
these end up being built on their own.
Classification
Statistical classification
Clustering
Regression
Anomaly detection
Association rules
Sentiment analysis
Programmation logic
Import requests
Apikey = 'ab881ef9-5941-45d7-95ª7-595fc89d129d'
Language = 'eng'
ligaPetition = 'https://api.havenondemand.com/1/api/sync/analyzesentiment/v1?
text=(0)&language=(1)&apikey=(2)'
Lesson 02
Web Scraping
Take the following into account when developing this type of programs:
Use the find method to find the substring within the text, this method takes the position
within the entire text, you must count the positions to define the data you want from the
beginning to the end. Make functions to obtain the data of interest, in the part of
processing the acquired data, is where you will program the logic to follow, consider the
type of update, keep a program in a cycle so that it runs continuously until you want to
stop it .
Lesson 03
Creation of web graphics
In an HTML file add the HTML structure, libraries and graphics titles
Add the plotOptions object to add the percentage of each bar
Adds the series object including the total data of each bar to indicate the display
of the detail of each bar.
Then add the drilldown property with its respective name.
Add another object called drilldown, followed by a series list, to specify the detail
of each bar.
Add an object with detailed information about each bar and add the ID with the
name that you indicated as drilldown in the totals objects.
Reload the page and check the detail functionality in each bar.
A web service is a program that runs on the server side to exchange information
between applications. These can provide information in two formats XML and JSON. A
web service requires the following:
Go to the folder where your server files are hosted and paste the “.dll” library in
the extensions folder, rename the library.
Open the configuration file and register the library, restart the server.
Check the installed extensions.
To develop an application that works in real time, you need to query a database using a
PHP web service that returns a JSON. To program it, follow these steps:
To draw a graph with HTML and JavaScript in real time, starting from an online graph,
follow these steps:
Add the series variable in the load part, so that the graph is updated every time a new
record is entered into the database.
Add two temporary variables in the document ready part, one for x and one for y, there
you save the last record added to the graph.
Create the get and set functions that will allow you to enter and modify those two
variables.
Add an ajax request, this will allow you to make queries and update the graph without
having to refresh the page, this request will require two pieces of information, the url of
the web service with the request to bring all the data and the type of request in this case
it will be get.
It scans the JSON of the content of each request and saves each of its elements in an
array.
Store temporary variables in an X and Y register.
Finally, assign the array to the data parameter so that the data is drawn on the graph.
Now you can see how the data is plotted on the graph, but it is not yet in real time.
Modify the interval function, which already brings the graph, adding a request to the
web service, but this time with the parameter of 1 to bring only the last record.
Compare the the graph.
Save the new records in temporary variables.
Finally, assign an interval of 1000 milliseconds to run this request again.
In this way you can create applications that allow you to observe and analyze in real
time the behavior of users on social networks, up to the latest movements in the stock
markets.
Structure of a dashboard
Define the KPIs of the organization, list them in order of importance, select the three or
four most important, if there are more than four you must divide them into different
groups so as not to saturate them, select the graphs that show the behavior of the
chosen KPIs. Arrange the graphs in a single panel, place filters, sorting functions and
descriptions.
Assessment
Assuming that the request returns a JSON, which is what the following code segment
would print in the console:
$.ajax({
url: "data.php?Consult=0",
type: 'get',
success: function(RecoveredData) {
RetrievedData = JSON.parse(RecoveredData);
$.each(RecoveredData, function(i,o){
Concoles.log( parseInt(ox));
Concoles.log( parseInt(oy));
});
User response:
The JSON coordinates that were sent as a response
Result:
Correct!
Question results
If with the following line of code a query is made to the NoSQL database for the last
record entered, how should you modify the statement to bring all the elements in
ascending order?
User response:
$cursor = $collection->find()->sort(array('$natural' => 1))->asd(1);
Result:
You need to reinforce the topic: Web Services and Creation of real-time graphics
Question results
Ivan is programming an application to display a line graph, but the graph does NOT
display the data. If he is using the following object for the data, what should Ivan do with
the data to solve the problem?
Correct!
Question results
What output does a web service have in php with the following lines of code assuming
that the variable "$cursor" contains this list of objects: [{“x”: “1”, “y”: “2”},{“x ”: “2”, “y”:
“5”},{“x”: “3”, “y”: “8”}]
You need to reinforce the topic: Web Services and Creation of real-time graphics
Question results
Correct!
Question results
Assuming that the "ajax" request in the following code segment returns a JSON, what
would it print in the console:
$.ajax({
url: "data.php?Consult=0",
type: 'get',
success: function(RecoveredData) {
$.each(RecoveredData, function(i,o){
Concoles.log( parseInt(ox));
Concoles.log( parseInt(oy));
});
User response:
The JSON coordinates that were sent as a response
Result:
title: {
title: 'User account'
},
User response:
Changing title name to text
Result:
Correct!
Question results
Maria is installing a NoSQL database in php and has already placed the file in the
extensions folder. What do you have to do to be able to use the library?
User response:
Check the installed extensions
Result:
If with the following line of code a query is made to the NoSQL database for the last
record entered, how should you modify the statement to bring the first record? $cursor
= $collection->find()->sort(array('$natural' => -1))->limit(1);
User response:
$cursor = $collection->find()->sort(array('$natural' => 1))->limit(1);
Result:
Correct!
Question results
José has the following output from a web service and is using this instruction to read
the data. What will José obtain as a result?
<Sender>
<Name>Sender name</Name>
<Mail> Sender's email </Mail>
</Sender>
<Recipient>
<Name>Name of recipient</Name>
<Mail>Recipient's email</Mail>
</Recipient>
</Sender>
User response:
A graph without data because the information you are trying to graph does not
correspond to the type of graph
Result:
José has the following output from a web service and is using this instruction to read
the data. What will José obtain as a result?
<Sender>
<Name>Sender name</Name>
<Mail> Sender's email </Mail>
</Sender>
<Recipient>
<Name>Name of recipient</Name>
<Mail>Recipient's email</Mail>
</Recipient>
</Sender>
User response:
An error in the read because it is not a valid input for the instruction it uses
Result:
Correct!
Question results
How many dashboards do you need if you selected 10 KPIs?
User response:
3
Result:
Correct!
Question results
Yvette is programming a graph, but it does NOT display the x-axis title of the graph and
she found the error in this part of the program. What is this error due to?
X axis: {
title: {
enabled: true,
text: 'Height (cm)'
}
},
User response:
The object name is incorrect
Result:
Correct!
Question results
If with the following line of code a query is made to the NoSQL database for the last
record entered, how should you modify the statement to bring the first record? $cursor
= $collection->find()->sort(array('$natural' => -1))->limit(1);
User response:
$cursor = $collection->find()->sort(array('$natural' => 1))->limit(1);
Result:
Correct!
Question results
What output does a web service have in php with the following lines of code assuming
that the variable "$cursor" contains this list of objects: [{“x”: “1”, “y”: “2”},{“x ”: “2”, “y”:
“5”},{“x”: “3”, “y”: “8”}]
Correct!
Question results
What should you do if the KPIs you selected to create a dashboard exceed 4?
User response:
Distribute them in different dashboard
Result:
Correct!
Question results
Assuming that the "ajax" request in the following code segment returns a JSON, what
would it print in the console:
$.ajax({
url: "data.php?Consult=0",
type: 'get',
success: function(RecoveredData) {
$.each(RecoveredData, function(i,o){
Concoles.log( parseInt(ox));
Concoles.log( parseInt(oy));
});
User response:
A graph with the data that included the JSON of the request
Result:
chart: {
type: 'graph-bar'
},
User response:
The object you are programming does not exist
Result:
Correct!
Question results
Karen is having trouble programming a php web service and needs to fetch the kitchen
collection from a NoSQL database called "Departments". What is the error in the code?
Correct!
Level 04
Lesson 01
Data science
Its objective is to have a better understanding of Big Data using study techniques other
than conventional ones. Data science is a mix of: Statistics and mathematics, Computer
science, Business administration.
Features
For a data scientist to obtain knowledge of Big Data, they must perform the following
tasks with the data:
Acquire them
Analyze them
Filter them
Extract them
represent them
Refine them
Interact with them
The structure that serves as the basis for developing and organizing software with
various integrated tools is called a framework.
The most used in Big Data is Apache Hadoop, which allows the processing of large
data sets distributed in clusters with the help of simple programming models. It is
designed to scale vertically and can have thousands of computers, where each one can
offer storage and processing. local.
Common
Distributed file system
YARN
MapReduce
The requirements to start taking advantage of Big Data are the following:
The entire company must know the impact of the appropriate use of data on the
business and its daily work
It is not necessary to know technical aspects
The company director must be the first to understand the benefit that this
technology brings and communicate it to others
Hire a Big Data specialist
It must define the needs and opportunity areas of the company, the appropriate
technology to satisfy those needs, these may be applications with Machine Learning,
other types of real-time analytics or just more robust business intelligence. It must also
be defined whether the processing will be in the cloud or internal.
Project Manager
Cloud Computing
If you want to work internally, then you must hire more staff:
Another necessary person is the Data Scientist who is an expert in Big Data analytics or
at least a business analyst who relies on cloud tools.
Assessment
Which of the protocols is recommended to connect your Web service with the relational
database?
User response:
HTTP
Result:
Correct!
Question results
What type of database schema is most recommended for your integrator application?
User response:
Star
Result:
Correct!
Question results
In the relational database you created, the following are required fields for your model,
except:
User response:
Comment
Result:
Correct!
Question results
User response:
388594514598480
Result:
Correct!
Question results
In your application, which of the following methods is correct to display the frequency of
likes based on time?
User response:
Histogram
Result:
Correct!
Question results
In the application you created, what is the minimum number of nested loops you need
to get comments for each post on the social network?
User response:
2
Result:
Correct!
Question results
User response:
Likes per week
Result:
Correct!
Question results
To enable your application server web page, you must perform the following tasks
except:
User response:
Upload files via FTP
Result:
Correct!