0% found this document useful (0 votes)
175 views81 pages

Azure Advanced Analytics Overview

The document discusses transforming business through data and intelligence. It outlines Azure data services that combine relevant data from any source and leverage advanced analytics to detect patterns, predict outcomes, and automate decisions. This transforms data into intelligent action in the cloud or on-premises by reducing costs, scaling infinitely, aggregating any data type, and connecting information wherever needed.

Uploaded by

murilove
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
175 views81 pages

Azure Advanced Analytics Overview

The document discusses transforming business through data and intelligence. It outlines Azure data services that combine relevant data from any source and leverage advanced analytics to detect patterns, predict outcomes, and automate decisions. This transforms data into intelligent action in the cloud or on-premises by reducing costs, scaling infinitely, aggregating any data type, and connecting information wherever needed.

Uploaded by

murilove
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 81

Agenda

Context

Solutions

Azure Data Services

Demo
Connected data
CLOUD

MOBILE
Data Cloud Intelligence
Decision
Transforming key aspects of business

Systems of Intelligence

Transform your Engage your Optimize your Empower your


products customers operations employees
$
1.6T
Additional business value captured
by companies that are leaders in
10%
Percent of organizations expected
to have a highly profitable business
using data assets to their advantage unit specifically for productizing and
Source: IDC, 2014 commercializing their data by 2020
Source: Gartner, 2016
Data Dividend
Incremental Gains Made by Leaders in Data and Analytics

$235 $158 $456 $674


Product & Service Customer Facing Operations Productivity
Innovation
Gains in $ Billions
IDC Data Dividend Study and Survey, N=2,020, April 2014
EXAMPLE SOLUTIONS
Data
Science
Data
Engineering

Data
Data
Dividend
Business
Acumen

Science Team
Application
Development

Data
Management
Data Data Business Data Application
Science Engineering Acumen Management Development

Reasoning Diversity Business Needs Stability Relevant

Discovery Ontology Domain Expertise Durability Usable

Prediction Transformation Reasoning Elasticity Embedded


Pipelines Translation Security Mobilized
Assessment
Scale-Up Governance
Innovation
Typical advanced analytics lifecycle

Preparation Modeling

Model Ingest  Transform  Explore  Model  Deploy 

ƒ(x)
Operationalization

Score  Score  Visualize  Measure 


Data Scientist should be creating / testing models
Data scientist average wage - 350 K year

Preparation Modeling

Model Ingest  Transform  Explore  Model  Deploy 

ƒ(x)
Operationalization

Score  Score  Visualize  Measure 


But the reality is different …
Data scientist focus time
80% 15%

Preparation Modeling

Model Ingest  Transform  Explore  Model  Deploy 

5%
ƒ(x)
Operationalization

Score  Score  Visualize  Measure 


Talent • Academic Rigor
Scarcity • Talent Competition

Low • Integration Complexity


Productivity • Tool, Skill & Culture Gaps

• Data Volume, Diversity


Complex
• Security & Governance Constraints
Infrastructure • Rapid Platform Evolution

Slow • Low Experimentation Rate


Innovation • Complex Operationalization

• Legacy Products
High Cost • Irregular Workload
Broaden The • Democratize Data Science
Talent Pool • Skill Re-Use

Increase • Transparent Scaling


Productivity • Facilitate Collaboration

Modernize • Decouple Data Science from Platforms


Infrastructure • Leverage Hybrid Cloud Architecture

Maximize • Accelerate Experimentation


Innovation • Streamline Deployment

Drive Down • Embrace Open Source


TCO • Evolutionary Path to Cloud
Earning our credibility Using vastly accelerated
experimentation cycles:
We needed to leverage data
and analytics to grow our
products.
Key Innovation…
More experiments by more people!

So we…
Built an Exabyte-scale data lake for everyone to put
their data.
Built tools approachable by any developer.
Built machine learning tools for collaborating across
large experiment models.
Transform data into intelligent action in the cloud

Data
Sources People

Apps Apps

Cortana Intelligence
Sensors Automated
and Systems
devices

DATA INTELLIGENCE ACTION


On Prem or in the Cloud

Data
Sources People

Cortana Intelligence

Apps
+ Apps

Sensors Automated
and Systems
devices Microsoft R Server & SQL R Services

DATA INTELLIGENCE ACTION


Reduce costs by collecting, storing,
and processing data in the cloud
Scale infinitely and manage planned
or unexpected events with elastic
data stores
Aggregate any type of data and
connect information to wherever
you need it
Aggregate any type of data and connect information to
wherever you need it
Detect subtle patterns and insights
by analyzing massive amounts of
data from many sources
Shape new business outcomes by
predicting what may happen in the 10101
01010

future
00100

100
10101010
Automate decision-making to 1011100010
accelerate business and aid 10101010
competitive advantage
Cortana Intelligence Suite
DocumentDb

Blob Storage

Information Big Data Stores Machine Learning Intelligence


Data Management and Analytics
People
Sources
Machine Cognitive
Data Factory Data Lake Store
Learning Services

SQL Data Data Lake Bot Web


Data Catalog Warehouse Analytics Framework

Apps HDInsight
Event Hubs (Hadoop and Cortana Mobile
Spark) Apps

Stream Analytics Bots

Dashboards &
Visualizations
Sensors Automated
and Power BI Systems
devices

Data Intelligence Action


It combines relevant data from anywhere
Reduce costs by collecting, storing, and
processing data in the cloud

Scale infinitely and manage planned or


unexpected events with elastic data stores

Aggregate any type of data and connect


information to wherever you need it
And leverages advanced analytics
Detect subtle patterns and insights by analyzing
massive amounts of data from many sources

Shape new business outcomes by predicting


what may happen in the future

Automate decision-making to accelerate


business and aid competitive advantage
Transform data into intelligent action
Information Big Data Stores Machine Learning Intelligence
Management and Analytics
Data
People
Sources
Machine Cognitive
Data Factory Data Lake Store
Learning Services

SQL Data Data Lake Bot Web


Data Catalog Warehouse Analytics Framework
Apps

Mobile
HDInsight
Event Hubs (Hadoop and Cortana Apps
Spark)
Bots

Sensors
Stream Dashboards &
and devices
Analytics Visualizations Automated
Systems
Power BI
Data

Data Intelligence Action


Information Management
Information
Data Management
Sources

Data Factory

Apps
Data Catalog

Event Hubs

Sensors
and devices

Data
Compose and orchestrate data services at
scale
Information
Management SQL

Data Factory INGEST

SQL
DATA SOURCES
<>
Data Catalog

Event Hubs

SQL {}

• Create, schedule, orchestrate, and manage data pipelines • Automate cloud resource management
• Visualize data lineage • Move relational data for Hadoop processing
• Connect to on-premises and cloud data sources • Transform with Hive, Pig, or custom code
• Monitor data pipeline health
Get more value from your enterprise data
assets
Information
Management

Data Factory

Data Catalog

Event Hubs

• Spend less time looking for data, and more time getting value from it • Intuitive search and filtering to understand the data sources and their
purpose
• Register enterprise data sources, discover data assets and unlock their
potential, and capture tribal knowledge to make data understandable • Let your data live where you want; connect using tools you choose
• Bridge the gap between IT and the business, allowing everyone to • Integrate into existing tools and processes with open REST APIs
contribute their insights, tags, and descriptions
Ingest events from websites, apps and
devices at cloud scale
Information Data
Management sources

SQL Database Machine Learning

Data Factory

Apps Azure
API
Data Catalog Management Storage HDInsight

Sensors Event Hubs


Event Hubs and
devices
Backend Services
Stream Analytics Power BI
Data

• Log millions of events per second in near real time • Get a managed service with elastic scale
• Connect devices using flexible authorization and throttling • Reach a broad set of platforms using native client libraries
• Use time-based event buffering • Pluggable adapters for other cloud services
• Get a managed service with elastic scale
Big Data Stores
Information Big Data Stores
Data Management
Sources

Data Factory Data Lake Store

Apps SQL Data


Data Catalog Warehouse

Event Hubs

Sensors
and devices

Data
A hyper-scale repository for big data analytics workloads

Big Data Stores


ADL Analytics
Devices Social
Data Lake Store
HDInsight
LOB ADL Store
Applications Video
SQL Data R
Warehouse
Web Sensors
Spark

Relational Clickstream Machine Learning

• A Hadoop Distributed File System for the cloud • Massive throughput to increase analytic performance
• No fixed limits on file size • High durability, availability, and reliability
• No fixed limits on account size • Azure Active Directory access control
• Unstructured and structured data in their native format
Elastic data warehouse as a service with
enterprise-class features
Big Data Stores

Power BI Hadoop
Data Lake Store

SQL Data
Warehouse

SQL Database
App Service SQL Data Warehouse

Intelligent App
Machine Learning

• Petabyte scale with massively parallel processing • Full enterprise-class SQL Server experience
• Independent scaling of compute and storage—in seconds • Works seamlessly with Power BI, Machine Learning,
• Transact-SQL queries across relational and non-relational data HDInsight, and Data Factory
Machine Learning and Analytics
Information Big Data Stores Machine Learning
Data Management and Analytics
Sources

Data Lake Store Machine


Learning
Data Factory

SQL Data Data Lake


Apps Warehouse Analytics
Data Catalog

HDInsight
(Hadoop and
Event Hubs Spark)

Sensors
and devices Stream
Analytics

Data Intelligence
Easily build, deploy, and share predictive
analytics solutions
Machine Learning
and Analytics
Machine
Learning

Data Lake
Analytics

HDInsight
(Hadoop and
Spark)

Stream
Analytics

• Simple, scalable, cutting edge. A fully managed cloud service that enables you to easily build, deploy, and share predictive analytics solutions.
• Deploy in minutes. Azure Machine Learning means business. You can deploy your model into production as a web service that can be called
from any device, anywhere and that can use any data source.
• Publish, share, monetize. Share your solution with the world in the Gallery or on the Azure Marketplace.
Big data analytics made easy
Machine Learning
and Analytics
Machine
Learning Data Lake Analytics

Data Lake
Analytics

HDInsight
(Hadoop and
Spark)

SQL DW SQL DB Data Lake Store Storage Blobs SQL DB in a VM


Stream
Analytics

• Analyze data of any kind and size • Managed and supported with an enterprise-grade SLA
• Develop faster, debug and optimize smarter • Dynamically scales to match your business priorities
• Interactively explore patterns in your data • Enterprise-grade security with Azure Active Directory
• No learning curve—use U-SQL, Spark, Hive, HBase and Storm • Built on YARN, designed for the cloud
Comprehensive set of managed Apache big data projects

Machine Learning
and Analytics
Machine
Learning
Batch Script SQL NoSQL Streaming In-Memory
Map Reduce Pig Hive HBase Storm Spark
Data Lake
Analytics

HDInsight
(Hadoop and
Spark)
Core Engine
Stream
Analytics

• Scale to petabytes on demand • Deploy in Windows or Linux


• Process unstructured and semi-structured data • Spin up an Apache Hadoop cluster in minutes
• Develop in Java, .NET, and more • Visualize your Hadoop data in Excel
• Skip buying and maintaining hardware • Easily integrate on-premises Hadoop clusters
Real-time stream processing in the cloud
Machine Learning
and Analytics
SQL Database

Machine
Learning Event Hubs Blob Storage

Data Lake
Event Hubs
Analytics
Stream
Analytics Table Storage
HDInsight
(Hadoop and
Spark) Power BI
Blob Storage

Stream
Analytics

• Perform real-time analytics for your Internet of Things solutions • Create real-time dashboards and alerts over data from
• Stream millions of events per second devices and applications
• Correlate across multiple streams of data
• Get mission-critical reliability and performance with predictable
results • Use familiar SQL-based language for rapid development
Intelligence
Information Big Data Stores Machine Learning Intelligence
Data Management and Analytics
Sources

Data Lake Store Cognitive


Machine Services
Data Factory
Learning
Bot
SQL Data Data Lake Framework
Apps Warehouse Analytics
Data Catalog
Cortana

HDInsight
(Hadoop and
Event Hubs Spark)

Sensors Stream
and devices Analytics

Data
Build applications that understand people
Intelligence

Cognitive
Services

Bot
Framework

Cortana

• Faces, images, emotion recognition and video intelligence • Complex tasks processing, knowledge exploration,
intelligent recommendations
• Spoken language processing, speaker recognition, custom speech recognition
• Bing engine capabilities for Web, Autosuggest, Image,
• Natural language processing, sentiment and topics analysis, spelling errors
Video and News
Your bots – wherever your users converse
Intelligence

Cognitive
Services

Bot
Framework

Cortana

• Bot Connector Service: A service to register your bot, configure channels and publish to the Bot Directory. Connect your bot(s) seamlessly to
text/sms, Office 365 mail, Skype, Slack, Twitter, and more.
• Bot Builder SDK: An open source SDK hosted on GitHub. Everything you need to build great dialogs within your Node.js or C# bot
• Bot Directory: A public directory of bots registered through the Bot Connector Service. Discover, try, and add bots to conversation experiences
Get things done in more helpful, proactive
and natural waysHere are some of the Cortana for With the Cortana
Intelligence
things I can help you with… Consumers (today) Intelligence Suite
Cognitive Answers from organizational data in Power BI
Public reference data answers – “How far is it
Services Answers from Los Angeles to San Francisco?”
“What were our biggest deals that closed
last month?”
Bot
Framework Integration with prediction solutions
Event predictions – “Who do you think is going
Predictions to win the Germany Italy game?”
“Which of our customers are most likely to
churn in the next quarter?”
Cortana

Monitoring KPIs and preemptive alerting


Flight status, traffic conditions, changes in
Monitoring & Alerts weather, …
“Alert me if this customer ever has a 90%
chance of churn in the next 30 days”

Line of business process integration


Setting reminders, scheduling meetings,
Task Completion getting directions, …
Assistance with expense report submission
on-time within policy
Dashboards & Visualizations
Information Big Data Stores Machine Learning Intelligence
Management and Analytics
Data
Sources
Machine Cognitive
Data Factory Data Lake Store
Learning Services

SQL Data Data Lake Bot


Data Catalog Warehouse Analytics Framework
Apps

HDInsight
Event Hubs (Hadoop and Cortana
Spark)

Sensors
Stream Dashboards &
and devices
Analytics Visualizations

Power BI
Data

Data Intelligence
Keep a pulse on your business with live,
interactive dashboards Stream Analytics

Event Hubs

Power BI

Machine Learning

Storage
Power BI
Dashboards &
Visualizations
SQL database
Power BI HDInsight
Power BI

• Analytics for everyone, even non-data experts • Drive consistent analysis across your organization
• Your whole business on one dashboard • Embed visuals in your applications
• Create stunning, interactive reports • Get real-time alerts when things change
Transform data into intelligent action
Information Big Data Stores Machine Learning Intelligence
Management and Analytics
Data
People
Sources
Machine Cognitive
Data Factory Data Lake Store
Learning Services

SQL Data Data Lake Bot Web


Data Catalog Warehouse Analytics Framework
Apps

Mobile
HDInsight
Event Hubs (Hadoop and Cortana Apps
Spark)
Bots

Sensors
Stream Dashboards &
and devices
Analytics Visualizations Automated
Systems
Power BI
Data

Data Intelligence Action


Easy
Analytics
& Actionable
Secure
Big Data
& Scalable

Cognitive
Intelligence&
Contextual
From Data To Action On Premises

Data
Sources People

Cortana Intelligence

Apps
+ Apps

Sensors Automated
and Systems
devices Microsoft R Server & SQL R Services

DATA INTELLIGENCE ACTION


• A statistics programming language
Language • A data visualization tool
Platform • Open source

What is
• 2.5+M users
• Taught in most universities
Community
• New and recent grad’s use it
• Thriving user groups worldwide

• 7000+ free algorithms in CRAN


Ecosystem • Scalable to big data
• Rich application & platform integration
Tool Use for Data Science Language Popularity
O’Reilly Data Science Survey 2014 IEEE Spectrum Top Programming Languages
(max=80%)

#9: R

IEEE Spectrum July 2015


Challenges posed by open source R

?
?

Limited Inadequate Lack of Complex


Data Modeling Commercial Deployment
Scale Performance Support Processes
R from Microsoft brings

Peace of Efficiency Speed and Flexibility


mind scalability and agility
April 6, 2015

“This acquisition will help customers use advanced analytics within Microsoft data platforms.“
Community Commercial
SQL Server
R Open R Services R Server
Windows Red Hat SUSE

Hadoop Teradata
CRAN, MRO, MRS Comparison
Microsoft Microsoft
R Open R Server

Datasize
In-memory
In-memory In-Memory or Disk Based

Speed of Analysis Multi-threaded, parallel


Single threaded Multi-threaded
processing 1:N servers

Support
Community Community Community + Commercial

Analytic Breadth 7500+ innovative packages +


& Depth 7500+ innovative analytic
7500+ innovative analytic commercial parallel high-speed
packages
packages functions

Licence Commercial license.


Open Source
Open Source Supported release with
indemnity
Escapes R’s traditional memory limits
Scales predictive modeling using
parallelization
Distributes computation cores & nodes
Minimizes data movement using in-
database, in-MapReduce and in-Apache
Spark execution
In-Database Example: From 5+ hours to 40 seconds

R on a R on a server
server Invoking RRE
pulling data ScaleR Inside
Minutes

via SQL the EDW

Rows
Linux, Windows, Hadoop & Teradata

High-performance, Scalable R
R Server Technology
Simplicity Scalability Cost
and agility and choice effectiveness

Enterprise speed and In-database deployment Included in SQL Server


scale 2016
Memory and disk
Near-DB analytics scalability Reuse and optimize
existing R code
Parallel threading and No R memory limits
processing Eliminate data movement
Write once, deploy
Reuse SQL skills for data anywhere
engineering
Cloud

Hadoop & Spark


R Server portfolio
R Server Technology
EDW

RDBMS

Desktops & Servers

Write Once – Deploy Anywhere


Hybrid Architecture

Cortana
Analytics Suite

SQL Server 2016


In the Cloud

Prepare Model

Operationalize
On-Premises

SQL
2016 SQL SQL Prepare Model
Operationalize
2016 2016
Prepare Model Operationalize

StretchDB Example: Cost Reduce Growth of EDW


On-Prem Scoring Example: Purchase Propensity On-Prem Scoring Example: Near-Real-Time
Polybase Example: Federate CRM Data with Prediction Fraud or Anomaly Detection
Cloud-Borne Demographics
Convergence with Flexibility

Scalable Algorithms

Templates & Samples

R: Write Once Deploy Anywhere


Cortana Intelligence Microsoft R Server Family
R & Python to AML Interop.
In 3 Years, we will help you achieve:
Discover. Learn. Share.

Azure Machine Learning Gallery


IoT + Analytics Scenarios
Platform Services

Security & Hybrid


Management Cloud Service
Operations
Services Fabric Web Apps API Apps
SQL Data DocumentDB
Portal Azure AD
Database Warehouse
Health Monitoring
Batch
Azure Active RemoteApp AD Privileged
Directory Mobile Logic Apps Identity
Redis Azure Storage
Apps Cache Management
Search Tables
Azure AD
B2C Domain Services

Multi-Factor API Notification


Authentication Management Hubs
Storage BizTalk Backup
Queues Services
Automation
HDInsight Machine Stream Data Operational
Hybrid Service Bus Learning Analytics Lake Analytics
Scheduler Connections

Azure Import/Export
Visual Studio SDK Data Event Data
Key Vault Catalog
Factory Hubs

Store/ Azure Site


Marketplace Media Content VS Online App IoT Hub Mobile Recovery
Services Delivery Insights Engagement
Network (CDN)
StorSimple
VM Image Gallery
& VM Depot

Infrastructure Services
Use the right PaaS store for the job
When you need…. Because… But not for… Use …
Transactions, joins,
Quickly changing data
Relational store structured data,
schemas
SQL Database
familiar SQL query
NoSQL key-value pair Low-cost, fast, massive
store scale
Rich query Tables
Flexible schema,
NoSQL JSON document
store
familiar SQL query, low Complex joins DocumentDB
latency
Open-source,
HBase on
NoSQL wide-column store integration with Operational simplicity
Hadoop analytics HDInsight
Increasing speed of an
Cache
app
Primary data store Redis Cache

Integrating search into


Search service
an app
Primary data store Azure Search

OLAP MPP Processing Transactions Azure SQL DW

Petabyte scale storage


and processing
Iot, Big Data Transactions Azure Data Lake
E-Commerce Sample PaaS Architecture
Azure Websites (part of Media Services + Storage
App Service Azure Redis Cache • Uploaded Community Videos
• autoscale enabled + • Secure upload and streaming
• Authentication: SQL, • Static and Dynamic Packaging
Facebook, Twitter, to all formats
Microsoft, User/Pass
SQL Database
• Scale via Elastic Database
• Product Orders

Client User
DocumentDB
• Scale via add’l Collections
• Product Catalog
• Community Posts

Search
• Scale via Search Units
• Product Catalog
• Community Posts
Azure IoT Suite Sample Architecture
Azure IoT Suite Remote Monitoring
People

Web/Mobile App
Power BI

Web
Storage blobs DocumentDB

Mobile

Apps
IoT Hub Stream Analytics Event Hub Web Jobs Logic Apps
Bots

Sensors
and
devices

Azure Automated
Active Directory Systems
Cortana Intelligence Sample Architecture
ADL Analytics
Devices Social

HDInsight
LOB ADL Store
Applications Video
R
Power BI
Web Sensors
Spark

Relational Clickstream Machine Learning

• A Hadoop Distributed File System for the cloud • Massive throughput to increase analytic performance
• No fixed limits on file size • High durability, availability, and reliability
• No fixed limits on account size • Azure Active Directory access control
• Unstructured and structured data in their native format
Big data analytics made easy

Data Lake Analytics

SQL DW SQL DB Data Lake Store Storage Blobs SQL DB in a VM

• Analyze data of any kind and size • Managed and supported with an enterprise-grade SLA
• Develop faster, debug and optimize smarter • Dynamically scales to match your business priorities
• Interactively explore patterns in your data • Enterprise-grade security with Azure Active Directory
• No learning curve—use U-SQL, Spark, Hive, HBase and Storm • Built on YARN, designed for the cloud
SQL Database

Azure SQL Database


Tables
DocumentDB
HBase on HDInsight
Redis Cache
Fully managed relational database service Azure Search

• Built for SaaS and Enterprise applications


• Predictable performance & Pricing
• Elastic database pool for unpredictable SaaS workloads
• 99.99% availability built-in
• Geo-replication and restore services for data protection
• Secure and compliant for your sensitive data
• Fully compatible with SQL Server 2014 databases
SQL Database
Tables
DocumentDB
HBase on HDInsight
Redis Cache
Azure Search

For:
 Web app user data
 Address books
 Device information
 Other metadata
SQL Database
Tables
DocumentDB
HBase on HDInsight
Redis Cache
Azure Search

For:
 Catalog data
 Preferences and state
 Event store
 User generated content
 Data exchange
 IoT device registry
SQL Database
Tables
DocumentDB
HBase on HDInsight
Redis Cache
Azure Search

For:
 Interactive websites
 Sensor data
 Message systems
 Real-time query (using Phoenix)
 Writing transactional data to Azure Blobs
SQL Database
Tables
DocumentDB
HBase on HDInsight
Redis Cache
Azure Search

For:
 All apps
Azure Search
SQL Database
Tables
DocumentDB
HBase on HDInsight
Redis Cache
Azure Search
high

high heels
high tops
high arch
People use search as a natural, low friction way to interact with apps
Web search engines have set the bar high for search
Instant results, auto-complete, hit highlighting, great ranking, linguistics Ecommerce and Online Retail
Search is hard and rarely a core expertise area User Generated Content
From infrastructure standpoint: availability, durability, scale, operations Line of Business Applications
From the functionality standpoint: ranking, geo-spatial, input handling

You might also like