Azure Advanced Analytics Overview
Azure Advanced Analytics Overview
Context
Solutions
Demo
Connected data
CLOUD
MOBILE
Data Cloud Intelligence
Decision
Transforming key aspects of business
Systems of Intelligence
Data
Data
Dividend
Business
Acumen
Science Team
Application
Development
Data
Management
Data Data Business Data Application
Science Engineering Acumen Management Development
Preparation Modeling
ƒ(x)
Operationalization
Preparation Modeling
ƒ(x)
Operationalization
Preparation Modeling
5%
ƒ(x)
Operationalization
• Legacy Products
High Cost • Irregular Workload
Broaden The • Democratize Data Science
Talent Pool • Skill Re-Use
So we…
Built an Exabyte-scale data lake for everyone to put
their data.
Built tools approachable by any developer.
Built machine learning tools for collaborating across
large experiment models.
Transform data into intelligent action in the cloud
Data
Sources People
Apps Apps
Cortana Intelligence
Sensors Automated
and Systems
devices
Data
Sources People
Cortana Intelligence
Apps
+ Apps
Sensors Automated
and Systems
devices Microsoft R Server & SQL R Services
future
00100
100
10101010
Automate decision-making to 1011100010
accelerate business and aid 10101010
competitive advantage
Cortana Intelligence Suite
DocumentDb
Blob Storage
Apps HDInsight
Event Hubs (Hadoop and Cortana Mobile
Spark) Apps
Dashboards &
Visualizations
Sensors Automated
and Power BI Systems
devices
Mobile
HDInsight
Event Hubs (Hadoop and Cortana Apps
Spark)
Bots
Sensors
Stream Dashboards &
and devices
Analytics Visualizations Automated
Systems
Power BI
Data
Data Factory
Apps
Data Catalog
Event Hubs
Sensors
and devices
Data
Compose and orchestrate data services at
scale
Information
Management SQL
SQL
DATA SOURCES
<>
Data Catalog
Event Hubs
SQL {}
• Create, schedule, orchestrate, and manage data pipelines • Automate cloud resource management
• Visualize data lineage • Move relational data for Hadoop processing
• Connect to on-premises and cloud data sources • Transform with Hive, Pig, or custom code
• Monitor data pipeline health
Get more value from your enterprise data
assets
Information
Management
Data Factory
Data Catalog
Event Hubs
• Spend less time looking for data, and more time getting value from it • Intuitive search and filtering to understand the data sources and their
purpose
• Register enterprise data sources, discover data assets and unlock their
potential, and capture tribal knowledge to make data understandable • Let your data live where you want; connect using tools you choose
• Bridge the gap between IT and the business, allowing everyone to • Integrate into existing tools and processes with open REST APIs
contribute their insights, tags, and descriptions
Ingest events from websites, apps and
devices at cloud scale
Information Data
Management sources
Data Factory
Apps Azure
API
Data Catalog Management Storage HDInsight
• Log millions of events per second in near real time • Get a managed service with elastic scale
• Connect devices using flexible authorization and throttling • Reach a broad set of platforms using native client libraries
• Use time-based event buffering • Pluggable adapters for other cloud services
• Get a managed service with elastic scale
Big Data Stores
Information Big Data Stores
Data Management
Sources
Event Hubs
Sensors
and devices
Data
A hyper-scale repository for big data analytics workloads
• A Hadoop Distributed File System for the cloud • Massive throughput to increase analytic performance
• No fixed limits on file size • High durability, availability, and reliability
• No fixed limits on account size • Azure Active Directory access control
• Unstructured and structured data in their native format
Elastic data warehouse as a service with
enterprise-class features
Big Data Stores
Power BI Hadoop
Data Lake Store
SQL Data
Warehouse
SQL Database
App Service SQL Data Warehouse
Intelligent App
Machine Learning
• Petabyte scale with massively parallel processing • Full enterprise-class SQL Server experience
• Independent scaling of compute and storage—in seconds • Works seamlessly with Power BI, Machine Learning,
• Transact-SQL queries across relational and non-relational data HDInsight, and Data Factory
Machine Learning and Analytics
Information Big Data Stores Machine Learning
Data Management and Analytics
Sources
HDInsight
(Hadoop and
Event Hubs Spark)
Sensors
and devices Stream
Analytics
Data Intelligence
Easily build, deploy, and share predictive
analytics solutions
Machine Learning
and Analytics
Machine
Learning
Data Lake
Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
• Simple, scalable, cutting edge. A fully managed cloud service that enables you to easily build, deploy, and share predictive analytics solutions.
• Deploy in minutes. Azure Machine Learning means business. You can deploy your model into production as a web service that can be called
from any device, anywhere and that can use any data source.
• Publish, share, monetize. Share your solution with the world in the Gallery or on the Azure Marketplace.
Big data analytics made easy
Machine Learning
and Analytics
Machine
Learning Data Lake Analytics
Data Lake
Analytics
HDInsight
(Hadoop and
Spark)
• Analyze data of any kind and size • Managed and supported with an enterprise-grade SLA
• Develop faster, debug and optimize smarter • Dynamically scales to match your business priorities
• Interactively explore patterns in your data • Enterprise-grade security with Azure Active Directory
• No learning curve—use U-SQL, Spark, Hive, HBase and Storm • Built on YARN, designed for the cloud
Comprehensive set of managed Apache big data projects
Machine Learning
and Analytics
Machine
Learning
Batch Script SQL NoSQL Streaming In-Memory
Map Reduce Pig Hive HBase Storm Spark
Data Lake
Analytics
HDInsight
(Hadoop and
Spark)
Core Engine
Stream
Analytics
Machine
Learning Event Hubs Blob Storage
Data Lake
Event Hubs
Analytics
Stream
Analytics Table Storage
HDInsight
(Hadoop and
Spark) Power BI
Blob Storage
Stream
Analytics
• Perform real-time analytics for your Internet of Things solutions • Create real-time dashboards and alerts over data from
• Stream millions of events per second devices and applications
• Correlate across multiple streams of data
• Get mission-critical reliability and performance with predictable
results • Use familiar SQL-based language for rapid development
Intelligence
Information Big Data Stores Machine Learning Intelligence
Data Management and Analytics
Sources
HDInsight
(Hadoop and
Event Hubs Spark)
Sensors Stream
and devices Analytics
Data
Build applications that understand people
Intelligence
Cognitive
Services
Bot
Framework
Cortana
• Faces, images, emotion recognition and video intelligence • Complex tasks processing, knowledge exploration,
intelligent recommendations
• Spoken language processing, speaker recognition, custom speech recognition
• Bing engine capabilities for Web, Autosuggest, Image,
• Natural language processing, sentiment and topics analysis, spelling errors
Video and News
Your bots – wherever your users converse
Intelligence
Cognitive
Services
Bot
Framework
Cortana
• Bot Connector Service: A service to register your bot, configure channels and publish to the Bot Directory. Connect your bot(s) seamlessly to
text/sms, Office 365 mail, Skype, Slack, Twitter, and more.
• Bot Builder SDK: An open source SDK hosted on GitHub. Everything you need to build great dialogs within your Node.js or C# bot
• Bot Directory: A public directory of bots registered through the Bot Connector Service. Discover, try, and add bots to conversation experiences
Get things done in more helpful, proactive
and natural waysHere are some of the Cortana for With the Cortana
Intelligence
things I can help you with… Consumers (today) Intelligence Suite
Cognitive Answers from organizational data in Power BI
Public reference data answers – “How far is it
Services Answers from Los Angeles to San Francisco?”
“What were our biggest deals that closed
last month?”
Bot
Framework Integration with prediction solutions
Event predictions – “Who do you think is going
Predictions to win the Germany Italy game?”
“Which of our customers are most likely to
churn in the next quarter?”
Cortana
HDInsight
Event Hubs (Hadoop and Cortana
Spark)
Sensors
Stream Dashboards &
and devices
Analytics Visualizations
Power BI
Data
Data Intelligence
Keep a pulse on your business with live,
interactive dashboards Stream Analytics
Event Hubs
Power BI
Machine Learning
Storage
Power BI
Dashboards &
Visualizations
SQL database
Power BI HDInsight
Power BI
• Analytics for everyone, even non-data experts • Drive consistent analysis across your organization
• Your whole business on one dashboard • Embed visuals in your applications
• Create stunning, interactive reports • Get real-time alerts when things change
Transform data into intelligent action
Information Big Data Stores Machine Learning Intelligence
Management and Analytics
Data
People
Sources
Machine Cognitive
Data Factory Data Lake Store
Learning Services
Mobile
HDInsight
Event Hubs (Hadoop and Cortana Apps
Spark)
Bots
Sensors
Stream Dashboards &
and devices
Analytics Visualizations Automated
Systems
Power BI
Data
Cognitive
Intelligence&
Contextual
From Data To Action On Premises
Data
Sources People
Cortana Intelligence
Apps
+ Apps
Sensors Automated
and Systems
devices Microsoft R Server & SQL R Services
What is
• 2.5+M users
• Taught in most universities
Community
• New and recent grad’s use it
• Thriving user groups worldwide
#9: R
?
?
“This acquisition will help customers use advanced analytics within Microsoft data platforms.“
Community Commercial
SQL Server
R Open R Services R Server
Windows Red Hat SUSE
Hadoop Teradata
CRAN, MRO, MRS Comparison
Microsoft Microsoft
R Open R Server
Datasize
In-memory
In-memory In-Memory or Disk Based
Support
Community Community Community + Commercial
R on a R on a server
server Invoking RRE
pulling data ScaleR Inside
Minutes
Rows
Linux, Windows, Hadoop & Teradata
High-performance, Scalable R
R Server Technology
Simplicity Scalability Cost
and agility and choice effectiveness
RDBMS
Cortana
Analytics Suite
Prepare Model
Operationalize
On-Premises
SQL
2016 SQL SQL Prepare Model
Operationalize
2016 2016
Prepare Model Operationalize
Scalable Algorithms
Azure Import/Export
Visual Studio SDK Data Event Data
Key Vault Catalog
Factory Hubs
Infrastructure Services
Use the right PaaS store for the job
When you need…. Because… But not for… Use …
Transactions, joins,
Quickly changing data
Relational store structured data,
schemas
SQL Database
familiar SQL query
NoSQL key-value pair Low-cost, fast, massive
store scale
Rich query Tables
Flexible schema,
NoSQL JSON document
store
familiar SQL query, low Complex joins DocumentDB
latency
Open-source,
HBase on
NoSQL wide-column store integration with Operational simplicity
Hadoop analytics HDInsight
Increasing speed of an
Cache
app
Primary data store Redis Cache
Client User
DocumentDB
• Scale via add’l Collections
• Product Catalog
• Community Posts
Search
• Scale via Search Units
• Product Catalog
• Community Posts
Azure IoT Suite Sample Architecture
Azure IoT Suite Remote Monitoring
People
Web/Mobile App
Power BI
Web
Storage blobs DocumentDB
Mobile
Apps
IoT Hub Stream Analytics Event Hub Web Jobs Logic Apps
Bots
Sensors
and
devices
Azure Automated
Active Directory Systems
Cortana Intelligence Sample Architecture
ADL Analytics
Devices Social
HDInsight
LOB ADL Store
Applications Video
R
Power BI
Web Sensors
Spark
• A Hadoop Distributed File System for the cloud • Massive throughput to increase analytic performance
• No fixed limits on file size • High durability, availability, and reliability
• No fixed limits on account size • Azure Active Directory access control
• Unstructured and structured data in their native format
Big data analytics made easy
• Analyze data of any kind and size • Managed and supported with an enterprise-grade SLA
• Develop faster, debug and optimize smarter • Dynamically scales to match your business priorities
• Interactively explore patterns in your data • Enterprise-grade security with Azure Active Directory
• No learning curve—use U-SQL, Spark, Hive, HBase and Storm • Built on YARN, designed for the cloud
SQL Database
For:
Web app user data
Address books
Device information
Other metadata
SQL Database
Tables
DocumentDB
HBase on HDInsight
Redis Cache
Azure Search
For:
Catalog data
Preferences and state
Event store
User generated content
Data exchange
IoT device registry
SQL Database
Tables
DocumentDB
HBase on HDInsight
Redis Cache
Azure Search
For:
Interactive websites
Sensor data
Message systems
Real-time query (using Phoenix)
Writing transactional data to Azure Blobs
SQL Database
Tables
DocumentDB
HBase on HDInsight
Redis Cache
Azure Search
For:
All apps
Azure Search
SQL Database
Tables
DocumentDB
HBase on HDInsight
Redis Cache
Azure Search
high
high heels
high tops
high arch
People use search as a natural, low friction way to interact with apps
Web search engines have set the bar high for search
Instant results, auto-complete, hit highlighting, great ranking, linguistics Ecommerce and Online Retail
Search is hard and rarely a core expertise area User Generated Content
From infrastructure standpoint: availability, durability, scale, operations Line of Business Applications
From the functionality standpoint: ranking, geo-spatial, input handling