SlideShare a Scribd company logo
Teradata - Architecture of Teradata
Introduction To Teradata
Teradata - Architecture of Teradata
Teradata Company Highlights
• Founded 1979 – West LA
• First product to market – 1984
• First Terabyte system – 1987
• Acquired by AT&T and
merged with acquired NCR – 1992
• Tri-vested as part of NCR - 1997
• Teradata Corporation – (re)Launched October 1, 2007
– Global Leader in Enterprise Data Warehousing
• EDW/ADW Database Technology
• Analytic Solutions
– Positioned in Gartner’s Leaders Quadrant
in data warehousing since 1999
• Top 10 U.S. publicly-traded software company
– S&P 500 Member
– Listed NYSE: “TDC”
– 2007 - $1.7B revenue
Teradata - Architecture of Teradata
Continuous (R)evolution
Hardware
+ Database
+ Consulting
+ Data models and
reports
+ Analytic applications
Continuous (R)evolution
Sell the HW, give everything else
away
Sell the SW with some HW to
run on
Sell solving business problems – and technology to
solve them
Sell applications with consulting, SW
and HW inside
Continuous (R)evolution
90% R&D
10% integration
80286
70% R&D
30% integration
i486
20% R&D
80% integration
Pentium
10% R&D
90% integration
Xeon Quad Core
Scale
• Every dimension of the technology must scale to meet today’s requirements
– Data, Data model complexity, Users, Performance, queries, Data loading, …
• What is a big Data Warehouse?
• Total spinning disk?
– 2.5 Petabytes
• Big table?
– 150 billion rows
• Number of tables?
– 300,000
• Insert/Update per day?
– 5 billion records
• Identified users?
– 100,000
• Queries per day?
– 5 million
• Data Turnover rate?
– 1TB per 5 seconds
The Problem
10 > 09/2009
Accts. Payable
Accts. Receivable
Invoicing
Sales/Orders
Finance G/L
Customer Support
HR
Payroll
Purchasing
Order Fulfillment
Manufacturing
Inventory …
Marketing
Supply Chain
Finance
Risk Management
Maintenance
Sales
Operations
Inventory
Call Center …
Operational Systems Decision Makers
The EDW Solution
Accts. Payable
Accts. Receivable
Invoicing
Sales/Orders
Finance G/L
Customer Support
HR
Payroll
Purchasing
Order Fulfillment
Manufacturing
Inventory …
EnterpriseEnterprise
DataData
WarehouseWarehouse
(EDW)(EDW)
Marketing
Supply Chain
Finance
Risk Management
Maintenance
Sales
Operations
Inventory
Call Center …
Operational Systems Decision Makers
Active Enterprise Intelligence™
An Obvious Trend: More Speed, More Users
Strategic Intelligence Operational Intelligence
Enterprise Data Warehouse
BI Tools & reports
Analysis & visualization
Predictive Analytics
EDW Enterprise Integration
Mixed workload management
SOA, BPMS, IDEs
Portals/composite applications
Days
Seconds
Active Enterprise Intelligence™ enabled by an
Active Data Warehouse™
STRATEGIC INTELLIGENCEOPERATIONAL INTELLIGENCE
Business Intelligence
Tools and Applications
Teradata Warehouse
Workflow & Applications
Active EventsActive Access
Suppliers Customers Call
Center
Logistics MarketingFinanceProduct/
Services
Executive
Active Enterprise Integration
Active
Availability
Active
Workload
Management
Active
Load
Active Enterprise Intelligence™ in Retail
Detecting Retail Fraud
Situation
Thieves make copies of cash register receipts, walk into
the store, pick up merchandise, and return items for
cash.
Problem
Associates in returns department did not have historical
POS receipt retrieval access to verify against previously
“returned” receipts or to do returns without receipts.
Solution
Associates query Teradata to quickly check if a return
has already occurred on that receipt number. Also used
by analysts to understand and prevent excessive
returns.
Impact
(for 500-store chain)
• 100% ROI in 5 months
• Stopped a crime ring on the
first day of rollout
• “Cost savings have been
huge”
Active Enterprise Intelligence™ in Retail
Single View of the Customer Across All Channels
Situation
Needed to add Web channel for selling shoes.
Problem
Too much time and cost to keep multiple customer
systems synchronized. Realized they needed just
one customer database, not one more for the Web,
in addition to Call Center, and POS/Store databases.
Solution
Adopted an ADW strategy, moved all customer data
to one Teradata system, revised data models to
cover all channels, added web channel for
commerce, used web services, added TASM to
handle multiple workload types
Impact
• 1M tactical hits to the
EDW per day from the
POS, Call Center, and
Web with 0.11 sec
response time
• Runs simultaneously
with back-office BI,
reports, and ETL
workloads
• Eliminated all other
customer data systems
What is the Measure of a Great
Architecture?
Handle huge changes of underlying technologies and
dependent components while continuing to deliver the
key value proposition.
Teradata - Architecture of Teradata
Processor RoadmapCPU power radically increasing
2003 2005 2009 2011
90nm
process
45nm
process
65nm
process
32nm
process
22nm
process
Hyper-Threading Dual Core Multi Core
20002000 2008+2008+
SPECInt2000SPECInt2000
5X5X
SINGLE-CORESINGLE-CORE
PERFORMANCEPERFORMANCE
DUAL/MULTI-CORE
PERFORMANCE
2007
20042004
What Does Shared Nothing Mean?
• 1985 – Every hardware part, every line of software – “pure” shared
nothing
• 1995 – Multiple units of parallelism sharing CPU, memory
• 2004 – Multiple units of parallelism sharing multiple cores, memory
• 2009 – Multiple units of parallelism sharing same physical spindles
– but still not sharing data
• Future – Multiple units of parallelism in Virtual machines/cloud
not even knowing what physical machine it is on or sharing
19 > 09/2009
Copyright Teradata © 2007-2009
– All rights Reserved
Teradata MPP Server Architecture
• Nodes
– Incrementally scalable to 1024
nodes
• Operating System
– Linux, Windows, Unix
• Storage
– Independent I/O
– Scales per node
• BYNET Interconnect
– Fully scalable bandwidth
• Connectivity
– Fully scalable
– Channel – ESCON/FICON
– LAN, WAN
• Server Management
– One console to view
the entire system
SMP Node1 SMP Node2 SMP Node3 SMP Node4
Server
Management
Dual BYNET Interconnects
CPU1 CPU2
Memory
Operating Sys
CPU1 CPU2
Memory
Operating Sys
CPU1 CPU2
Memory
Operating Sys
CPU1 CPU2
Memory
Operating Sys
Shared Nothing - Dividing the Work
• “Virtual processors” (vprocs) do the work
• Two types
– AMP: owns and operates on the data
– PE: handles SQL and external interaction
• Configure multiple vprocs per hardware node
– Take full advantage of SMP CPU and memory
• Each vproc has many threads of execution
– Many operations executing concurrently
– Each thread can do work for any user, transaction
• Software is equivalent regardless of configuration
– No user changes as system grows from small SMP to huge MPP
Shared Nothing - Dividing the Work
• Basis of Teradata scalability
– Each AMP owns an equal slice of the disk
– Only that AMP reads that slice
• No single point of control for any operation
– I/O, Buffers, Locking, Logging, Dictionary
– Nothing centralized
– Exponential communication costs avoided
AMPsLogs
Locks
Buffers
I/O
# Nodes
Coordination
cost
Teradata
Teradata Data Distribution
• Rows automatically distributed evenly by hash partitioning
– Even distribution results in scalable performance
– Done in real-time as data are loaded, appended, or changed.
– Hash map defined and maintained by the system
• 2**32 hash codes, 64K buckets distributed to AMPs
– Prime Index (PI) column(s) are hashed
– Hash is always the same - for the same values
– No reorgs, repartitioning, space management
Table A Table B Table C
AMP1 AMP2 AMP3 AMP4 ……………………………………………………… AMPn
Primary Index
Teradata Parallel Hash Function
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
RowHash (Hash Bucket) Data Fields
Disk Capacity Exploding
with Little Increase in Performance
36 GB
5.5
73 GB
6.0
146 GB
6.4
.044
.080
.155
PerformanceperCapacity
MB/Sec/GB
DiskDriveBandwidth(MB/Sec)
1
2
3
4
5
6
7
8
Disk Drive Capacity
Platform Change
• Focus used to be
– Optimization of expensive CPU cycles
– Micro-management of precious disk space
• Now
– Manage I/O
– Balance CPU power to the I/O capacity
– Find new ways to optimize I/O, trading for CPU use as necessary
– Pulling 2.5GB/sec per node continuous
• Discontinuity coming
– SSDs become price competitive and reliable
File System
• Teradata wrote a new rule book
– Old one written by IBM 35 years ago, used by all mainstream DBMSs today - except Teradata
• File system built of raw slices
• Rows stored in blocks
– Variable length
– Grow and shrink on demand
– Rows located dynamically
• May be moved to reclaim space, defrag
– Maximum block size is configurable
• System default or per table
• 8K to 128K
• Change dynamically
• Indexes are just rows in tables
• Has evolved from direct management of single spindles to completely virtualized storage, not even
knowing spindle location
Workload Management Evolution
• 1984 – pure timeshare
• 1987 – 4 priorities, defined by user
• 1995 – multiple priorities in multiple partitions
• 2000 – weighted workload groups
• 2004 – queuing, reserved resources, focus on tactical work
• 2009 – Visualization and detailed workgroup management
• Future – Set service level goals, our job to deliver
Active Workload Management
• Manage workloads
– Reduce server congestion
• Dynamically adjust
in-flight task priority
– Turn the dial – change priorities
• Fast active access queries
– Performance, performance,
performance
• Get maximum throughput
Speed
10
Active
Events
Active
Access
Query and
ReportingActive Load
Active Data
Warehouse
Speed
60
Speed
75
Speed
25
TASM Reporting/Monitoring - 13.10
Availability Requirements
IT, Finance,
Planners, Power
Users,
Data Miners
Executives,
Middles
Managers,
Marketing
1000000
100000
10000
1000
100
10
Consumers
Suppliers
B2B
Operational
Employees
Category Mgr,
Line Managers,
Service Managers
Users
Mission Critical
Dual
Active
Strategic Intelligence Operational Intelligence
“Always ON” – An Elusive Challenge
• Unplanned downtime
– Hardware faults
– Software faults
– Hangs
• Planned downtime
– Software upgrade
– Hardware upgrade
– Data center maintenance
• “Disasters”
– Multi-component failures
– Building disasters
– Area disasters
• And optimize resource value to the business
• And avoid hidden costs and surprises
– Eg Major performance variations
• Major opportunity for research – but must be holistic
– Reaches far beyond core database
Real time Operational Actions
Strategic
Intelligence
Operational
Intelligence
1. Customer makes
multi-segment
travel reservation
2. Flight rerouted
causing missed
connections.
“Active”
Enterprise Data
Warehouse
3. What are the customers’
flying history?
4. How profitable is each
customer?
5. Which customers
experienced delays or
other problems in last 6
months?
WebSphere MQ,
Oracle AQ,
Microsoft MSMQ
6. Customer re-booked
and notified.
7. Airport operations
adjusted
Real Time Customer Management
Strategic
Intelligence
Operational
Intelligence
4. Is this customer
approaching the
predicted loss rate for
their segment?
5. What offers are
available for this
customer?6. Message sent to floor
Luck Ambassador with
customer offer to
prevent additional
losses.
TIBCO
2. What is the customer’s past
spending history in all our
casinos?
3. What is a significant loss
for this person based on
market segment, past and
predicted behavior?“Active”
Enterprise Data
Warehouse
1. Customer inserts
Total Rewards
Card at Slot
Machine
That’s a Wrap!
• Business requires a new level of decision making
– Many more decisions by many more people much faster
– Current representation of the state of the enterprise
• Data Warehouse must evolve to support the requirements of Active
Enterprise Intelligence
• Technology must evolve to deal with the new requirements
– Rich area for research and innovation
– Change view of what data warehouse/BI means
• Teradata driving an aggressive roadmap to meet real business
requirements
Teradata - Architecture of Teradata
For More Information click below link:
Follow Us on:
http://vibranttechnologies.co.in/teradata-classes-in-mumbai.html
Thank You !!!

More Related Content

PPTX
Teradata Architecture
BigClasses Com
 
PPTX
An overview of reference architectures for Postgres
EDB
 
PPTX
Data warehouse
Sonali Chawla
 
PDF
Apache spark - Architecture , Overview & libraries
Walaa Hamdy Assy
 
PPTX
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
Databricks
 
PPTX
Introduction to MongoDB
S.Shayan Daneshvar
 
PDF
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
VMware Tanzu
 
PDF
Care and Feeding of Catalyst Optimizer
Databricks
 
Teradata Architecture
BigClasses Com
 
An overview of reference architectures for Postgres
EDB
 
Data warehouse
Sonali Chawla
 
Apache spark - Architecture , Overview & libraries
Walaa Hamdy Assy
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
Databricks
 
Introduction to MongoDB
S.Shayan Daneshvar
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
VMware Tanzu
 
Care and Feeding of Catalyst Optimizer
Databricks
 

What's hot (20)

PDF
Sqoop
Prashant Gupta
 
PDF
Optimizing MariaDB for maximum performance
MariaDB plc
 
PDF
Indexes and Indexing in Oracle 12c
Oren Nakdimon
 
PDF
Hadoop combiner and partitioner
Subhas Kumar Ghosh
 
PDF
Architecting a Data Warehouse: A Case Study
Mark Ginnebaugh
 
PDF
XStream: stream processing platform at facebook
Aniket Mokashi
 
PPT
Database migration
Sankar Patnaik
 
PPTX
Hive: Loading Data
Benjamin Leonhardi
 
PDF
Gain 3 Benefits with Delta Sharing
Databricks
 
PDF
Building Event Driven Systems
WSO2
 
PPT
datamarts.ppt
bhavyag24
 
PPTX
An overview of data warehousing and OLAP technology
Nikhatfatima16
 
PPT
Hadoop at Ebay
Aroop Maliakkal
 
PPTX
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Erik Krogen
 
PDF
Redpanda and ClickHouse
Altinity Ltd
 
PDF
SQOOP PPT
Dushhyant Kumar
 
PPTX
Introduction to sqoop
Uday Vakalapudi
 
PDF
We are drowning in complexity—can we do better?
Jonas Bonér
 
PPTX
Ambari Views - Overview
Hortonworks
 
Optimizing MariaDB for maximum performance
MariaDB plc
 
Indexes and Indexing in Oracle 12c
Oren Nakdimon
 
Hadoop combiner and partitioner
Subhas Kumar Ghosh
 
Architecting a Data Warehouse: A Case Study
Mark Ginnebaugh
 
XStream: stream processing platform at facebook
Aniket Mokashi
 
Database migration
Sankar Patnaik
 
Hive: Loading Data
Benjamin Leonhardi
 
Gain 3 Benefits with Delta Sharing
Databricks
 
Building Event Driven Systems
WSO2
 
datamarts.ppt
bhavyag24
 
An overview of data warehousing and OLAP technology
Nikhatfatima16
 
Hadoop at Ebay
Aroop Maliakkal
 
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Erik Krogen
 
Redpanda and ClickHouse
Altinity Ltd
 
SQOOP PPT
Dushhyant Kumar
 
Introduction to sqoop
Uday Vakalapudi
 
We are drowning in complexity—can we do better?
Jonas Bonér
 
Ambari Views - Overview
Hortonworks
 
Ad

Viewers also liked (19)

PPTX
Teradata introduction - A basic introduction for Taradate system Architecture
Mohammad Tahoon
 
PPTX
Teradata introduction
Rameejmd
 
PPTX
Teradata
Teja Bheemanapally
 
PPT
Teradata 13.10
Teradata
 
PDF
Teradata - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
PPTX
Teradata Big Data London Seminar
Hortonworks
 
PPTX
Introduction to Teradata And How Teradata Works
BigClasses Com
 
PDF
Key note big data analytics ecosystem strategy
IBM Sverige
 
PPTX
The Big Data Analytics Ecosystem at LinkedIn
rajappaiyer
 
PDF
Teradata Aster: Big Data Discovery Made Easy
TIBCO Spotfire
 
PDF
Unified big data architecture
DataWorks Summit
 
PPT
Teradata Unity
Teradata
 
PPTX
Leveraging your hadoop cluster better - running performant code at scale
Michael Kopp
 
PPTX
Tableau AWS EC2 integration architecture diagram
Vaidy Krishnan
 
PPTX
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Data Con LA
 
PDF
Big data performance management thesis
Ahmad Muammar
 
PPTX
Teradata Training Course Content
BigClasses Com
 
PDF
Big Data to your advantage with High-Performance Analytics
SAS Institute India Pvt. Ltd
 
PPTX
EMC Big Data Solutions Overview
walshe1
 
Teradata introduction - A basic introduction for Taradate system Architecture
Mohammad Tahoon
 
Teradata introduction
Rameejmd
 
Teradata 13.10
Teradata
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
Teradata Big Data London Seminar
Hortonworks
 
Introduction to Teradata And How Teradata Works
BigClasses Com
 
Key note big data analytics ecosystem strategy
IBM Sverige
 
The Big Data Analytics Ecosystem at LinkedIn
rajappaiyer
 
Teradata Aster: Big Data Discovery Made Easy
TIBCO Spotfire
 
Unified big data architecture
DataWorks Summit
 
Teradata Unity
Teradata
 
Leveraging your hadoop cluster better - running performant code at scale
Michael Kopp
 
Tableau AWS EC2 integration architecture diagram
Vaidy Krishnan
 
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Data Con LA
 
Big data performance management thesis
Ahmad Muammar
 
Teradata Training Course Content
BigClasses Com
 
Big Data to your advantage with High-Performance Analytics
SAS Institute India Pvt. Ltd
 
EMC Big Data Solutions Overview
walshe1
 
Ad

Similar to Teradata - Architecture of Teradata (20)

PPT
Teradata Technology Leadership and Innovation
Teradata
 
PPTX
Introduction to Harnessing Big Data
Paul Barsch
 
PPT
Teradata a z
Dhanasekar T
 
PDF
Ugif 04 2011 france ug04042011-jroy_part1
UGIF
 
PPT
Maximizing Business Value: Optimizing Technology Investment
Teradata
 
PPTX
Teradata training
Manish Goyal ITIL, ISEB, Prince2
 
PPTX
Data Warehousing & Business Intelligence 5 Years From Now
Teradata Corporation
 
PDF
Workload Optimierte Systeme_Jan Klockow_IBM Symposium 2013
IBM Switzerland
 
PPTX
Five database trends - updated April 2015
Guy Harrison
 
PDF
Hadoop and the Data Warehouse: When to Use Which
DataWorks Summit
 
PPTX
Real Time Data Warehousing Mastering Business Objects June 11
jeffmonico
 
PPT
5 Years of Progress in Active Data Warehousing
Teradata
 
PDF
"Dell and Open Source" by Serban Zirnovan @ eLiberatica 2008
eLiberatica
 
PDF
Storage simplicity value_110810
rjmurphyslideshare
 
PDF
The Evolution of Database Technologies Christian Bandulet
Christian Bandulet
 
PPT
Gulabs Ppt On Data Warehousing And Mining
gulab sharma
 
PDF
1.1 Overview.pdf
ssuser8b6c85
 
PDF
ADV Slides: 2021 Trends in Enterprise Analytics
DATAVERSITY
 
PPT
Teradata Active EDW 6680
Teradata
 
PDF
Optimizing workload deployments to accelerate business outcomes
Dell World
 
Teradata Technology Leadership and Innovation
Teradata
 
Introduction to Harnessing Big Data
Paul Barsch
 
Teradata a z
Dhanasekar T
 
Ugif 04 2011 france ug04042011-jroy_part1
UGIF
 
Maximizing Business Value: Optimizing Technology Investment
Teradata
 
Data Warehousing & Business Intelligence 5 Years From Now
Teradata Corporation
 
Workload Optimierte Systeme_Jan Klockow_IBM Symposium 2013
IBM Switzerland
 
Five database trends - updated April 2015
Guy Harrison
 
Hadoop and the Data Warehouse: When to Use Which
DataWorks Summit
 
Real Time Data Warehousing Mastering Business Objects June 11
jeffmonico
 
5 Years of Progress in Active Data Warehousing
Teradata
 
"Dell and Open Source" by Serban Zirnovan @ eLiberatica 2008
eLiberatica
 
Storage simplicity value_110810
rjmurphyslideshare
 
The Evolution of Database Technologies Christian Bandulet
Christian Bandulet
 
Gulabs Ppt On Data Warehousing And Mining
gulab sharma
 
1.1 Overview.pdf
ssuser8b6c85
 
ADV Slides: 2021 Trends in Enterprise Analytics
DATAVERSITY
 
Teradata Active EDW 6680
Teradata
 
Optimizing workload deployments to accelerate business outcomes
Dell World
 

More from Vibrant Technologies & Computers (20)

PPT
Buisness analyst business analysis overview ppt 5
Vibrant Technologies & Computers
 
PPT
SQL Introduction to displaying data from multiple tables
Vibrant Technologies & Computers
 
PPT
SQL- Introduction to MySQL
Vibrant Technologies & Computers
 
PPT
SQL- Introduction to SQL database
Vibrant Technologies & Computers
 
PPT
ITIL - introduction to ITIL
Vibrant Technologies & Computers
 
PPT
Salesforce - Introduction to Security & Access
Vibrant Technologies & Computers
 
PPT
Data ware housing- Introduction to olap .
Vibrant Technologies & Computers
 
PPT
Data ware housing - Introduction to data ware housing process.
Vibrant Technologies & Computers
 
PPT
Data ware housing- Introduction to data ware housing
Vibrant Technologies & Computers
 
PPT
Salesforce - classification of cloud computing
Vibrant Technologies & Computers
 
PPT
Salesforce - cloud computing fundamental
Vibrant Technologies & Computers
 
PPT
SQL- Introduction to PL/SQL
Vibrant Technologies & Computers
 
PPT
SQL- Introduction to advanced sql concepts
Vibrant Technologies & Computers
 
PPT
SQL Inteoduction to SQL manipulating of data
Vibrant Technologies & Computers
 
PPT
SQL- Introduction to SQL Set Operations
Vibrant Technologies & Computers
 
PPT
Sas - Introduction to designing the data mart
Vibrant Technologies & Computers
 
PPT
Sas - Introduction to working under change management
Vibrant Technologies & Computers
 
PPT
SAS - overview of SAS
Vibrant Technologies & Computers
 
PPT
Teradata - Restoring Data
Vibrant Technologies & Computers
 
PPT
Datastage database design and data modeling ppt 4
Vibrant Technologies & Computers
 
Buisness analyst business analysis overview ppt 5
Vibrant Technologies & Computers
 
SQL Introduction to displaying data from multiple tables
Vibrant Technologies & Computers
 
SQL- Introduction to MySQL
Vibrant Technologies & Computers
 
SQL- Introduction to SQL database
Vibrant Technologies & Computers
 
ITIL - introduction to ITIL
Vibrant Technologies & Computers
 
Salesforce - Introduction to Security & Access
Vibrant Technologies & Computers
 
Data ware housing- Introduction to olap .
Vibrant Technologies & Computers
 
Data ware housing - Introduction to data ware housing process.
Vibrant Technologies & Computers
 
Data ware housing- Introduction to data ware housing
Vibrant Technologies & Computers
 
Salesforce - classification of cloud computing
Vibrant Technologies & Computers
 
Salesforce - cloud computing fundamental
Vibrant Technologies & Computers
 
SQL- Introduction to PL/SQL
Vibrant Technologies & Computers
 
SQL- Introduction to advanced sql concepts
Vibrant Technologies & Computers
 
SQL Inteoduction to SQL manipulating of data
Vibrant Technologies & Computers
 
SQL- Introduction to SQL Set Operations
Vibrant Technologies & Computers
 
Sas - Introduction to designing the data mart
Vibrant Technologies & Computers
 
Sas - Introduction to working under change management
Vibrant Technologies & Computers
 
SAS - overview of SAS
Vibrant Technologies & Computers
 
Teradata - Restoring Data
Vibrant Technologies & Computers
 
Datastage database design and data modeling ppt 4
Vibrant Technologies & Computers
 

Recently uploaded (20)

PDF
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
Shreyas_Phanse_Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
SHREYAS PHANSE
 
PDF
agentic-ai-and-the-future-of-autonomous-systems.pdf
siddharthnetsavvies
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Enable Enterprise-Ready Security on IBM i Systems.pdf
Precisely
 
PDF
Software Development Company | KodekX
KodekX
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PDF
CIFDAQ's Teaching Thursday: Moving Averages Made Simple
CIFDAQ
 
PDF
Building High-Performance Oracle Teams: Strategic Staffing for Database Manag...
SMACT Works
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PPTX
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
AbdullahSani29
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Shreyas_Phanse_Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
SHREYAS PHANSE
 
agentic-ai-and-the-future-of-autonomous-systems.pdf
siddharthnetsavvies
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Enable Enterprise-Ready Security on IBM i Systems.pdf
Precisely
 
Software Development Company | KodekX
KodekX
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
CIFDAQ's Teaching Thursday: Moving Averages Made Simple
CIFDAQ
 
Building High-Performance Oracle Teams: Strategic Staffing for Database Manag...
SMACT Works
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
AbdullahSani29
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 

Teradata - Architecture of Teradata

  • 4. Teradata Company Highlights • Founded 1979 – West LA • First product to market – 1984 • First Terabyte system – 1987 • Acquired by AT&T and merged with acquired NCR – 1992 • Tri-vested as part of NCR - 1997 • Teradata Corporation – (re)Launched October 1, 2007 – Global Leader in Enterprise Data Warehousing • EDW/ADW Database Technology • Analytic Solutions – Positioned in Gartner’s Leaders Quadrant in data warehousing since 1999 • Top 10 U.S. publicly-traded software company – S&P 500 Member – Listed NYSE: “TDC” – 2007 - $1.7B revenue
  • 6. Continuous (R)evolution Hardware + Database + Consulting + Data models and reports + Analytic applications
  • 7. Continuous (R)evolution Sell the HW, give everything else away Sell the SW with some HW to run on Sell solving business problems – and technology to solve them Sell applications with consulting, SW and HW inside
  • 8. Continuous (R)evolution 90% R&D 10% integration 80286 70% R&D 30% integration i486 20% R&D 80% integration Pentium 10% R&D 90% integration Xeon Quad Core
  • 9. Scale • Every dimension of the technology must scale to meet today’s requirements – Data, Data model complexity, Users, Performance, queries, Data loading, … • What is a big Data Warehouse? • Total spinning disk? – 2.5 Petabytes • Big table? – 150 billion rows • Number of tables? – 300,000 • Insert/Update per day? – 5 billion records • Identified users? – 100,000 • Queries per day? – 5 million • Data Turnover rate? – 1TB per 5 seconds
  • 10. The Problem 10 > 09/2009 Accts. Payable Accts. Receivable Invoicing Sales/Orders Finance G/L Customer Support HR Payroll Purchasing Order Fulfillment Manufacturing Inventory … Marketing Supply Chain Finance Risk Management Maintenance Sales Operations Inventory Call Center … Operational Systems Decision Makers
  • 11. The EDW Solution Accts. Payable Accts. Receivable Invoicing Sales/Orders Finance G/L Customer Support HR Payroll Purchasing Order Fulfillment Manufacturing Inventory … EnterpriseEnterprise DataData WarehouseWarehouse (EDW)(EDW) Marketing Supply Chain Finance Risk Management Maintenance Sales Operations Inventory Call Center … Operational Systems Decision Makers
  • 12. Active Enterprise Intelligence™ An Obvious Trend: More Speed, More Users Strategic Intelligence Operational Intelligence Enterprise Data Warehouse BI Tools & reports Analysis & visualization Predictive Analytics EDW Enterprise Integration Mixed workload management SOA, BPMS, IDEs Portals/composite applications Days Seconds
  • 13. Active Enterprise Intelligence™ enabled by an Active Data Warehouse™ STRATEGIC INTELLIGENCEOPERATIONAL INTELLIGENCE Business Intelligence Tools and Applications Teradata Warehouse Workflow & Applications Active EventsActive Access Suppliers Customers Call Center Logistics MarketingFinanceProduct/ Services Executive Active Enterprise Integration Active Availability Active Workload Management Active Load
  • 14. Active Enterprise Intelligence™ in Retail Detecting Retail Fraud Situation Thieves make copies of cash register receipts, walk into the store, pick up merchandise, and return items for cash. Problem Associates in returns department did not have historical POS receipt retrieval access to verify against previously “returned” receipts or to do returns without receipts. Solution Associates query Teradata to quickly check if a return has already occurred on that receipt number. Also used by analysts to understand and prevent excessive returns. Impact (for 500-store chain) • 100% ROI in 5 months • Stopped a crime ring on the first day of rollout • “Cost savings have been huge”
  • 15. Active Enterprise Intelligence™ in Retail Single View of the Customer Across All Channels Situation Needed to add Web channel for selling shoes. Problem Too much time and cost to keep multiple customer systems synchronized. Realized they needed just one customer database, not one more for the Web, in addition to Call Center, and POS/Store databases. Solution Adopted an ADW strategy, moved all customer data to one Teradata system, revised data models to cover all channels, added web channel for commerce, used web services, added TASM to handle multiple workload types Impact • 1M tactical hits to the EDW per day from the POS, Call Center, and Web with 0.11 sec response time • Runs simultaneously with back-office BI, reports, and ETL workloads • Eliminated all other customer data systems
  • 16. What is the Measure of a Great Architecture? Handle huge changes of underlying technologies and dependent components while continuing to deliver the key value proposition.
  • 18. Processor RoadmapCPU power radically increasing 2003 2005 2009 2011 90nm process 45nm process 65nm process 32nm process 22nm process Hyper-Threading Dual Core Multi Core 20002000 2008+2008+ SPECInt2000SPECInt2000 5X5X SINGLE-CORESINGLE-CORE PERFORMANCEPERFORMANCE DUAL/MULTI-CORE PERFORMANCE 2007 20042004
  • 19. What Does Shared Nothing Mean? • 1985 – Every hardware part, every line of software – “pure” shared nothing • 1995 – Multiple units of parallelism sharing CPU, memory • 2004 – Multiple units of parallelism sharing multiple cores, memory • 2009 – Multiple units of parallelism sharing same physical spindles – but still not sharing data • Future – Multiple units of parallelism in Virtual machines/cloud not even knowing what physical machine it is on or sharing 19 > 09/2009 Copyright Teradata © 2007-2009 – All rights Reserved
  • 20. Teradata MPP Server Architecture • Nodes – Incrementally scalable to 1024 nodes • Operating System – Linux, Windows, Unix • Storage – Independent I/O – Scales per node • BYNET Interconnect – Fully scalable bandwidth • Connectivity – Fully scalable – Channel – ESCON/FICON – LAN, WAN • Server Management – One console to view the entire system SMP Node1 SMP Node2 SMP Node3 SMP Node4 Server Management Dual BYNET Interconnects CPU1 CPU2 Memory Operating Sys CPU1 CPU2 Memory Operating Sys CPU1 CPU2 Memory Operating Sys CPU1 CPU2 Memory Operating Sys
  • 21. Shared Nothing - Dividing the Work • “Virtual processors” (vprocs) do the work • Two types – AMP: owns and operates on the data – PE: handles SQL and external interaction • Configure multiple vprocs per hardware node – Take full advantage of SMP CPU and memory • Each vproc has many threads of execution – Many operations executing concurrently – Each thread can do work for any user, transaction • Software is equivalent regardless of configuration – No user changes as system grows from small SMP to huge MPP
  • 22. Shared Nothing - Dividing the Work • Basis of Teradata scalability – Each AMP owns an equal slice of the disk – Only that AMP reads that slice • No single point of control for any operation – I/O, Buffers, Locking, Logging, Dictionary – Nothing centralized – Exponential communication costs avoided AMPsLogs Locks Buffers I/O # Nodes Coordination cost Teradata
  • 23. Teradata Data Distribution • Rows automatically distributed evenly by hash partitioning – Even distribution results in scalable performance – Done in real-time as data are loaded, appended, or changed. – Hash map defined and maintained by the system • 2**32 hash codes, 64K buckets distributed to AMPs – Prime Index (PI) column(s) are hashed – Hash is always the same - for the same values – No reorgs, repartitioning, space management Table A Table B Table C AMP1 AMP2 AMP3 AMP4 ……………………………………………………… AMPn Primary Index Teradata Parallel Hash Function P DM P DM P DM P DM P DM P DM P DM P DM P DM RowHash (Hash Bucket) Data Fields
  • 24. Disk Capacity Exploding with Little Increase in Performance 36 GB 5.5 73 GB 6.0 146 GB 6.4 .044 .080 .155 PerformanceperCapacity MB/Sec/GB DiskDriveBandwidth(MB/Sec) 1 2 3 4 5 6 7 8 Disk Drive Capacity
  • 25. Platform Change • Focus used to be – Optimization of expensive CPU cycles – Micro-management of precious disk space • Now – Manage I/O – Balance CPU power to the I/O capacity – Find new ways to optimize I/O, trading for CPU use as necessary – Pulling 2.5GB/sec per node continuous • Discontinuity coming – SSDs become price competitive and reliable
  • 26. File System • Teradata wrote a new rule book – Old one written by IBM 35 years ago, used by all mainstream DBMSs today - except Teradata • File system built of raw slices • Rows stored in blocks – Variable length – Grow and shrink on demand – Rows located dynamically • May be moved to reclaim space, defrag – Maximum block size is configurable • System default or per table • 8K to 128K • Change dynamically • Indexes are just rows in tables • Has evolved from direct management of single spindles to completely virtualized storage, not even knowing spindle location
  • 27. Workload Management Evolution • 1984 – pure timeshare • 1987 – 4 priorities, defined by user • 1995 – multiple priorities in multiple partitions • 2000 – weighted workload groups • 2004 – queuing, reserved resources, focus on tactical work • 2009 – Visualization and detailed workgroup management • Future – Set service level goals, our job to deliver
  • 28. Active Workload Management • Manage workloads – Reduce server congestion • Dynamically adjust in-flight task priority – Turn the dial – change priorities • Fast active access queries – Performance, performance, performance • Get maximum throughput Speed 10 Active Events Active Access Query and ReportingActive Load Active Data Warehouse Speed 60 Speed 75 Speed 25
  • 30. Availability Requirements IT, Finance, Planners, Power Users, Data Miners Executives, Middles Managers, Marketing 1000000 100000 10000 1000 100 10 Consumers Suppliers B2B Operational Employees Category Mgr, Line Managers, Service Managers Users Mission Critical Dual Active Strategic Intelligence Operational Intelligence
  • 31. “Always ON” – An Elusive Challenge • Unplanned downtime – Hardware faults – Software faults – Hangs • Planned downtime – Software upgrade – Hardware upgrade – Data center maintenance • “Disasters” – Multi-component failures – Building disasters – Area disasters • And optimize resource value to the business • And avoid hidden costs and surprises – Eg Major performance variations • Major opportunity for research – but must be holistic – Reaches far beyond core database
  • 32. Real time Operational Actions Strategic Intelligence Operational Intelligence 1. Customer makes multi-segment travel reservation 2. Flight rerouted causing missed connections. “Active” Enterprise Data Warehouse 3. What are the customers’ flying history? 4. How profitable is each customer? 5. Which customers experienced delays or other problems in last 6 months? WebSphere MQ, Oracle AQ, Microsoft MSMQ 6. Customer re-booked and notified. 7. Airport operations adjusted
  • 33. Real Time Customer Management Strategic Intelligence Operational Intelligence 4. Is this customer approaching the predicted loss rate for their segment? 5. What offers are available for this customer?6. Message sent to floor Luck Ambassador with customer offer to prevent additional losses. TIBCO 2. What is the customer’s past spending history in all our casinos? 3. What is a significant loss for this person based on market segment, past and predicted behavior?“Active” Enterprise Data Warehouse 1. Customer inserts Total Rewards Card at Slot Machine
  • 34. That’s a Wrap! • Business requires a new level of decision making – Many more decisions by many more people much faster – Current representation of the state of the enterprise • Data Warehouse must evolve to support the requirements of Active Enterprise Intelligence • Technology must evolve to deal with the new requirements – Rich area for research and innovation – Change view of what data warehouse/BI means • Teradata driving an aggressive roadmap to meet real business requirements
  • 36. For More Information click below link: Follow Us on: http://vibranttechnologies.co.in/teradata-classes-in-mumbai.html Thank You !!!

Editor's Notes

  • #5: [Enter any extra notes here; leave the item ID line at the bottom] Avitage Item ID: {{E3648B2F-FB1B-499B-B91B-8871943BA5EE}}
  • #15: Retail Fraud is a $16 B year problem in the USA alone. With web receipts and better copying capabilities, thieves can make multiple copies of a single receipt and make multiple returns for cash or other merchandise. Or they can bring back shoplifted items and try to exchange for cash. The problem is that often the associates in Returns department don’t have access to past sales information and can’t keep track easily of returned merchandise. This is especially problematic if the policy is to make returns without receipts. So the solution is straightforward: hook up the Point of Sale systems so within seconds, the Teradata data warehouse is updated with sales, return, exchange, and void data, and provide the Returns department with the entire history of purchases by that customer,, so they can ensure that a sold product can only be returned once. <Click> The impact? Huge, according to one Teradata customer who has already built this system. They stopped a crime ring in the first day of their rollout, a group that had defrauded the company of thousands of dollars. They saw a 100% payback on their investment in just 5 months, and continue to reap the benefits of this example use of Active Enterprise Intelligence.
  • #21: [Enter any extra notes here; leave the item ID line at the bottom] Avitage! Item ID: {{33DC1405-7316-423E-B269-8F92054D20CE}}
  • #25: (CLICK) In this chart, we have 3 different disk drive sizes, and you can see that per generation, disk drive bandwidth hasn’t increased very much. (CLICK) As disk capacities get larger (36 GB  73 GB  146 GB) the performance per capacity ratio (Capacity vs. Disk Bandwidth on right side of chart) declines significantly. The key metric on this slide is performance per capacity (MB/ SEC/ GB) Look at this slide! Capacity is doubling, but throughput is diminishing! If you fill all the drives up with data, you will not have enough I/O or bandwidth! Choosing twice as much storage capacity in a configuration, but not increasing the number of physical disks (to keep I/O constant), will result in performance degradation.
  • #29: Assuming workloads are categorized, this illustration shows “speed limits” which are actually resource limits for each workload. Each workload is allowed to consume a limited amount of resources at any given time to ensure other workloads get their rightful share. Dynamic Resource Prioritization Inside every fully utilized active data warehouse, there’s a major turf battle going on. Each job in the database is engaged in an ongoing struggle for more and more resources for its own work, often competing against other diverse activities. In most databases, these me-first conflicts result in short, resource-light queries falling victim to the heavier jobs. Those batch fraud-detection reports and long-running market share analysis queries essentially take ownership of the database and all it has to give. But Teradata Database lets your specific business needs determine how your precious database resources are divided. Once a definition for equitable sharing of database assets is in place, it automatically controls what percent of the CPU and disk I/O those batch reports and complex queries, as well as those vulnerable short queries, will receive. When there’s a handful of users on the system, Teradata Database spreads available resources out relative to the priorities and assignments that have been made to those particular users, without a single sub-second of CPU being wasted. Teradata Database has made job scheduling and prioritization of the work a core competency since 1988. And recently, that technology has deepened and matured offering even more flexibility. Teradata’s Priority Scheduler can be used to ensure that the event-driven work coming from the web is allowed to cut into line to grab the CPU it needs to get that promotion back to the client quickly. For example, if the tactical query that comes up with that promotion returns an answer in 1 second when running alone in the database, that same query, if armed with a high Teradata Database priority, can maintain a similar turnaround even if multiple complex inventory adjustment queries begin executing at the same time. For the active data warehouse, it will be critical to keep more resource-hungry complex queries from dominating the resources in the system, starving out the shorter tactical work. Teradata’s Dynamic Workload Manager will play a big role in enabling favored work to be as near to real time as it needs to be.
  • #31: While no 2 dimensional drawing can accurately portray such complex issues, this graphic frames the discussion around when to move to mission critical and dual active solutions. In general, the type of users often correlates with the population of users. For example, we know that the consumer population for many industries can mean 10 of thousands to millions of possible users via the internet . Similarly, for some industries, the population of supplier employees who access your data warehouse can be enormous, maybe not always in concurrent users but certainly in potential users. At the other end of the spectrum, planning, analysis, and power users tend to be a small community albeit an influential one. In the middle of the graphic we see overlaps of many kinds because line managers (category managers, sales managers, service managers, etc.) often bounce between strategic decisio0ns and operational decisions, with probably more time spent in the operational tasks. Business critical is not a well defined term in our industry. It tends to mean anything less than mission critical. These users can often tolerate downtime, from a few hours perhaps even an entire day. But many data warehouse sites have become so dependent on the EDW, that they have “hardened” the server, software, and procedures to a mission critical level. This means the executives realize how many decisions are made daily based on BI Tools based reporting that they are willing to fund the project to increase system availability. Mission critical can begin in the EDW and certainly extends all the way to the end of the graphic. These clients understand that large populations of front line users will demand 24X7 data availability. With operational employees you MIGHT be able to tolerate a 10-20 minute outage every month. It depends very much on the business use of the EDW. As the EDW evolves to larger populations and more operational ACTIVE tasks, outrages become increasingly expensive so additional investments in availability become mandatory. In some cases, an active data warehouse begins being so critical to the operational employee that it becomes necessary to step up to a dual active configuration. This is particularly true in retail with 100s of concurrent employees and suppliers using the data, but it may also occur with large call centers or sales staff. Finally, we hope it is obvious that when consumers gain access to the data warehouse, it is typically for eCommerce purchasing. No downtime is tolerated in this case because the loss of revenue cannot be tolerated.
  • #33: Problem: Lack of ability to track customer gaming behavior and Comp redemption. No mechanism to communicate or react to specific behaviors and trends Solution: Player Contact System - when a patron swipes his/her card at a casino that information is sent to Teradata. The player profile is accessed and it is determined if the casino should make personal contact with that player. Allows Harrah’s to provide real-time offers to customers at each gaming point Enables Harrah’s to track the redemption of any comp provided to a guest as the comp is redeemed or partially redeemed. Allows them not to “over-comp” guests. Future: “Marketing At The Slots” initiative. This implementation has a BusinessWorks process receiving inbound card-swipes from the Slot Data System and building an EDW query. It then makes a Request/Reply call to Teradata to solicit and compile an XML message which is then published back out on the TIB for consumption by other applications. This will drive CRM to a new “real-time” level allowing interaction with the customer while they are gaming.
  • #34: Problem: Lack of ability to track customer gaming behavior and Comp redemption. No mechanism to communicate or react to specific behaviors and trends Solution: Player Contact System - when a patron swipes his/her card at a casino that information is sent to Teradata. The player profile is accessed and it is determined if the casino should make personal contact with that player. Allows Harrah’s to provide real-time offers to customers at each gaming point Enables Harrah’s to track the redemption of any comp provided to a guest as the comp is redeemed or partially redeemed. Allows them not to “over-comp” guests. Future: “Marketing At The Slots” initiative. This implementation has a BusinessWorks process receiving inbound card-swipes from the Slot Data System and building an EDW query. It then makes a Request/Reply call to Teradata to solicit and compile an XML message which is then published back out on the TIB for consumption by other applications. This will drive CRM to a new “real-time” level allowing interaction with the customer while they are gaming.