Teradata Architecture PDF Free

Download as pdf or txt
Download as pdf or txt
You are on page 1of 89

Introduction to Teradata

Teradata Architecture

LEVEL – LEARNER
Icons Used

Hands-on Referenc Question Points To


Exercise e s Ponder

Coding Lend A Summar Test Your


Standards Hand y Understanding

2
Module 1: Teradata basics

Objectives:
After completing this chapter you will be able to answer below
questions
• What is Teradata?
• What are the unique features of Teradata?
• What are Teradata components and its functions?
• What is Teradata Architecture?
Introduction to Teradata Database

 Teradata is a relational database management system that drives


company’s data warehouse
Compatible with Industry standards (ANSI Complaint)
 The architecture supports both single-node, Symmetric Multiprocessing
(SMP) systems and multinode,. Massively Parallel Processing (MPP) systems
 It uses parallelism to manage terabytes of data
 It is built on a parallel Architecture
 Its scalability ranges from 10GB to 100+TB of data
 Teradata runs on UNIX MP –RAS, Windows 2000 server platform
 It is capable of supporting many concurrent users from various platforms
 Over TCP/IP or IBM channel connection
Unique Features of Teradata

• Parallel processing
– Each AMP holds a portion of the data and they them in parallel
• Linear Scalability
– Double the AMPS and double the speed
• Mature Optimizer
– PE is the Matured optimizer
• Automatic Data distribution
– Each table has Primary index which is hashed and distributes to AMP
automatically
• Shared Nothing Architecture
– Each AMP has their own Memory, CPU and disk, so called shared
Nothing Architecture
• Single Data Store
– Teradata scalability allows all data to be on one system. This is Single
data store
Teradata –Parallel processing

• The rows of a Teradata table are spread across the AMPs, so


each AMP can then process in parallel when a USER queries
the table.
Parsing engine
(PE)

BYNET
Teradata – Linear Scalability

Teradata Systems can Add AMPs for Linear Scalability


Linear Scalability means if you double your AMPs and their
supporting nodes the performance doubles!
Teradata Architecture

Teradata Components
• Parsing engine (PE)
• BYNET (BanYan NETwork)
• AMP
• Disk
What is a Node?

• Gateway and Channel-drive software run as processes.


• Users connecting via the Mainframe access Teradata
though the Channel and all other users utilize the LAN
gateway.
• The Parallel Database Extension (PDE) controls the Access
Module Processors (AMPs) and Parsing Engines (PEs) which
are referred to as Virtual Processors (Vprocs) and they
reside in the nodes memory.
• The operating system running the node is Linux.
Node

Each Node is attached via a Network to a Disk Farm


• A Teradata AMP will be assigned a Virtual disk to store its
tables and the rows .
• Only the AMP assigned to the virtual disk can read or write
to that disk.
• A node holds 40-50 AMPs.
Number of Nodes and Amps

Query to identify number of nodes in Teradata server


SELECT NodeID FROM dbc.ResUsageSPma
GROUP BY 1
Query to identify number of AMPs in Teradata server
SELECT Vproc FROM dbc.diskspace
GROUP BY 1
SMP Node

• SMP stands for symmetric multi-processing which means


each CPU processor performs equally, and all CPUs share a
pool of memory and operate under one operating system.
MPP

• Two SMP nodes connected via the BYNETs are now one
Massively Parallel Processing (MPP) system.
Teradata Functional Overview

Picture depicts LAN Connections for Network Attached


Client
Teradata Functional Overview

Picture depicts Mainframe connection to Teradata


Parsing Engine

• When a user logs into Teradata, a PE will log them in and be


responsible for their entire session
• The PE checks the SQL Syntax
• The PE creates the EXPLAIN plan checks security and builds a
plan for the AMPs to follow. Hence PE is also known as
‘Optimizer’.
• The PE converts EBCDIC (from the mainframe queries) to
ASCII on the way in and the AMPs are responsible for
converting from ASCII to EBCDIC on the way out.
• The PE always delivers the final answer set to the user.

The Parsing Engine's biggest responsibility is


building a parallel-aware, cost-based plan for the AMPs to follow
to retrieve the data
Parsing Engine Components

Parsing
Engine Process
Elements
• Manages session activities, such as logon,
password validation, and logoff.
Session Control
• Recovers sessions following client or server
failures.
• Decomposes SQL into relational data
Parser
management processing steps.
• Determines the most efficient path to access
Optimizer
data.
• Receives processing steps from the parser
and sends them to the appropriate AMPs via
the BYNET.
Dispatcher
• Monitors the completion of steps and
handles errors encountered during
processing.
How does PE builds best plan?

The PE uses the COLLECTED STATISTICS to build the best


plan (least cost plan).

Collect stats defines the confidence level of PE in estimating


"how many rows it is going to access ? how many unique
values does a table have , null values and all this info is stored
in data dictionary. Once you submit a query in Teradata, the
parsing engine checks if the stats are available for the
requested table , if it has collected stats earlier PE generates a
plan with "high confidence" . in absence of collect stats plan
will be with "low confidence" in data dictionary
BYNET

• BYNET connects PE and AMP for passing various instructions and


corresponding outputs.
• In Teradata system, there are two BYNET systems viz. ‘BYNET 0’ and
‘BYNET 1’. This is because, in case one BYNET fails, the other one carries
the instruction. It also fastens communication and hence enhances query
performance.
• Symmetric Multiprocessing Node (SMP) – It has Boardless BYNET and no
Physical BYNET
• Massively Parallel Processing system (MPP) - Nodes are connected by then
two physical BYNET boards.
• BYNET is responsible for Broadcast, multicast and point –to – point
communications between nodes and virtual processors.
AMP

• AMPS are responsible for storing and retrieving rows from their
assigned disk (Vdisk).
• AMPs lock the tables and rows.
• AMPs sort rows and do all aggregation.
• AMPs handle all space management and space accounting.
• AMPs convert ASCII to EBCDIC when returning answer sets to the
mainframe.
• In Teradata 13, the AMP Worker Task (AWT) per AMP is increased for better
performance.
All Teradata Tables are spread across ALL AMPS
Disk Array
• Each AMP Vproc is assigned to a disk
• A Vdisk may contain 119 GB of its disk space
Teradata Components

• The maximum number of vprocs per node can be as high as


128
• Each Parsing Engine (PE) can manage up to 120 individual
sessions
• Each nodes will hold up to 40-50 AMPs
• The maximum number of vprocs that can be supported in a
single system is 16,384
• Each BYNET supports up to 1024 nodes in a system
Questions

23
Test Your Understanding

Questions:

1. What is Parsing engine?


2. AMP stands for ?
3. What is the function performed by BYNET?
4. How many BYNET systems are there in Teradata? Explain
their functionalities.
5. What is TDP?

24
Summary

The chapters give a detailed overview of the following


processes in Teradata:
 The PE checks the syntax of the query, also checks the
security right of the user accessing.
 The PE comes up with the best optimized plan for execution
of the query.
 The PE passes this plan through BYNET to AMP.
 The AMPs follow the plan to retrieve data from its DISKS.
 The AMP passes the data to PE through BYNET.
 The PE then passes the data to the user.

25
Module 2: RDBMS Overview

Objectives:
• After completing this chapter you will be able to answer the
following questions
• What is RDBMS?
• Describe Logical/Relational Modeling?
• What is the relationship between primary and
foreign keys?
• What are the advantages of Relational Modeling?
Introduction to RBMS

A database is the collection of permanently stored data that is


• Logically related – data relates to other data
• Shared – many users may access data
• Protected – access to data is controlled
• Managed – Data has integrity and value
• Based on relational model
Logical/Relational Model

• The Logical Model


 Should be designed without regard to usage
 It cannot accommodate wide variety of front end tools
 It allows database to be created more quickly
 Should be same regardless of data volume
 Represents real world business in a tabular (relational) form.
 Includes all the data definitions within the scope of
enterprise or application
 Is generic , Logical model is the template for physical
implementation on any RDBMS platform.
 Teradata supports fully normalized logical models
• Ability to perform 64 table joins
• Ability to perform large aggregations
Logical/Relational Model

 A column always contain like data


 Relational database contains set of logically related tables
 A table is a two dimensional representation of a data consisting of
rows and columns
 Column always contain like data
 A row is one instance of all the columns in a table
 In a relational database, tables are defined as a named collection of
one or more named columns that can have zero or many rows of
related information
 Each row represents an occurrence of entity defined by the table. An
entity is defined as a person, place, thing or event about which the
table causes information.
 In relational math, the following stand true
• Table = a relation or equivalent to that
• Row –a tuple
• Column – an attribute
Primary and Foreign keys

Primary Key rules:


• A Primary Key is required for every table.
• Only one Primary key is allowed in a table.
• Primary keys may consists of one or more columns.
• Primary keys cannot have duplicate values (ND).
• Primary keys cannot be Null (NN).
• Primary keys are considered non- changing values (NC)
Foreign Key rules:
• FK are optional.
• More than one Foreign key is allowed in a table.
• FKs may consists of one or more columns.
• Foreign keys can have duplicate values .
• Foreign keys can be Null.
• Changes to Foreign keys are allowed.
• Each FK must exist somewhere as primary key (Referential integrity)
Relational Advantage
Advantages of relational database:
Ease of use: The revision of any information as tables consisting of rows and columns is much easier to
understand .

Flexibility: Different tables from which information has to be linked and extracted can be easily
manipulated by operators such as project and join to give information in the form in which it is desired.

Security: Security control and authorization can also be implemented more easily by moving sensitive
attributes in a given table into a separate relation with its own authorization controls. If authorization
requirement permits, a particular attribute could be joined back with others to enable full information
retrieval.

Data Independence: Data independence is achieved more easily with normalization structure used in
a relational database than in the more complicated tree or network structure.

Data Manipulation Language: The possibility of responding to query by means of a language based
on relational algebra and relational calculus e.g SQL is easy in the relational database approach. For data
organized in other structure the query language either becomes complex or extremely limited in its
capabilities.

Cater for future requirements: By having data held in separate tables, it is simple to add records
that are not yet needed but may be in the future. For example, the city table could be expanded to
include every city and town in the country, even though no other records are using them all as yet. A flat
file database cannot do this
Module 3: Teradata Index

Objectives:
After completing this chapter you will be able to answer below
questions
• What is Primary Index?
• What is Secondary Index?
• How data rows are stored and retrieved?
Indexing

Index is the physical mechanism to store the data


Primary keys Vs. Primary Indexes

Indexes are conceptually different from Keys


• A PK is a relational modeling convention which allows each
row to be uniquely identified
• A PI is a Teradata convention which determines how row will
be stored and accessed
Primary Index

• The Primary Index is defined when the table is created.


• The Primary Index cannot be changed. Changing the PI
requires dropping and recreating the table.
• It is a mechanism to assign a row to an AMP

When the Primary Index is not specified , Teradata will default to


the first column in the table, and it will be defined as Non-
Unique.
Unique Primary Index (UPI)

• If Index choice of column is Unique then it is UPI.


• UPI will result in even distribution of the rows of table
across all AMPs
Unique Primary Index (UPI)

• Use the Primary Index column in your SQL WHERE clause


and only 1-AMP retrieves
• UPI is a one AMP operation and returns one row
Non-Unique Primary Index (NUPI)

• If Index choice of column is not Unique then it is NUPI.


• NUPI will result in even distribution of the rows of table proportional to
the degree of uniqueness of the Index.

• A Non-Unique Primary Index (NUPI) will have duplicates grouped together on


the same AMP, so data will always be skewed (uneven). The above skew is
reasonable
Non-Unique Primary Index (NUPI)

• Use the Primary Index column in your SQL WHERE clause


and only 1-AMP retrieves.
• NUPI is a one AMP operation and returns multiple rows
Multi-Column Primary Index

A table can have only one Primary Index, but you can combine
up to 64 columns together max to form one Multi-Column
Primary Index.
Multi-Column Primary Index

• Use the Primary Index column in your SQL WHERE clause,


and only 1-AMP retrieves
NO Primary Index

• A table that specifically states NO PRIMARY INDEX will


receive no primary index. It will distribute the data evenly
but randomly, and this is often used as a staging table.
NO Primary Index

To retrieve a record , Teradata performs Full table scan as


there is no primary index.
NO Primary Index

• NoPI is generally preferred when the need is to load records


temporarily into staging table.
• Data can be quickly loaded from the source to the staging
table. From the staging table the data can be moved to
Production table using Insert/select statement.
How Teradata distributes and retrieves data

• The Teradata Parsing Engine will take the Primary Index Value of a row and
run a math calculation called the Hash Formula on that Primary Index
column value.
• It produces 32 - bit row hash which equates to an integer
• The Row Hash will go to a bucket in the Hash Map and is assigned to an
AMP
32 – bit row hash 00000000000000000101 = 13

• Every Teradata System has one Hash Map with a million buckets. Inside the
buckets are AMP numbers
Placing rows on AMP

• The below example hashed Emp_No 1001 (Primary Index value) and the
output was a Row Hash of 13. Teradata counted over to bucket 13 in the
Hash Map, and it has the number one (1) inside that bucket. This means
that this row will go to AMP 1.
• Emp_No 1002 (Primary Index value) and the output was a Row Hash of 5.
Teradata counted over to bucket 5 in the Hash Map, and it has the number
two (2) inside that bucket. This means that this row will go to AMP 2.
• There is one Hashing Formula in Teradata, and it is consistent.
Emp No 1001 Emp No 1002
Review of Hashing process

• Hash the Primary Index Value for a row with the Hash
Formula.
• The output of the Hash Formula is a 32-bit Row Hash.
• Take the Row Hash and find its corresponding bucket in the
Hash Map.
• Send the row and its Row Hash to the AMP listed in the
Hash Map Bucket.
Skew Factor

• Skew refers to the row distribution on AMPs. If the data is highly


skewed, it means some AMPs are having more rows and some
very less i.e. data is not properly/evenly distributed. This in turn
will result in poor performance. Choice of Indexes should be made
with utmost care to avoid Skewness.

• NULL values in the Primary Index is the main reason for skew. A
Table with a Unique Primary Index can have only one Null value,
but a NUPI table can have many NULL values, and each NULL
value hashes to the same AMP.
Uniqueness Value

• Each AMP will place a Uniqueness Value after the row hash
to track duplicate values
• The Hash Formula is consistent so every Smith has the
same Row Hash and the same goes for each Jones and each
Patel. Therefore, duplicate values land on the same AMP.

• Row-ID equals the Row Hash of the Primary Index column


and the Uniqueness Value.
Row ID

UNIQUE PRIMARY INDEX NON - UNIQUE PRIMARY


• The Uniqueness Value on INDEX
each Row-ID is 1. • Uniqueness Value increases
• Each AMP sorts their rows by on all duplicate names
the Row-ID. • Each AMP sorts their rows by
the Row-ID.

AMPs sort rows by Row-ID so like data is grouped


together and for Binary searches.
Example

Sel * from Employee_table where


last_name =‘Smith’;

Plan:
1. PE sees the last name as Priamry index
2. It hash Smith and get row hash
3. Row hash =7
4. Counts the bucket in hash map 7 times
and it says Amp 1
5. Passes message to AMP1 through
BYNET to retrieve row has 7’s
6. Bring back all columns for Row hash 7
(‘Smith’)
Binary Search - Example

Sel * from order_table where


Order_Number=50;
Plan:
1. PE sees the order_number as Priamry
index
2. It hash 50 and get row hash
3. Row hash =75
4. Counts the bucket in hash map 75
times and it says Amp 1
5. Passes message to AMP1 through
BYNET to retrieve row has 75
6. Perform a Binary Search
Primary Index Example

• A Unique Primary Index will • A Non-Unique Primary


spread the data perfectly Index will NOT spread the
evenly data perfectly evenly.
Primary Index Example

• Multi-Column Primary • In No Primary Index , all


Index is often used to fix a AMPs read all of their rows
data skew problem (full table scan) because
there is no Primary Index.
Secondary Index

• Secondary Index can be created and dropped dynamically


• Syntax

• Secondary index requires a separate physical structure (the


subtable), but a Primary Index do NOT require a separate
physical structure
• Unique Secondary Index (USI) Subtable contains two
columns
1. Emp_No (The USI column)
2. Row-ID of the real Primary Index of the base table
Primary Index Vs Secondary Index
How Parsing Engine uses the USI Subtable

• Parsing Engine plan - It is a 2 AMP operation

Emp_no is a USI.
PE will hash 1004 and see which AMP holds row in subtable. (AMP 3).
PE will have the BYNET contact with AMP 3 and retrieves row 1004 (Single AMP).
AMP will pass the real row id of base table row (1,4) back up to PE.
PE will use the ROW –ID to find the base table row with another single AMP retrieve.

• A USI is a Two-AMP Operation


• The first AMP is assigned to read the subtable and the second the base table.
• Two binary searches are performed in total, and one row is returned.
Non Unique Secondary Index

• Syntax

• Non Unique Secondary Index (NUSI) Subtable contains two


columns
1. Emp_No (The USI column) First_Name (The NUSI column)
2. Row-ID of the real Primary Index of the base table
•. The NUSI rows get their own Row-ID, but they are not
hashed to different AMPs and stay AMP local.
NUSI are AMP -Local

• Subtable rows match those of the base rows on the same


AMP , hence it is AMP Local.
• A NUSI query always searches all AMPs, but the intent is not
to do a Full Table Scan. If there are 50 AMPs, then a
minimum of 50 binary searches are done.
How Parsing Engine uses the NUSI Subtable

• Parsing Engine plan - It is ALL AMP operation

 First_name is a NUSI.
 PE will order each AMP to search if they have kyle’ in their NUSI subtable
 Each AMP will simultaneously perform a binary search on their NUSI Subtable
 If AMP has Kyle, PE will order them to retrieve the base row.
 If there are 50 AMP’s, then all 50 AMP’s will perform a binary search simultaneously and
if they find ‘Kyle’ they perform another binary search on base table.

• A NUSI is ALL AMP Operation


Primary Index vs. Secondary Index

Index Feature UPI NUPI USI NUSI


Required? Yes* Yes* No No
Single-AMP Retrieve Yes Yes No No
Number of Binary Searches 1 1 2 Many
Number per Table 1 1 "0-32" "0-32"
Max Columns 64 64 64 64
Unique Y N Y N
Affects Row Distribution Y Y N N
Created/Dropped Dynamically N N Y Y
Improves Access Y Y Y Y
Can be multiple data types Y Y Y Y
Separate physical structure N N Sub-table Sub-table
Extra Processing Overhead N N Y Y
May be ordered by value N N N Y
May be partitioned Y Y N N
* Teradata has a NoPI table now in V13.10
Full- Table Scans

• Teradata Database always uses a full-table scan to access


the data of a table if a query:
 Accesses a NoPI table that does not have an index
defined on it
 Does not specify a WHERE clause
 The Index columns are not used in the query’
 An index is used in a non –Equality test
 A range of values is specified for the primary index
• A full-table scan is always an all-AMP operation, and should
be avoided when possible
Questions

63
Summary

• Index is the physical mechanism to store the data


• A PK is a relational modeling convention which allows each row to be
uniquely identified
• The Primary Index is defined when the table is created.
• A table can have only one Primary Index, but you can combine up to 64
columns together max to form one Multi-Column Primary Index.
• Hash the Primary Index Value for a row with the Hash Formula.
• The output of the Hash Formula is a 32-bit Row Hash.
• Row-ID equals the Row Hash of the Primary Index column and the
Uniqueness Value.
• Secondary Index can be created and dropped dynamically
• Non Unique Secondary Index (NUSI) Subtable contains two columns
– Emp_No (The USI column) First_Name (The NUSI column)
– Row-ID of the real Primary Index of the base table
• NUSI are AMP -Local
Test Your Understanding

1. How are both tables sorted?


2. What was the Row-ID when Minal was hashed?
3. Looking in the subtable what is the Row-ID of the base for employee
1006?
4. When 1006 was placed in the subtable, which bucket in the hash
map was chosen?
5. How many times is the Hash Map consulted on a query using a USI in
the WHERE Clause?
Module 4: Space

Objectives:
After completing this chapter, you will be able to answer the
following questions
What is Teradata database and user?
How are space allocated to Teradata objects?
What is the hierarchy of objects in Teradata syatem?
Space

There are three types of space in Teradata


Perm Space : PERM space houses permanent tables,
Secondary Indexes, Join Indexes and Permanent Journals
Temp Space: Temp space is store temporary tables
Spool Space : Spool space is used by each AMP in order to
build the answer set for the user.
A Teradata Database(Example)

A Teradata database is a logical repository for


• Tables (requires perm space)
• Views (uses no perm space)
• Macros (use no perm space)
When a system arrives, there is only one user called DBC.
USER DBC
• System user DBC contains all Teradata Database software components and all system
tables.

Syntax:
CREATE DATABASE new_db FROM existing_db
AS
PERMANENT = 20000000
,SPOOL= 50000000
,TEMP = 20000000

‘new_db’ is owned by ‘existign_db’


A database is empty until all objects are created within it
A database with no PERM space can have view and macros but not tables
A Teradata User

A Teradata user is a database with an assigned password


A Teradata user may also own tables, view, macros, triggers but users with no
perm space may not own tables
A user may logon to Teradata and access objects within:
• Itself
• Other database for which it has access rights

Syntax:
CREATE USER new_user FROM existing_user
AS
PERMANENT = 10000000
PASSWORD =‘Acdmy’
,SPOOL= 50000000
,TEMP = 20000000

‘new_user’ is owned by ‘existing_user’


A user is empty until all objects are created within it
The Teradata Hierarchy

• Initially DBC owns 10 TB of PERM space. DBC created


Spool_Reserve (4 TB), USER Retail (2 TB) and USER
Financial (2 TB) and later that DBC has only 2 TB of PERM
space.
• USER Retail and USER Financial can create the databases
and users desired as below.
Difference between PERM and Spool space

Assume User ‘A’ has 2TB of permanent space ,10


GB of spool space and has 1000 users under them
 User ‘A’ can create and load up to 2 TB of Tables
data in his PERM space
 Every 1000 user under ‘A’ say ‘A1, A2, A3….’ can
run queries up to 10GB of spool space
simultaneously
Test Your Understanding

• What is the difference between


Teradata Database and Teradata
User?
Module 5: Data Protection

Objectives
After completing this module you will be able to answer
• How locks prevents loss of data integrity?
• What are the types of locking provided by Teradata?
• What are FALLBACK tables?
Locks

There are four types of Locks


Exclusive Lock: This is placed only on a database or table when the
object is going through a structural change. Prevents any other type of
concurrent access to database or tables and never to rows
Write Lock: This happens on an INSERT, DELETE, or UPDATE request. It
prevents other Read, Write and Exclusive locks
Read Lock: This is placed in response to a SELECT request. This restricts
access by users who require Exclusive or Write locks. If you have a multi-
user environment with updates occurring and you need to keep data
consistent, you want a read lock.
Access Locks(Dirty-Read or Stale-Read): An Access lock permits the
user to access to READ an object that may already be locked for READ or
WRITE. An access lock does not restrict access by another user except
when an Exclusive lock is required. This is placed in response to a user-
defined LOCKING FOR ACCESS phrase. A user requesting access cannot
be concerned with data consistency.
Locks

• Locks are applied at 3 levels


1. Database: Applies to
tables/Views in the database
2. Table/View: Applies to all rows
in a table
3. Row Hash: Applies to all rows
with same Row Hash
Rule:
Lock requests are queued
behind all outstanding incompatible
lock request for the same object.
Row Hash Lock Syntax :
Locking Row for Access SELECT
* FROM TABLE_A;
Compatibility between Read Locks

Read Locks are compatible but Write Locks are not.

Assume in Employee_Table, we have four SQL statement first two are SELECT, third is
INSERT and fourth is SELECT.

Compatibility:
• Read supports other Read locks and Access Locks
• Write supports Access Lock
Cliques

• A cliques is a defined set of nodes with fallover capability


• A clique protects against a node failure
• All nodes in a clique must be able to access all vdisks for all
amps in a clique
• If a node fails all AMPs will migrate to the remaining nodes
in a clique
• When a node fails:
– Teradata resets
– On the restart the AMPs in Node 1 Migrate
– The system is degraded but still able to function
– The down node is fixed
– Another reset is done and the AMPs return home
• Each node can support 128 AMPs
Cliques

• An example of Four node cliques

• Node 1 fails and the AMPs are migrated to other AMPS


Fallback

• Fallback is to protect against an AMP Failure.


• Fallback makes a duplicate copy of every row in a table and keeps that row
on a different AMP.
• If an AMP goes down, the system can still process the query because the
rows on the failed AMP are also held by another AMP.
• Automatically restores data changed during AMP offline.
• It is critical for high availability applications.
Cost of Fallback:
• The cost of Fallback is that the table is twice as big and uses twice the
space.
• Twice the Inserts, updates, and deletes is needed.
Table with
CREATE Fallback
TABLE and with noCREATE
Emp_Intl, fallback
TABLE Emp_Intl, No
Fallback Fallback
(Emp_No INTEGER (Emp_No INTEGER
, Dept_No SMALLINT , Dept_No SMALLINT
, First_Name VARCHAR(12) , First_Name VARCHAR(12)
, Last_Name CHAR(20) , Last_Name CHAR(20)
, Salary DECIMAL(10,2)) , Salary DECIMAL(10,2))
UNIQUEPRIMARY INDEX UNIQUEPRIMARY INDEX
Note: Default
( Emp_No ); is No fallback ( Emp_No );
Fallback Clusters

• A cluster is a group of AMPs that act as a single fallback


unit.
• Fallback rows for AMPs reside in a cluster.
• Loss of AMPs in a cluster permits continued table access.
• Loss of 2 AMPs in the cluster causes the RDBMS to halt.
2 Clusters with 2AMP each

System performance can be adversely affected when any


AMP has a disproportionate burden
Fallback Vs. Non-Fallback tables

Fallback tables
• One AMP down
– Data fully available
• Tow or more AMPs down
– In different cluster
• Data fully available
– In the same cluster
• System halts.

Non - Fallback tables


• One AMP down
– Data partially available
– Queries avoiding down AMP succeed
• Tow or more AMPs down
– In different cluster
• Data partially available
• Queries avoiding down AMP succeed
– In the same cluster
• System halts.
RAID

RAID –Redundant Array of Independent Disks


Two Types of Disk Array protection
• RAID 1(Mirroring)

• RAID 1 provides each AMP two disks for storing data and two disks
for mirroring.
• The data disk and the mirror disk are called a mirrored pair.
• RAID 1 costs 50% of the disk space, but it ensures a 99% up time for
customers.
• If a single disk goes down, it is easily replaced and Teradata isn't
even effected
RAID

RAID 5(Parity):
• For every 3 blocks of data, there is a parity block on a 4th disk.
• If a disk fails, any missing blockmay be reconstructed using the
other three disks
• Array controller reconstruction of failed disk is longer than RAID
1

Summary:
• RAID 1: Good Performance with disk failures. Higher cost in
terms of disk space
• RAID 5: Reduced Performance with disk failures. Lower cost in
terms of disk space
Questions

84
Test Your Understanding

1. List the type of locks in Teradata


2. What are compatibility locks?
3. What is Dirty read lock?
4. How can the Node failure be protected?
5. What is RAID?
6. Is it mandatory to have FALLBACK for all tables?
Summary

• Exclusive Lock is placed only on a database or table when


the object is going through a structural change.
• Write Lock happens on an INSERT, DELETE, or UPDATE
request.
• Read Lock is placed in response to a SELECT request.
• Access Locks is also known as Dirty-Read or Stale-Read.
• A cliques is a defined set of nodes with fallover capability.
• Fallback is to protect against an AMP Failure.
• RAID 1 shows good Performance with disk failures.
Source
• Tera Tom e – Book
• Teradata Database Design (PDF)
• www.teradataforum.com
• www.teradata.com

Disclaimer: Parts of the content of this course is based on the materials available from the
websites and books listed above. The materials that can be accessed from the linked sites
are not maintained by Cognizant Academy and we are not responsible for the contents
thereof. All trademarks, service marks, and trade names in this course are the marks of the
respective owner(s).

32
Change Log

Version Changes made


Number
V1.0 Initial Version

V1.1 Slide No. Effective Changes


Changed By
Date Effected
   1-86 Bhuvanya.M 05/05/2015 Base line
(221634) content
         

34
Introduction to Teradata

You have successfully completed the


session on Teradata Architecture

You might also like