0% found this document useful (0 votes)
29 views40 pages

TechTalk 202404020

Uploaded by

imusmanahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views40 pages

TechTalk 202404020

Uploaded by

imusmanahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
You are on page 1/ 40

Teradata

• Teradata Architecture
• Secondary Index
• Partition Primary Index

Classification: Public - ‫عام‬ 1


SQL Answer Set Major Components of Teradata
Request Response

Parsing Engines (PE)


Parsing Engine … Parsing Engine • Manage sessions for users
• Parse, optimize, and send your request
to the AMPs as execution steps
• Returns answer set response back to
Message Passing Layer
client

Message Passing Layer (MPL)


• Allows PEs and AMPs to communicate
AMP AMP AMP … AMP with each other
Access Module Processors (AMP)
• Owns and manages its storage
• Performs the steps sent by the PEs
Vdisk Vdisk Vdisk … Vdisk
Virtual Disks (Vdisk)
• Space owned by the AMP and is used
AMPs store and retrieve rows to and from disk. to hold user data (rows within
tables).
• Maps to physical space in a disk array.
Classification: Public - ‫عام‬

Teradata Storage Architecture
Records From Client (in random sequence)
2 32 67 12 90 6 54 75 18 25 80 41

Teradata

Parsing The Parsing Engine dispatches


Engine(s) request to insert a row.

The Message Passing Layer


Message Passing Layer ensures that a row gets to the
appropriate AMP (Access
Module Processor).
AMP 0 AMP 1 AMP 2 … AMP x
The AMP stores the row on its
associated (logical) disk.

2 12 80 25
18
54
90 67 An AMP manages a logical or
41 75 32 6
virtual disk which is mapped to
multiple physical disks in a disk
array.
Classification: Public - ‫عام‬
Teradata Retrieval Architecture
Rows retrieved from table
2 32 67 12 90 6 54 75 18 25 80 41

Teradata

The Parsing Engine dispatches a


Parsing request to retrieve one or more
Engine(s)
rows.

The Message Passing Layer


Message Passing Layer ensures that the appropriate
AMP(s) are activated.

AMP 0 AMP 1 AMP 2 … AMP x


The AMP(s) locate and retrieve
desired row(s) in parallel access.

Message Passing Layer returns


2 12 80 25
18
54
90 67 the retrieved rows to PE.
41 75 32 6
The PE returns row(s) to
requesting client application.
Classification: Public - ‫عام‬
Multiple Tables on Multiple AMPs
EMPLOYEE Table DEPARTMENT Table JOB Table

Row from each table will usually be


Parsing Engine stored on each AMP.
Each AMP may have rows from all
tables.
Message Passing Layer
Ideally, each AMP will hold roughly
the same amount of data.

AMP 0 AMP 1 AMP 2 AMP x

EMPLOYEE Rows EMPLOYEE Rows EMPLOYEE Rows EMPLOYEE Rows


DEPARTMENT Rows DEPARTMENT Rows DEPARTMENT Rows DEPARTMENT Rows
JOB Rows JOB Rows JOB Rows JOB Rows

Classification: Public - ‫عام‬


Linear Growth and Expandability
Parsing
Parsing Engine

Parsing
Engine • Teradata is a linearly expandable
Engine SESSIONS
RDBMS.
AMP • Components may be added as
AMP requirements grow.
AMP G • Linear scalability allows for
E L PROCESSIN
PARALL increased workload without
decreased throughput.
Disk • Performance impact of adding
Disk components is shown below.
Disk
DATA

QUERIES AMPs DATA Performance


Same Same Same Same
Double Double Same Same
Same Double Double Same
Same Double Same Double

Classification: Public - ‫عام‬


The Parsing Engine
SQL Request Answer Set Response

The Parsing Engine is responsible for:


Parser • Managing individual sessions (up to
120)
Parsing Optimizer • Parsing and Optimizing your SQL
Engine requests
Dispatcher
• Dispatching the optimized plan to the
AMPs
• Input conversion (EBCDIC / ASCII) - if
Message Passing Layer necessary
• Sending the answer set response back
AMP AMP AMP AMP
to the requesting client

Classification: Public - ‫عام‬


Message Passing Layer

SQL Request Answer Set Response The Message Passing Layer or


Communications Layer is responsible for:
• Carrying messages between the AMPs
Parsing and PEs
Engine
• Point-to-Point, Multi-Cast, and
Broadcast communications

Message Passing Layer • Merging answer sets back to the PE


(PDE and BYNET) • Making Teradata parallelism possible
The Message Passing Layer or
AMP AMP AMP AMP Communications Layer is a combination of:
• Parallel Database Extensions (PDE)
Software
• BYNET Software
• BYNET Hardware for MPP systems
Classification: Public - ‫عام‬
The Access Module Processor (AMP)
The AMPs are responsible for:
SQL Request Answer Set Response
• Accesses storage using Teradata's File
System Software
Parsing • Lock management
Engine • Sorting rows
• Aggregating columns
• Join processing
Message Passing Layer • Output conversion and formatting
• Creating answer set for client
• Disk space management
AMP AMP AMP AMP • Accounting
• Special utility protocols
• Recovery processing
Teradata File System Software:
• Translates DatabaseID/TableID/RowID
AMPs store and retrieve rows to and from disk. into location on storage
• Controls a portion of physical storage
• Allocates storage space by “Cylinders”
Classification: Public - ‫عام‬
PE PE Teradata
PE Parallelism

Session A Session C Session E


Session B Session D Session F

Parallelism is built into


Message Passing Layer Teradata from the ground
AMP 0 AMP 1 AMP 2 AMP 3 up!

Task 1 Task 4 Task 7 Task 10


Task 2 Task 5 Task 8 Task 11
Task 3 Task 6 Task 9 Task 12

Notes:
• Each PE can handle up to 120 sessions in parallel.
• Each Session can handle multiple REQUESTS.
• The Message Passing Layer can handle all message activity in parallel.
• Each AMP can perform up to 80 tasks in parallel (can be configured for more).
• All AMPs can work together in parallel to service any request.
• Each AMP can work on several requests in parallel.
Classification: Public - ‫عام‬
Teradata Objects
Examples of objects within a Teradata database or user include:
Tables – rows and columns of data
Views – predefined subsets of existing tables
Macros – predefined, stored SQL statements
Triggers – SQL statements associated with a table
Stored Procedures – program stored within Teradata
User-Defined Function – function (C program) to provide additional SQL functionality
Join and Hash Indexes – separate index structures stored as objects within a database
Permanent Journals – table used to store before and/or after images for recovery
DATABASE or USER can have a mix of TABLE 1 * TABLE 2 * TABLE 3 *
various objects.
VIEW 1 VIEW 2 VIEW 3
* - require Permanent Space

MACRO 1 Stored Procedure 1 *


These objects are created, maintained,
and deleted using SQL. TRIGGER 1 UDF 1 *

Object definitions are stored in the DD/D.


Join/Hash Index 1 *
These aren't directly accessed by users.
Permanent Journal *

Classification: Public - ‫عام‬


The Data Dictionary Directory (DD/D)
The DD/D ...
– is an integrated set of system tables
– contains definitions of and information about all objects in
the system
– is entirely maintained by the Teradata Database
– is “data about the data” or “metadata”
– is distributed across all AMPs like all tables
– may be queried by administrators or support staff
– is normally accessed via Teradata supplied views

Examples of DD/D views:


DBC.TablesV – information about objects (e.g., tables) in a database/user
DBC.UsersV – information about all users
DBC.AllRightsV – information about access rights
DBC.AllSpaceV – information about space utilization

Classification: Public - ‫عام‬


EXPLAIN Facility
The EXPLAIN modifier in front of any SQL statement generates an English translation of the
Parser’s plan.
The request is fully parsed and optimized, but not actually executed.
EXPLAIN returns:
• Text showing how a statement will be processed (a plan)
• An estimate of how many rows will be involved
• A relative cost of the request (in units of time)

This information is useful for:


• predicting row counts
• predicting performance
• testing queries before production
• analyzing various approaches to a problem

EXPLAIN SELECT * FROM Employee WHERE DeptNumber = 1018;


:
3) We do an all-AMPs RETRIEVE step from HR.Employee by way of index # 4 "HR.Employee.DeptNumber =
1018" with no residual conditions into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1
is estimated with high confidence to be 9 rows (657 bytes). The estimated time for this step is 0.02 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is
0.02 seconds.
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Create Secondary Index - Example
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Thank You

Classification: Public - ‫عام‬


Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬
Classification: Public - ‫عام‬

You might also like