0% found this document useful (0 votes)
9 views3 pages

Admtttt

Big data is categorized into structured, unstructured, and semi-structured types, each requiring different storage and processing methods. A Data Mart is a specialized subset of a data warehouse focused on specific business needs, with types including dependent, independent, and hybrid. The star schema is a common data modeling approach that enhances query performance and understanding by organizing data into a central fact table and surrounding dimension tables.

Uploaded by

vinayak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views3 pages

Admtttt

Big data is categorized into structured, unstructured, and semi-structured types, each requiring different storage and processing methods. A Data Mart is a specialized subset of a data warehouse focused on specific business needs, with types including dependent, independent, and hybrid. The star schema is a common data modeling approach that enhances query performance and understanding by organizing data into a central fact table and surrounding dimension tables.

Uploaded by

vinayak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

1)Explain types of big date.

Big data can be categorized into three main types based on its structure: structured,
unstructured, and semi-structured. Each type of data is stored, processed, and analyzed
differently depending on its characteristics. Here's a breakdown:
1. Structured Data : This type of data is organized in a predefined format, typically stored in
rows and columns (like in databases or spreadsheets).
2. Unstructured Data : Data that does not have a specific structure or organized format. It’s
often more complex to process and analyze because it doesn’t fit neatly into traditional
database tables.
3. Semi-Structured Data Definition: Data that doesn’t conform to a rigid structure but still
contains some organizational properties that make it easier to process than unstructured
data.

3) What is a Data Mart.?


Data Mart is a subset of a data warehouse that is focused on a specific business line,
department, or function. It contains a smaller, more specialized collection of data tailored
to the needs of a particular group of users, making it easier to access and analyze relevant
information without dealing with the vast amount of data stored in the enterprise-wide
data warehouse.
Types of Data Marts:
1. Dependent Data Mart:
2. Independent Data Mart:
3. Hybrid Data Mart:

4 )Explain any four advantages of Dala Warehouse.


Data Warehouse Advantages Complete control over the four main areas of data management
systems: - Clean data , Query processing: multiple options ,Indexes: multiple types , Security:
data and access

5 Explain Pivot OI AP operations with example.


1 Since OLAP servers are based on multidimensional view of data, we will discuss OLAP
operations in multidimensional data.
2 Pivot Pivot (also called rotate) is a visualization operation that rotates the data axes in view
to provide an alternative data presentation.
3 a pivot operation where the item and location axes in a 2-D slice are rotated
Q2 1 Write a short note on the star schema with an example.
The most common modeling paradigm, in which the DW contains
1: a large central table(fact table) containing the bulk of the data, with no redundancy and
2. a set of smaller attendant tables (dimension table), one for each dimension.
3. The schema graph resembles a starburst, with the dimension tables displayed in a radial
pattern around the central fact table.
4 Keys in Star schema:
1) Primary Key 2) Foreign Key 3) Surrogate key

Advantages of Star schema:


1) Query Performance: Has limited no. of tables and clear join paths: Query run faster than
OLTP
2) Load performance and administration: Simple structure(Dimension tables and fact table
are separate-load get reduced),
3) Built-in referential integrity: PK of Dimension table is FK in Fact table.
4) Easily Understood: Easy or simple to understand and navigate as a dimensions joined
through only fact table
Q3 1 List end explain basic tasks involved in Data Transfomation.
Now in this section, we will consider specific types of transformation tasks which are most
commonly performed on the extracted data before being moved in the data warehouse.
Format revision
These revisions include changes to the data types and lengths of individual data fields.
Decoding of fields
When the data comes from multiple source systems, the same data items may have been
described by different field values. The most common example is the coding for gender, with
one system using 0 and 1 for male and female, another using M and F, and the other using
male, female.
Data with cryptic codes must also be decoded before being moved in the data warehouse.
Splitting of fields
Earlier legacy systems stored names and addresses in large text fields.
Merging of information This type of data transformation is neither the opposite of the
previous task nor it means merging a number of fields to form a single field; instead, it means
bringing together the relevant information from different data sources.
Character set conversion This type of data transformation is done to the textual data to
convert its character set to an agreed standard character set. Some of the legacy systems on
the mainframes may have the source data in EBCDIC characters while in other source systems
the data may be stored using the ASCII character set. So you need to convert the data from
one character set to the other.
Conversions of units Many companies have global branches. So the sales amount may be
represented in different currencies in different source systems. But before moving the data in
the data warehouse, you need to convert the figures into a common unit of measurement.
Date and time conversion The date and time values also need to be represented in a
standard format.
Summarization This type of transformation is done to derive summarized/aggregate data
from the most granular data. The summarized data will then be loaded in the data warehouse
instead of loading the most granular level of data.
Key restructuring While extracting data from the data sources, you have to form the primary
keys for the fact tables and the dimensions tables. You cannot keep the primary keys of the
source data tables as the primary keys for the fact and dimension tables because the primary
keys of the source data have built-in meaning.

You might also like