Unit 1
Unit 1
Unit 1
MANAGEMENT
ABINAYA.G/CSE-SRMIST
2
Why Information Storage and Management?
• Information is the knowledge derived from data
• Growth of digital information has resulted in information
explosion
• We live in an on-command, on-demand world
We need information when and where required
• Increasing dependency on fast and reliable access to information
• Businesses seek to store, protect, optimize, and leverage the
information
To gain competitive advantage
To derive new business opportunity
3
What is Data?
Data
4
Types of Data
• Data can be classified as:
Structured
Unstructured
email Attachments PDFs
• Majority of data being Unstructured (90%)
X-rays
created is unstructured
Manuals Instant Messages
Images Documents
Forms
Web Pages
Contracts Rich Media
Invoices
Audio, Video
Structured (10%)
Database
5
Big Data
Big Data
It refers to data sets whose sizes are beyond the ability of commonly used
software tools to capture, store, manage, and process within acceptable
time limits.
6
Storage
• Stores data created by individuals and organizations
Provides access to data for further processing
• Examples of storage devices are:
Media card in a cell phone or digital camera
DVDs, CD-ROMs
Disk drives
Disk arrays
Tapes
7
Evolution of Storage Architecture
Storage
Network
Storage Device
Information-centric Storage Architecture
8
Data Center
Data Center
9
Data Center: Online Order Transaction System Example
Client
Storage
LAN/WAN Network
User
Interface OS and DBMS
10
Key Characteristics of a Data Center
Availability
Manageability
Performance Capacity
Scalability
11
Managing Data Center
• Key management activities include
Monitoring
Continuous process of gathering information on various elements
and services running in a data center
Reporting
Details on resource performance, capacity, and utilization
Provisioning
Configuration and allocation of resources to meet the capacity,
availability, performance, and security requirements
• Virtualization and cloud computing have changed the way data
center infrastructure resources are provisioned and managed
12
Virtualization: An Overview
• Virtualization is a technique of abstracting physical resources and
making them appear as logical resources
For example partitioning of raw disks
• Pools physical resources and provides an aggregated view of
physical resource capabilities
• Virtual resources can be created from pooled physical resources
Improves utilization of physical IT resources
13
Cloud Computing: An Overview
• Enables individuals and organizations to use IT resources as a
service over network
• Enables self-service requesting and automates request-
fulfillment process
Enables users to scale up or scale down the usage of computing
resources quickly
• Enables consumption-based metering
Consumers pay only for the resources they use
Example: CPU hours used, amount of data transferred, and
Gigabytes of data stored
14
Key Challenges in Managing Information
• Exploding digital universe
Multifold increase of information growth
15
Some Constraints to Meeting the Requirements
Constraints include:
• Cost
• Physical environment
• Maintenance and support
• Compliance – regulatory and legal
• Hardware and software infrastructure
• Interoperability and compatibility
16
Data Center Environment –Application
• An application is a computer program that provides the logic for
computing operations
• The application sends requests to the underlying operating
system to perform read/write (R/W) operations on the storage
devices
• Some examples of these applications are e-mail, enterprise
resource planning (ERP), decision support system (DSS), resource
management, backup, authentication and antivirus applications,
and so on.
17
Database Management System (DBMS)
18
The Contemporary Database Environment
19
Components of a DBMS
• Modeling language to define schema of each database hosted
according to the data model
• Data structures optimized to deal with huge data stored on a
permanent data storage device
• Query language and report writer
• Transaction mechanism to ensure data integrity
20
Functions of a DBMS
• Manages the data dictionary, or the definition and structure of
data
• Manages data storage
• Responsible for data transformation & presentation
• Manages data integrity & security
• Controls multi-user access
• Manages backup & recovery
• Manages database language & API (source code)
• Manages database communication interface
21
Importance of a DBMS
• DBMS facilitate effective & efficient data management
22