Managing Data Resources
Managing Data Resources
Chapter 5
Data Resource Management
Data are a vital organizational resource that need to be managed
Distributed database
◦ can reside on network servers on the World Wide Web, on corporate intranets
or extranets, or on other company networks
◦ may be copies of operational or analytical databases, hypermedia or discussion
databases, or any other type of database
◦ Ensuring that the data in an organization’s distributed databases are consistently
and concurrently updated is a major challenge
Types of Databases
Distributed database
◦ One primary advantage of a distributed database lies with the protection
of valuable data
◦ Another advantage of distributed databases is found in their storage
requirements.
◦ Primary challenge is the maintenance of data accuracy.
◦ One additional challenge associated with distributed databases is the
extra computing power and bandwidth necessary to access multiple
databases in multiple locations.
Hypermedia Database
◦ consists of hyperlinked pages of multimedia (text, graphic and
photographic images, video clips, audio segments, and so on)
Data Warehousing and Data Mining
Data Warehouse
◦ stores data that have been extracted from the various operational,
external, and other databases of an organization
◦ central source of the data that have been cleaned, transformed, and
cataloged so that they can be used by managers and business
professionals for data mining
◦ Data marts holds subsets of data from data warehouse that focus on
specific aspects of company
The components of a complete data warehouse
system.
Data Warehousing and Data Mining
Data Warehouse
• data from various operational and external databases are captured,
cleaned, and transformed into data that can be better used for analysis
• acquisition process might include activities like consolidating data from
several sources, filtering out unwanted data, correcting incorrect data,
converting data to new data elements, or aggregating data into new data
subsets
• data are then stored in the enterprise data warehouse, from which they
can be moved into data marts or to an analytical data store that holds
data in a more useful form for certain types of analysis
• Metadata (data that define the data in the data warehouse) are stored in a
metadata repository and cataloged by a metadata directory.
• Finally, a variety of analytical software tools can be provided to query,
report, mine, and analyze the data for delivery via Internet and intranet
Web systems to business end users
Data Warehousing and Data Mining
data in a data warehouse are static - means that once the data are
gathered up, formatted for storage, and stored in the data
warehouse, they will never change
Data Mining
◦ the data in a data warehouse are analyzed to reveal hidden patterns and
trends in historical business activity
◦ analysis can be used to help managers make decisions about strategic
changes in business operations to gain competitive advantages in the
marketplace
◦ can discover new correlations, patterns, and trends in vast amounts of
business data stored in data warehouses
Traditional file processing
data are organized, stored, and processed in independent files of data
records
Problems of File Processing
◦ Data Redundancy: Independent data files included a lot of duplicated data
◦ It caused problems when data had to be updated
◦ Separate file maintenance programs had to be developed and coordinated
to ensure that each file was properly updated