Data Management TURBAN
Data Management TURBAN
Data Management TURBAN
m
À m!
usinesses run on data that have been processed to information and
knowledge, which mangers apply to businesses problems and
opportunities. This transformation of data into knowledge and solutions is
accomplished in several ways.
Ú. New data collection occurs from various sources.
2. It is temporarily stored in a database then preprocessed to
fit the format of the organizations data warehouse or data
marts
3. Users then access the warehouse or data mart and take a
copy of the needed data for analysis.
4. Analysis (looking for patterns) is done with
èata analysis tools
result mining
6he èata tools activities is the generating of decision
of all these
support and knowledge
m
À m! m
m
"
The data life cycle begins with the acquisition of data from data sources.
These sources can be classified as internal, personal, and external.
½ Internal ata Sources are usually stored in the corporate database and
are about people, products, services, and processes.
½ ersonal ata is documentation on the expertise of corporate employees
usually maintained by the employee. It can take the form of:
½ estimates of sales
½ opinions about competitors
½ business rules
½ rocedures
½ Etc.
½ External ata Sources range from commercial databases to Government
reports.
½ Internet and Commercial atabase Services are accessible through the
Internet.
m
m#$
The task of data collection is fairly complex. Which can create data-quality
problem requiring validation and cleansing of data.
ñne way to improve data collection from multiple external sources is to use
a data flow manager (DFM), which takes information from external sources
and puts it where it is needed, when it is needed, in a usable form.
½ FM consists of
½ a decision support system
½ a central data request processor
½ a data integrity component
½ links to external data suppliers
½ the processes used by the external data suppliers.
m
%&
m
6' !
m
6
A data warehouse is a repository of subject-oriented historical data that is
organized to be accessible in a form readily acceptable for analytical
processing activities (
).
m
6 m
½
e
of d
we
e:
½ Tme v
. Te d
e kep
f
my ye
ey be
ed f
e, f
e
,
mp
ve
me.
½ N
v
l
le. e e
ee
e we
e,
e
p
e.
½ el
l. Typ lly
e
we
e e el
l
e.
½ le
eve. Te
we
e e
e le
eve
e
e mly
p
ve
e e e ey e
.
½ Web-be.
we
e e ee
p
ve
eff e
mp
ev
me
f
Web-be ppl
m
m
6
m
6 m
m
"
m
&
^ow It Works.
m
å$
m
å$ m
Queries allow users to request information from the computer that is not
available in periodic reports. Query systems are often based on menus or
if the data is stored in a database via a structured query language (SQ)
or using a query-by-example (Q) method.
m
!
m
Data mining is a tool for analyzing large amounts of data. It derives its
name from the similarities between searching for valuable business
information in a large database, and mining a mountain for a vein of
valuable ore.
m
6 (
m
m
Multidimensionality Visualization:
½ Measures:
½ Money
½ Sales volume
½ ead count
½ Inventory profit
½ Actual versus forecasted results.
½ Time:
½ aily
½ Weekly
½ Monthly
½ uarterly
½ early.
m
m
m
m
m
"
Data warehouses and data marts serve end users in all functional areas.
Most current databases are static: They simply gather and store information.
Today¶s business environment also requires specialized databases.
m
,
"
m
,
"
m
m
,
"
m
m
-.+#&À&""/+"
½ m
Some of the data management solutions
discussed are very expensive and justifiable only in large corporations. Smaller organizations
can make the solutions cost effective if they leverage existing databases rather than create
new ones. A careful cost-benefit analysis must be undertaken before any commitment to the
new technologies is made.
½ Should data be distributed close to their users? This
could potentially speed up data entry and updating, but adds replication and security risks.
r should data be centralized for easier control, security, and disaster recovery? This has
communications and single point of failure risks.
½
ata mining may suggest that a company send catalogs or promotions to
only one age group or one gender. A man sued Victoria Se e
. be e feme
eb
e ee m
e
w
ee
e
em e e ee
e
e
(
e
w
e f
e . Se
e be e exee.
½ Should a firm invest in internally collecting, storing, maintaining,
and purging its own databases of information? r should it subscribe to external databases,
where providers are responsible for all data management and data access?
m
-.+#&À&""/+" m
½ Can an organization¶s business processes, which have become dependent on
databases, recover and sustain operations after a natural or other type of information system disaster?
ow can a data warehouse be protected? At what cost?
½
Are the company¶s competitive data safe from external snooping or
sabotage? Are confidential data, such as personnel details, safe from improper or illegal access and
alteration? Who owns such personal data?
½ aying for use of data. Compilers of public-domain information, such as Lexis-Nexis, face a
problem of people lifting large sections of their work without first paying royalties. The Collection of
Information Antipiracy Act (Bill 2652 in the U.S. Congress) will provide greater protection from
online piracy. This, and other intellectual property issues, are being debated in Congress and
adjudicated in the courts.
½ Collecting data in a warehouse and conducting data mining may result in the invasion of
individual privacy. What will companies do to protect individuals? What can individuals do to protect
their privacy?
m
-.+#&À&""/+" m
½ ne very real issue, often known as the legacy data acquisition problem, is what to
do with the mass of information already stored in a variety of systems and formats,. ata in older,
perhaps obsolete, databases still need to be available to newer database management systems.
Many of the legacy application programs used to access the older data simply cannot be converted
into new computing environments without considerable expense. Basically, there are three
approaches to solving this problem. ne is to create a database front end that can act as a translator
from the old system to the new. The second is to cause applications to be integrated with the new
system, so that data can be seamlessly accessed in the original format. The third is to cause the data
to migrate into the new system by reformatting it.
½ Moving data efficiently around an enterprise is often a major problem. The
inability to communicate effectively and efficiently among different groups, in different geographical
locations is a serious roadblock to implementing distributed applications properly, especially given the
many remote sites and mobility of today¶s workers.
m
m
m