0% found this document useful (0 votes)
48 views17 pages

Cap 1-Managing Data As A Resource-Final

The document provides a historical overview of data management from ancient times to the present. It discusses the evolution from manual record keeping using materials like clay tablets, to punch cards used to catalog census data, to early mainframe computers used for batch processing, to online databases and networks in the 1960s-1970s. A key development was the proposal and adoption of the relational database model in the 1980s, which standardized data storage and querying and reduced data redundancy.

Uploaded by

malvin muthoni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views17 pages

Cap 1-Managing Data As A Resource-Final

The document provides a historical overview of data management from ancient times to the present. It discusses the evolution from manual record keeping using materials like clay tablets, to punch cards used to catalog census data, to early mainframe computers used for batch processing, to online databases and networks in the 1960s-1970s. A key development was the proposal and adoption of the relational database model in the 1980s, which standardized data storage and querying and reduced data redundancy.

Uploaded by

malvin muthoni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

10/6/2020

Managing Data as a Resource

2 10/6/2020
|

1
10/6/2020

Intended Learning Outcomes


 State what data resources management is
 Explain the Importance and trend towards Data/Information
Resource Management
 Give a brief historical account of the development of data
resources management

3
|

Data Resources Management


 According to The Data Management Association (DAMA) an
international body on data management;
 "Data Resource Management is the development and execution of
architectures, policies, practices and procedures that properly
manage the full data lifecycle needs of an enterprise."
 The definition is broad and covers even non-technical concerns.
 Alternative:
 “Data management is the development, execution and supervision
of plans, policies, programs and practices that control, protect,
deliver and enhance the value of data and information assets."

4
|

2
10/6/2020

Data Resources Management


 The Data Resource Management Wheel

5 10/6/2020
|

 Why would we need Data Resource Management (DRM)?


 Why is DRM important?

6 10/6/2020
|

3
10/6/2020

Historical perspective: The Six


Generations of Data
Management

Zeroth generation: Record Managers


4000BC -1900
 Royal assets and taxes in
Sumeria
 First known writings
 Ancients stored data on
clay tablets, papyrus
reeds, parchment and
then to paper over a
period of 6,000 years
 Data processing was
manual

4
10/6/2020

First Generation: Record Managers


1900 -1955
 Jacquard loom- produced
fabric from patterns
represented on punched
cards.
 Hollerith uses punched
card technology to
perform US Census in
1890.
 Represented households,
states, congressional
districts as binary holes in
a card.

First Generation: Record Managers


1900 -1955
 By 1955, many companies were
keeping large records of
punched cards, sorters and
tabulators.
 The cards were processed by
some rudimentary programs
that selectively reproduced
them on paper and other cards.
 This would have taken long if it
was manual processes.
 These were electromechanical
machines.

5
10/6/2020

Second Generation: Programmed Unit


Record Equipment 1955-1970
 About 1940 when
stored program begun
being developed,
UNIVAC developed a
magnetic tape.
 It could store
information the
equivalent of
thousands of punched
cards.

Second Generation: Programmed Unit


Record Equipment 1955-1970
 Was delivered to the 1951 US census.
 Processed hundreds of records in seconds.
 Improved space time, convenience and reliability.
 Standard packages such as general-ledger, payroll, inventory
control, subscription management, banking, and document
libraries began emerging
 Software was key to this era.
 It made it easier to sort, analyze and process data with
languages such as COBOL and FORTRAN.

6
10/6/2020

Second Generation: Programmed Unit


Record Equipment 1955-1970
 Large businesses recorded even more information.
 Increased demand for faster equipment.
 Hardware prices dropped hence even medium size
business could afford computers.
 Typically programs were file-oriented.
 They read data from files e.g. tape in a sequential manner
and produced new files as output.

Second Generation: Programmed Unit


Record Equipment 1955-1970
 Emergence of operating systems provided a means to store
and manage these files.
 Another key aspect in this era was batch processing.
 Data is collected cumulatively until the end of the day and processed
in one batch.
 Problem was if there was an error, it would take time to detect.
 Correcting such an error would mean repeating the whole batch.
 The correctness of the data was thus not guaranteed.
 This led to the next phase, online processing.

7
10/6/2020

Third Generation: Online Network


Databases 1965-1980
 Some applications like stock market trading and airline
reservation needed current up to date information.
 Hence could not rely on day-old information as in bath
processing.
 There was need for online interactive processing of data.
 Early hardware providing online interactive processing were
tele-processing monitors.
 These sent requests to some centralized server for processing
and provided immediate feedback.

Teleprocessing

16
|

8
10/6/2020

Third Generation: Online Network


Databases 1965-1980
 These early systems stored data in a manner that gave
rise to redundancy or repetition of data.
 Data about a single entity would be stored in multiple
locations.
 This caused a problem of updating the data. This is
because the network and hierarchical data models were
used.
 In order to maintain consistency, updates had to be done
several times.

Duplication in Hierarchical Model

18 10/6/2020
|

9
10/6/2020

Third Generation: Online Network


Databases 1965-1980
 Another problem arose.
 These systems had to be
able to accommodate
concurrent use of data
by multiple users/
transactions.
 This was solved by
“locking” records for
processing so that other
transactions could not
access them.

Fourth Generation: Relational Databases and


client-server computing
1980-1995
 The systems in the third generation were successful
 They were however difficult to design program and maintain.
 In 1970, Dr. EF Codd proposed a different method of storing
data.

10
10/6/2020

Fourth Generation: Relational Databases and


client-server computing
1980-1995
 The idea of the relational model is to represent both entities and
relationships in a uniform way.
 The relational data model has a unified language for data definition,
data navigation, and data manipulation, rather than separate languages
for each task.

Fourth Generation: Relational Databases and


client-server computing
1980-1995
 The relational data model and operators gives much
shorter and simpler programs to perform record
management tasks.
 Through Structured Query Language or SQL, it became
easy to retrieve and manipulate data.
 Unlike hierarchical and network models where one had
to navigate through individual records, SQL provides a
non-procedural manner of manipulating data.
 You only specify what data is needed not how to retrieve
it.

11
10/6/2020

Fourth Generation: Relational Databases and


client-server computing
1980-1995
 Many companies started producing software or database
management systems based on the relational model e.g.
Microsoft access, oracle, IBM DB2 etc.
 SQL became a standard language across different vendor
platforms.

Fourth Generation: Relational Databases and


client-server computing
1980-1995
 Another development in this
era was the rise of client-
server computing.
 It is a computing paradigm
where the client side provides
software to input/ capture
data.
 The data so captured is sent
over network to the server
which processes it and returns
it as output back to the client.
 The server is also stores data
persistently.

12
10/6/2020

Fourth Generation: Relational Databases and


client-server computing
1980-1995
 Parallel database processing also arose in this era.
 It means performing a task by subdividing into smaller
components that are processed separately and later
combined to give a final result.
 Data mining tasks on large data set that could possibly
have taken several hours or days can be performed in a
short time through parallel processing.

Fourth Generation: Relational Databases


and client-server computing 1980-1995

26 10/6/2020
|

13
10/6/2020

Fourth Generation: Relational Databases and


client-server computing 1980-1995

 Graphical user Interfaces are also key in this era.


 Users are able to retrieve data and visualize data e.g. as pie
charts, graphs and other types of graphically rendered data.
 Tools exist to allow users to do this e.g. report generators.

Fifth Generation: Multimedia


Databases 1995-
 Specialized applications needed specialized ways of storing
data.
 For example, geographic information system needed ways to
store maps, large text databases need ways in which text can
be indexed and retrieved , storing of images etc.

14
10/6/2020

Fifth Generation: Multimedia


Databases 1995-
 The relational model traditionally separated data from
the programs that access it.
 Also relational databases generally are good for storing
numbers and text, but not very accommodating to
images , large texts and geographical data.
 SQL extensions have been made try and cater to this
need.
 Big vendors like oracle have developed object-oriented
databases that store both the data and the procedures/
applications that’s access them.

Fifth Generation: Multimedia


Databases 1995-
 These are the new generation of data management
software.
 Unifying data and their procedures gives rise to the
ability to create “workflows”
 For example a simple purchase by a client could trigger
different work flows like:
 (1) buyer request, (2) bid, (3) agree, (4) ship, (5)
invoice, (6) pay.
 Systems to script, execute and manage workflows are
becoming common.

15
10/6/2020

Workflow

31 10/6/2020
|

Sixth Generation: 2000s, NoSQL and


NewSQL
 XML databases are a type of structured document-oriented
database that allows querying based on XML document
attributes
 data is conveniently viewed as a collection of documents,
with a structure that can vary from the very flexible to the
highly rigid:
 NoSQL databases are often very fast, do not require fixed
table schemas
 Cloud based storage

32 10/6/2020
|

16
10/6/2020

 http://research.microsoft.com/pubs/69642/tr-96-18.pdf
 http://www.systems-thinking.org/dikw/dikw.htm
 http://www.dataschemata.com/uploads/7/4/8/7/748733
4/dikwchain.pdf
 http://otec.uoregon.edu/data-wisdom.htm

33 10/6/2020
|

Ole Sangale Road, Madaraka Estate. PO Box 59857-00200, Nairobi, Kenya


Tel +254 (0)20 606155, 606268, 606380 Fax +254 (0)20 607498
Mobile +254 (0)722 25 428, (0)733 618 135 Email [email protected]
www.strathmore.edu |

17

You might also like