Information Integration, Life-Cycle and Visualization & Group Projects
Information Integration, Life-Cycle and Visualization & Group Projects
Thilanka Munasinghe
Xinformatics – ITEC, CSCI, ERTH 4400/6400
Module 5, February 11th, 2020 1
Contents
• Review of last class, reading
• Information integration
• Group Projects – Exploring Ideas and set
up Groups.
• Information life-cycle & Management
• Information visualization
• Assignment 2
• Assignment 3
2
Assignment 2 (Available on LMS)
Assignment: Based on the use-case that you developed in Assignment 1, address the
question of assessing information uncertainty in different aspects of the use case and
determine possible ways to condition the system to reduce uncertainty in achieving the
goals of the use case. [You will need to take into account that since the use case is not
implemented, this is a hypothetical exercise]. The weighting score for each question is
included in the rubric on LMS. Please use the question numbering (1 and 2) below for your
written responses for this assignment.
1. Choose a signage ‘system’. Pick an analog or digital information system that utilizes
‘signs’ (icons, indices, symbols). It can be one you like or dislike. Write min. 1-2
sentences on why you made your choice. Include a graphic of your chosen information
system. (1%)
2. Describe signs in the system you chose and why it is a “system” and use the Class 2
system “properties, attributes and leverage points” to frame your description. Write min.
2-3 sentences per sign for at least 3 signs. Graduate question (6400- level): calculate or
estimate the uncertainty in the information content for part or all of the information
system. (3%)
3. Semiotic analysis: classify the signs according to categories defined in class, e.g. what
is the signifier and what is signified; what is the index and indicate which are icons, or
symbols? Min. 1 sentence per sign component. Describe the “code” or paradigm used.
Min. 1-2 sentences. (3%)
4. For your chosen signage system what library, cognitive and/ or social science principles
have been applied in their development? What, if any, attention is given to syntax,
semantics, and pragmatics; describe them - min. 3-4 sentences (3%)
5. Present in class. Discuss the relevant informatics considerations from questions 1 (why),
2 (system aspects) , 3 (semiotics), and 4 (principles applied). Present for ~ 5 mins (~3-5 4
slides) with a few questions to follow (5%).
Readings (on LMS)
Information Integration
5
Readings
Information Lifecycle
• Information Lifecycle
• Information Lifecycle Management (ILM)
• The New Buzzwords: Information Lifecycle Man
agement
• Database Archiving: A Critical Component of Inf
ormation Lifecycle Management
• Information Lifecycle Management – Wikipedia
6
Readings
Information Visualization
7
Data to become useful
• For the data to become useful (their definition
of information is data that is organized
somehow) we have to do something to it. It
needs to be transformed.
• Davenport and Prusak suggest that there are
"5 Cs" to how we might do that. These are:
Resource/Reference: https://thecdm.ca/news/how-do-we-create-information
“5 Cs”
1. How is the data contextualized? Do we know why the data
was gathered?
2. How was the data been categorized? Do we know the units
of analysis, the key components of the data?
3. How was the data calculated? Have there been some
mathematical or statistical analysis, such as changes over
time, averages, etc?
4. What corrections have been applied to the data? Do we
know how and whether or not errors have been removed?
5. And finally, has the data been condensed? Are there
summaries, tables, graphics?
Resource/Reference: https://thecdm.ca/news/how-do-we-create-information
From Information to Knowledge
• Turning information into action is the next
step up the pyramid and what defines
knowledge.
• Some call knowledge "actionable
information." Another transformation is called
for: Davenport and Prusak helpfully provide
another list (and another set of "Cs"):
10
Resource/Reference: https://thecdm.ca/news/how-do-we-create-information
1. The information is compared. How does this
situation compare to other situations we
have been in?
2. The consequences are identified. What
implications does the information have for
decisions and actions?
3. Connections have been made. How does
this bit of knowledge relate to others?
4. A conversation is initiated. What do other
people think about this information?
11
Resource/Reference: https://thecdm.ca/news/how-do-we-create-information
Information integration
• Involves combining information residing in
different sources and providing users with a
unified view of them.
• This process becomes significant in a variety
of situations both commercial (e.g. when two
similar companies need to merge their
databases) and scientific (e.g. combining
research results from different bioinformatics
repositories).
• Integration appears with increasing frequency
as the volume and the need to share existing 12
information explodes.
• Combines information from disparate data
sources and displays it in a single integrated
framework
13
Information integration
• It has become the focus of extensive
theoretical work, and numerous open
problems remain unsolved.
• In management circles, people frequently
refer to data integration as "Enterprise
Information Integration" (EII)” (Wikipedia)
• Is this an information management challenge
(rhetorical question)?
• Integration discussion context
– Data Integration vs. Data Interoperability
14
An example – Geospatial Data
• Much of the work on information integration
has focused on the dynamic integration of
structured data sources, such as
databases or XML data.
• With the more complex geospatial data
types, such as imagery, maps, and vector
data, researchers have focused on the
integration of specific types of information,
such as placing points or vectors on maps,
but much of this integration is only partially
automated.
• The challenge is that the dynamic
integration of online data and geospatial
data is beyond the state of the art of
existing integration systems.
• An example – Geospatial Data in
15
Year 2001
Resource/Reference/Image Credit: http://www.isi.edu/integration/TerraWorld/
In Class Work:
Explain your Use Case to someone you have NOT
met before in the class
16
In-Class Exercise
• Break into groups of 3 – 4 persons
• Choose a signage “system.” Pick an analog or digital information system
that utilizes ‘signs’ (icons, indices, symbols). It can be one you like or
dislike.
• Describe signs in the system you chose and why it is a ‘system’.
Describe the “properties, attributes, and leverage points” to frame your
description of the system. Write 2-3 min sentences per sign for at least 3
signs
17
Group Project
• This is the Term Project that you are going to
work till end of the semester
• 4-6 members in a team (No more than 6
students in a group)
• Some Ideas to Explore on:
Weather information
Disaster and Rick Management
Disease outbreaks
Early Warning Systems (Malaria/Dengue EWS)
18
Brainstorm ideas on Information
System for the Group Project
• Identify the area(s) in informatics that you want to
work on (Ex. Urban, Astro, Healthcare
informatics…)
• Create mind maps…
• Develop or refine a use case around a particular
area of informatics that you choose as a group
• Take Notes during the group discussions
• Use a collaborative editors such as Google Doc and
use a Google Drive/Dropbox/Box to share your
notes, data and other resources with the group
members.
19
• Discuss how a prototype implementation will
address areas defined in lecture materials
covering information uncertainty, semiotics,
cognition, and architectures.
• Use the template from Assignment 1
• Develop a conceptual model for the use case
you chose as a group. This model should
include relations among the “content” (things)
and application of information theory and
architecture principles (e.g. interfaces) and
include diagrams 20
Consider following Application Areas..
• Ask Questions: How do we use the
information to build a better future?
21
https://pmm.nasa.gov/GPM
Image Resource: https://pmm.nasa.gov/sites/default/files/document_files/GPM%20Mission%20Brochure.pdf
Information on Global Precipitation
https://pmm.nasa.gov/GPM 22
Image Resource: https://pmm.nasa.gov/sites/default/files/document_files/GPM%20Mission%20Brochure.pdf
Information on Weather & Climate
• Enhanced Prediction Skills for Weather and
Climate
23
https://pmm.nasa.gov/GPM
Image Resource: https://pmm.nasa.gov/sites/default/files/document_files/GPM%20Mission%20Brochure.pdf
• Improve Forecasting Capabilities for Floods,
Drought and Landslides
24
https://pmm.nasa.gov/GPM
Image Resource: https://pmm.nasa.gov/sites/default/files/document_files/GPM%20Mission%20Brochure.pdf
Better Agricultural Crop Forecasting
• The agricultural community needs to know the timing and amount of precipitation
to forecast crop yields and warn of freshwater shortages that might affect
irrigation and production .
• Satellite data from the GPM mission will provide global precipitation estimates
over land that can be incorporated into forecast models
25
https://pmm.nasa.gov/GPM
Image Resource: https://pmm.nasa.gov/sites/default/files/document_files/GPM%20Mission%20Brochure.pdf
Monitoring Freshwater Resources
Water resource managers rely on accurate precipitation measurements to monitor
freshwater resources necessary for human activities including public consumption,
irrigation, sanitation, mining, livestock and powering industries. Global observations
of precipitation from the GPM constellation of satellites will allow scientist to better
understand and predict changes in freshwater supply
26
https://pmm.nasa.gov/GPM
Image Resource: https://pmm.nasa.gov/sites/default/files/document_files/GPM%20Mission%20Brochure.pdf
Part 2 of the lecture:
27
Elements/ Forms of Information
• Structured/ un-structured, content, context
• Syntax-semantics-pragmatics
28
Elements/ Forms of Information
• Integration poses an important challenge
here
– Two forms presented/ organized differently
– Different structure, semantics…
29
Aiding integration
• Usually an integration capability is HIGHLY
curated or left entirely to the end user
• If left to the user, the results is a new product
which must also be managed and shared
• “I can’t integrate what I don’t understand”
• Key idea: provide for integratability !!!
– Standards – formats for sure but also
– Metadata
– Semantics
30
Informatics considerations
• Be aware of what means for integration is
available and what can actually be used
31
Life Cycle
32
Life cycle - definitions
• Life-cycle elements
– Acquisition: Process of recording or generating
a concrete artefact from the concept (see
transduction)
– Curation: The activity of managing the use of
data from its point of creation to ensure it is
available for discovery and re-use in the future (
http://www.dcc.ac.uk/FAQs/data-curator)
– Preservation: Process of retaining usability of
data in some source form for intended and
unintended use
– Stewardship: Process of maintaining integrity 33
34
The nature of the challenge
• To architect information systems today
– You may play many roles
– You may not get all the metadata or
information you need even if you get the data
– You will need skills that you were not taught
• To work with end-users today
– You may have lots of technical experience
– You will need new skills in addressing the
changing use of data and information
– One ‘size’ does not fit all
35
Acquisition
• Learn / read what you
can about the means of
acquisition
– Documents may not be
easy to find
– Bias is everywhere!!!
37
Preservation
• ‘Archiving’ is only one component
– Where are your class notes from last term?
– This term?
• Involves steps that may not be conventionally
thought of
• Think 10, 20, 50, 200 years forward. Looking
historically gives some guide to future
considerations
• …So, how would you preserve your class
notes from this class?
38
Information Life Cycle
• The life cycle applies within, before and
after your use case…
39
Information Lifecycle Governance (ILG)
• Information lifecycle governance (ILG) helps
you manage your business information
throughout its lifecycle — from creation to
deletion. It automates critical Data
operation requirements like records
management, electronic discovery,
compliance, storage optimization and data
migration initiatives.
40
Reference/Resource: https://www.ibm.com/analytics/information-lifecycle-governance?
How the information is created
• Systemic
• Environmental
• Trial-and-error (or ad-hoc)
41
How is information delivered?
• White paper (a document)
• Web site FAQ
• Web site informational
• Web site directed (link sent with e-mail, and
so on) to a specific Web site
• One-to-one presentation:
– Word of mouth / communication
42
How the information is managed
• Complexity of the
information
• Complexity of the
creation process
• Complexity of the
management system
Complexity=Uncertainty?
• Financial impact of
creation 43
Type of information created
• Tacit (created and stored informally):
– Human memory
– Localize, e.g. hard drive of the computer
– Movement of tacit information into a formalized
structure
• Explicit (created and sorted formally):
– Network shared
– Network Web site/intranet
– Informal knowledge-management system
– Document-management system
– Formal KM system 44
For information creation:
• Consider the
– Value of the source
– Age of the information
– Source of the information, and previous
interactions with that specific source
45
Value of the source
• Age of the information
• Source of the information, and previous
interactions with that specific source
46
Life cycle is a complex issue
• Must be managed
• Documented
48
Data-Information-Knowledge Ecosystem
Producers Consumers
Experience
Context 49
Why visualization?
• Reducing amount of data
• Patterns
• Features
• Events
• Trends
• Irregularities
• Exit points for analysis
50
Visualization formats
• Many – vector, raster
(image), animation,
multi-dimensional,
51
However, information, data..
• Assignment 3 - your presentations will be on
semiotics and the visual representations of
information systems – both good and bad
• Not just a matter of the ‘producer’ view…
consider the ‘consumer’ view, i.e. what is the
goal of the visualization?
• This is a time when
– Experience helps a lot
– But so does listening and gaining external
feedback
52
Remember metadata!
• Many formats already contain metadata or
fields for metadata, use them!
•
How do you visualize Metadata?
53
Visualization
54
Managing visualization products
• The importance of a ‘self-describing’ product
• http://www.smashingmagazine.com/2007/08/02/dat
a-visualization-modern-approaches/
• http://agbeat.com/business-marketing/piktochart-si
mple-infographic-creator-online-for-the-busy-profes
sional/
56