Types of Data (Contd)
Types of Data (Contd)
Structure Data
Structure Data is the term used when it is clear what data elements exist, where, and in
what form. In its most simple form, structured data is a fixed record layout; however, there are
variable record layouts, including XML that make it clear what data points exist, where, and in what
form.
The most common form of structured data is file and database data. This includes the content of
the many databases and files within an enterprise company where there is a formal file layout or
database schema. This data is typically the result of business applications collecting books and
records data for the enterprise.
The next most common form of structured data refers to the content of machine-generated outputs
(e.g. logs), that are produced by various types of software products, such as application systems,
database management systems. Networks, and security software. The ability to search, monitor, and
analyze machine-generated output from across the operational environment can provide significant
benefit to any large company.
Unstructured Data
Unstructured Data is the term used when it is not clear what data elements exist, where
they exist, and the form they may be in. Common examples include written or spoken
language, although heuristics can often be applied to discern some sampling of structured data from
them.
The most unstructured data does not even have data elements. These forms of unstructured data
include signal feeds from sensors involving streaming video, sound, radar, radio waves, sonar, light
sensors, and charged particle detectors.
Often some degree of structured data may be known or inferred with even these forms of
unstructured data, such as its time, location, source, and direction.
Semi-Structured Data
The most common forms of semi-structured data include electronic documents, such as PDFs,
diagrams, presentations, word processing documents, and spreadsheet documents, as distinct from
spreadsheets that strictly represent flat files of structured data. Another common form of semi-
structured data includes messages that originate from individuals, such as email, text messages,
and tweets.
Interval (or Scale) Data are values recorded along a numeric scale, e.g. reading the length
along a ruler, or noting an output reading on a PH meter. The differences (intervals) on the
scale represent true measures, e.g. the difference between 40 degrees Celsius and 80 degrees
Celsius is twice the temperature difference between 0 degrees Celsius and 20 degrees Celsius.
Interval data can be discrete, where values are recorded to a limited precision or
continuous, in which values can be recorded to any possible precision.
Ordinal Data is nominal data where the different categories show a sense of