0% found this document useful (0 votes)

0 views

DataAnalyticsforCivilEngineers_Module1

data analytics

Uploaded by

faisaloffice2020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

DataAnalyticsforCivilEngineers_Module1

data analytics

Uploaded by

faisaloffice2020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/377415936

DATA ANALYTICS FOR CIVIL ENGINEERS_ Module 1

Presentation · January 2024

DOI: 10.13140/RG.2.2.24118.88644

CITATIONS READS

0 1,079

1 author:

Jayaram M.A
RASTA - Center for Road Technology VOLVO Construction Equipment Campus
190 PUBLICATIONS 1,485 CITATIONS

SEE PROFILE

All content following this page was uploaded by Jayaram M.A on 16 January 2024.

The user has requested enhancement of the downloaded file.

RASTA
Center for Road Technology
VOLVO-CE Campus
Peenya Industry, Bengaluru, India

Course Title

DATA ANALYTICS FOR

CIVIL ENGINEERS
Course Contributor

Dr.M.A.Jayaram
Professor
RASTA
Module 1
Data and Knowledge
Assessment of Knowledge
Statistics, Descriptive and
Inferential
Definition of Data Analytics
Data Analytic Process
Models: KDD, SEMMA,
CRISP-DM
Methods
Tasks
Tools
RASTA

Data Analytics for Engineers

Course Code : 22CHT335

Module 1: Introduction
Dr.M.A.JAYARAM
Professor
RASRA-Center for Road Technology
VOLVO-CE Campus, Peenya Industrial Area
Bengaluru, India
INTRODUCTION
Data analytics involves the process of converting raw data into actionable insights. It
includes a range of tools, technologies, and processes used to find trends and solve
problems by using data. Data analytics can shape business processes, improve decision-
making, and foster business growth.
Motivation : Availability of huge data, sensors, Internet, search engines, Public
domain data, archivals etc
We are drowning in information, but starving for knowledge.

As a consequence, a new research area has been developed, which has become known under
the name of data mining. The goal of this area was to meet the challenge to develop tools that
can help humans find potentially useful patterns in their data and solve the problems they are
facing by making better use of their data.

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
INTRODUCTION: DATA Vs KNOWLEDGE
DATA
The quantities, characters, or symbols on which operations are performed
by a computer, which may be stored and transmitted in the form of electrical
signals and recorded on magnetic, optical, or mechanical recording media.
Examples:
Specific Gravity of Cement(Gc) : 3 -3.15
Specific Gravity of Bitumen(Gb): 0.95 – 1.10
Strength of concrete(Sc): 30- 40 MPa
Traffic volume on a particular road is Tv: 200 Vehicles/Hour
Plasticity Index of soil(PI): 15%
Grade of Concrete CG: M40
The absolute viscosity of Bitumen is VG20 is: Vi: 1600 – 2200 Poise at 60 degree centigrade
Rutting in a particular road stretch: Rd : 9 -14 mm.
Optimum Bitumen Content :Obc: 5.4%
CBR = 19%
Grade of Steel : Fe415
RASTA, Highway Technology, Elective Course ,Data
Etc…. Analytics for Civil Engineers_22CHT335
In summary : DATA
• Refer to single instances
(single object, people, events, points in time, materials, etc.)

• Describe individual properties

( Comp strength, OMC, Coeff. Thermal expansion, stiffness, viscosity, tenacity)

• Are often available in large amounts

(databases, archives, sensors, measuring instruments, continuous observations..)
• Are often easy to collect or obtain
(e.g., Public databases, publications, Lab experiments)
• Utility is limited to particular domain of application
• Do not allow us to make predictions or forecasts

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
In Summary, the Knowledge:
• Refers to classes of instances
(sets of objects, people, events, points in time, etc.)

• Describe general patterns, structures, laws, principles, etc.

• Consists of as few statements as possible

(this is an explicit goal)

• Is often difficult and time-consuming to find or to obtain

(e.g., natural laws, education, experience, observation, lab tests etc…..)

• Allows us to make predictions and forecasts

• Much more valuable than (raw) data.

• With its generality and the possibility to make

RASTA, Highway Technology, predictions
Elective Course ,Data about the properties of new
cases are the main reasons for this superiority
Analytics for Civil Engineers_22CHT335
KNOWLEDGE
Awareness or familiarity gained through experience of a fact or situation, education,
Knowledge as justified true belief (JTB). This definition identifies three essential features:
it is : (i) a belief that is (ii) true and (iii) justified. Truth is a widely accepted feature of
knowledge.

Knowledge is expressed through statements ( verbally or written)

Examples:
• With specific gravity, we can calculate the bearing capacity of soil.
• The specific gravity of straight run and cut-back bitumen is essential for purposes of calculating rates of spread, asphaltic
concrete mix properties etc.
• Keeping other parameters constant lower W/C ratio ( 0.2-0.4) corresponds to a strength range of ( 40 – 60 Mpa).
• Lower the W/C ratio higher the strength.
• As per IS 1206, the Absolute viscosity of VG 30 paving grade bitumen is 2400-3600 poises (60•C) and kinematic viscosity is
350 mm² /Sec (CSt)
• The maximum dry unit weight and the optimum moisture content of the sand are 18.698 kN/m3 and 10.271%,
respectively.
• The bitumen has two origins. It can be produced in the petroleum refinery or it can be extracted from the mines which is
gilsonite.
RASTA, Highway Technology, Elective Course ,Data
• Tar can be obtained through the destructive distillation of organic materials such as coal, wood, peat or petroleum
Analytics for Civil Engineers_22CHT335
Assessment of the Knowledge
• Not all kinds of knowledge are equally valuable as any other. Not all
general statements are equally important, equally substantial, equally
significant, or equally useful.

• Therefore, Knowledge is to assessed.

• Assessment will ensure the relevance of the knowledge and eliminate

irrelevant knowledge.

Criteria for assessment

i.Correctness: Through probability, success rate, factual observation, and field
verification.
RASTA, Highway Technology, Elective Course ,Data
Analytics for Civil Engineers_22CHT335
Examples: Assessed by conducting experiments

The density of bitumen is in the range 1.01-1.05 g/cm³ in the temperature

range of 25°C to 85°C.

Tar density in this temperature is higher than bitumen and is in the range 1.15
to 1.4 g/cm³.

The asphalt density is 2.2 to 2.4 g/cm³. The higher the density of the asphalt,
the more durable it is.

The pavement condition index (PCI) is a numerical index between 0 and

100, which is used to indicate the general condition of a pavement section.
85-100: Good, 70-85: Satisfactory, < 40: not passable ( through
observation of various kinds of Technology,
RASTA, Highway distress) Elective Course ,Data Analytics for
Civil Engineers_22CHT335
ii.Generality (domain and conditions of validity)
One of the downfalls of installing a rigid pavement is that the
installation is very costly, but the cost of maintenance is
reasonable
Flexible pavements are applied in layers. The weakest materials are laid at
the very bottom whereas the more durable materials are laid at the very
top to ensure the structural integrity and adaptability of the entire
structure.
iii.Usefulness (relevance, predictive power)
Fiber-optic cables embedded in the road detect wear and tear, and
communication between vehicles and roads can improve traffic management.

In a smart road system, data can be obtained from various sources, such as a large number
of sensors, smart cards, satellite systems, cameras, social networks
RASTA, Highway Technology, Elective Course ,Data Analytics for
Civil Engineers_22CHT335
iv. Comprehensibility (simplicity, clarity, parsimony)
• Crushing test is carried out to assess the strength of coarse aggregate when a
compressive load is gradually applied.
•Aggregate crushing value= W2/W1 x 100, where W1= Total weight of the dry
sample, W2 = weight of material passing through 2.36 mm IS sieve.

•Should not exceed 35% for CC pavements and 45% for wearing surfaces.

V. Novelty (previously unknown, unexpected)

The solar pavement has become one of the most researched new highway transportation
infrastructures with a goal to transform the road system from the energy consumer to the
energy provider, and eliminate or alleviate pollution from the source of energy of
transforming
RASTA, Highway Technology, Elective Course ,Data Analytics for
Civil Engineers_22CHT335
Among the data emerging in the field of ITS, visual data are among the most
voluminous kind. Computer Vision studies enable the analysis of both images
and videos and provide detailed information about the traffic situation.

In the domain of science & and technology, the focus is on correctness, generality, and
simplicity (parsimony) are in the focus: one way of characterizing science & and
technology is to say that it is the search for a minimal correct description of the world.

In economy and construction industry, however, the emphasis is placed on usefulness,

comprehensibility, and novelty: the main goal is to gain a competitive edge and thus to
increase revenues.

Nevertheless, neither of the two areas can afford to neglect the other criteria.

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
Data Analytics??
Data analytics converts raw data into actionable insights. It includes a range of tools,
technologies, and processes used to find trends and solve problems by using data. Data
analytics can shape business processes, improve decision-making, and foster business
growth.
By analyzing factors such as weather conditions, traffic patterns, and regulatory
requirements, project managers can make more accurate project schedules and resource
allocations.
Transportation data analytics offer highly granular datasets suitable for complex modelling,
including route information, trip speed, length, and duration, travel mode (e.g. driving
vs. cycling), O-D patterns, and more.
There are two approaches to maintaining roads—either address the damage that has already occurred thus
being reactive, or try and prevent the damage from occurring in the first place as a proactive measure.
Advances in analytics, cloud-based mobile technologies, and the Internet of Things (IoT) are making it possible
for authorities to adopt the latter approach. It not only helps them avoid accidents caused by poor road
conditions but also creates more cost savings that can be diverted toward road maintenance.
RASTA, Highway Technology, Elective Course ,Data Analytics for
Civil Engineers_22CHT335
Data Analytics is thought of as a kind of statistics.

Statistics has a long history and originated from collecting and analyzing
data about the population and the state.

Statistics
Descriptive Statistics

Inferential Statistics

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
Descriptive Statistics
Describe states and processes based on observed data.

The main tools to tackle this task are the computation of characteristic measures,
tabular and graphical representations.

Characteristic measures : central tendency, dispersion measures, and

distribution measures
Tabular representations: Correlation matrix, Covariance matrix etc….

Graphical representations: Histogram, Pie Chart, box plots, mosaic charts,

scatter diagrams, star plots, Spider plots, parallel coordinates etc…..
RASTA, Highway Technology, Elective Course ,Data
Analytics for Civil Engineers_22CHT335
Commuters satisfaction
Evolution of over public
transportation tarnsporttransport

RASTA, Highway Technology, Elective Course ,Data Analytics for

Civil Engineers_22CHT335
Inferential Statistics
Inferential statistics is a branch of statistics that makes the use of various analytical tools
to draw inferences about the population data from sample data.

It helps in making generalizations about the population by using various analytical tests
and tools.

Inferential statistics can be

classified into hypothesis testing
and regression analysis.
Hypothesis testing also includes
the use of confidence intervals to
test the parameters of a
population.

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
Examples

The study of the statistical significance of the predictor variables (e.g., “age group,”
“manufacturer,” “season,” and “color”) on the luminous intensities of traffic signals

The study on the pattern of crashes involving young drivers.

Travel Time reliability in urban areas/ metropolitan cities

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
Other methods
Exploratory Data Analysis : is concerned with generating hypotheses from the collected data. There are
no or at least considerably weaker model assumptions about the data generating process in exploratory
data analysis. They are mostly universal methods designed to achieve a certain goal but are not based
on a rigorous model as in inferential statistics.

Data Mining: is concerned with finding patterns in huge data bases. For this certain data mining
algorithms / tools are to be applied. With this any kind of desired knowledge could be squeezed out of
a given database automatically with no or only little human interference.

Knowledge discovery in data bases (KDD) : As every project is unique, we need specific approaches.
KDD is a process of identifying valid, novel, potentially useful, and ultimately understandable patterns
in data.

Data Analytics : Coined by David Hand (1997). Data analytics is the collection, transformation, and
organization of data in order to draw conclusions, make predictions, and drive informed decision
making. RASTA, Highway Technology, Elective Course ,Data
Analytics for Civil Engineers_22CHT335
Data analytics is a multidisciplinary field that employs a wide range of analysis techniques,
including math, statistics, and computer science, to draw insights from data sets. Data
analytics is a broad term that includes everything from simply analyzing data to theorizing
ways of collecting data and creating the frameworks needed to store it.

Data analytics is important across many industries, as many business leaders use data to
make informed decisions. An expert in smart health monitoring of structures might look
at sensor data to determine which components of a structure could be continued with
simple repairs and which to be dismantled. An environmental engineer may look at
ambient atmospheric data garnered through devises to determine the degree of
pollution and hence to design or device the possible remedial methods.

Ex. Health care analytics, Business analytics, Inventory Analytics, Material Behavior
Analytics, Visual Data analytics, Structural health care data anaalytics, pavement
deterioration data analytics etc……
RASTA, Highway Technology, Elective Course ,Data
Analytics for Civil Engineers_22CHT335
Data analytics requires a wide range of skills to be performed effectively:
Structured Query Language (SQL): a programming language commonly used for databases

Statistical programming languages: such as R and Python, commonly used to create advanced
data analysis programs

Machine learning: a branch of artificial intelligence that involves using algorithms to spot data
patterns, and making computers/devices to learn on the given data.

Probability and statistics: in order to better analyze and interpret data trends

Data management: or the practices around collecting, organizing and storing data

Data visualization: or the ability to use charts and graphs to tell a story with data

Econometrics: or the ability to RASTA,

useHighway
dataTechnology,
trends to create mathematical models that forecast
Elective Course ,Data Analytics for
future trends. Civil Engineers_22CHT335
DATA ANALYTICS PROCESS MODELS
Process models are meant to establishment of standards in analytic processes, methods,
and tasks both by academics, researchers and by people in the industry.

The models are centered in the attempt to formulate a general framework for data analysis

Three models are popular

Knowledge Discovery from Data Bases ( KDD)

Sample, Explore, Modify, Model, Assess (SEMMA)

CRoss Industry Standard Process for Data Mining (CRISP-DM)

RASTA, Highway Technology, Elective Course ,Data Analytics for

Civil Engineers_22CHT335
The KDD Process Model
The KDD, as presented in (1996) is the process of using data mining methods to extract
what is deemed knowledge , using a database along with any required preprocessing,
sub sampling, and transformation of the database

Stage 1 : Selection – This stage consists on creating a target data set, or focusing on a subset of
variables or data samples, on which discovery is to be performed.
Sample Key Pots Rutting Raveling Alligator PCI
Cracks
HTB_20 2 5.0 1 2.0 50
HTB_21 3 6.1 3 3.5 65
HTB_22 1 12.4 4 4.6 40
HTB_23 0 10.2 2 7.2 48
HTB_24 5 4.1 5 3.2 70
----- ----- ----- ----- ----- -----
----- ----- -----Technology, Elective
RASTA, Highway ----- -----for
Course ,Data Analytics -----
Civil Engineers_22CHT335
Stage 2 : Pre processing – This stage consists on the target data cleaning and pre processing in
order to obtain consistent data
Stage 3 : Transformation – This stage consists on the transformation of the data using
dimensionality reduction or transformation methods.
Stage 4 : Data Mining – This stage consists on the searching for patterns of interest in a
particular representational form, depending on the data mining objective (usually, prediction)
Stage 5: Interpretation/Evaluation – This stage consists on the interpretation and evaluation of
the mined patterns.

KDD process must be preceded by the development of an understanding of the application

domain, the relevant prior knowledge and the goals of the end-user. It also must be continued
by the knowledge consolidation by incorporating this knowledge into the system

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
KDD FLOW DIAGRAM

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
SEMMA PROCESS
Developed and standardized by the SAS Institute considers a cycle with 5 stages for the
process. SAS Institute is an American multinational developer of analytics software based in
Cary, North Carolina. SAS develops and markets a suite of analytics software, which helps
access, manage, analyze and report on data to aid in decision-making.

Stage 1: Sample – This stage consists on sampling the data by extracting a portion of a large
data set big enough to contain the significant information, yet small enough to manipulate
quickly. This stage is pointed out as being optional.

Stage 2: Explore – This stage consists on the exploration of the data by searching for
unanticipated trends and anomalies in order to gain understanding and ideas.

Stage 3: Modify – This stage consists on the modification of the data by creating, selecting,
and transforming the variables to focus the model selection process.

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
Stage 4: Model – This stage consists on modeling the data by allowing
the software to search automatically for a combination of data that
reliably predicts a desired outcome.

Stage 5: Assess – This stage consists on assessing the data by

evaluating the usefulness and reliability of the findings from the data
mining process and estimate how well it performs. Here model
performance is evaluated against the test data

SEMMA offers an easy to understand process, allowing an organized

and adequate development and maintenance of data mining
projects. It thus confers a structure for his conception, creation and
evolution, helping to present solutions
RASTA, Highway Technology, Elective Course ,Data
Analytics for Civil Engineers_22CHT335
CRISP-DM Model
The CRISP-DM process was developed by the means of the effort of a consortium initially
composed with Daimler Chrysler, SPSS (Statistical Package for the Social Sciences-IBM) and
NCR. CRISP-DM stands for CRoss-Industry Standard Process for Data Mining. It consists on a
cycle that comprises six stages
Stage 1. Domain / Business Understanding
The Domain understanding phase focuses on understanding the objectives and requirements of the project. The
three other tasks in this phase are foundational project management activities that are universal to most projects:
Determine objectives: You should first “thoroughly understand, from a business perspective, what the project
engineer really wants to accomplish.” and then define project success criteria.
Assess situation: Determine resources availability, project requirements, assess risks and contingencies, and
conduct a cost-benefit analysis.
Determine data mining goals: In addition to defining the business objectives, you should also define what success
looks like from a technical data mining perspective.
Outcome of this stage : - project plan: Select technologies and tools and define detailed plans for each project
phase. While many teams hurry through this phase, establishing a strong domain understanding is like building the
foundation of a structure. RASTA, Highway Technology, Elective Course ,Data
Analytics for Civil Engineers_22CHT335
CRISP-DM
Process
Model
Flow
Diagram

RASTA, Highway Technology, Elective Course ,Data Analytics for

Civil Engineers_22CHT335
Stage 2. Data Understanding

The focus is to identify, collect, and analyze the data sets that can help you accomplish the
project goals. This phase also has four tasks:

Collect initial data: Acquire the necessary data and (if necessary) load it into your analysis tool.
Describe data: Examine the data and document its surface properties like data format, number
of records, or field identities.
Explore data: Dig deeper into the data. Query it, visualize it, and identify relationships among
the data.
Verify data quality: How clean/dirty is the data? Document any quality issues.

If data is not understood properly, then it is needed to redo/ revisit domain understanding
again

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
Stage 3. Data Preparation

A common observation is that 80% of the project is data preparation.

This phase, which is often referred to as “data munching”, prepares the final data set(s) for
modeling. It has five tasks:
Select data: Determine which data sets will be used and document reasons for
inclusion/exclusion.
Clean data: Often this is the lengthiest task. Without it, it will be garbage-in, garbage-out. A
common practice during this task is to correct, impute, or remove erroneous values.
Construct data: Derive new attributes that will be helpful. For example, angular ratio, PI etc..
Integrate data: Create new data sets by combining data from multiple sources.
Format data: Re-format data as necessary. For example, you might convert string values that
store numbers to numeric values so that you can perform mathematical operations.

RASTA, Highway Technology, Elective Course ,Data

Analytics for Civil Engineers_22CHT335
Stage 4 : Modeling

In this stage development and assessment of various models based on several different
modeling techniques. This phase has four tasks:
Select modeling techniques: Determine which algorithms to try (e.g. regression, neural net).
Generate test design: Pending your modeling approach, you might need to split the data into
training, test, and validation sets.
Build model: Developed model will be available
Assess model: Generally, multiple models are competing against each other, and the data
scientist needs to interpret the model results based on domain knowledge, the pre-defined
success criteria, and the test design.

If model is not acceptable, data preparation is to be revisited/ redone.

RASTA, Highway Technology, Elective Course ,Data Analytics for

Civil Engineers_22CHT335
Stage 5 : Evaluation

The Evaluation phase looks more broadly at which model best meets the business and what
to do next. This phase has three tasks:

Evaluate results: Do the models meet the business success criteria? Which one(s) should we
approve for the business?

Review process: Review the work accomplished. Was anything overlooked? Were all steps
properly executed? Summarize findings and correct anything if needed.

Determine next steps: Based on the previous three tasks, determine whether to proceed to
deployment, iterate further, or initiate new projects.

If model fails or proves to be erroneous or does not work for new data, processes are
to be repeated from domain understanding
RASTA, Highway Technology, Elective Course ,Data Analytics for
Civil Engineers_22CHT335
Stage 6 : Deployment

Depending on the requirements, the deployment phase can be as simple as generating a

report or as complex as implementing a software system suiting the enterprise/ project

Plan deployment: Develop and document a plan for deploying the model.

Plan monitoring and maintenance: Develop a thorough monitoring and maintenance plan to
avoid issues during the operational phase (or post-project phase) of a model.

Produce final report: The project team documents a summary of the project which might
include a final presentation of data mining results.

Review project: Conduct a project retrospective about what went well, what could have been
better, and how to improve in the future.

RASTA, Highway Technology, Elective Course ,Data Analytics for

Civil Engineers_22CHT335
Comparison of KDD, SEMMA and CRISP-DM

RASTA, Highway Technology, Elective Course ,Data Analytics for

Civil Engineers_22CHT335
Methods, Tasks, and Tools
Every data analysis problem is different.

To avoid the effort of inventing a completely new solution for each problem, it is helpful to
think of different problem categories and consider them as building blocks from which a
solution may be composed.

Methods
Classification
Regression
Clustering/Segmentation

Association
RASTA, Highway Technology, Elective Course ,Data Analytics for
Deviation Analysis Civil Engineers_22CHT335
Classification
Predict the outcome of an experiment with a finite number of possible results
Predict a class label for an entity/sample/material/structure
Examples : Yes/NO , Class1/Class2/Class3, Distressed / Not Distressed/ Severely
distressed
Palatable/Not-palatable/Neutral, Durable/ High durable/ Moderately durable.
High/low/moderate ---- (Angularity, workability, strength)

We may be interested in a prediction because the true result will emerge in the future
or because it is expensive, difficult, or cumbersome to determine it.

Is this material worthy of consideration/selection?

Is the distress level of this stretch of highway acceptable?
Is the material/component has adequate service life?
Does the material possess required property ?( angularity, workability, tensile strength etc….)
RASTA, Highway Technology, Elective Course ,Data Analytics for
Civil Engineers_22CHT335
Regression
Regression is, just like classification, also a prediction task, but here the value of interest
is numerical in nature. Regression equations may be developed.

How will the strength of concrete develop?

How much may be the cost to build certain infrastructure, 5 years from now?

What will be the degree of distress/ deterioration of a structure 5 years from now?

What will be the optimum bitumen content, given other parameter values?

What is the maximum duration of a project given the values of other parameters?

RASTA, Highway Technology, Elective Course ,Data Analytics for

Civil Engineers_22CHT335
Clustering / Segmentation
Summarize the data to get a better overview by forming groups of similar cases (called
clusters or segments).

Instead of examining a large number of similar data, we need to inspect the group
summary only.

We may also obtain some insight into the structure of the whole data set. Cases that do
not belong to any group may be considered abnormal or outliers.
Do the data about viscosity, shear strength, flowability of bitumen divide into different
groups?
Can the fly ashes be grouped into 3, based on the chemical, physical and morphological
features?
Group specific properties of materials.
RASTA, Highway Technology, Elective Course ,Data Analytics for
Civil Engineers_22CHT335
Association Analysis
Find any correlations or associations to better understand or describe the
interdependencies of all the attributes

The focus is on relationships between all attributes rather than focusing on a single
target variable or the cases

What are the two attributes that will go together in increasing the strength of
Concrete/Bituminous mix/ asphalt?

How superplasticizer and chemical admixture influence the properties of SCC?

What are the significant attributes that contribute to good quality of material xx?

RASTA, Highway Technology, Elective Course ,Data Analytics for

Civil Engineers_22CHT335
Deviation Analysis
Knowing already the major trends /structures in a material/ process/ cost, , find any
exceptional subgroup that behaves differently concerning to some target attribute.

Under which circumstances does the material/ component of a structure/ pavement behave
differently?

Which properties do those materials share who do not follow the pattern ?
The most frequent categories in transportation data analytics are classification and
regression, because decision making always becomes much easier if reliable predictions of
the near future are available.
When a completely new area or domain (like ITS, IoT ,Smart City ) is explored, cluster analysis
and association analysis may help to identify relationships among attributes or instances.
Once the major relationships are understood (e.g., by a domain expert), a deviation analysis
can help to focus on exceptional situations that deviate from regularity.
RASTA, Highway Technology, Elective Course ,Data Analytics for
Civil Engineers_22CHT335
Major Tasks
Pattern Finding
If the domain (and therefore the data) is new to us or if we expect to find interesting
relationships, we explore the data for new, previously unknown patterns.
For this , we may apply methods from, for instance, segmentation, clustering,
association analysis, or deviation analysis.
Techniques available Examples of patterns :
Crack length , width, numbers over a unit area
Statistical Techniques Variation in rutting over a stretch of highway
Prediction of serviceability aspects of pavement
Structural Techniques Clustering of highway distress, traffic flow intensities based
on related attributes
Template Matching
Neural Network Approach
Fuzzy Model
Hybrid Models RASTA, Highway Technology, Elective Course ,Data Analytics for
Civil Engineers_22CHT335
Finding Explanations
We have a special interest in some target variable and wonder why and how it varies from case
to case
The primary goal is to gain new insights (knowledge) that may influence our decision
making, but we do not necessarily intend automation.
We may apply methods from, classification, regression, association analysis, or deviation
analysis.

Examples:
The variation of vehicular registration
The variation of Cement Production
Variation of Strength of concrete/ rigid pavement in relation to several influencing
parameters.

RASTA, Highway Technology, Elective Course ,Data Analytics for

Civil Engineers_22CHT335
Finding Predictors
We have a special interest in the prediction of some target variable, but it (possibly)
represents only one building block of our full problem, so we do not really care about the
how and why but are just interested in the best-possible prediction.
We may apply methods from, classification or regression.
Prediction of 28 day strength of concrete, using huge data.

Prediction of optimal bitumen content in bituminous mixes.

Prediction of infrastructure (Road, Industrial , Railway, Metro) expenditure in India for

next 5 years.

Available open source tools:

R-studio ,Weka, Neuro Intelligence, Tableau, Rapid Miner, Knime, Pentaho
Data Analysis with Open Source Tools
by Philipp K. Janert, 2010, O'Reilly Media, Inc.
RASTA, Highway Technology, Elective Course ,Data Analytics for
Civil Engineers_22CHT335
View publication stats

Crystal Heart Starter Set
No ratings yet
Crystal Heart Starter Set
21 pages
Mathsclinic Smartprep Gr11 Eng 2023 v2.1
No ratings yet
Mathsclinic Smartprep Gr11 Eng 2023 v2.1
45 pages
Lbzala STTP Adit
100% (1)
Lbzala STTP Adit
77 pages
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Introduction to Data Analytics (2)
No ratings yet
Introduction to Data Analytics (2)
51 pages
Edge Computing Applications in Supply Chain Management
From Everand
Edge Computing Applications in Supply Chain Management
Bo Li
No ratings yet
Telecommunications Traffic : Technical and Business Considerations
From Everand
Telecommunications Traffic : Technical and Business Considerations
Sigit Haryadi
No ratings yet
Gujarat Technological University: Page 1 of 5
No ratings yet
Gujarat Technological University: Page 1 of 5
5 pages
1628083343
No ratings yet
1628083343
5 pages
Big Data Approach For Secure Traffic Data Analytics Using Hadoop
No ratings yet
Big Data Approach For Secure Traffic Data Analytics Using Hadoop
4 pages
Highway Engg Lab Manual
No ratings yet
Highway Engg Lab Manual
42 pages
Spe 196428 Ms
No ratings yet
Spe 196428 Ms
16 pages
Data Collection Technologies For Road Management
No ratings yet
Data Collection Technologies For Road Management
53 pages
Resume Shashi Tubaki
No ratings yet
Resume Shashi Tubaki
2 pages
Dancing on a Cloud: A Framework for Increasing Business Agility
From Everand
Dancing on a Cloud: A Framework for Increasing Business Agility
David Sterling
No ratings yet
Project Report
No ratings yet
Project Report
48 pages
Data Management and Security in Blockchain Systems
From Everand
Data Management and Security in Blockchain Systems
Sonali Vyas
No ratings yet
0 Introdiction Syllabus Highway Engg Course-2150601
No ratings yet
0 Introdiction Syllabus Highway Engg Course-2150601
13 pages
C11920283S19
No ratings yet
C11920283S19
5 pages
The Role of Network Security and 5G Communication in Smart Cities and Industrial Transformation
From Everand
The Role of Network Security and 5G Communication in Smart Cities and Industrial Transformation
Devasis Pradhan
No ratings yet
BDA Unit 1
No ratings yet
BDA Unit 1
68 pages
Visualization
No ratings yet
Visualization
13 pages
373670trn0300data0collection01PUBLIC1
No ratings yet
373670trn0300data0collection01PUBLIC1
8 pages
Big Data Manual - Edited
No ratings yet
Big Data Manual - Edited
69 pages
Network Coding and Signcryption for Cloud Data Integrity
From Everand
Network Coding and Signcryption for Cloud Data Integrity
Noah Joan
No ratings yet
Analysis and Planning of Rural Road Network Using QGIS
100% (1)
Analysis and Planning of Rural Road Network Using QGIS
29 pages
OTC_Traffic_Data_Collection_Analysis_April172023_Handout
No ratings yet
OTC_Traffic_Data_Collection_Analysis_April172023_Handout
49 pages
4 - Course Outline CE 451
No ratings yet
4 - Course Outline CE 451
4 pages
Papers on the field QoS Measurement Of Services in mobile networks Using Aggregation Method
From Everand
Papers on the field QoS Measurement Of Services in mobile networks Using Aggregation Method
Sigit Haryadi
4/5 (2)
JNTUK - Revised Syllabus For M. Tech Transportation Engineering
No ratings yet
JNTUK - Revised Syllabus For M. Tech Transportation Engineering
20 pages
Transportation Syllabus
No ratings yet
Transportation Syllabus
10 pages
Railway Assets A Potential Domain For Big Data Analytics
No ratings yet
Railway Assets A Potential Domain For Big Data Analytics
11 pages
Gujarat Technological University: W.E.F. AY 2018-19
No ratings yet
Gujarat Technological University: W.E.F. AY 2018-19
4 pages
I Jcs It 2015060405
No ratings yet
I Jcs It 2015060405
6 pages
Chapter 1 Introduction Data Analytics
No ratings yet
Chapter 1 Introduction Data Analytics
64 pages
BDA-UNIT-I-LM
No ratings yet
BDA-UNIT-I-LM
14 pages
Transportation Engineering Management
No ratings yet
Transportation Engineering Management
12 pages
Performance Evaluation of Rigid Pavements
No ratings yet
Performance Evaluation of Rigid Pavements
6 pages
Big Data
No ratings yet
Big Data
26 pages
Big Data and Analytics For Safer Transportation PDF
No ratings yet
Big Data and Analytics For Safer Transportation PDF
8 pages
QoS: Myths and Hype
From Everand
QoS: Myths and Hype
John G. Waclawsky
No ratings yet
MTech CIVIL 2013 PDF
No ratings yet
MTech CIVIL 2013 PDF
45 pages
NIT Trichy SE Syllabus PDF
No ratings yet
NIT Trichy SE Syllabus PDF
45 pages
Distributed Facts Device for Flow Controls
From Everand
Distributed Facts Device for Flow Controls
Dr.V.V.L.N. Sastry
No ratings yet
CEE211 - Transportation Engineering - I
No ratings yet
CEE211 - Transportation Engineering - I
4 pages
Big Data, Hadoop
No ratings yet
Big Data, Hadoop
24 pages
Blockchain Adoption in Supply Chain Management and Logistics
From Everand
Blockchain Adoption in Supply Chain Management and Logistics
Niels Hackius
No ratings yet
The Telematics Revolution: Driving Connectivity and Insights
From Everand
The Telematics Revolution: Driving Connectivity and Insights
Anand Kumar Vedantham
No ratings yet
Transportation Engineering
No ratings yet
Transportation Engineering
155 pages
Big Data in The Construction Industry: A Review of Present Status, Opportunities, and Future Trends
No ratings yet
Big Data in The Construction Industry: A Review of Present Status, Opportunities, and Future Trends
1 page
Transportation
No ratings yet
Transportation
216 pages
Seminar Report Alisha
No ratings yet
Seminar Report Alisha
22 pages
Quality of Experience Engineering for Customer Added Value Services: From Evaluation to Monitoring
From Everand
Quality of Experience Engineering for Customer Added Value Services: From Evaluation to Monitoring
Abdelhamid Mellouk
No ratings yet
Review On Big Data & Analytics - Concepts, Philosophy, Process and Applications
No ratings yet
Review On Big Data & Analytics - Concepts, Philosophy, Process and Applications
25 pages
An Experiential Study of The Big Data: Keywords
No ratings yet
An Experiential Study of The Big Data: Keywords
12 pages
Bigdata Documentation
No ratings yet
Bigdata Documentation
20 pages
Unit III - Big Data
No ratings yet
Unit III - Big Data
22 pages
Engineering Analytics: Advances in Research and Applications 1st Edition Coll - Quickly access the ebook and start reading today
100% (1)
Engineering Analytics: Advances in Research and Applications 1st Edition Coll - Quickly access the ebook and start reading today
75 pages
Big Data
No ratings yet
Big Data
13 pages
R19 BDA UNIT-1
No ratings yet
R19 BDA UNIT-1
22 pages
Highway History
No ratings yet
Highway History
46 pages
The CompTIA Network+ & Security+ Certification: 2 in 1 Book- Simplified Study Guide Eighth Edition (Exam N10-008) | The Complete Exam Prep with Practice Tests and Insider Tips & Tricks | Achieve a 98% Pass Rate on Your First Attempt!
From Everand
The CompTIA Network+ & Security+ Certification: 2 in 1 Book- Simplified Study Guide Eighth Edition (Exam N10-008) | The Complete Exam Prep with Practice Tests and Insider Tips & Tricks | Achieve a 98% Pass Rate on Your First Attempt!
Comptia Ace5
5/5 (1)
Testbank and Solutions for Physics for Scientists and Engineers a Strategic Approach Volume 4 3rd Edition
No ratings yet
Testbank and Solutions for Physics for Scientists and Engineers a Strategic Approach Volume 4 3rd Edition
18 pages
Compression Mould.
No ratings yet
Compression Mould.
26 pages
(Ebook) Handbook of Surface and Interface Analysis: Methods for Problem-Solving, Second Edition (Surfactant Science) by John C. Riviere, Sverre Myhra ISBN 9780849375583, 0849375584download
100% (2)
(Ebook) Handbook of Surface and Interface Analysis: Methods for Problem-Solving, Second Edition (Surfactant Science) by John C. Riviere, Sverre Myhra ISBN 9780849375583, 0849375584download
54 pages
Pipe Last
No ratings yet
Pipe Last
8 pages
EPH105C TEST1 MEMO (2)
No ratings yet
EPH105C TEST1 MEMO (2)
12 pages
Blast Pressure Distribution Around Large Storage Tanks: July 2015
No ratings yet
Blast Pressure Distribution Around Large Storage Tanks: July 2015
10 pages
GT1 CLC W4
No ratings yet
GT1 CLC W4
5 pages
Grease Interval and Volume Calculator
100% (1)
Grease Interval and Volume Calculator
12 pages
13 Sreebha M S Investigation of L'ambiance Plaza Building Collapse in Bridgeport, Connecticut
No ratings yet
13 Sreebha M S Investigation of L'ambiance Plaza Building Collapse in Bridgeport, Connecticut
47 pages
Engineering Mechanics: Theorems of Pappus-Guldinus
No ratings yet
Engineering Mechanics: Theorems of Pappus-Guldinus
15 pages
Material Sci by Vishal Sir
No ratings yet
Material Sci by Vishal Sir
53 pages
An Adventure To Guoker Planet
No ratings yet
An Adventure To Guoker Planet
258 pages
Matlab and Ansys Programmes For Mechanical
No ratings yet
Matlab and Ansys Programmes For Mechanical
6 pages
S INE: Techniques in Answering Section B
100% (1)
S INE: Techniques in Answering Section B
60 pages
Biography: Journal Des Savants
No ratings yet
Biography: Journal Des Savants
3 pages
Download Full Mechanics of Fluids Fourth Edition Merle C. Potter PDF All Chapters
100% (17)
Download Full Mechanics of Fluids Fourth Edition Merle C. Potter PDF All Chapters
50 pages
A4 - Perspective - Group 1
No ratings yet
A4 - Perspective - Group 1
5 pages
03 - Bu 3
No ratings yet
03 - Bu 3
40 pages
PWPS MultiDrive PotablePump Control Sequence-Rev3
No ratings yet
PWPS MultiDrive PotablePump Control Sequence-Rev3
5 pages
Calentador Gabinete PW 120 C H Specs
No ratings yet
Calentador Gabinete PW 120 C H Specs
3 pages
Solid Element Formulation Overview
No ratings yet
Solid Element Formulation Overview
36 pages
PHYV101 Exam June 2018
No ratings yet
PHYV101 Exam June 2018
13 pages
Vol 5 Issue 4 M 37
No ratings yet
Vol 5 Issue 4 M 37
6 pages
Rephrasings Vacios
No ratings yet
Rephrasings Vacios
22 pages
SVN Sba #4
No ratings yet
SVN Sba #4
4 pages
0 Paper 3 SL ANSWERS Section A
No ratings yet
0 Paper 3 SL ANSWERS Section A
65 pages
Design of Machine Elements Assignment 2 Roll 07
100% (1)
Design of Machine Elements Assignment 2 Roll 07
6 pages
6TH GRADE PT3 MATHEMATICS QUESTION PAPER
No ratings yet
6TH GRADE PT3 MATHEMATICS QUESTION PAPER
2 pages

DataAnalyticsforCivilEngineers_Module1

Uploaded by

DataAnalyticsforCivilEngineers_Module1

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

DATA ANALYTICS FOR CIVIL ENGINEERS_ Module 1

Presentation · January 2024

The user has requested enhancement of the downloaded file.

DATA ANALYTICS FOR

Data Analytics for Engineers

RASTA, Highway Technology, Elective Course ,Data

• Describe individual properties

• Are often available in large amounts

RASTA, Highway Technology, Elective Course ,Data

• Describe general patterns, structures, laws, principles, etc.

• Consists of as few statements as possible

• Is often difficult and time-consuming to find or to obtain

• Allows us to make predictions and forecasts

• Much more valuable than (raw) data.

• With its generality and the possibility to make

Knowledge is expressed through statements ( verbally or written)

• Therefore, Knowledge is to assessed.

• Assessment will ensure the relevance of the knowledge and eliminate

Criteria for assessment

The density of bitumen is in the range 1.01-1.05 g/cm³ in the temperature

The pavement condition index (PCI) is a numerical index between 0 and

V. Novelty (previously unknown, unexpected)

In economy and construction industry, however, the emphasis is placed on usefulness,

RASTA, Highway Technology, Elective Course ,Data

RASTA, Highway Technology, Elective Course ,Data

Characteristic measures : central tendency, dispersion measures, and

Graphical representations: Histogram, Pie Chart, box plots, mosaic charts,

RASTA, Highway Technology, Elective Course ,Data Analytics for

Inferential statistics can be

RASTA, Highway Technology, Elective Course ,Data

The study on the pattern of crashes involving young drivers.

Travel Time reliability in urban areas/ metropolitan cities

RASTA, Highway Technology, Elective Course ,Data

Econometrics: or the ability to RASTA,

Three models are popular

Sample, Explore, Modify, Model, Assess (SEMMA)

CRoss Industry Standard Process for Data Mining (CRISP-DM)

RASTA, Highway Technology, Elective Course ,Data Analytics for

KDD process must be preceded by the development of an understanding of the application

RASTA, Highway Technology, Elective Course ,Data

RASTA, Highway Technology, Elective Course ,Data

RASTA, Highway Technology, Elective Course ,Data

Stage 5: Assess – This stage consists on assessing the data by

SEMMA offers an easy to understand process, allowing an organized

RASTA, Highway Technology, Elective Course ,Data Analytics for

RASTA, Highway Technology, Elective Course ,Data

A common observation is that 80% of the project is data preparation.

RASTA, Highway Technology, Elective Course ,Data

If model is not acceptable, data preparation is to be revisited/ redone.

RASTA, Highway Technology, Elective Course ,Data Analytics for

Depending on the requirements, the deployment phase can be as simple as generating a

RASTA, Highway Technology, Elective Course ,Data Analytics for

RASTA, Highway Technology, Elective Course ,Data Analytics for

Is this material worthy of consideration/selection?

How will the strength of concrete develop?

RASTA, Highway Technology, Elective Course ,Data Analytics for

How superplasticizer and chemical admixture influence the properties of SCC?

RASTA, Highway Technology, Elective Course ,Data Analytics for

RASTA, Highway Technology, Elective Course ,Data Analytics for

Prediction of optimal bitumen content in bituminous mixes.

Prediction of infrastructure (Road, Industrial , Railway, Metro) expenditure in India for

Available open source tools:

You might also like