De MSC Handbook 2022
De MSC Handbook 2022
Data Engineering
Master of Science
Subject-specific Examination Regulations for Data Engineering
The subject-specific examination regulations for Data Engineering are defined by this program
handbook and are valid only in combination with the General Examination Regulations for Master
degree programs (“General Master Policies”).
This handbook also contains the program-specific Study and Examination Plan (Appendix 1).
Upon graduation students in this program will receive a Master of Science (MSc) degree with a scope
of 120 ECTS credit points (for specifics see chapter 2 of this handbook).
http://www.jacobs-university.de/data-engineering
Fall 2022 – V1 Sep 01, 2022 May 22, 2019 V1 Originally approved by the
Academic Senate
Contents
3
Network Approaches in Biology and Medicine ......................................................56
Applied Dynamical Systems ............................................................................... 58
Remedial Modules............................................................................................. 60
Discovery Area (15 CP)............................................................................................. 64
Current Topics in Data Engineering ..................................................................... 64
Advanced Project 1 ........................................................................................... 66
Advanced Project 2 ............................................................................................ 68
Career Area (15 CP) ................................................................................................. 70
Language Skills................................................................................................. 70
Academic Writing Skills/Intercultural Training ......................................................71
Communication & Presentation Skills for Executives .............................................73
Ethics and the Information Revolution................................................................. 75
Master Thesis (30 CP) .............................................................................................. 77
5 Appendices ..................................................................................... 80
Study and Examination Plan ..................................................................................... 80
Intended Learning Outcomes Assessment-Matrix......................................................... 82
4
1 Program Overview
Concept
Today we are “drowning in data and starving for information”, while acknowledging that “data is the
new gold”. However, deriving value from all the data now available requires a transformation in data
analysis, in how we see, maintain, share and understand data. Data Engineering is an emerging
profession concerned with the task of acquiring large collections of data and extracting insights from
them. It is driving the next generation of technological innovation and scientific discovery, which is
expected to be strongly data-driven.
The graduate program in Data Engineering offers a fascinating and profound insight into the methods
and technologies of this rapidly growing area. The program combines the big data aspects of “Data
Analytics” as well as of “Data Science” with the technological challenges of data acquisition,
curation, and management. Thus, the program provides the essentials for paving the way to a
successful career: computer skills and mathematical understanding paired with practical experience
in selected application fields.
The program is embedded into the School of Computer Science & Engineering at Jacobs University.
This school investigates the mobility of people, goods, and information. Even though the Data
Engineering program is centered in the School of Computer Science & Engineering, it includes
contributions from and supports applications in the two other research schools: The School of Science
(bioactive substances), and the School of Business, Social & Decision Sciences (in modern societies).
Moreover, the Data Engineering program attracts students with diverse career goals, backgrounds,
and prior work experience. Therefore, the program offers four focus tracks within which the students
can choose to specialize further: Computer Science, Geo-Informatics, Bio-Informatics and Business
& Supply Chain Engineering. These tracks are a preparation for the Advanced Projects within the
Discovery Area and the Master Thesis.
In particular, one specialization track is Computer Science providing them the skills to go beyond a
mere usage of existing toolboxes, and develop innovative data analysis techniques of their own design.
Another specialization track is Bioinformatics and the analysis of biomedical data. Integration and
model-based interpretation of high-throughput data are severe bottlenecks in biomedical and
pharmaceutical research. Data Engineering prepares students for the novel computational challenges
in these fields.
A third specialization track is Geo-Informatics which provides an introduction to Geographic
Information System techniques, principles of spatial analysis, and data mining with integration of
remote sensing and GPS. It thereby provides an early exposure to earth science data and its handling.
Students can also choose the specialization track of Business & Supply Chain Engineering. A vast
amount of data is collected as part of business processes in particular along supply chains. In this
specialization track students will concentrate on the full data analysis cycle including pre-processing
of data, data analysis and deployment of model results within the business process.
The graduate program in Data Engineering is tailored to a diverse student body (see also Section 1.3)
with a wide variety of interests, academic backgrounds, and previous experiences. Small group sizes,
a low student-to-teacher ratio, and personalized supervision/advising allow the program to cater to the
21-year-old student who has just graduated with a Bachelor degree, as well as a person who already
has been employed in a data-intensive company and who wants to keep up with current data
engineering practices.
5
Qualification Aims
Educational Aims
The program aims to provide an in-depth understanding of the essential aspects of data-based
decision-making and the skills required to apply and implement these powerful methods in a
successful and responsible manner. Apart from the necessary programming skills, this comprises:
▪ methods of data acquisition both from the internet and from sensors;
▪ methods to efficiently store and access data in large and distributed data bases;
▪ statistical model building including a wide range of data mining methods, signal
processing, and machine learning techniques;
▪ visualization of relevant information;
▪ construction and use of confidence intervals, hypothesis testing, and sensitivity analyses;
▪ the legal foundations of Data Engineering;
▪ scientific qualification;
▪ competence to take up qualified employment in Data Engineering;
▪ competence for responsible involvement in society;
▪ personal growth.
▪ critically assess and creatively apply technological possibilities and innovations driven by
big data;
▪ use sensors and microcontrollers to collect data and to transmit them to databases on
servers or the internet in general;
▪ set up and use databases to efficiently and securely manage and access large amounts of
data;
▪ apply statistical concepts and use statistical models in the context of real-life data
analytics;
▪ use, adapt and improve visualization techniques to support data-based decision-making;
▪ design, implement and exploit various representations of data for classification and
regression including supervised machine learning methods and core ideas of deep
learning;
▪ apply and critically assess data acquisition methods and analytical techniques in real life
situations, organizations and industries;
▪ independently investigate complex problems and undertake scientific or applied research
into a specialist area utilizing appropriate methods, also taking methods and insights of
other disciplines into account;
▪ professionally communicate their conclusions and recommendations, the underlying
information and their reasons to both specialists and non-specialists, clearly and
unambiguously on the basis of the state of research and application;
6
▪ assess and communicate social, scientific and ethical insights that also derive from the
application of their knowledge and their decisions;
▪ engage ethically with the academic, professional and wider communities and actively
contribute to a sustainable future;
▪ take responsibility for their own learning, personal development, and role in society,
evaluating critical feedback and self-analysis;
▪ take on lead responsibility in a diverse team;
▪ adhere to and defend ethical, scientific and professional standards.
Target Audience
The Data Engineering graduate program is targeted towards students who have completed their BSc
in areas such as computer science, physics, applied mathematics, statistics, electrical engineering,
communications engineering or related disciplines, and who want to deepen their knowledge and
proceed to research-oriented work towards a master or ultimately a PhD degree. Typical examples
are:
▪ a bachelor in computer science who wants to acquire skills in data analysis and
micro/macroeconomics for a career in computational finances;
▪ a bachelor in business with a solid statistics and analysis foundation and programming
experience;
▪ a bachelor in geology who wants to become a data scientist and needs to deepen his/her
mathematical and statistical skills;
▪ a student with a bachelor or master degree in one of the natural sciences who wishes to
boost his/her career in empirical research or industrial research and development, where
professional handling of very large-scale data collections has become a prime bottleneck
for success;
▪ a bachelor in mathematics or physics who wants to capitalize on his/her theoretical
knowledge of modeling methods by learning about the hands-on side of data analysis,
interesting fields for applications, and options for employment;
▪ a student with an undergraduate degree in the life sciences wishing to expand their skill
sets towards computational methods and to specialize in bioinformatics and the analysis
of biomedical data.
In order to facilitate the integration of students with diverse backgrounds, we offer remedial courses
in the first semester. Placement tests in the orientation week before the beginning of the first semester
help students to identify contents that they need to refresh or remedy.
Career Options
The demand for Data Engineers is massive. Typical fields of work encompass the finance sector, the
automotive and health industry as well as retail and telecommunications. Companies and institutions
in almost every domain need:
▪ experts for data acquisition who find out how to collect the data needed;
▪ experts for data management who know how to store, enhance, protect and process large amounts
of data efficiently;
7
▪ experts for data analysis who evaluate and interpret the collected data correctly and are able to
visualize the findings clearly.
▪ Graduates of the program work as data analysts, data managers, data architects, business
consultants, software and web developers, or system administrators;
▪ an MSc degree in Data Engineering also allows students to move on to a PhD and a career in
academia and research institutions.
The employability of Data Engineering graduates is promoted by organizing contacts with industry
and research institutes throughout the curriculum. In the first semester, in the Current Topics in Data
Engineering seminar, companies and research groups introduce their field of interest. The advanced
projects, in the second and third semesters can be combined with internships in research institutes
or companies. In the second and third semester, the participation in public big data challenges is
organized as an integral part of the curriculum.
Admission Requirements
Applicants need to submit the following documents in order to be considered for admission:
• Letter of motivation
• Curriculum vitae (CV)
• Certified university transcripts in English or German
• Bachelor’s degree certificate or equivalent (may be handed in later)
• Two letters of recommendation
• Language proficiency test results (TOEFL, IELTS or equivalent) as outlined on the website.
8
2 The Curriculum
See Chapter 3 “Modules” of this handbook for the detailed module descriptions or refer to CampusNet (https://campusnet.jacobs-university.de).
9
Study and Examination Plan
MSc Degree in Data Engineering
Matriculation Fall 2022
1 2
Module Code Program-Specific Modules Type Assessment Period Status Semester CP
Semester 1 30
CORE Area 10
MCO003-BigData Module: Big Data Challenge m 1 5
MCO003-051003 Big Data Challenge Lecture Term paper (Project report) During semester
MCO011-DataAnaDE Module: Data Analytics m 1 5
MCO011-340131 Data Analytics Lecture Written examination Examination period
Elective Area me 5
- students choose one module from those listed below
Methods Area 5
MMM014-IntroDataMan Module: Introduction to Data Management with Python m 1 5
Written examination / Programming Examination period / During
MMM014-350200 Introduction to Data Management with Python Lecture/Tutorial
assignments semester
Discovery Area 5
MRD004-CurTopDE Module: Current Topics in Data Engineering m 1
Current Topics in Data Engineering Colloquium Poster Presentation During semester
Career Area 5
MCA006-Commun Module: Communication and Presentation Skills for Executives m 1 2.5
MCA006-051464 Communication and Presentation Skills for Executives Seminar Oral presentation During semester
JTLA-xxx Module: Language 1 m 1 2.5
German is the default language. Native German speakers take modules in another offered language.
JTLA-xxx Language 1 Seminar Various Various me
Semester 2 27.5
Semester 3 32.5
Master Thesis 30
MMT003-MasterThesis Module: Master Thesis MSc DE m 4 30
MMT003-340003 Master Thesis
Total CP 120
1
Each lecture period lasts 14 semester weeks and is followed by reading and examination days. Written examinations are centrally scheduled during weeks 15 and 16. For all other assessment types, the timeframes indicated in the above table stipulate
the period during which module work has to be handed in or presented. Specific information on dates of topic announcement as well as submission deadlines is communicated in the syllabus which is made available to the students at the beginning of
each semester. Academic dates are published in the university-wide Academic Calendar (see http://www.jacobs-university.de/academic-calendar).
2
m = mandatory, me = mandatory elective
Elective Area
Students choose 15 CP of manadatory electives
Computer Science Track 20
MECS001-StatMod Module: Principles of Statistical Modeling me 2 5
MECS001-340101 Principles of Statistical Modeling Lecture Project Report During semester
MECS002-NetworkTheo Module: Network Theory me 1 or 3 5
MECS002-340212 Network Theory Lecture Written examination Examination period
MCO012-AdvDataBase Module: Advanced Databases me 2 5
MCO012-340152 Advanced Databases Lecture Written examination Examination period 2.5
MCO012-340153 Advanced Databases Lab Lab Lab project During semester 2.5
MECS004-ParDisCom Module: Parallel and Distributed Computing me 3 5
MECS004-30040 Parallel and Distributed Computing Lecture Written examination Examination period
Geoinformatics Track 10
MEGI001-GeoInf Module: Geoinformatics me 1 or 3 5
MEGI001-210213 Geo-Information Systems Lecture m 2.5
Term paper Examination period
MEGI001-210103 Introduction to Earth System Data Lecture m 2.5
MEGI002-GeoInfLab Module: Geoinformatics Lab me 2 5
MEGI002-210214 Geoinformatics Lab Lecture Term paper Examination period
Bio-Informatics Track 15
MEBI001-IntroSysBio Module: Introduction to Systems Biology me 2 5
MEBI001-550432 Introduction to Systems Biology Lecture Written examination Examination period
MDE-BIO-03 Management and Analysis of Biological and Medical Data me 1 or 3 5
MDE-BIO-03 Management and Analysis of Biological and Medical Data Seminar Oral Examination Examination period
Business & Supply Chain Engineering Track 10
MESC001-DataMin Module: Data Mining me 2 5
MESC001-340122 Data Mining Lecture Term paper (Project report) During semester
MCO008-DataAnaSCM Module: Data Analytics in Supply Chain Management me 1 or 3 5
MCO008-051008 Data Analytics in Supply Chain Management Lecture Term paper (Project report) During semester
Total CP 65
10
Methods Area
Students take "Introduction to Data Management with Python" in the first semester and choose 2 modules from the list below in semester 2 and 3.
20
MMM004-ModDynSys Module: Modeling and Control of Dynamical Systems me 2 5
MMM004-340103 Modeling and Control of Dynamical Systems Seminar Written examination Examination period
MMM005-ModSigProc Module: Modern Signal Processing me 2 5
MMM005-340153 Modern Signal Processing Seminar Oral presentation During semester
MMM007-NetBioMed Module: Network Approaches in Biology and Medicine me 3 5
MMM007-550443 Network Approaches in Biology and Medicine Lecture Oral presentation During semester
MMM008-ApplDynSys Module: Applied Dynamical Systems me 2 5
MMM008-110231 Applied Dynamical Systems Lecture Term paper (Project report) During semester
Remedial Courses (Methods Area) 10
MMM009-CalLinAlg Module: Calculus and Linear Algebra for Graduate Students me 1 5
MMM009-340181 Calculus and Linear Algebra for Graduate Students Lecture Written examination Examination period
MMM011-ProbabGS Module: Probabilities for Graduate Students me 1 5
MMM011-340171 Probabilities for Graduate Students Lecture Written examination Examination period
Total CP 30
Core Modules
11
Students may choose any combination of the modules listed below. Each track may be followed
completely and/or complemented with other modules (as necessary in case of the tracks with 10 CP).
In addition to the modules offered within these focus tracks, 3rd year modules from the undergraduate
curriculum or other graduate programs at Jacobs University can be taken with the approval of the
program coordinator. Please see CampusNet (https://campusnet.jacobs-university.de) for current
offerings.
To enhance flexibility, students may transfer modules between the Elective and the Methods Areas
(except for remedial modules) after consulting their academic advisor.
Elective Modules
Geo-Informatics Track
Bio-Informatics Track
12
Methods Area (15 CP)
In the Methods Area advanced concepts, methods and technologies of data engineering are
introduced with a view towards industrial applications. Students can choose freely from the modules
in this area. To enhance flexibility, students may transfer modules between the Elective and the
Methods Areas (except for remedial modules) after consulting their academic advisor.
Methods Modules
Within the Methods Area Jacobs University offers special remedial modules, which are recommended
to refresh knowledge or to fill knowledge gaps, preparing students to successfully take the Data
Engineering Core Area modules. Based on a placement test in the orientation week, the academic
advisor will propose which of the modules are useful depending on prior knowledge of the student.
13
Discovery Area (15 CP)
This area features in the first semester a Project Seminar introducing the students to Current Topics
and Challenges in Data Engineering, which is followed by two advanced projects in Data Engineering
in semesters 2 and 3, each of which is worth 5 CP. The projects can be done in the research groups
at Jacobs University or during internships at companies. The projects are supervised by Jacobs
University faculty.
Discovery Modules
14
Career Area (15 CP)
In this area students acquire skills to prepare them for a career as data engineers in industry.
Career Modules
15
3 Data Engineering Modules
16
Intended Learning Outcomes
• contribute knowledgeably to the current debate about big data, digitalization and industry 4.0;
• explain and discuss pros and cons of digitalization from a business perspective as well as a societal
perspective;
• perform a SWOT analysis on current big data initiatives;
• evaluate technological possibilities and innovations driven by big data;
• assess the business opportunities of current big data developments.
Indicative Literature
McLellan (2013): Big Data: An Overview
https://www.zdnet.com/article/big-data-an-overview/
S. Akter & S. Fosso Wamba, Big data analytics in e-commerce: A systematic review and agenda for future
research, 2016. Electronic Markets, 26 173-194.
Z. Lv, H. Song, P. Basanta-Val, A. Steed and M. Jo. "Next-Generation Big Data Analytics: State of the Art,
Challenges, and Future Research Topics," in IEEE Transactions on Industrial Informatics, vol. 13, no. 4, pp.
1891-1899, Aug. 2017.
Usability and Relationship to other Modules
▪ For DE: This module provides an overview on practical big data applications. The computational
details will then be studied in MDE-CS-04.
▪ For SCM: Concepts are applied in MSCM-CO-03 Trends & Challenges in Supply Chain Management.
Project management concepts taught in MSCM-CO-01 will be applied. Academic writing skills taught
in MSCM-CAR-01 facilitate the completion of the tasks in this module.
Examination Type: Module Examination
17
IT Law
18
Indicative Literature
Lloyd (2020). Information Technology Law. Oxford: Oxford University Press (9th ed).
▪ For DSSB students: It is one of the three Career modules (IT Law, Language III, and Ethics and the
Information Revolution) that can be chosen for replacement by the internship. Students need to
replace 10 CP for the internship.
19
Data Security and Privacy
20
Data Analytics
Module Components
21
In this module students will learn concepts and various techniques for data analysis. They will be rigorously
applied in MDE-CS-03 as well as in the applied projects MDE-DIS-02 and MDE-DIS-03, and typically also in
the master thesis.
Examination Type: Module Examination
22
Machine Learning
Module Components
▪ design, implement and exploit elementary supervised ML methods for classification and regression with
expert care given to dimension reduction preprocessing and regularization;
▪ understand and practically use PCA and linear regression;
23
▪ understand the core ideas behind feedforward neural networks and the backpropagation algorithm, as the
basis for accessing "deep learning" methods.
Indicative Literature
T. M. Mitchel, Machine Learning, McGraw-Hill, 1997, IRC: Q325.5.M58.
24
Data Visualization and Image Processing
Module Components
A. C. Telea, Data Visualization: Principles and Practice, Second Edition, A K Peters, 2014, ISBN,
9781466585263.
25
Usability and Relationship to other Modules
As this module introduces visualization techniques for data sets, it builds on courses introducing data systems,
particularly the Data Analytics module MDE-CO-02 and the Data Mining module MDE-BSC-01.
Examination Type: Module Examination
26
Data Acquisition Technologies and Sensor Networks
Module Components
▪ acquire data from different sensors and use a microcontroller to process them;
▪ transmit data from the microcontroller to a database on a server
▪ collect data from web browsers and transmit them to a database on a server
▪ visualize the data on computers or smart devices
▪ set up a wireless sensor network and communicate data among different components.
27
Indicative Literature
M. Kooijman, Building wireless sensor networks using Arduino: leverage the powerful Arduino and XBee
platforms to monitor and control your surroundings, Packt Publishing, 2015 ISBN:9781784397159
1784397156.
H. E Williams, D. Lane, Web database applications with PHP and MySQL, O'Reilly Media, 2004, ISBN:
0596005431 9780596005436.
Usability and Relationship to other Modules
This module offers the techniques of wireless acquisition of the data that will later be processed and analyzed by
techniques studied in the Data Analytics module MDE-CO-02, the Machine Learning module MDE-CO-04, and
the Data Analytics in Supply Chain Management module MSCM-CO-07.
Examination Type: Module Examination
28
Elective Area (15 CP)
29
Intended Learning Outcomes
Upon completion of this module, students will be able to:
30
3.2.1.2 Network Theory
31
▪ communicate in scientific language using advanced field-specific technical terms.
Indicative Literature
M. Newman, Networks an Introduction, Oxford Univ. Press, 2010, ISBN: 9780199206650.
A.-L. Barabasi, Network Science, Cambridge University Press, Cambridge, 2016, ISBN-10: 1107076269.
Usability and Relationship to other Modules
This course prepares for the courses MDE-CO-04 Machine Learning and MDE-CS-03 Principles of Statistical
Modeling.
Examination Type: Module Examination
32
3.2.1.3 Advanced Databases
N.A.
Content and Educational Aims
This course deepens knowledge and skills in managing and serving Big Data with emphasis on flexibility and
scalability. As a result of this course, students will know the state of the art in data management for particularly
large and complex data, including in cloud-based data setups. Based on the Data Engineering Core lecture Data
Management the course starts with a reinspection of classical SQL, preparing an overview of SQL query
processing. Based on this understanding opportunities of optimization and parallelization are discussed.
Subsequently, novel developments in Big Data services are discussed. NoSQL approaches with their new data
models are inspected, such as documents, graphs and arrays. This is contrasted with NewSQL and their novel
techniques for competitive performance. Dedicated architectures are discussed, such as MapReduce. This leads
to general scalability considerations, with an emphasis on large-scale parallel and distributed processing.
Throughout the course practical considerations play an important role, including practitioner hints on database
modeling, tuning, and security. Practical guided hands-on exercises complement this.
Intended Learning Outcomes
▪ Summarize the state of the art in data management for particularly large and complex data
▪ Establish criteria for selecting adequate scalable data management technology based on various criteria
▪ Establish a state of the art database schema for a given application scenario
▪ Tune a relational database for best performance on some given query workload
33
▪ Adequately consider security aspects in databases
▪ Develop applications using Web and database technology
Indicative Literature
McLellan (2013): Big Data: An Overview
https://www.zdnet.com/article/big-data-an-overview/
S. Akter & S. Fosso Wamba, Big data analytics in e-commerce: A systematic review and agenda for future
research, 2016. Electronic Markets, 26 173-194.
Z. Lv, H. Song, P. Basanta-Val, A. Steed and M. Jo. "Next-Generation Big Data Analytics: State of the Art,
Challenges, and Future Research Topics," in IEEE Transactions on Industrial Informatics, vol. 13, no. 4, pp.
1891-1899, Aug. 2017.
Usability and Relationship to other Modules
Pre-requisite Introduction to Data Management with Python.
Completion: To pass this module, the examination of each module component has to be passed with at least
45%.
34
3.2.1.4 Parallel and Distributed Computing
J.C. Daniel, Data Science with Python and Dask, Manning Publications.
35
Z. Radtka, D. Miner, Hadoop with Python. Hadoop with Python, O'Reilly.
36
Geoinformatics Track
3.2.2.1 Geoinformatics
Module Name Module Code Level (type) CP
Geoinformatics MDE-GEO-01 Year 1 5
(Elective)
Module Components
37
By the end of this module, students will be able to:
▪ design, implement and exploit elementary supervised ML methods for classification and regression with
expert care given to dimension reduction preprocessing and regularization;
▪ understand and practically use PCA and linear regression;
▪ understand the core ideas behind feedforward neural networks and the backpropagation algorithm, as the
basis for accessing "deep learning" methods.
Indicative Literature
The course is based on a self-contained, detailed set of online lecture notes.
Nevertheless, the following provides a good overview of the material covered:
P. A. Longley, M. F. Goodchild, D. J. Maguire, D. W. Rhind, Geographic Information Systems and Science, 2nd
Edition, Wiley, 2005, 560 p. ISBN 0470721448.
Jake VanderPlas, Python Data Science Handbook, 2016,
https://jakevdp.github.io/PythonDataScienceHandbook/.
▪ This module is a natural companion to the "Principles of Statistical Modeling" (PSM) module MDE-
CS-03.
▪ The ML module focuses on practical ML skills, whereas PSM module on rigorous mathematical
formalism and analysis.
▪ For students not familiar with graph theory, it is recommended to take the first semester course MDE-
CS-01 Network Theory, which introduces concepts used in this Machine Learning module.
Examination Type: Module Examination
38
3.2.2.2 Geoinformatics Lab
Module Name Module Code Level (type) CP
Geoinformatics Lab MDE-GEO-02 Year 1 5
(Elective)
Module Components
▪ design, implement and exploit elementary supervised ML methods for classification and regression with
expert care given to dimension reduction preprocessing and regularization;
▪ understand and practically use PCA and linear regression;
▪ understand the core ideas behind feedforward neural networks and the backpropagation algorithm, as the
basis for accessing "deep learning" methods.
Indicative Literature
J. VanderPlas, Python Data Science Handbook, 2016, https://jakevdp.github.io/PythonDataScienceHandbook/
B. Day, J. Bruner, A. Moser, Geospatial Data and Analysis, O'Reilly Media, 2017, ISBN: 9781491984314
39
Usability and Relationship to other Modules
40
Bio-Informatics Track
U.Alon, An Introduction to Systems Biology: Design Principles of Biological Circuits. Chapman & Hall/CRC,
2006.
B. O. Palsson, Systems Biology – Properties of reconstructed networks, Cambridge University Press, 2006.
41
Usability and Relationship to other Modules
N.A.
Examination Type: Module Examination
42
3.2.3.2 Modeling and Analysis of Complex Systems
(1) the dynamics of diseases such as HIV, (2) the microbial growth in batch and chemostat cultures, (3) the
dynamics of plankton ecosystems in the oceanic mixed layer, and (4) examples of life acting as a regulating force
at a planetary scale. In addition, the lecturer introduces Agent-Based Modelling techniques with applications to
cultural segregation problems and spatially explicit predator-prey interactions.
Intended Learning Outcomes
▪ independently design and develop models (from the basic conceptual aspects, to the mathematical
equations and the numerical code) for tackling problems in the natural and social sciences
▪ undertake numerical equilibria and stability analysis, to evaluate model performance, and to identify
uncertainties in model results.
Indicative Literature
The course is based on a self-contained, detailed set of online lecture notes and practical exercises.
43
Usability and Relationship to other Modules
N.A.
Examination Type: Module Examination
44
Module Name Module Code Level (type) CP
Management and Analysis of Biological and Medical Data MDE-BIO-03 Year 1/2 5
(Elective)
Module Components
Duration Workload
In the first sessions of the course, we define small research projects based on the selected databases. In the
rest of the course these research projects will be pursued in small groups and the results will be reported and
discussed.
Intended Learning Outcomes
Upon completion of this module, students will be able to:
1. identify and process a variety of data formats and data standards in biology and medicine
2. access and use the main bioinformatics databases
3. download and analyze diverse biological and medical data
4. derive research questions from scientific publications
5. apply concepts from data science to biological and medical databases.
Indicative Literature
45
Business and Supply Chain Engineering Track
▪ be able to implement and apply advanced data mining methods with appropriate tools
▪ be able to evaluate and compare the suitability, scalability and efficiency of different methods in
practical settings
▪ have gained experience in performing a full cycle of data mining and data analysis
▪ have acquired practical skills to tackle data mining problems
Indicative Literature
G. James, D. Witten, T. Hastie, R. Tibshiran, Introduction to Statistical Learning with R by Springer, 2013
(ISLR).
J. VanderPlas, Python Data Science Handbook, 2016 - https://jakevdp.github.io/PythonDataScienceHandbook/.
47
3.2.4.2 Data Analytics in Supply Chain Management
Module Components
Prof. Dr.-Ing. • MSc Supply Chain Management Mandatory elective for SCM
Hendro Wicaksono and DE
48
• apply methods and tools to collect and integrate data from different sources in the context of supply
chain management;
• apply machine learning and statistical analytics methods and tools to uncover hidden patterns,
correlations, trends, and knowledge that are useful for improving supply chain management processes;
• evaluate data analytics results in different scenarios and solve the problems that might occur throughout
the entire data analytics process, from data collection to analysis;
• develop deployment architecture concepts by integrating existing tools/software;
• develop business model and ecosystem concepts.
Indicative Literature
N.A.
49
Methods Area (15 CP)
None.
Content and Educational Aims
This module introduces data engineering students to the field of data management with Python. Data management
describes the vast field of methodologies to collect, store, process and provision data. The aim of this module is
to focus on a very applied view of these tasks. Since Python has become the de-facto standard in the field, the
initial part of the module is concerned with a basic introduction into core concepts of imperative programming in
Python. Data structures and fundamental algorithms are discovered in a hands-on fashion. These will also include
basic numerical and data analysis tasks based on NumPy/SciPy. One source from which we can collect and in
which we can store data are relational databases. The course introduces the Structured Query Language (SQL) to
get access to this data source. More recently, data is frequently stored in Data Frames, a data structure provided
by Pandas, a Python library. Pandas also provides functionality to carry out data analysis tasks. Provisioning of
data analysis outputs will be done by basic2D visualization techniques.
Intended Learning Outcomes
50
• understand and apply DataFrames and data analysis using Pandas
• visualize simple data by different types of 2D plots using Matplotlib
Indicative Literature
Jake VanderPlas, Python Data Science Handbook, O'Reilly.
Cay S. Horstmann, Rance D. Necaise, Python For Everyone, 3rd Edition, Wiley.
Usability and Relationship to other Modules
The course provides the necessary background knowledge to courses like “Advanced Databases” or “Machine
Learning”.
Examination Type: Module Component Examinations
Scope: All intended learning outcomes of this module excluding practical aspects.
Completion: To pass this module, the examination of each module component has to be passed with at least
45%.
51
Modeling and Control of Dynamical Systems
Module Name Module Code Level (type) CP
Modeling and Control of Dynamical Systems MDE-MET-04 Year 1/2 5.0
(Methods)
Module Components
H. Stark & J. Woods, Probability and Random Processes with Applications to Signal Processing, Westview Press,
2002.
Usability and Relationship to other Modules
Complementary to the machine Learning module MDE-CO-04 this module focuses on a theory-based design of
models. Such models, if available, are usually “smaller” and easier to parameterize.
52
Examination Type: Module Examination
53
Modern Signal Processing
54
• further develop their Matlab programming skills (or an equivalent programming language with sufficient
support of for mathematical libraries);
• gain a deeper and a modern understanding of crucial mathematical tools such as linear algebra (vectors
and matrices) and functional analysis (Hilbert spaces, inner products, basic calculus), in the context of
their application to data engineering.
Indicative Literature
P. Walk and P. Jung, Compressed Sensing: Applications to Communication and Digital Signal Processing,
Springer, 2019.
S. Oh, Matrix Completion: Fundamental Limits and Efficient Algorithms, Stanford University, 2010.
J. Dattorro, Convex Optimization and Euclidean Distance Geometry, Meboo Publishing, 2008.
I. Rish, G. Grabarnik, Sparse Modeling: Theory, Algorithms, and Applications, CRC Press, 2014.
55
Network Approaches in Biology and Medicine
Module Name Module Code Level (type) CP
Here, the application of network analysis to biology and medicine are discussed. In this module standard networks
considered in Systems Biology (gene regulatory networks, metabolic networks, signaling networks and protein-
protein interaction networks), in which each link corresponds to a specific biological process are discussed. It is
enhanced by the discussion of relational networks, which are capable of serving as very efficient sources of data
integration and interpretation: the diseasome, a network where a disease is linked to a gene, in which there is
data evidence relating the gene to the disease; and the drug-target network, where drugs and proteins linked by
drug-target associations.
In addition to standard review articles and textbooks on Network Science, material from recent scientific literature
is incorporated in the module.
Intended Learning Outcomes
▪ understand the basic principles of network science applications to Biology and Medicine;
▪ use and access the main bioinformatics databases to obtain biological networks;
▪ analyze biological networks;
▪ combine multiple data analysis tools for a comprehensive analysis of molecular data;
▪ describe in some detail essential facts and theoretical concepts derived from recent scientific literature;
▪ identify open questions from the scientific literature and synthesize information from the literature into a
scientific presentation.
Indicative Literature
A.-L. Barabási, Network science. Cambridge University Press, 2016.
56
Alon, U. (2007). Network motifs: theory and experimental approaches. Nature Reviews Genetics, 8(6):450–
461.
A.-L. Barabási (2012), The network takeover. Nature Physics, 8(1):14–16.
A.-L. Barabási, N. Gulbahce and Loscalzo (2011). Network medicine: a network-based approach to human
disease. Nature reviews. Genetics, 12(1):56.
Barabasi, A.-L. and Oltvai, Z. N. (2004). Network biology: understanding the cell’s functional organization.
Nature reviews. Genetics, 5(2):101.
Radde, N. E. and Hütt, M.-T. (2016). The physics behind systems biology. EPJ Nonlinear Biomedical Physics,
4(1):7.
Strogatz, S. H. (2001). Exploring complex networks. Nature, 410(6825):268.
57
Applied Dynamical Systems
Steven Strogatz, Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and
Engineering, Westview Press, second edition, 2014.
Usability and Relationship to other Modules
This module is complementary to the module MDE-MET-04 Modeling and Control of Dynamical Systems.
Examination Type: Module Examination
59
This module introduces and refreshes the essential Calculus and Linear Algebra required in most of the modules
of the data engineering program. There is a placement test offered in the orientation week before the start of the
first semester to help all students to find out if they need to take this remedial course.
Examination Type: Module Examination
60
3.4.6.2 Probabilities for Graduate Students
61
Examination Type: Module Examination
62
Discovery Area (15 CP)
63
Examination Type: Module Examination
64
Advanced Project 1
Module Name Module Code Level (type) CP
Advanced Project 1 MDE-DIS-02 Year 1 5
(Discovery)
Module Components
66
Advanced Project 2
Module Name Module Code Level (type) CP
Advanced Project 2 MDE-DIS-03 Year 2 5
(Discovery)
Module Components
67
Examination Type: Module Examination
68
Career Area (15 CP)
Language Skills
The descriptions of the language modules are provided in a separate document, the “Language
Module Handbook” that can be accessed from here: https://www.jacobs-
university.de/study/learning-languages
69
Academic Writing Skills/Intercultural Training
Fraedrich, J. & Ferrell, O.C. (2014): Business Ethics: Ethical Decision Making & Cases. Cengage Learning.
70
• understand labor conditions in Germany.
• understand the typical business cultures in German companies.
Indicative Literature
The literature is provided individually to each student by each instructor for the respective advanced project.
71
Communication & Presentation Skills for Executives
Module Name Module Code Level (type) CP
72
▪ collaborate effective in intercultural teams.
Indicative Literature
This course utilizes lecture formats, case studies and interactive
presentations, discussions, role play and peer-to-peer coaching. The course will also use internet resources,
videos, and home assignments to illustrate and practice specific communication aspects.
Usability and Relationship to other Modules
This module is recommended to be taken together with the elective modules in the Bio-Informatics track.
73
Ethics and the Information Revolution
Module Name Module Code Level (type) CP
The module pursues three goals. 1. Participants will immerse themselves and learn about core ethical theories.
2. They will integrate this theoretical knowledge and develop a “Big Data Ethics,” which they 3. will put into
practice. For the second and third purposes, in-classroom discussions and interactions are indispensable for
identifying possible dilemmas and conflict of interests and for balancing contradictions to derive practical
solutions and policy advice.
74
Usability and Relationship to other Modules
It is one of the three Career modules (IT Law, Language III, and Ethics and the Information Revolution) that
can be chosen for replacement by the internship. Students need to replace 10 CP for the internship.
Examination Type: Module Examination
75
Master Thesis (30 CP)
Module Name Module Code Level (type) CP
Master Thesis MDE-THE-01 Year 2 30
Module Components
• MDE-DIS-03
Advanced
Project II
Recommendations for Preparation
Read the Syllabus.
76
• writing a research thesis such that it could be submitted to a scientific publication venue, or as a project
report to a funding agency or industrial client;
• presentation of project results for specialists and non-specialists.
Indicative Literature
N.A.
Scope: Mainly presentation of project results but the presentation touches all intended learning outcomes
Completion: This module is passed with an assessment-component weighted average grade of 45% or higher.
77
4 Data Engineering Graduate Program Regulations
In exceptional cases, certain necessary deviations from the regulations of this study handbook
might occur during the course of study (e.g., change of the semester sequence, assessment
type, or the teaching mode of courses).
In general, Jacobs University Bremen reserves therefore the right to change or modify the
regulations of the program handbook also after its publication at any time and in its sole
discretion.
Degree
Upon successful completion of the program, students are awarded a Master of Science (M.Sc.)
degree in Data Engineering.
Graduation Requirements
In order to graduate, students need to obtain 120 CP. In addition, the following graduation
requirements apply:
78
5 Appendices
MCO015 – DataAcquiSens
MECS002 – NetworkTheo
MCO014 – DataVisImage
MCA002 – Language MA
Parallel and Distributed
MECS003 – DataComp
MEGI002 – GeoinfLab
MRD004 – CurTopDE
Advanced Databases
MRD005 – AdvProj1
MRD006 – AdvProj2
MEGI001 – Geoinf
Master's Thesis
MDE-BIO-03
Computing,
Semester 1 1 2 2 3 1, 2 or 3 2 1 or 3 1 or 3 2 3 1 or 3 2 1 2 1 or 3 2 3 2 1or 3 2 1 1 1 1 2 3 4 1,2,3 1,3 1,3 2
Mandatory/ optional M M M M M M O O 0 O O O O M O O O O O O O O O O M M M M M M M M
Credits 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 30 5 5 5 2.5
Competencies*
Program Learning Outcomes A E P S
ILO 1 critically assess and creatively apply
technological possibilities and innovations x x x x x x x x x x x x x x x x x x x
driven by big data
ILO 2 use sensors and microcontrollers to
collect data and to transmit them to
x x x x x x x
databases on servers or the internet in
general
ILO 3 set up and use databases to
efficiently and securely manage and access x x x x x x x x x x x x x x x x x
large amounts of data
ILO 4 apply statistical concepts and use
statistical models in the context of real-life x x x x x x x x x x x x x x x x x
data analytics
ILO 5 use, adapt and improve visualization
techniques to support data-based decision x x x x x x x x x x x
making
ILO 6 design, implement and exploit
various representations of data for
classification and regression including x x x x x x x x x x x x x x
supervised machine learning methods and
ILO 7idapply and
f d critically
l iassess data
acquisition methods and analytical x x x x x x x x x x x x x x
techniques in real life situations,
organizations and industries
ILO 8 independently investigate complex
problems and undertake scientific or
applied research into a specialist area x x x x x x x x x x x x
utilizing appropriate methods, also taking
methods and insights of other disciplines
ILO 9 professionally communicate their
conclusions and recommendations, the
underlying information and their reasons x x x x x x x x x x x x x x x x x x
to specialists and non-specialists both
clearly and unambiguously on the basis of
ILO 10 assess and communicate social,
scientific and ethical insights that also
x x x x x x x x x x x x x x x
derive from the application of their
knowledge and their decisions
ILO 11 engage ethically with academic, x x
professional and wider communities and x x x x x x x
actively contribute to a sustainable future
ILO 12 take responsibility for their own x x
learning, personal development and role in
x x x x x x x x x x x
society, evaluating critical feedback and
self-analysis
ILO 13 take on lead responsibility in a x x
x x x x x x
diverse team
ILO 14 adhere to and defend ethical, x
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
scientific and professional standards
Assessment Type
oral examination x x x x x
final written exam x x x x x x x x x
project x x x x x x x x x x x x x
essay x
lab report
poster presentation x
presentation
various x
*Competencies: A-scientific/academic
proficiency; E-competence for qualified
employment; P-development of
personality; S-competence for engagement
i i t
79