0% found this document useful (0 votes)
6 views

Development_of_a_Modularized_Undergraduate_Data_Science_and_Big_Data_Curricular_Using_No-Code_Software_Development_Tools

This paper presents a modularized undergraduate curriculum for Data Science that utilizes no-code software development tools to aid non-computer science majors in learning programming concepts. The study demonstrates that visual programming languages can enhance students' understanding and performance in data science, providing evidence-based recommendations for integrating these tools into formal education. The curriculum aims to reduce the learning curve associated with traditional programming languages, making data science more accessible to a diverse range of students.

Uploaded by

anuanamika0220
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Development_of_a_Modularized_Undergraduate_Data_Science_and_Big_Data_Curricular_Using_No-Code_Software_Development_Tools

This paper presents a modularized undergraduate curriculum for Data Science that utilizes no-code software development tools to aid non-computer science majors in learning programming concepts. The study demonstrates that visual programming languages can enhance students' understanding and performance in data science, providing evidence-based recommendations for integrating these tools into formal education. The curriculum aims to reduce the learning curve associated with traditional programming languages, making data science more accessible to a diverse range of students.

Uploaded by

anuanamika0220
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Received 19 June 2024, accepted 10 July 2024, date of publication 16 July 2024, date of current version 29 July 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3429241

Development of a Modularized Undergraduate


Data Science and Big Data Curricular Using
No-Code Software Development Tools
HARRY D. MAFUKIDZE 1 , ACTION NECHIBVUTE1 , ABID YAHYA 2 , (Senior Member, IEEE),
IRFAN ANJUM BADRUDDIN3 , SARFARAZ KAMANGAR 3 , AND MOHAMED HUSSIEN4
1 Department of Applied Physics and Telecommunications, Midlands State University, Gweru, Zimbabwe
2 Department of Electrical, Computer and Telecommunications Engineering, Botswana International University of Science and Technology, Palapye, Botswana
3 Department of Mechanical Engineering, College of Engineering, King Khalid University, Abha 61421, Saudi Arabia
4 Department of Chemistry, Faculty of Science, King Khalid University, Abha 61413, Saudi Arabia

Corresponding author: Harry D. Mafukidze ([email protected])


The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work
through Large Research Project under grant number RGP.2/127/45.

ABSTRACT Over the last decade, Data Science has emerged as one of the most important subjects
that has had a major impact on industry. This is due to the continual development of scientific methods,
algorithms, processes, and computational tools that help to extract knowledge from raw data efficiently
and cost-effectively, compared with early-generation tools. Professional data scientists create code that
processes, analyses and extracts actionable insights from high volumes of data. This process requires a
deep understanding of mathematical principles, statistics, business knowledge, and computer science. But
most importantly, the data science development chain requires knowledge of a high-level programming tool
and its dependencies. This is a major problem in some aspects due to the steep learning curve. In this
paper, we describe and present a modularized Data Science curriculum for undergraduate learners that
relies on no-code software development tools as programming aids for non-computer science majors.
No-code development tools have been added to the traditional teaching pedagogy to improve students’
motivation and conceptual understanding of coding despite their limited programming skills. The study
aims to assess the impacts of visual programming languages on the performance of non-computer science
majors on programming problems. The study’s sample consists of 50 fourth-year students from the Faculty
of Science and Technology at the Midlands State University. A post-survey questionnaire and assessment
items were administered to the control and experimental groups. Results show that the students drawn
from the experimental group benefited from the use of a visual programming language. These results offer
evidence-based recommendations for incorporating high-performance no-code software development tools
in the formal curriculum to aid teaching and learning data science programming for students of diverse
academic backgrounds.

INDEX TERMS Curriculum, data science, education, no-code tools, visual programming languages.

I. INTRODUCTION sources of data. Consequently, this has also pushed the


The recent technological advances in computing, coupled demand for specialists, analysts, and engineers, who develop
with the increase in the demand to process high volumes and maintain that code. An earlier report by McKinsey
of data have led to the development of computer algorithms Global Institute (MGI) in 2011 estimates that hundreds and
that extract information and knowledge from different thousands of data-related jobs will be required in the next
few years [1]. This means that there is an absolute need
The associate editor coordinating the review of this manuscript and to train more data scientists to meet this ever-increasing
approving it for publication was Martin Reisslein . demand. To democratize training in this field, especially in
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
VOLUME 12, 2024 100939
H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

universities, a high-powered delegation comprising 25 under- concepts in their final year projects. To assist our students,
graduate faculty members from a variety of institutions across we came across VPLs, that have helped them develop
the US met to develop a series of curriculum guidelines for data-related projects without actually worrying about the
an undergraduate data science course [2]. Drawn from the intricacies of coding.
three major disciplines; mathematics, statistics and computer The primary motive driving the present work is the need
science, the guidelines stipulate that a graduating student to reduce the data science learning curve for non-majors,
majoring in data science must be proficient in subjects such especially on the practical side, that is mainly characterized
as computational and statistical thinking, mathematics, model by heavy programming. Currently, minimum effort has been
building, algorithms, data modelling and communication. made to formally merge data science education with this new
Such guidelines define the skills that learners are supposed visual programming tool, despite their promising advantages.
to have after completion of the course [3]. As of now, the two fields have existed in parallel with each
As the report by MGI states, data is now a key asset for other, thereby failing to provide an adequate broad-based cur-
companies, and data analytics can improve the company’s key riculum required to support professional development in data
operations or help launch new business models to expand the science. This paper presents a data curriculum initiative using
markets. Considering this goal, there are two ways to achieve No-Code tools (NCTs). The curriculum has been designed
this: (i) engaging professional data scientists, or (ii) up- in such a way that the knowledge base, course structure
skilling existing professionals who are not data scientists with and content are similar, in principle to the curriculum that
the necessary data-based skills required to meet the needs is based on traditional programming languages. It should
of industry. Producing data scientists is straightforward, be noted, however, that the proposed curriculum has not
students would graduate with a major specialization in added or removed content from the existing data science
data science and are then deployed in various industries. curriculum, rather, the present work proposes the integration
On the other hand, developing data literacy skills in graduate of emerging and flexible programming environments in
students who do not have the pre-requisite programming data science education initiatives. We define the appropriate
experience may be challenging due to the steep learning teaching and evaluation methods that are suitable for this type
curve of text-based programming languages that have been of programming. The primary objectives of our work are to
traditionally used to teach or learn the practical aspect of integrate data science education with NCTs and accelerate
data science. However, a different programming paradigm data science education using such tools to reduce the
has been developed over the years. They rely on visual, amount of time required to up-skill a non-data professional.
drag-and-drop, no-code computer programming tools, where We summarize the key contributions of our work as follows:
instructions are encapsulated in blocks, instead of text-based
formal languages. Several blocks can then be connected 1) We describe key features and processes for visual
sequentially to solve a programming problem. As we can programming approaches.
expect, such no-code models offer several advantages, 2) We demonstrate the feasibility of no-code paradigms
especially to new learners. Mainly, it enables the learner to as a potential aid for programming in data science
focus on algorithm development, instead of struggling with education.
the intricacies of the structure or style of the programming 3) We evaluate the performance of visual programming
language. environments, and demonstrate that they are compara-
This work is motivated by our experiences in teaching Data ble with text-based programming languages in terms of
Science as a module in the Department of Applied Physics data science education.
and Telecommunications under the Faculty of Science and 4) We provide a survey on NCTs, as well as empirical
Technology at Midlands State University in Zimbabwe. evidence on the use of NCTs in education.
The Department offers a four-year Bachelor of Engineering 5) We demonstrate, through experiments, that NCTs are
Degree in Telecommunication Engineering, where Data enablers of student success, especially non-computer
Science is offered in the final year. In the module, students science majors in data science programming.
learn the fundamentals of data science as well as the
applications of data science and big data in various domains. This paper is organized as follows; Section II describes
The Department also offers a four-year Bachelor of Science the data science curriculum initiative. The need for a
Degree In Industrial Physics and Instrumentation. Of late, supplementary data science programming tool and the current
data science has emerged as students’ favourites, with the state of visual programming and data science education are
majority of them implementing data science concepts in discussed. Furthermore, empirical evidence on the use of
their final year projects. However, we realized that although VPLs in formalized educational environments is presented.
our students have a strong Mathematical background, the In Section III, we discuss the data science topics that can be
majority of them, especially those from BSc Industrial implemented using VPLs as well as develop assessment items
Physics and Instrumentation often struggle with developing for Python and visual programming languages. Chapter IV
practical computer code. As a result, this has affected discusses the teaching methods suitable for visual program-
the majority of students who intend to apply data science ming languages and Chapter IV discusses the assessment

100940 VOLUME 12, 2024


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

methods. Results of the experiments conducted are presented


in Chapter VI. Finally, we conclude the paper in Chapter VII.

II. THE DATA SCIENCE CURRICULUM INITIATIVE


This Chapter addresses the major concerns raised in the
literature about existing coding practices in data science
education and discusses the goal of integrating visual
programming languages in coding data science algorithms. FIGURE 1. The data science work-flow [8], [9]. The data science process
Next, the section presents the basic architecture of VPLs, starts with the generation of data. This could be raw, real-world data or
as well as key competencies students must have in Data synthetic data that is artificially created using mathematical models or
simulations to mimic real-world data. Next, according to Wing et al., not
Science. Lastly, the section concludes by offering empirical all data generated will be used [8], so the following process concerns
evidence on the use of NCTs/VPLs in Education. systematically gathering of information from various sources. The
collected data is then processed. This may include conducting preparatory
steps on the raw data before analysis or modelling. Next, the data goes
A. THE NEED FOR A SUPPLEMENTARY DATA SCIENCE through several stages, such as data cleaning, storage, transformation,
analysis, and visualization, all of which enable extraction of valuable
PROGRAMMING ENVIRONMENT insights from the data.
Data science is usually offered as an independent degree
spanning a few years. The course features specializations
in subjects such as data structures, algorithms, statistics, using no-code software platforms. Potential tools include
computer science, machine learning, and mathematics. There (1) Orange developed by Demsar et al [5], (2) KNIME
is a need, however, to present this course as a condensed from KNIME AG company [6], and Neural Designer from
module to students specializing in different fields such as Artelnics [7].
Physics, Engineering, Telecommunications or any field of In this work, we present programming workflows imple-
specialization to make data-driven decisions without having mented in Orange [5], as well as a scaffolded project that
extensive computer programming skills. To support non- can be implemented using NCTs. This project highlights
computer science majors in this field, special considerations the data-science life cycle as described by Wing [8] and
must be made to teach these students to apply a range of Zhang et al. [9], as shown in Figure 1, and it takes learners
data science technical skills in their areas of specialization. through a practical-based learning experience, to assist them
In short, this study contributes to the development of the in discovering patterns in the data, hence making valuable
objectives, subject content, teaching methods, and evaluation insights.
instruments, as well as effective data science learning
outcomes for non-computer science audiences. Further,
B. CURRENT STATE OF VISUAL PROGRAMMING AND
we explore activities and learning strategies that demonstrate
DATA SCIENCE EDUCATION
the data science workflow using visual tools. This requires
1) VISUAL PROGRAMMING LANGUAGES
simplifying the domain problem, knowledge of data formats,
and the appropriate analytical techniques. To achieve this, the We intend to address the data science skills gap that exists,
following three major questions arise: especially during the current era of the rapid growth of data-
driven industries. We address this gap by outlining how an
1) Do non-computer science learners and professionals emerging model of no-code computer programming tools
find no-code development tools helpful? can support data science education, and help undergraduate
2) Do visual programming tools provide the necessary learners extract information from data in their relevant field.
tools to equip learners with the knowledge and skills Additionally, we hope that the uptake of such NCTs will help
to solve data science problems? in up-skilling existing data science tutors and professionals
3) How should test items be structured to assess and alike in data-related roles in various industries. Determining
evaluate data science education based on NCTs? the strengths, limitations and the future of using such NCTs
The goal of this project was to define a set of no-code will help us to achieve our objective. In this section, we survey
data science curriculum structures and guidelines to be the literature for visual programming languages since they are
incorporated into current and future undergraduate modules the same as NCTs, but different nomenclature.
to support effective skill acquisition and analytical thinking A central feature of NCTs is that algorithms or functions
for non-computer science majors. The proposed work is are encapsulated into containers called widgets. These are
intended to address key unknowns such as what to teach the basic building blocks of the data analysis pipeline, and
and how to teach data science using no-code development they can be connected to create a visual workflow, as shown
tools. We examined a set of work-flows on five key areas in Figure 2. Although the appearance of a widget may vary
of data science summarized by Hastie et al. [4] as data depending on the platform, their structure typically consists
acquisition and storage, data pre-processing and cleaning, of the following components, as shown in Table 1:
exploratory data analysis (EDA), predictive modelling and The classical definition of NCTs has been described
machine learning, data visualization and communication by early pioneers in the context of Visual Programming.

VOLUME 12, 2024 100941


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

TABLE 1. Components of a widget. form, whereby parameters are added or changed using
drop-down menus or windows. Diagram-based languages
enable end-user developers to connect basic shapes such
as rectangles, parallelograms, circles, etc by arrows, lines
or connectors to represent programming constructs. On the
other hand, icon-based languages rely on the use of icons
to visualize the organization, design and flow of a program.
Lastly, block-based languages enable end-user developers to
drag and drop components of a program in the form of blocks.
Several blocks can then be connected to define program
flow. In general, using visual or graphical expressions
as a way of writing computer programs greatly reduces
syntax errors, and is easily comprehended by users of
diverse backgrounds, since the human visual system and
visual information processing is greatly optimized for multi-
dimensional data [10].
Many VPLs have been developed in the literature.
However, a large number of those do not possess the features
of a true VPL. Although there is no consensus on what
makes up a complete VPL [15], certain criteria must be
met first to be considered a VPL. It is generally agreed that
the language of a visual programming environment must be
able to convey meaningful information for programming,
rather than just cosmetic graphics [12], or rich graphical user
interface features. To extend the criteria, Burnett and Baker
develop a classification scheme for VPL research papers [14].
FIGURE 2. Basic layout and connection of widgets in NCTs. Two widgets In their work, they highlight a set of important features of
are connected sequentially via a communication link that relays a VPL, which can then be used for comparison. A detailed
information between the two. In this workflow, widget A processes the
data, and transmits it to the next widget connected to it through its criteria is presented by Kiper et al. [15]. They suggest that
output port. The input of widget B receives this data, processes it further, VPLs can be assessed based on visual nature, functionality,
and transmits using its output port. Depending on the configuration,
multiple widgets could be connected to one widget to reveal several ease of comprehension, paradigm support, and scalability.
instances of the data. Although this field has received considerable interest in
the past, work is ongoing to address inherent problems that
have plagued early generations of visual programming tools.
In 1986, Myers defined this as any system that could be In the past, researchers had faced difficulty developing large
programmed by the user in 2D or multiple dimensions [10]. programs or processing large datasets [12]. This problem
This requires environments that use graphical techniques to has been solved by recent advances in computer graphics,
aid the entire processes of programming and developing com- abstraction and cloud computing. It is now possible to fit
puter applications [10], [11]. Unlike conventional text-based a lot of blocks, icons or diagrams on the same computer
languages, visual programming is motivated by the ideology screen. Users are now able to navigate large programs
that graphical techniques, particularly pictures, can convey through the use of multiple sheets. The wide uptake of cloud
more information concisely, compared with 1-dimensional computing, and related services has played a crucial role in
linear programming languages [11]. In addition, pictures processing large datasets efficiently, and hence improves the
can break the language barrier, simplifying the process functionality of visual programming tools. In other spheres,
of programming for users regardless of the language they early researchers have cited a lack of functionality as another
speak [11]. In summary, graphical tools provide two things; major drawback of early visual programming tools. Indeed,
(i) a visual environment for programming, and (ii) a language even some modern VPLs are hindered by this problem.
interface for expressing visual information flow [12]. These Besides the lack of functionality, novice programmers may
are some of the predominant factors that have influenced the face limited or no room at all to develop and integrate
development of no-code programming languages. customized program elements due to; proprietary software
To establish a common understanding, Kuhail et al. and the steep learning curve of the tool. Another aspect that
[13] combined Myers [10] and Burnett and Bakers [14] characterized early generations of VPLs is inefficiency. This
taxonomies to divide existing VPLs into four broad areas: was a major challenge due to slow program execution [12],
form-based languages, diagram based, icon-based, and block- and large memory requirements. However, this is no longer
based. According to the authors, form-based programming a problem, nowadays, as most tools leverage web-based
allows end-user programmers to configure a graphical online environments to deliver programming tools with high

100942 VOLUME 12, 2024


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

TABLE 2. Some of the widgets that can be used to build programming


workflows in the Orange software.

FIGURE 3. Screenshot of the Orange Development Environment showing


widget repository and workspace. The workflow shown here presents a
scatter plot of the dataset provided by the file widget.

computational performances. Most platforms are developed


using high-level programming languages such as C/C++
which are widely known for managing memory capacity well.

a: PERFORMANCE OF VISUAL PROGRAMMING LANGUAGES


A key question that arises is the need to evaluate the
performances of VPLs against set metrics. A deficiency in
the literature, however, is an apparent lack of benchmarks for
comparing VPLs against text-based programming languages.
This is mainly due to the different architectures, and operating
principles of the programming environments. As already
discussed, VPLs can be assessed based on attributes such as
(i) visual nature, (ii) functionality, (iii) ease of comprehen-
sion, (iv) paradigm support, and (v) scalability [15]. There
is a need, however, to develop cross-platform evaluation
metrics to assess how VPLs fair against well-established and data science workflow. Some of the widgets that can be used
well-supported high-level programming languages such as are shown in Table 2. Several VPLs exist, and to select the
C/C++, Python, JAVA, MATLAB etc. On the other hand, ‘‘best’’ VPL, Dobesova et al. conducts a comparative study
it is possible to generate or export text-based code such as to evaluate the performance of Orange software in teaching
Python or C/C++ from VPLs and run on supported hardware machine learning tasks in the Department of Geoinformatics
or platforms just like any program written natively in that at Palacký University Olomouc [16]. They show that the
code. This also paves the way for students to transition from graphic representation of the program workflow, as well as
VPL-based to text-based programming. the design procedures in the Orange application, is simple to
These, and many more improvements, especially in use and its visual language is semantically transparent [16]
graphical techniques and data management, have motivated compared with others. Multiple works in the literature
the uptake of such tools in modern software development, and corroborated their statement [17], [18], [19], [20], and this
we believe that they will be the cornerstone of data science motivated our work to apply Orange as a programming aid
education for years to come. to our final-year students. An example workflow is shown in
Figure 3.
2) ORANGE AS A VISUAL PROGRAMMING ENVIRONMENT
FOR DATA SCIENCE EDUCATION 3) DATA SCIENCE EDUCATION
The developers describe the Orange software as fruitful The concept of data science is not entirely new, in fact, the
and fun, offering a visual programming environment that evolution of data science can be attributed to the interdis-
facilitates the implementation of the entire data science ciplinary integration of various subjects such as statistics,
programming chain with only a few steps. Aided by a vast mathematics, computer science and domain knowledge [21].
library of functions encapsulated in graphical blocks called To put into perspective, Conway [22] drafted what is to be
widgets, Orange allows users to drag and drop widgets known as the ‘‘data science Venn diagram’’, as shown in
from different categories such as Data, Transform, Visualize, Figure 4. This is a diagrammatic representation illustrating
Model, Evaluate, and Unsupervised, among others to build a the interdisciplinary nature of data science.

VOLUME 12, 2024 100943


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

that can be studied or taught in a data science course: (i)


data exploration and preparation, (ii) data representation and
transformation, (iii) computing with data, (iv) data modelling,
(v) data visualization and presentation, and (vi) science about
data science [27].
The curriculum of data science courses offered by many
educational institutions throughout the world often revolve
around these activities. They may be different in wording or
nomenclature, but the fundamentals do not differ. Further,
Hazzan et al. suggested that such data activities promote
computational thinking, statistical/mathematical thinking,
and data thinking, which all have cognitive abilities [29].
In the report, ‘‘A Guide to Teaching Data Science’’, Hicks et
al. present the five basic guiding principles for developing a
data science curriculum [26]. They suggest that: (i) the course
must be organized around a set of diverse case studies, (ii)
FIGURE 4. Data science Venn diagram. The field of data science lies at the computing must be integrated into every facet of the course,
intersection of three primary fields; domain knowledge, computer
science, and mathematics & statistics [22]. (iii) reliance on mathematical notation must be minimized,
while promoting abstraction, (iv) the course activities must be
structured to mimic a data scientist’s experience, and lastly,
(v) the importance of critical thinking must be demonstrated
From the Venn diagram in Figure 4, it is quite evident that through examples [26].
the three fields existed independently before the introduction These guiding principles and activities have formed the
of the data science nomenclature. It is only after the efforts cornerstone of data science education for years. They have
of several authors such as John Tukey [23], Chambers [24], been able to develop data literacy among students to gain
Cleveland [25] among others, who called for the establish- actionable insights and extract meaning from data. However,
ment of an interdisciplinary field that was based on the In our view, one aspect that is often neglected in this process
expansion of statistics and integration with computer science is the challenge of developing computational thinking skills
and domain knowledge [26], [27]. among students. The challenge recognized here is the need to
We have witnessed a rapid increase in the field of data simplify the development of computer code that supports the
science over the past few years. The explosion of data, entire processes of data analysis, especially for non-computer
advances in technology and the demand for data science science majors. Teaching programming has traditionally
professionals are some of the major drivers behind this rapid relied on text-based languages, which has been cited by
increase [28]. To bridge the gap between professional data many novice programmers as one of the barriers to entry
scientists and the data deluge, there have been concerted in programming [30], and consequently data science. Here,
efforts along two fronts to expedite data analytic training. we believe that using engaging learning environments, such
First, traditional educational institutions such as colleges and as NCTs, for programming will make developing computer
universities throughout the world have introduced several code manageable to a larger number of students, while still
specializations in data science at various levels of education. developing key competencies in data science as text-based
Apart from the mainstream formal data science education, programming languages.
there has also been a significant push through massive online
open courses (MOOCs). The latter model offers several
online data-related programs, often for free, in collaboration 4) EMPIRICAL EVIDENCE ON THE USE OF NCTS/VPLS IN
with tutors at major universities. This ensures customized and EDUCATION
flexible learning modes, which greatly appeal to learners who This section seeks to establish the usefulness of NCTs to
are already working and might not have the time to commit early-stage learners and professionals. Then determines if
to a full-time or part-time formal educational system. Despite VPLs provide the necessary tools to equip learners with
the learning mode, every graduating data scientist must be the knowledge and skills required to solve programming
proficient with certain skills and knowledge, what we shall problems. Finally, the section establishes the effectiveness
refer to in this report as ‘‘key competencies in data science’’. of NCTs in facilitating the understanding of complex pro-
gramming concepts. To address these questions, we conduct
a: KEY COMPETENCIES IN DATA SCIENCE a short survey to review works that have implemented VPLs
Studies on the core competencies required for the graduating in any formal educational activities, from junior to tertiary
data scientist are scattered throughout the literature. However, education.
the work by Donoho, in their report on the 50 years of A significant effort to document the benefits of visual
data science articulates unambiguously the six data activities programming languages, compared with textual languages

100944 VOLUME 12, 2024


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

is reported in [30]. In their work, Kelleher et al. survey In perhaps an interesting and related application area,
a variety of graphics-oriented programming languages that Estevez et al. introduced Artificial Intelligence to high-school
can be used for different application areas. They cite syntax students using Scratch. Their work is motivated by a strong
and program style as primary barriers to programming and need to grasp the attention of young learners through the use
demonstrate that simplifying the mechanics of programming, of interactive graphical programming tools in computational
especially for novice programmers, greatly reduces the sciences, which is usually characterized by a lack of appeal
barrier to learning to program. Later in 2007, the authors of the presentations [35]. In their work, they teach young
considered using storytelling to motivate programming [31]. learners two basic methods of Artificial Intelligence: data
In their later work, they attribute a falling interest in clustering, and artificial neural networks learning. They
Computer Science in the US to the uninspiring courses taught. selected 37 students and followed a scaffolded teaching
To inspire middle-school girls’ interest in programming, approach, where they provided a pre-built template for the
they use the Storytelling Alice programming environment to students to fill the gaps among lines of code. Just like
create custom 3D animated movies from in-built characters the authors in the previous work, Estevez’s approach also
and environments. Their results show most of the participants conducts a formative and summative assessment of the target
were able to create a sequential program in Storytelling Alice group. A quantitative analysis of the results reveals that the
within just two hours of programming, while 87% were students acquired confidence to understand the fundamentals
able to create a program with multiple flow control mech- of Artificial Intelligence algorithms, and its holistic view.
anisms. Based on these results, the authors concluded that A case study by Ase et al. to teach engineering modules
offering computer programming in the form of storytelling using computer-aided animations was conducted at the
encouraged the target group to learn to develop computer University of Hertfordshire. The study focused on applying
programs. visualization and 3D animations in automotive engineering
To support self-directed learning among young learners in courses to help improve conventional teaching resources.
developing computer programs, Maloney et al. developed the They explore the benefits of automation, with particular
Scratch programming language and environment [32]. Mal- emphasis on 3D computer-aided animation tools for auto-
oney’s programming environment allows students, especially motive studies. This is an innovative paradigm shift from
primary ages to create engaging projects such as animated the conventional methods of delivery that are based on 1D
stories, games and simulations [32]. A distinguishing feature flowcharts, schematics and static objects. In their results,
of Scratch is that program flow is constructed sequentially they report that over 61% of the students reported a better
by joining together building blocks that represent actions or understanding of automotive engineering modules taught
flow control mechanisms. Its primary goal is to introduce using animations.
programming education to learners who have little or We have provided empirical evidence where VPL tools
no programming experience. This goal has motivated the have been applied successfully in education. Results show
worldwide use of the tool, and by 2010, the program had that such programming environments are helpful to early-
been offered in nearly fifty languages and had supported stage learners, they have the necessary tools to foster learning.
almost two million users. By January 2024, the number of Furthermore, it has been shown that such tools grasp the
subscribers has risen to over one hundred million registered attention of learners, promote their motivation to learn,
users [33], signifying the importance of graphical-based and improve their learning experiences, without focusing
teaching of computer programming. The program has been on the mechanics and intricacies of programming, which
so popular that it has been incorporated into formal education have been shown to be a barrier to programming. With
streams, targeting early-stage programmers in different these results, we are convinced that such tools can also be
fields. integrated with data science education to improve problem-
A pilot study by Friss et al. at ORT Uruguay University solving, creativity, motivation, collaboration and data science
during the 2nd semester of 2007 experimented with Scratch in communication at the tertiary level.
two scenarios. They incorporated Scratch in; (i) a university
course and (ii) vocational studies environments to improve III. DATA SCIENCE PROGRAMMING USING NCTs
students’ capabilities in computer science courses [34]. The following Chapter discusses the sections of the data
In their work entitled Scratch: Applications in Computer science curriculum and develops section-specific assessment
Science 1, the authors conduct formative and summative items for evaluating Python coding against NCT workflows
assessments on a group of students who were randomly as shown in Table 4 – Table 12.
selected from the class. They administered scratch, over three It should be noted, however, that the chapters for this
weeks with the control group solving the same programming proposed curriculum have been adapted from the conven-
tasks manually. For their results, 88% of students who had tional data science curriculum, as found in modern textbooks
used the Scratch programming environment described their such as [4] and [36], and teaching has been modified to
learning experiences as ‘‘motivating’’ or ‘‘easy’’, while 80% support data science education using no-code programming
of the control group described their learning experiences as environments. This was done to ensure learners would be
‘‘normal’’ or ‘‘difficult’’. exposed to the same content that is offered in a conventional

VOLUME 12, 2024 100945


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

TABLE 3. Overview of the proposed curriculum: The proposed data science curriculum using NCTs, including the aims, knowledge area and learning
objectives in nine chapters. The design follows a typical data science structure for majors, except that the practical component does not rely on textual
programming.

data science curriculum while being taught in dynamic and TABLE 4. Assessment items for Chapter 1.
interactive learning environments. This curriculum can also
be used by tutors who want to study introductory data science
using such tools.

A. CHAPTER 1: INTRODUCTION TO PRACTICAL DATA


SCIENCE AND NCTs
As shown in Table 3, the proposed curriculum begins with an
overview of practical data science approaches using NCTs.
This introduction takes learners through the fundamental
concepts of the application of data science and gives a Here, the main emphasis is on introducing NCTs to the
detailed discussion of the working principles of NCTs. learners and installing the software locally on their machines.
Additionally, this introductory part discusses the importance
of data science in industry, the opportunities, challenges, B. CHAPTER 2: DATA COLLECTION AND PREPARATION
future and the impacts of data science education using highly The following chapter is by far the most critical. In fact,
abstractive frameworks. The learning objective of the first some texts in the literature cite this phase as the most
module is to understand the most popular NCTs, in addition time-intensive [37]. The data collection and preparation
to basic knowledge of data science and its applications, phase requires a substantial investment of time, money, and
especially in the student’s application area. At the end of the resources due to the intricacies of data, planning and design
first chapter, new learners must have developed a thorough processes, and ethical and legal considerations. It should be
understanding of data science, and its application areas, along noted, however, that data collection is not platform-specific.
with the knowledge of visual tools for developing data-centric It is the same across different platforms. On the other hand,
frameworks. data preparation is different. It requires extensive knowledge
Table 4 shows some of the assessment items for the of the data structure and procedures to sort, and prepare it for
introduction to practical data science and NCTs chapter. subsequent stages.

100946 VOLUME 12, 2024


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

TABLE 5. Assessment items for Chapter 2. TABLE 6. Assessment items for Chapter 3.

Therefore, the curriculum for data collection and prepa- techniques [44]. However, it is in the 20th Century that data
ration focuses on students understanding the various data visualization rose to prominence due to major developments
types, formats, and pre-processing stages that are performed in computer graphics, technology, scientific visualization,
on datasets using NCTs, especially very large datasets. personal computers, and software tools [43]. The continual
The starting point of any data-related problem is the developments in software tools into the 21st Century,
collection of usable, representable and unbiased data. This especially open-source tools, enabled users to create custom
is a critical process that requires (1) a prior understanding of visualizations using simple, yet powerful data visualization
the problem at hand, (2) formulating research questions, and libraries such as Pandas, Matplotlib and Seaborn [45], among
(3) a thorough comprehension of the subsequent objectives of others.
the data analysis [38]. Several authors in the literature discuss The objectives of the data visualization stage in the
various principles and procedures that must be followed to data processing pipeline vary depending on a lot of fac-
ensure data integrity, however, that is beyond the scope of tors. However, any data-proficient student must be able
this work. A comprehensive overview is provided by [39]. to effectively communicate insights and findings, support
In this chapter, we will only focus on working with the data informed decision-making, and identify patterns, trends, and
that has already been collected. However, the rule of thumb correlations derived from the data analysis, irrespective of the
is to ask ‘‘What data?’’. Navigating this space will require platform. NCTs support data visualization through the use of
identifying the relevant data sources and planning the data widgets that create visual elements such as scatter plots, line
collection and processing methods. On the contrary, research plots, histograms, bar charts, and heat maps, among others.
shows that a lot of students and novice data scientists often At the end of this chapter, learners must be able to confidently
struggle with this part [40], hence the use of NCTs to simplify create interactive and informative visualizations that (1)
the data collection and preparation process. facilitate the understanding of relatively straightforward or
Table 5 shows some of the assessment items for the data complex data and (2), provide a holistic view of the data, thus
collection and preparation chapter. The primary focus is to facilitating more informed knowledge discovery.
take learners through the processes of collecting data and Table 6 shows some of the assessment items for the data
preparing it for the subsequent steps. visualization chapter. Here, the focus is on using relevant
The learning outcomes of this chapter have been developed widgets to visualize relations in the data.
as follows: (i) identify the sources of data for a particular
project, (ii) evaluate the reliability of data sources and the data D. CHAPTER 4: UNSUPERVISED LEARNING
collection procedures, (iii) collect that data from different The subject of unsupervised learning rose from a strong
sources and (iv) clean and pre-process the data to detect and need to detect anomalies or discover hidden structures or
handle inconsistencies such as missing values, and outliers. trends in unlabeled datasets. A central feature of these
algorithms is that they do not require prior knowledge
C. CHAPTER 3: DATA VISUALIZATION or output labels of the datasets. In other words, they do
Data visualization is considered to be one of the most not require training datasets to learn data dependencies,
important topics in data science [41]. As such, a lot of empha- instead, they learn features on their own from uncategorized
sis has been placed on the development of programming data on the fly. This chapter aims to study a range of
languages, visualization libraries and frameworks to enable unsupervised machine learning algorithms for clustering such
data-driven decision-making. Recent efforts are in huge graph as K-means, hierarchical, and density-based spatial clustering
visualization with big data infrastructure [42]. of noisy datasets. The chapter proceeds with a discussion
The concept of data visualization can be traced back on dimensionality reduction techniques. These are a set of
centuries to ancient Greek mathematicians who utilized algorithms that reduce the number of variables or features,
latitude and longitude information to visualize geographic creating a lower dimensional representation of the dataset.
information [43]. Subsequent developments in coordinate Principal component analysis is widely used for this. Lastly,
systems and Cartesian graphs by scientists, mathematicians the chapter explores algorithms for anomaly detection. This
and philosophers in the 17th Century are widely considered is a very critical and broad area that has witnessed significant
to have laid the foundations of modern data visualization research over the years due to its capability of detecting

VOLUME 12, 2024 100947


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

TABLE 7. Assessment items for Chapter 4. TABLE 8. Assessment items for Chapter 5.

unusual or inconsistent data points in various applications.


These algorithms can be applied in areas such as network
security, fraud detection, and quality control, among others
to distinguish between normal and unexpected behaviour.
Finally, at this stage, we are only interested in three learning
outcomes for this chapter. Students must; (i) understand
unsupervised learning algorithms, (ii) apply unsupervised
learning models on real-world data and (iii) evaluate the
performance of unsupervised learning algorithms.
Table 7 shows some of the assessment items for the
unsupervised learning chapter. Learners can work with
different widgets such as k-means, DBSCAN, and PCA,
among others, to identify distinct clusters in the dataset.
important practice that involves creating or selecting better
E. CHAPTER 5: SUPERVISED LEARNING representations or informative features in the dataset for
Unlike unsupervised learning algorithms that do not require machine learning algorithms. A major advantage of selecting
prior knowledge of the data, there exists another learning features that carry more information is that the overall
paradigm that requires knowledge of the class labels to make performance of the machine-learning model is improved.
predictions or decisions. Supervised learning algorithms In addition to that, the overall computational cost and
require a lot of training data that consists of input features and memory requirements are reduced due to an overall reduction
their corresponding output labels. The goal is to put related in the number of variables. This is another research area
objects with identical features in the same class or label. The that has received considerable attention, and progress has
algorithms then learn these features or study the patterns or led to the development of algorithms that use computational
relationships between members of the same class. Training is methods for automatic feature engineering and selection.
realized through an iterative process of adjusting the model’s On another front, deep neural networks learn the relevant
parameters based on minimizing the training error. Several features automatically, and this has wide applications in
supervised learning algorithms have been developed in the image, video or speech processing. Although automated
past, and some of the most popular include neural networks, feature engineering has shown considerable success in the
linear regression, support vector machines, and decision literature, there is a need for data science learners to grasp the
trees, among others. After model training and evaluation, the background knowledge of such an important aspect. As such,
model is now ready to make predictions on new or unseen the goal of this chapter is to equip students with the skills of
datasets. It is a common goal for a well-trained model to make creating and selecting valuable features from the data, along
accurate predictions on new datasets. The overarching aim with feature transformation and scaling using NCTs.
of this chapter is to take students through the entire process Table 9 shows some of the assessment items for the feature
of training, testing/validating, making predictions/inferences engineering and scaling chapter. The questions ask learners
and deploying supervised learning algorithms using NCTs. to create new features from the dataset.
Table 8 shows some of the assessment items for the
supervised learning chapter. These questions test the student’s G. CHAPTER 7: MODEL EVALUATION AND DEPLOYMENT
ability to create models that learn from the data. After model development, a natural question that typically
follows is ‘‘How good is that model?’’ Although this is a
F. CHAPTER 6: FEATURE ENGINEERING AND SCALING trivial question, there is a strong need to know if the model can
The next chapter in the curriculum covers feature engi- make accurate predictions on unseen data. If poorly trained,
neering and selection using no-code tools. This is another machine learning models tend to memorize the training data

100948 VOLUME 12, 2024


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

TABLE 9. Assessment items for Chapter 6. TABLE 11. Assessment items for Chapter 8.

TABLE 10. Assessment items for Chapter 7.

TABLE 12. Assessment items for Chapter 9.

very well, leading to poor predictions on new samples. This


is a huge problem, especially in safety-critical applications,
or the development of new algorithms. Answering this
question requires a scientific methodology that addresses a
specific question: ‘‘How do we evaluate the generalization
performance of machine learning models? Table 11 shows some of the assessment items for the time
The next chapter in the curriculum discusses concepts of series analysis chapter. The questions determine the student’s
model evaluation and deployment using NCTs. In simplest ability to extract temporal information from the data.
terms, model evaluation refers to the process of determining
the performance of models using certain metrics or other I. CHAPTER 9: INTRODUCTION TO MACHINE LEARNING
approaches. Typical evaluation metrics include but are not AUTOMATION
limited to accuracy, precision, recall, and F1 score [46]. At the The last component in this curriculum deals with machine
end of this chapter, students must be able to assess machine learning automation. This is a particularly interesting subject
learning models using various metrics and techniques, along that uses computer algorithms and tools to perform tasks
with deployment in real-world scenarios. such as data preparation, feature engineering, best model
Table 10 shows some of the assessment items for the model selection, training, and hyperparameter tuning, among others.
evaluation and deployment chapter. This came from an urgent need to address a pertinent question
in the literature: ‘‘How can users design more efficient
H. CHAPTER 8: TIME SERIES ANALYSIS and effective machine learning models?’’ Although this is
The content so far has only focused on modelling the spatial a broad question, an answer to this has led to the removal
relationships in the data. There is a certain area, however, that of ‘‘human-in-the-loop’’, since some aspects of the machine
studies the temporal dependencies in the data. For example, learning workflow rely on a trial and error basis, for example,
problems such as power output forecasting, predicting stock determining the best learning rate for the training dataset.
prices, weather prediction, energy usage, fuel consumption The aim of automating machine learning workflows is to
monitoring, and many more, require historical data at speed up the development and subsequent deployment of
pre-determined intervals over a longer period to predict machine learning models. This is helpful, especially to novice
future values. Time Series Analysis gives insights into data science professionals, however, human expertise is still
the behaviour or trends of time-varying observations. This crucial in the process. At the end of this chapter, students
involves studying the temporal sequence of features using must be able to leverage the power of NCTs to develop and
statistical techniques, recurrent neural networks with memory implement automated machine learning workflows and use
blocks, or 3-dimensional deep learning models. Several the knowledge they have gained to interpret the results.
architectures with different mechanisms for time series data In this section, we have presented a nine-chapter syllabus
exist, and their performance has been widely documented that takes students through the entire data analytic workflow
in multiple texts in the literature [47], [48]. Since this is a using NCTs. The data science curriculum based on NCTs is
large area with so many application scenarios, this curriculum not in any way different from the conventional curriculum.
hopes to equip students with the knowledge of forecasting and Students still learn the basics of data science, but using a dif-
trend analysis of sequential data. ferent tool for programming. The structure follows traditional

VOLUME 12, 2024 100949


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

content that consists of an introduction to data science, TABLE 13. Methods of creating project-based learning workflow.
data collection, data visualization, unsupervised learning,
supervised learning, feature engineering, model evaluation
and deployment, time series analysis, and machine learning
automation. In that regard, we believe that more emphasis
should be placed on developing teaching, assessment and
evaluation methods suitable for NCT-based education.
Table 12 shows some of the assessment items for the
machine learning automation chapter. The emphasis is on
testing student’s abilities to automate the entire process of
machine learning.

IV. TEACHING METHODS


A key pillar in the curriculum is teaching. In this Chapter,
we review the teaching methods that are suitable for NCTs
and provide relevant examples of their implementation.
We discuss lecture-based, instructor-led problem-based and
project-based learning approaches about their strengths, and
weaknesses and further give an example of implementing
project-based learning on a sample fall detection project.
There are several approaches to learning, and each
approach produces different outcomes [49]. Since the new learning in schools. In this curriculum, we adopt the
data science curriculum is composed mainly of theoreti- seven-step method described by Graaff et al. to help students
cal and practical aspects, we believe that a combination analyze the problem [55]. These are (i) clarify the concepts,
of multiple learning strategies is more suitable to teach (ii) define the problem, (iii) analyze the problem, (iv) find
programming for data science using NCTs. We suggest a the explanation, (v) formulate the learning objective, (vi)
combination of lecture-based, instructor-led problem-based search for further information and (vii) report and test new
and project-based learning approaches. In the past, lecture- information [55]. This is envisaged to intrinsically motivate
based learning has been the primary mode of instruction students, improve communication and effective collaborative
at most tertiary institutions. This instructional approach skills, and provide a more enjoyable learning experience [56].
traditionally involves a classroom set-up, where the lecturer Project-based learning is another student-centred approach
leads the class and presents information while the students to learning with particular emphasis on investigating
listen and take notes [50]. This model is relevant in this real-world systems and collaboration to gain knowledge [57].
curriculum since the students need organized and structured Although it shares similar characteristics with other instruc-
background information and literature from the lecturer to tional approaches such as Problem-Based learning, here
understand the basics. Although this mode offers efficient the focus is on the use of projects to promote learn-
delivery of information, it continues to be the subject of ing [58]. Condliffe argues that solving projects stimulates
criticism in various disciplines [51], [52]. Zhao and Potter learning [59]. This is aided by working on and implementing
argue that lecture-based learning is mainly lecturer-centred, the design principles that are being taught. Table 13 outlines
and promotes the superficial acquisition of knowledge [52]. the methods of creating project-based learning workflows,
We believe that this criticism is valid, to a larger extent, if the as described by two prominent authors, Grant [60] and
entire curriculum is based solely on this delivery method. Krajcik [61]. Figure 5 presents an anatomy of a project-based
To reduce the gap between lecture material and practice, learning approach inspired by Grant [60]. This discusses
problem-based learning led by an instructor is envisioned several steps that can be implemented using NCTs such
to complement deficiencies in an all lecture-based learning as Orange. We believe Grant offers a stronger approach to
model. Problem-based pedagogy is an instructional method project-based learning due to important and relevant stages.
that has origins in the medical field and has been reported We also provide an example of implementing project-based
to foster active learning by creating a need to solve an learning using NCTs on a project to develop a real-time fall
authentic problem [53]. This is a mainly student-centred detection and monitoring system for the elderly.
pedagogical approach where the starting point is to solve a All these three instructional approaches will be adopted in
context-specific problem that the students can handle [53]. this curriculum to help new learners develop skills in data
Problems can then be solved by the students in small science. In this curriculum, each chapter will begin with
collaborative groups, often led by a lecturer [54]. This area conducting lectures, following the conventional lecture-based
has had a long history and has received a lot of attention method. As students understand the concepts, problem-based
in the literature. Different authors have proposed distinctive learning is gradually introduced, and it covers the rest of the
guidelines for implementing and assessing problem-based chapter. Towards the end of the module, students are given

100950 VOLUME 12, 2024


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

a project to solve individually, or in groups. An example


project, along with the implementation layout is shown in
Figure 5.

V. ASSESSMENT METHODS
This Chapter addresses five important questions: (i) how
do you assess learners who use graphical tools for pro-
gramming?, (ii) who has reported on the use of visual
programming languages, especially in higher and tertiary
education scenarios?, (iii) has it been successful? (iv) what
were their recommendations?, and (v) how can this impact
our curriculum design?
The question of student assessment, especially in general
Andragogy, has been addressed thoroughly throughout the
literature [63], [64], [65]. This is mainly conducted to
evaluate how well students have performed against a set of
learning outcomes at various stages of learning. This provides
quantifiable evidence that can then be used by both students
and lecturers to evaluate the knowledge and skills gained
through learning [66]. In [67], Llamas-Nistal et al. discussed
the two main categories of assessment as continuous, and
summative assessment. Continuous assessment is usually
carried out during the instructional process to gather and
analyze information on student’s performance [68]. On the
other hand, summative assessment is carried out towards or
at the end of the learning process to evaluate cumulative
knowledge and skills gained [68].
According to the computing curriculum developed in
2013 by the Association for Computing Machinery – IEEE-
Computer Society, the generic learning outcome of any
programming course is to design, implement, test, and
debug a program that incorporates some basic programming
constructs [69]. This is a guideline that has been used
by many authors in the literature to assess students in
programming and has been used as a basis to judge the
coding abilities of first-year computer science students [70].
Assessing students who use graphical tools for programming
is not in any way different from assessing students who
use conventional programming tools. With VPL tools, tutors
can also test students’ main learning outcomes such as
designing, implementing, testing, and debugging a program
that incorporates basic to advanced programming constructs.
Here, students will connect various widgets of a VPL
to demonstrate program sequence, selection functions, and
iteration loops. Afterwards, the tutor will run their programs
to evaluate these concepts.

A. VPL-BASED CURRICULUM ASSESSMENT AND


EVALUATION
1) PARTICIPANTS FIGURE 5. Project-based learning implementation on a sample project.
The capstone project was developed as an example to assist students
Participants were students enrolled in the Department of through the workflow. The starting point is an introduction that presents
Applied Physics and Telecommunications and the Depart- the subject area and provides the background of the project. Next, the
task presents several activities to be accomplished. The resources outline
ment of Computer Science under the Faculty of Science the materials required to do the project. For this example, we used the
and Technology at Midlands State University. Students were fall dataset for the elderly [62]. This is followed by processes, scaffolding,
enrolled in one of the 3 Degree programs: BEng Honors collaborative learning and finally, reflection.

VOLUME 12, 2024 100951


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

in Telecommunication Engineering Degree, BSc Honors


Degree in Industrial Physics and Instrumentation and BSc
Honors Degree in Computer Systems Engineering. The study
targeted final-year students who were working on completing
their dissertation projects. Three classes from six different
semesters over three years were studied. Class sizes ranged
from 10 to 25 students. Most participants were male. All
participants completed a first-year introduction to computer
programming module. Participants from the Telecommu-
nications Engineering and Industrial Physics Degrees had
studied at least two computer science-related modules,
and are thus considered non-majors. Participants from the
Computer Science Degree have vast computer programming
experience, after having studied at least 20 computer science FIGURE 6. Students’ summaries of the responses to the questionnaire.
modules, and are thus considered majors.

2) METHODS data science education based on NCTs?’’ is answered by


quantitatively assessing the performance of students on
The research process focused on implementing and assessing
sample assessment items.
the effectiveness of three pedagogical methods, a lecture-
based model [50], [51], [52], instructor-led problem-based
model [53], [54], [56], and the project-based model [57], A. STUDENT’S PERFORMANCE ON VPL-BASED
[58], [59], [60], [61] using VPLs on data science problems. PROJECT-BASED LEARNING
The research was implemented in two interventions. The The first research question is centred on the applica-
first intervention motivated this research, and it took place tion of qualitative analysis, observing the performance of
over six semesters covering three years from March 2021 to non-computer science students in implementing a final-year
January 2024. Here, the study focused on qualitatively dissertation project based on VPLs. Students were observed
observing students from BEng Honors in Telecommunication throughout the development of the project, which allowed us
Engineering class and BSc Honors in Industrial Physics and to evaluate the usefulness of VPLs. There were two groups of
Instrumentation class on their ability to implement final-year students, the control group using Python code to develop and
dissertation projects using VPLs (project-based learning). implement a dissertation project, and the experimental group
Post-interviews were conducted to determine the usefulness using VPLs to develop and implement a slightly different
of VPLs in solving data science and related programming dissertation project. Results from observations revealed that
problems. students who used Python code to build a project took longer
The second intervention centred on two academic to complete, seeking assistance and asking a lot of questions
semesters from August 2023 to January 2024 of pedagogical from lecturers, fellow students, online forums and tutorials.
practice in data science education using the lecture-based On the other hand, students who used VPLs completed their
model and instructor-led problem-based model. Here, the projects with minimum help. From the results, it can be stated
study focused on quantitatively observing students from that NCTs or VPLs are useful, especially for students who do
the three classes on the VPL assessment items mentioned not have vast programming experience.
previously in Chapter III and post-study questionnaires.
B. VPLS’ ABILITY TO SOLVE DATA SCIENCE PROBLEMS
VI. RESULTS The second research question is focused on establishing
This Chapter presents the results of qualitative and quan- students’ views and experiences on the use of VPLs in
titative data analysis of the first intervention performed on solving data science programming problems. To address this
the effectiveness of VPLs in teaching programming aspects question, quantitative data was collected from post-study
of data science. Specifically, the first research question, questionnaires. The questionnaires consisted of 10 questions,
‘‘Do non-computer science learners and professionals find as shown in Table 14 measuring students’ opinions, experi-
no-code development tools helpful?’’, is addressed by quali- ences and motivation toward the use of VPLs in data science
tatively assessing the performance of students in successfully education. A [1 − 5] numerical rating scale [71] ‘‘maximum’’
implementing a data science-related project using VPLs. (5), ‘‘minimum’’ (1) was used. 50 students completed the
The second research question, ‘‘Do visual programming post-study questionnaire.
tools provide the necessary tools to equip learners with the From the responses to the questions in Table 14, 77% of
knowledge and skills to solve data science problems?’’ is the experimental group students admitted to struggling with
answered through questionnaires on students’ experiences writing their code using Python or C/C++ before the study,
with the VPL. Finally, the third research question, ‘‘How while only 4% demonstrated proficiency in developing code.
should test items be structured to assess and evaluate 88% of the students found the VPL tool to be easy to use,

100952 VOLUME 12, 2024


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

TABLE 14. Questions regarding students’ views and experiences on the TABLE 15. t-Test – Statistical analysis of the performance of
use of VPLs in their data science projects. non-computer science majors (Physics and Telecommunications students)
on Python-based and VPL-based assessment items.

t-value of 12.57, at a 0.05 significance level and a probability


‘p < 0.0001’. The results are shown in Table 15.
This provides sufficient evidence to suggest that the use of
visual programming languages increased students’ abilities to
develop code for solving data science challenges.
and 80% said VPLs facilitated the implementation of data
science algorithms in their projects. An impressive 96% of the VII. CONCLUSION
students from the experimental group highly recommended As the data industry grows exponentially, so does the need
using VPLs for programming to their fellow students who to train new professionals or upskill existing professionals
did not have vast programming experience. The responses are with data-related skills and new knowledge. This paper
summarised in Figure 6. has structured a modularized undergraduate curriculum for
data science education using no-code programming tools.
C. ASSESSMENT AND EVALUATION OF DATA SCIENCE Modular in the sense that it makes only one component
PROGRAMMING USING PYTHON AND NCTs of a course and the learning content is spread across one
The third research question uses the second intervention university semester. Guided by well-established educational
which centres on quantitatively assessing and evaluating the philosophies, this curriculum adopts effective teaching
performance of students from the three classes; BEng Honors and learning methods that are interactive and student-
in Telecommunication Engineering class, BSc Honors in centred. Students benefit mainly from the advantages of
Industrial Physics and Instrumentation class and the BSc graphical-based tools for learning programming and also the
Honors in Computer Systems Engineering class in Python theoretical concepts of data science, which are delivered
and VPL assessment items. The control group is drawn using the conventional lecture-based approach. The main
from the BSc Honors in Computer Systems Engineering teaching methods identified for the new method of learning
class while the experimental group is drawn from the non- data science are discussed in detail. Moreover, this curriculum
majors (BEng Honors in Telecommunication Engineering adheres to the data science guidelines that define the key
and the BSc Honors in Industrial Physics and Instrumentation competencies in data science for undergraduate learners.
classes). This work demonstrates the benefits of using visual
Code snippets of the assessment items of the two groups programming languages to develop code for implementing
were analyzed to assess their performance in developing code data science concepts. Overall, the student’s engagement in
using Python against VPLs. All sessions were supervised, and learning data science increased, and their assessment marks
they were conducted in a typical computer laboratory setup. greatly improved. Notably, the control group obtained a mean
An analysis of the results is given in Table 15 of 46.28% on Python questions, while the experimental group
The benefits of using visual programming languages obtained a mean of 71.76% on VPL-based questions. For the
to develop code were also confirmed in how students sake of comparison, the computer science class obtained a
performed in VPL-based assessment items against standard mean of 64.28% and 75.32% on Python and VPL questions
Python-based assessment items. The performance of the respectively. There was a small difference of 3.56% between
students from the two groups was tested by a Two-Sample the means of non-computer science majors and computer
t-Test and found significant differences between the control science majors who had used VPLs to solve data science
group and the experimental group. The results obtained a programming questions. This means that VPLs can assist

VOLUME 12, 2024 100953


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

REFERENCES
[1] M. Analytics, ‘‘The age of analytics: Competing in a data-driven world,’’
in McKinsey Global Institute Research. McKinsey & Company, 2016.
[Online]. Available: https://www.mckinsey.com/capabilities/quantumbla
ck/our-insights/the-age-of-analytics-competing-in-a-data-driven-world
[2] R. D. De Veaux et al., ‘‘Curriculum guidelines for undergraduate programs
in data science,’’ Annu. Rev. Statist. Appl., vol. 4, pp. 15–30, Aug. 2017.
[3] M. J. Ramzan, S. U. R. Khan, Inayat-Ur-Rehman, T. A. Khan,
FIGURE 7. Basic representation of an artificial neuron. The artificial A. Akhunzada, and C. Naseeb, ‘‘A conceptual model to support the
neuron is analogous in operation to the biological neuron. It has a bias transmuters in acquiring the desired knowledge of a data scientist,’’ IEEE
parameter (constant), and accepts two or more inputs, which are Access, vol. 9, pp. 115335–115347, 2021.
multiplied by their respective coefficients (wP
i ). These weighed inputs are [4] T. Hastie, R. Tibshirani, J. H. Friedman, and J. H. Friedman, The Elements
summed together to produce an output z = n i =0 wi xi which is then of Statistical Learning: Data Mining, Inference, and Prediction, vol. 2.
passed through an activation function to produce y = f (z) [72]. Springer, 2009.
[5] J. Demsar, T. Curk, A. Erjavec, C. Gorup, T. Hočevar, M. Milutinovič,
M. Možina, M. Polajnar, M. Toplak, A. Starič, M. Štajdohar, L. Umek,
non-computer science majors to achieve higher marks that are L. Žagar, J. Žbontar, M. Žitnik, and B. Župan, ‘‘Orange: Data mining
toolbox in Python,’’ J. Mach. Learn. Res., vol. 14, no. 1, pp. 2349–2353,
comparable to the marks of computer science majors. 2013.
In this work, VPLs have shown great potential as a [6] M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter, T. Meinl, P. Ohl,
pedagogical aid to data science students who do not have K. Thiel, and B. Wiswedel, ‘‘KNIME–the Konstanz information miner:
a strong programming background. Research on assisting Version 2.0 and beyond,’’ ACM SIGKDD Explorations Newslett., vol. 11,
no. 1, pp. 26–31, Nov. 2009.
learners to transition from no-code tools to text-based [7] A. Santos. (2023). Home. [Online]. Available: https://www.
programming languages is a work of further research. neuraldesigner.com/
In conclusion, the overarching contribution of this work [8] J. M. Wing, ‘‘The data life cycle,’’ Harvard Data Sci. Rev., vol. 1, no. 1,
p. 6, 2019.
will support students, tutors, educational institutions and data
[9] A. X. Zhang, M. M´’uller, and D. Wang, ‘‘How do data science workers
industries by (i) reducing the time and resources required collaborate? Roles, workflows, and tools,’’ Proc. ACM Hum.-Comput.
to learn data science programming, (ii) offering an alter- Interact., vol. 4, no. CSCW1, pp. 1–23, May 2020.
native approach to text-based programming in data science [10] B. A. Myers, ‘‘Visual programming, programming by example, and
program visualization: A taxonomy,’’ ACM SIGCHI Bull., vol. 17, no. 4,
education, (iii) providing detailed procedures for developing pp. 59–66, Apr. 1986.
teaching, learning, evidence-based assessment and evaluation [11] N. C. Shu, ‘‘Visual programming languages: A perspective and a
methods using interactive learning environments. dimensional analysis,’’ in Visual Languages. Cham, Switzerland: Springer,
1986, pp. 11–34.
[12] N. C. Shu, ‘‘Visual programming: Perspectives and approaches,’’ IBM Syst.
APPENDIX A J., vol. 38, no. 2, pp. 199–221, 1999.
OVERVIEW OF ARTIFICIAL NEURAL NETWORKS AND [13] M. A. Kuhail, S. Farooq, R. Hammad, and M. Bahja, ‘‘Characterizing
DATA SCIENCE visual programming approaches for end-user developers: A systematic
review,’’ IEEE Access, vol. 9, pp. 14181–14202, 2021.
The human brain consists of several interconnected cells that [14] M. M. Burnett and M. J. Baker, ‘‘A classification system for visual
transmit information encapsulated in electrical and chemical programming languages,’’ J. Vis. Lang. Comput., vol. 5, no. 3, pp. 287–300,
signals from various parts of the brain. These cells or neurons Sep. 1994.
[15] J. D. Kiper, E. Howard, and C. Ames, ‘‘Criteria for evaluation of visual
receive sensory inputs, process the signals and relay the programming languages,’’ J. Vis. Lang. Comput., vol. 8, no. 2, pp. 175–192,
output to other neurons. Neurons can work together to learn Apr. 1997.
the solution to a problem by creating a neural pathway. This [16] Z. Dobesova, ‘‘Evaluation of orange data mining software and examples
pathway becomes more accurate through trial and error by for lecturing machine learning tasks in geoinformatics,’’ in Computer
Applications in Engineering Education. Hoboken, NJ, USA: Wiley, 2024.
identifying neurons that regularly communicate. Through [17] U. Thange, V. K. Shukla, R. Punhani, and W. Grobbelaar, ‘‘Analyzing
regular practice, the brain learns to solve a problem. COVID-19 dataset through data mining tool ‘Orang,’’’ in Proc. 2nd
This simple, yet complex operation is the fundamental Int. Conf. Comput., Autom. Knowl. Manage. (ICCAKM), Jan. 2021,
pp. 198–203.
principle that has influenced the development of the artificial
[18] A. Abdelmagid and A. Qahmash, ‘‘Utilizing the educational data mining
neuron as shown in Figure 7. Each neuron can be seen as an techniques,’’ Inf. Sci. Lett., vol. 12, no. 3, pp. 1415–1431, 2023.
individual computing node that accepts one or more inputs [19] I. Popchev and D. Orozova, ‘‘Algorithms for machine learning with
xi , and produces an output y based on an activation function orange system,’’ Int. J. Online Biomed. Eng., vol. 19, no. 4, pp. 109–123,
Apr. 2023.
f (z). Non-linear functions Sigmoid functions are typically [20] J. Demsar and B. Zupan, ‘‘From experimental machine learning to
used, however, several activation functions are sufficient for interactive data mining,’’ in Proc. Knowl. Discovery Databases, 2005,
this purpose. A summary of the most common activation pp. 537–539.
functions for neural networks is presented by et al. in [72]. [21] B. Cukic, D. Hague, and M. Lou Maher, ‘‘An innovative interdisciplinary
undergraduate data science program: Pathways and experience,’’ in Proc.
A collection of these neurons form an artificial neural IEEE Frontiers Educ. Conf. (FIE), Oct. 2020, pp. 1–5.
network that learns from data by adjusting their parameters [22] D. Conway. (2010). The Data Science Venn Diagram. [Online]. Available:
and finding the correct solutions on their own, thereby http://www.dataists.com/2010/09/the-data-science-venn-diagram
[23] J. W. Tukey, ‘‘The future of data analysis,’’ Ann. Math. Statist., vol. 33,
mimicking human intelligence. In what follows, we describe no. 1, pp. 1–67, 1962.
the technical principles that govern the operation of artificial [24] J. M. Chambers, ‘‘Greater or lesser statistics: A choice for future research,’’
neural networks. Statist. Comput., vol. 3, no. 4, pp. 182–184, Dec. 1993.

100954 VOLUME 12, 2024


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

[25] W. S. Cleveland, ‘‘Data science: An action plan for expanding the technical [52] B. Zhao and D. D. Potter, ‘‘Comparison of lecture-based learning vs
areas of the field of statistics,’’ Stat. Anal. Data Mining: ASA Data Sci. J., discussion-based learning in undergraduate medical students,’’ J. Surgical
vol. 7, no. 6, pp. 414–417, Dec. 2014. Educ., vol. 73, no. 2, pp. 250–257, Mar. 2016.
[26] S. C. Hicks and R. A. Irizarry, ‘‘A guide to teaching data science,’’ Amer. [53] W. Hung, D. H. Jonassen, and R. Liu, ‘‘Problem-based learning,’’
Statistician, vol. 72, no. 4, pp. 382–391, 2018. Handbook Res. Educ. Commun. Technol., vol. 3, no. 1, pp. 485–506, 2008.
[27] D. Donoho, ‘‘50 years of data science,’’ J. Comput. Graph. Statist., vol. 26, [54] M. A. Albanese and L. C. Dast, ‘‘Problem-based learning,’’ in Under-
no. 4, pp. 745–766, Oct. 2017. standing Medical Education: Evidence, Theory and Practice. Wiley, 2013,
[28] N. Corte-Real, P. Ruivo, T. Oliveira, and A. Popovic, ‘‘Unlocking the pp. 61–79.
drivers of big data analytics value in firms,’’ J. Bus. Res., vol. 97, [55] E. de Graaff and A. Kolmos, ‘‘Characteristics of problem-based learning,’’
pp. 160–173, Apr. 2019. Int. J. Eng. Educ., vol. 19, pp. 657–662, Jan. 2003.
[29] O. Hazzan and K. Mike, Guide to Teaching Data Science: An Interdisci- [56] C. Onyon, ‘‘Problem-based learning: A review of the educational and
plinary Approach. Springer, 2023. psychological theory,’’ Clin. Teacher, vol. 9, no. 1, pp. 22–26, Feb. 2012.
[30] C. Kelleher and R. Pausch, ‘‘Lowering the barriers to programming: [57] D. Kokotsaki, V. Menzies, and A. Wiggins, ‘‘Project-based learning: A
A taxonomy of programming environments and languages for novice review of the literature,’’ Improving Schools, vol. 19, no. 3, pp. 267–277,
programmers,’’ ACM Comput. Surveys, vol. 37, no. 2, pp. 83–137, 2005. Nov. 2016.
[58] N. Hosseinzadeh and M. R. Hesamzadeh, ‘‘Application of project-based
[31] C. Kelleher and R. Pausch, ‘‘Using storytelling to motivate programming,’’
learning (PBL) to the teaching of electrical power systems engineering,’’
Commun. ACM, vol. 50, no. 7, pp. 58–64, Jul. 2007.
IEEE Trans. Educ., vol. 55, no. 4, pp. 495–501, Nov. 2012.
[32] J. Maloney, M. Resnick, N. Rusk, B. Silverman, and E. Eastmond, ‘‘The [59] B. Condliffe, ‘‘Project-based learning: A literature review. working paper,’’
scratch programming language and environment,’’ ACM Trans. Comput. in Proc. MDRC, 2017, pp. 1–11.
Educ., vol. 10, no. 4, pp. 1–15, Nov. 2010. [60] M. M. Grant, ‘‘Getting a grip on project-based learning: Theory, cases and
[33] Scratch Statistics. Accessed: Feb. 7, 2024. [Online]. Available: recommendations,’’ Meridian, A Middle School Comput. Technol. J., vol. 5,
https://scratch.mit.edu/statistics/ no. 1, p. 83, 2002.
[34] I. F. de Kereki, ‘‘Scratch: Applications in computer science 1,’’ in Proc. [61] J. S. Krajcik and P. C. Blumenfeld, Project-Based Learning. Cambridge
38th Annu. Frontiers Educ. Conf., Oct. 2008, pp. 1–7. Univ. Press, 2006.
[35] J. Estevez, G. Garate, and M. Graña, ‘‘Gentle introduction to artificial [62] M. Saleh, M. Abbas, and R. B. Le Jeannès, ‘‘FallAllD: An open dataset
intelligence for high-school students using scratch,’’ IEEE Access, vol. 7, of human falls and activities of daily living for classical and deep learning
pp. 179027–179036, 2019. applications,’’ IEEE Sensors J., vol. 21, no. 2, pp. 1849–1858, Jan. 2021.
[36] R. J. Brunner and E. J. Kim, ‘‘Teaching data science,’’ Proc. Comput. Sci., [63] S. C. dos Santos, ‘‘PBL-SEE: An authentic assessment model for PBL-
vol. 80, pp. 1947–1956, Dec. 2016. based software engineering education,’’ IEEE Trans. Educ., vol. 60, no. 2,
[37] H. Habibzadeh, K. Dinesh, O. Rajabi Shishvan, A. Boggio-Dandry, pp. 120–126, May 2017.
G. Sharma, and T. Soyata, ‘‘A survey of healthcare Internet of Things [64] P. Abichandani, V. Sivakumar, D. Lobo, C. Iaboni, and P. Shekhar,
(HIoT): A clinical perspective,’’ IEEE Internet Things J., vol. 7, no. 1, ‘‘Internet-of-Things curriculum, pedagogy, and assessment for
pp. 53–71, Jan. 2020. STEM education: A review of literature,’’ IEEE Access, vol. 10,
[38] H. Hu, Y. Wen, T.-S. Chua, and X. Li, ‘‘Toward scalable systems pp. 38351–38369, 2022.
for big data analytics: A technology tutorial,’’ IEEE Access, vol. 2, [65] G. V. Helden, V. Van Der Werf, G. N. Saunders-Smits, and M. M. Specht,
pp. 652–687, 2014. ‘‘The use of digital peer assessment in higher education—An umbrella
[39] A. K. Pandey, A. I. Khan, Y. B. Abushark, Md. M. Alam, A. Agrawal, review of literature,’’ IEEE Access, vol. 11, pp. 22948–22960, 2023.
R. Kumar, and R. A. Khan, ‘‘Key issues in healthcare data integrity: Anal- [66] H.-P. Yueh, T.-L. Chen, L.-A. Chiu, S.-L. Lee, and A.-B. Wang, ‘‘Student
ysis and recommendations,’’ IEEE Access, vol. 8, pp. 40612–40628, 2020. evaluation of teaching effectiveness of a nationwide innovative education
[40] B. K. Daniel, ‘‘Big data and data science: A critical review of issues for program on image display technology,’’ IEEE Trans. Educ., vol. 55, no. 3,
educational research,’’ Brit. J. Educ. Technol., vol. 50, no. 1, pp. 101–113, pp. 365–369, Aug. 2012.
Jan. 2019. [67] M. Llamas-Nistal, F. A. Mikic-Fonte, M. Caeiro-Rodríguez, and
[41] X. Qin, Y. Luo, N. Tang, and G. Li, ‘‘DeepEye: An automatic big data M. Liz-Domínguez, ‘‘Supporting intensive continuous assessment with
visualization framework,’’ Big Data Mining Analytics, vol. 1, no. 1, BeA in a flipped classroom experience,’’ IEEE Access, vol. 7,
pp. 75–82, Mar. 2018. pp. 150022–150036, 2019.
[42] A. Perrot and D. Auber, ‘‘Cornac: Tackling huge graph visualization with [68] J. Moreno and A. F. Pineda, ‘‘A framework for automated for-
big data infrastructure,’’ IEEE Trans. Big Data, vol. 6, no. 1, pp. 80–92, mative assessment in mathematics courses,’’ IEEE Access, vol. 8,
Mar. 2020. pp. 30152–30159, 2020.
[69] S. Draft, Computer Science Curricula. New York, NY, USA: ACM, 2013.
[43] M. Aparicio and C. J. Costa, ‘‘Data visualization,’’ Commun. Design
[70] M. McCracken, V. Almstrum, D. Diaz, M. Guzdial, D. Hagan,
Quart., vol. 3, no. 1, pp. 7–11, Jan. 2015, doi: 10.1145/2721882.2721883.
Y. B.-D. Kolikant, C. Laxer, L. Thomas, I. Utting, and T. Wilusz, ‘‘A multi-
[44] R. Descartes, The Philosophical Works of Descartes.[2 Vols.]. Dover, 1955.
national, multi-institutional study of assessment of programming skills
[45] E. Bisong, ‘‘Matplotlib and seaborn,’’ in Building Machine Learning and of first-year cs students,’’ in Working Group Reports From ITiCSE on
Deep Learning Models on Google Cloud Platform. Cham, Switzerland: Innovation and Technology in Computer Science Education. Association
Springer, 2019, pp. 151–165. for Computing Machinery, 2001, pp. 125–180.
[46] J. Davis and M. Goadrich, ‘‘The relationship between precision-recall and [71] A. Joshi, S. Kale, S. Chandel, and D. Pal, ‘‘Likert scale: Explored
ROC curves,’’ in Proc. 23rd Int. Conf. Mach. Learn., 2006, pp. 233–240. and explained,’’ Brit. J. Appl. Sci. Technol., vol. 7, no. 4, pp. 396–403,
[47] A. N. Shewalkar, ‘‘Comparison of RNN, LSTM and GRU on speech Jan. 2015.
recognition data,’’ Comput. Sci. Masters Papers, 2018. [72] R. Parhi and R. D. Nowak, ‘‘The role of neural network activation
[48] Y. Bai, J. Xie, C. Liu, Y. Tao, B. Zeng, and C. Li, ‘‘Regression modeling functions,’’ IEEE Signal Process. Lett., vol. 27, pp. 1779–1783, 2020.
for enterprise electricity consumption: A comparison of recurrent neural
network and its variants,’’ Int. J. Electr. Power Energy Syst., vol. 126,
Mar. 2021, Art. no. 106612.
HARRY D. MAFUKIDZE received the B.Sc.
[49] K. H. Lycke, P. Grøttum, and H. I. Strømsø, ‘‘Student learning strategies,
degree (Hons.) in physics from Midlands
mental models and learning outcomes in problem-based and traditional
State University, Gweru, Zimbabwe, in 2009,
curricula in medicine,’’ Med. Teacher, vol. 28, no. 8, pp. 717–722,
Jan. 2006. and the M.Eng. degree in electronic engineering
[50] M. Khatiban, S. N. Falahan, R. Amini, A. Farahanchi, and A. Soltanian, from the University of Stellenbosch, Stellenbosch,
‘‘Lecture-based versus problem-based learning in ethics education among South Africa, in 2014. He is currently with the
nursing students,’’ Nursing Ethics, vol. 26, no. 6, pp. 1753–1764, Department of Applied Physics and Telecommu-
Sep. 2019. nications, Midlands State University. His research
[51] L. D. Kantar and S. Sailian, ‘‘The effect of instruction on learning: Case interests include radar signal processing, data
based versus lecture based,’’ Teaching Learn. Nursing, vol. 13, no. 4, science, machine learning, and deep learning and
pp. 207–211, Oct. 2018. their applications.

VOLUME 12, 2024 100955


H. D. Mafukidze et al.: Development of a Modularized Undergraduate Data Science and Big Data Curricular

ACTION NECHIBVUTE received the B.Sc. degree IRFAN ANJUM BADRUDDIN received the
in physics from Midlands State University, Graduate degree in mechanical engineering,
in 2001, the B.Sc. degree in mathematics from in 1998, the Master of Technology degree,
the University of Zimbabwe, in 2001, the M.Sc. in 2001, and the Ph.D. degree in heat transfer
degree in physics from the University of Botswana, from Universiti Sains Malaysia, in 2007. He is
in 2008, and the Ph.D. degree in physics in the area currently a Professor with the Department of
of energy harvesting for wireless sensor devices Mechanical Engineering, King Khalid University,
from Midlands State University, in 2015. He is Saudi Arabia. He works in the interdisciplinary
currently an Academic Researcher with Midlands fields. He has more than 300 articles to his credit.
State University.

SARFARAZ KAMANGAR received the Ph.D.


degree in mechanical engineering. He is currently
an Assistant Professor with the Department of
Mechanical Engineering, King Khalid University,
ABID YAHYA (Senior Member, IEEE) received Saudi Arabia. He has more than 13 years of
the bachelor’s degree in electrical and electronic research and teaching experience at well-known
engineering, major in telecommunication from universities. He has published more than 100 arti-
the University of Engineering and Technology cles at international journals and conferences.
Peshawar, Peshawar, Pakistan, and the M.Sc. and
Ph.D. degrees in wireless and mobile systems from
Universiti Sains Malaysia. He began the career
on an engineering path, which is rare among
other researcher executives. He is currently with
Botswana International University of Science and MOHAMED HUSSIEN was born in Menofia,
Technology. He is also a Professional Engineer certified by Botswana Egypt, in 1981. He received the B.Sc. degree
Engineers Registration Board (ERB). He has many research publications in chemistry and the Ph.D. degree in organic
in numerous reputable journals, conference articles, and book chapters. chemistry from Menofia University, in 2002 and
He received several awards and grants from various funding agencies and 2023, respectively. He is currently a Lecturer in
supervised several master’s and Ph.D. candidates. His recent four books organic chemistry with the Department of Chem-
Emerging Technologies in Agriculture, Livestock, and Climate (Springer, istry, Faculty of Science, King Khalid University,
2020), Mobile WiMAX Systems: Performance Analysis of Fractional Abha, Saudi Arabia. He has many research articles
Frequency Reuse (CRC Press|Taylor & Francis, 2019), Steganography in heterocyclic chemistry against pests and using
Techniques for Digital Images, LTE-A Cellular Networks: Multi-hop as anticancer agents published in international
Relay for Coverage, Capacity and Performance Enhancement (Springer journals. His research interests include synthesis and chemical reactivity of
International Publishing, July 2018 and January 2017), and are being phosphorus compounds contain bioactive heterocyclic systems.
followed in national and international universities.

100956 VOLUME 12, 2024

You might also like