Exploring Big Data Traits and

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 30

“Exploring big data traits and 

data quality dimensions


for big data analytics application using partial least
squares structural equation modelling”

Abstract
The popularity of big data analytics (BDA) has boosted the interest of organisations into
exploiting their large scale data. This technology can become a strategic stimulation for
organisations to achieve competitive advantage and sustainable growth. Previous BDA
research, however, has focused more on introducing more traits, known as Vs for big data
traits, while ignoring the quality of data when examining the application of BDA. Therefore,
this study aims to explore the efect of big data traits and data quality dimensions on BDA
application. This study has formulated 10 hypotheses that comprised of the relationships of
big data traits, accuracy, believability, completeness, timeliness, ease of operation, and BDA
application constructs. This study conducted a survey using a questionnaire as a data
collection instrument. Then, the partial least squares structural equation modelling technique
was used to analyse the hypothesised relationships between the constructs. The fndings
revealed that big data traits can signifcantly afect all constructs for data quality dimensions
and that the ease of operation construct has a signifcant efect on BDA application. This study
contributes to the literature by bringing new insights to the feld of BDA and may serve as a
guideline for future researchers and practitioners when studying BDA application

Introduction
Driven by globalisation and increasing market competitions, various industries have turned to
big data analytics (BDA) for its ability to transform enormous raw data into decision-making
tools [1]. BDA consists of a set of advanced analytical techniques adapted from related felds,
such as artifcial intelligence, statistics, and mathematics, which are used to identify trends,
detect patterns, and unveil hidden knowledge from a huge amount of data [2]. Tis technology
has been applied in diferent felds, including fnance [3], insurance [4], and cyber security [5],
to name a few. Te emergence of BDA can be linked to the inability of traditional database
management tools to handle structured and unstructured data simultaneously [6]. Structured
data refers to data that have a scheme, metadata, rules, and constraints to follows, whilst
unstructured data have no structure at all or unknown structure to follow [7]. Tese types of
data are collected or received from diverse platforms, such as network sensors, social media,
and the Internet of Tings. Although it is vital to exploit structured and unstructured data for
BDA, they are usually incomplete, inaccurate, inconsistent, and vague or ambiguous, which
could lead to false decisions [8–11]. Salih et al. [12] and Wamba et al. [13] have highlighted
the lack of data quality mechanisms being applied in BDA prior to data usage. Several studies
have considered the potential of data quality for BDA application [14–18], yet, specifc
questions about what drives the dimensions of data quality remain unanswered. Nevertheless,
studies on data quality and BDA are still underway and have not reach a good level of
maturity [7]. Tus, there is an urgent need to conduct in-depth study on data quality to
determine the most important dimensions for BDA application. Several theories or models
for understanding data quality problems have been suggested, such as resource-based theory
(RBT), organisational learning theory (OLT), frm performance (FPER), and data quality
framework (DQF). However, these theories or models do not ft into BDA application since
they concentrate primarily on service quality as opposed to data quality [19]. Moreover, most
studies related to BDA are focused on the perspective held at the organisational or frm level
[8, 10, 20, 21] and studies focusing on the individual perspective are lacking. Since
academics are encouraged to participate in research on pedagogical support for teaching
about BDA [22], this study has determined that university students can represent the
perspectives at the individual level. Students were chosen because it is crucial to prepare and
expose them to BDA, especially in the mandatory setting [23]. Meanwhile, numerous traits
have been studied to explain the characteristics of big data, such as 3Vs [24], 4Vs [25], 5Vs
[26, 27], 7Vs [28], 9Vs [29], 10Vs [30], 10Bigs [31], and 17Vs [32]. Tese attempts to assign
the maximum number of characteristics to big data show the lack of uniform consensus
regarding the core of big data characteristics [33]. Although big data characteristics and data
quality are viewed as distinct domains, several studies have found that these two domains are
interconnected and closely related [9, 14, 17]. A better understanding of the core
characteristics of big data and the dimensions of data quality is needed. Hence, this study
seeks to expand the knowledge on big data characteristics, hereafter known as big data traits
(BDT) and data quality dimensions (DQD), as well as to explore how they could afect the
application of BDA.
Literature Survey

Big data and analytics are two diferent felds that are widely used to exploit the exponential
growth of data in recent years. Te term ‘big data’ represents a large volume of data, while the
term ‘analytics’ indicates the application of mathematical and statistical tools on a collection
of data [34]. Tese two terms have been merged into ‘big data analytics’ to represent various
advanced digital techniques that are formulated to identify hidden patterns of information
within gigantic data sets [35, 36]. Scholars have suggested varying defnitions for BDA. For
instance, Verma et al. [23] defned BDA as a suite of data management and analytical
techniques for handling complex data sets, which in turn lead to a better understanding of the
underlying process. Faroukhi et  al. [37] defned BDA as a process of analysing raw data in
order to obtain information that is understandable to humans, which are hard to observe using
direct analysis. Davenport [38] simply defned BDA as a “focus on very large, unstructured
and fast moving data”. Nowadays, BDA application has helped numerous organisations
improve their performance because it can handle problems instantly and assist organisations
in making better and smarter decisions [35, 39]. Te advantages of BDA application for
organisational performance have been proven by numerous studies. For instance, Mikalef et 
al. [20] found four alternative solutions surrounding BDA that can lead to higher
performance, whereby diferent combinations of BDA resources either play a greater or lesser
importance to organisational performance. Similarly, Wamba et al. [40] applied the RBT and
sociomaterialism theory to examine organisational performance. Teir empirical work showed
that the hierarchical BDA has both direct and indirect impacts on organisational performance.
Based on this same set of views, Wamba et al. [13] highlighted the importance of capturing
the quality dimensions of BDA. Teir fndings proved the existence of a signifcant relationship
between the quality of data in BDA and organisational performance. Some scholars perceive
data quality as equivalent to information quality [41–44]. Data quality generally refers to the
degree to which the data are ft for use [45]. Meanwhile, the concept of information quality is
defned as how well the information supports the task [46]. Haryadi et al. [14] asserted that
data quality is focused on data that have not been analysed, while information quality is
focused on the analysis that has been done on the data. Tis study, however, opines that data
quality should focus on the wellness and appropriateness of data, which encompasses either
before or after it has been analysed, in which it should meet the requirements of organisations
[12]. Te notion of quality represents a multidimensional construct, whereby it is essential to
combine its dimensions and express them in a solid structure [46]. Initially, Wang and Strong
[45] used factor analysis to identify DQD and found 179 dimensions that were eventually
reduced to 20. Ten, they organised these dimensions into four primary categories, namely
intrinsic, contextual, representational, and accessibility. Te intrinsic category denotes datasets
that have quality in their own right, while the contextual category highlights the requirement
of the task that data quality must be considered within the context. Te representational
category describes data quality in relation to the presentation of the data, and the accessibility
category emphasises on the importance of computer systems that provide access to data [18].
Each category has several dimensions that are used as specifc data quality measurements. For
instance, accuracy and objectivity are the dimensions in the intrinsic category, while
relevance and timeliness are the dimensions in the contextual category. Interpretability and
understandability are the dimensions in the representational category, and access security and
ease of operations are the dimensions in the accessibility category. Table 1 presents all DQD
according to their categories. Various studies have been conducted to analyse the
relationships between DQD and BDA application. For instance, Côrte-Real et al. [8] analysed
the direct and indirect efects of DQD on BDA capabilities in a multi-regional survey
(European and

Abbreviations

BDA: Big data analytics; DQD: Data quality dimensions; BDT: Big data traits; PLS-SEM:
Partial least squares structural equation modelling
Software requirement specification

Requirements gathering is one of the most important phase of a software development life
cycle. It is the phase that tells us what is the system supposed to do and drives the other
phases in the life cycle. Requirement gathering for the SOS app started with brain storming
and discussion with other students as to what features are the most essential in a panic
situation. This led to the most basic and initial draft of requirements for the application.
Requirements were also collected by looking at other devices like personal locator beacons
and satellite messengers that are commercially available. A brief study of the functionality of
the devices helped me to refine and narrow down the requirements even further. One
important thing to learn for these devices was the simplicity of their design. This helped me
to design an effective and simple UI design for my application. The next step for
requirements understanding was to look for existing solutions and similar applications in the
Android market. A careful study of these applications, adding other important features and
removing unnecessary features was done

Conceptual model and hypotheses

Based on the literature review in the previous section, this study proposes a new model for
BDA application based on the integration of BDT and DQD, as depicted in Fig. 1. It should
be noted that various applications can have diferent requirements, as not all dimensions and
constructs are always applicable. Nevertheless, most studies on BDA application are focused
on the organisation or frm levels and not on the individual level. Hence, this study is based on
an individual’s perception.

Big data traits

According to Sun [31], various Vs are used to defne BDT, while conventional data quality is
defned by a number of DQD [17]. Hence, this study considered BDT as a single construct
because diferent Vs are overlapping with the DQD. DQD categories that are generally
accepted and frequently used in the application of BDA were also included in this study,
which
were the intrinsic, contextual, and accessibility categories. Te intrinsic category was chosen
because of the importance of data correctness in BDA application, which is composed of two
constructs, namely, accuracy and believability. Meanwhile, the contextual category was
chosen because the application of BDA commonly depends on the context in which the data
are used. Tis study considered two constructs in the contextual category, namely,
completeness and timeliness. Finally, the accessibility category was chosen because the
computer system needs to facilitate the accessing and storing of data in BDA application.
Tus, the ease of operation is considered as a construct in the accessibility category for this
study. Te signifcant infuence of BDT on the constructs of DQD, namely, accuracy,
believability, completeness, timeliness, and ease of operation was explored through the
following hypotheses:

H1: Big data traits have a signifcant infuence on accuracy.


H2: Big data traits have a signifcant infuence on believability.

H3: Big data traits have a signifcant infuence on completeness.

H4: Big data traits have a signifcant infuence on timeliness.

H5: Big data traits have a signifcant infuence on ease of operation

Accuracy

Accuracy means that the data must depict facts accurately and the data must come from a
valid source [45, 51]. Te efective use of BDA relies on the accuracy of data, which is
necessary to produce reliable information [8]. As higher data accuracy may facilitate the
routines and activities of BDA, this study proposes that accuracy is included as an enabler in
BDA application. Hence, this study proposes the following hypothesis: H6: Accuracy has a
signifcant infuence on big data analytics application.

Believability

Believability represents the degree of which the data is considered valid and reliable [44].
Tere are concerns regarding the credibility of BDA fndings due to insufcient insight into the
trustworthiness of the data source [52]. Believability of data sources might be difcult to
notice, as people may alter facts or even publish false information. Terefore, data sources
need to be treated as believable in BDA application. Hence, this study proposes the following
hypothesis: H7: Believability has a signifcant infuence on big data analytics application.

Completeness

Completeness refers to the degree of which there is no lack of data and that the data are
largely appropriate for the task at hand [45]. It also refers to the validity of the values of all
components in the data [53]. As big data sources are rather large and the architectures are
complicated, the completeness of data is crucial to avoid errors and inconsistencies in the
outcome of BDA application. Hence, this study proposes the following hypothesis: H8:
Completeness has a signifcant infuence on big data analytics application.

Timeliness

Timeliness refers to the degree of which data from the appropriate point in time refects truth
[50]. Timeliness is identifed as one of the most signifcant dimensions of data quality, since
making decisions based on outdated data will ultimately lead to incorrect insights [54].
Additionally, the more rapidly data are being generated and processed, the better time the
data will be used in BDA application [17]. Hence, this study proposes the following
hypothesis: H9: Timeliness has a signifcant infuence on big data analytics application. Ease
of operation Ease of operation refers to the degree of which data can be easily merged,
changed, updated, downloaded or uploaded, aggregated, reproduced, integrated, customised,
and manipulated, as well as can be used for multiple purposes. Users will undeniably face
challenges and complexity to utilise BDA based on the technical approaches used for
handling this technology. If the BDA application is relatively easily to operate, the user
would be willing to use it in a long-term. Hence, this study proposes the following
hypothesis: H10: Ease of operation has a signifcant infuence on big data analytics application

Research methodology

Te methodological procedures in this study were conducted in two phases, namely, research
instrument, and data collection and analysis. Figure 2 shows the methodology in sequence.
Research instrument Tis study used a survey questionnaire with two sections to explore the
hypothesised relationships in the proposed conceptual model. Te frst section included
questions related to the respondents’ profles, such as gender, year of study, and area of study,
while the second section contained measurement of constructs with 28 indicators. Tese
constructs were BDT, accuracy, believability, completeness, timeliness, ease of operation,
and BDA application. Te indicators to measure BDT (velocity, veracity, value, and
variability) were self-developed based on the defnitions proposed by Arockia et al. [32]. Te
accuracy, believability, completeness, timeliness, and ease of operation constructs, each with
four indicators, were adapted from [8, 47], and [48]. BDA application, with four indicators,
was adapted from [23] and [55]. All indicators have been measured using the 7-point Likert
scale, ranging from 1 (strongly disagree) to 7 (strongly agree). Tis questionnaire was pretested
among academics and several items have been reworded to improve the clarity of the questions. Te
questionnaire was then used in a pilot test to confrm the reliability of all shortlisted constructs. Tis
test involved 30 respondents, and the Cronbach’s alpha values for all seven constructs on the
reliability scale were found to be appropriate and acceptable.

Data collection and analysis


This study used random sampling to select respondents who have knowledge on BDA. As a
preliminary study, 200 survey invitations were sent to Computer Science students at the
National Defence University of Malaysia. Tese students were chosen because of the
knowledge they had gained during the Big Data Analytics or Data Mining course that they
attended previously. Data were collected through a web survey, which was conducted from
July till August 2020. A total of 108 complete responses were received, resulting in 54%
response rate. Tere were 84 male (77.78%) and 24 female (22.22%) respondents involved in
this study. Most of the respondents were in their second year of study (52.78%) and in the
area of artifcial intelligence (50.93%). Te key profles of these respondents are shown in
Table 2. Subsequently, the partial least squares structural equation modelling (PLS-SEM)
was applied to analyse the survey-based crossectional data, since this technique is able to
explain the variance in key target constructs [56]. Tis technique amalgamates the concepts of
factor analysis and multiple regression in order to validate the measurement instruments and
test the research hypotheses. Since PLS-SEM is a modest and practical technique to create
rigour in a complex modeling [57], this study had also utilised this technique for analysing
and validating the complex hypothesised relationships of the proposed model.

Measurement model

A measurement model was used to assess the reliability and validity of the constructs. Te
standard steps for assessing a measurement model are convergent validity and discriminant
validity. Convergent validity was analysed by calculating the factor loading of the indicators,
composite reliability (CR), and average variance extracted (AVE) [58]. Te convergent
validity results in Table 3 show that the factor loadings for all indicators are higher than
0.708, as suggested by Hair et  al. [59], with the elimination of three indicators (AC3, BE2,
and EO4) from the original 28 indicators. Meanwhile, the CR values ranged from 0.822 to
0.917, which exceeded the suggested value of greater than 0.7 [59]. An adequate AVE is 0.50
or greater, meaning that at least 50% of the variance of the constructs can be explained by its
indicators [56]. As shown in Table 3, all AVE values range from 0.536 to 0.786, indicating
that convergent validity of the measurement model is achieved. Once the convergent validity
has been successfully established, the discriminant validity was examined using the Fornell-
Larcker criterion. The square root of AVE should be greater than the correlations among each
construct [60]. Table 4 demonstrates that the square root of AVEs are greater in all cases than
the of-diagonal elements in their corresponding row and column. Terefore, discriminant
validity has been achieved.
Structural model

structural model was used to examine the magnitude of the relationships among the
constructs. Te goodness of ft of the structural model can be assessed by examining the R2
measure (the coefcient of determination) and the signifcance level of the path coefcients (β
values) [56]. Te results of the research model were satisfactory, demonstrating the R2 value
for BDA application at 0.444, which suggested that 44.4% of the variance in BDA
application can be explained by DQD. Furthermore, the R2 values for constructs of accuracy
(42.6%), believability (45.5%), and completeness (33.2%) were also satisfactory, except for
timeliness (26.8%) and ease of operation (31.5%) that were moderately explained by BDT.
Figure 3 illustrates the results of the R2 values from the SmartPLS 3.2 software. Te path
coefcents of the structural model were calculated using bootstrap analysis (resampling=5000)
to assess their statistical signifcance. Table 5 shows the results of the path coefcients and their
level of signifcance. An analysis of t-value and p-value has shown that six hypotheses were
signifcant, namely, H1, H2, H3, H4, H5, and H10. Overall, H1 to H5 were the infuence of
BDT on DQD, whereas, only one hypothesis of DQD, H10, was identifed as signifcant for
evaluating the infuence of ease of operation towards BDA application. Te results have also
shown that accuracy, believability completeness, and timeliness had no signifcant efect on
BDA application. Tus, H6, H7, H8, and H9 were rejected.
Software Engineering Model Used

Waterfall Model

Waterfall model is the earliest SDLC approach that was used for software
development. It is also referred to as a linear-sequential life cycle model. It is very simple to
understood and use. In a waterfall model, each phase must be completed before the next
phase can begin and there is no overlapping in phases.
Following is a diagrammatic representation of distinct phases of waterfall model.

Waterfall Model

In “The Waterfall” approach, the full process of software development is divided into
separate phases. In Waterfall model, typically, the outcomes of one phase act as the input for
the next phase sequentially. The sequential phases in Waterfall model are:

 Requirement Gathering and analysis

All possible requirements of the system to be developed are captured in this phase and
documented in a requirement specification document.

 System design

The requirement specifications from first phase are studied in this phase and
system design is prepared. Design helps in specifying hardware and system
requirements and also helps in defining overall system architecture.

 Implementation

With inputs from system design, the system is first developed in small
programs called units, which are integrated in the next phase. Each unit is developed
and tested for its functionality which is referred to as Unit Testing.

 Integration and Testing


All the units developed in the implementation phase are integrated into a
system after testing of each unit. Post integration the entire system is tested for any
faults and failures.

Feasibility Study:

A feasibility study is an analysis used in measuring the ability and likelihood to complete a


project successfully including all relevant factors.

Depending on the result of the initial investigation, the survey is expended to a more detailed
feasibility study. A feasibility study is a test of system proposal according to its work ability
impact on the this application and effective use of resources.

Operational Feasibility:

In the application there are multiple operations are to be performed to so the application to be
more effective and more reliable.

The every system is calculated based on its performance. There are many tools are used to
develop the system. But the every system chooses based on its performance and the
operations. User always wants his system to be more effective and more responsive, so
considering all the above scenarios to develop system with more effective we choose best
tools for projects.
Economic Feasibility:

The main aim of this system is to cost effective. The cost to develop this application is less.
Because every customer wants to make his software with less cost effective. It saves the
evaluation time by to automating some of manual tasks of a paper evaluator. It is easy to use
for all new users.

Motivational feasibility:

The motivational feasibility provides the effective user interface and helps the developer to
motivate. There are many stakeholders of the system which helps the user to flow of the
application according to they need. The stake holders provide the details to the developer in
an easy understanding way.

Schedule feasibility

This application gives tremendous output with less time, i.e., this to execute this application
application is compatible for all devices with rich user interface hence it helps for easy to
operate and took less time.
Discussion

To present study has explored the BDT and DQD constructs for BDA application. Te fndings
showed that the accessibility of DQD (ease of operation) can signifcantly infuence BDA
application. Tis result showed that the ease of obtaining data plays an important role in
providing users with an efective access to reduce the digital divide in BDA application
endeavour. Tis result is corroborated by the fndings by Zhang et al. [61], who considered the
ease of functional properties would ensure the quality of BDA application. Janssen et al. [9]
similarly proposed that the easier it is to operate BDA, the more application systems would
be integrated and are sufcient for handling this technology. Akter et al. [57] found signifcant
infuence of DQD (completeness, accuracy, format, and currency) on BDA application. On
the other hand, the results of this study showed that accuracy, believability, completeness,
and timeliness have no signifcant infuence on the decision to apply BDA. Tese results were
unexpected. Tese outcomes could be because the respondents were novice users, whom
assumed the availability of technical teams to solve any accuracy, believability,
completeness, and timeliness problems in BDA application. Meanwhile, the four indicators of
BDT (velocity, veracity, value, and variability) have shown signifcantly high impact on all
constructs of DQD (accuracy, believability, completeness, timeliness, and ease of operation).
Tese fndings are in agreement with the results obtained by Wahyudi et  al. [17], whereby high
correlation was found between BDT, and timeliness and ease of operation. Te signifcant
infuence of BDT on DQD showed interesting results, which demonstrated how users
recognise the importance of BDT for assessing quality assessment results. Tis observation is
in agreement with Taleb et al. [62], who claimed that BDT could enforce quality evaluation
management to achieve quality improvements. Te fndings also showed that while many
researchers have proposed numerous BDT, in this context, velocity, veracity, value, and
variability are more critical for assessing data quality in BDA application.
SYSTEM DESIGN

Introduction

The Software Design Document is a document to provide documentation which will


be used to aid in software development by providing the details for how the software should
be built. Within the Software Design Document are narrative and graphical documentation of
the software design for the project including use case models, sequence diagrams,
collaboration models, object behaviour models, and other supporting requirement
information.

Purpose

The purpose of Software Design Specification (SDS) document is to specify


high level view of the architecture of our system, and on the interaction between the user and
the system. And another purpose is on detailing a low-level view of each component of the
software and how the components interact with each other.

This document's purpose is to provide a high-level design framework around


which to build our project. It also provides a list of requirements against which to test the
final project and determine whether we were able to successfully implement the system
according to design.

Scope

The system Design (SD) describes how the functional and non-functional
requirements gathered in the requirement phase, preliminary user-oriented functional design
transform into more technical system specifications from which the system is built. This
phase describes the design goals and considerations, provides a high-level overview of the
system architecture, and describes the data design associated with the human-machine
interface and operational scenarios.

Overview

The system design documents and tracks the necessary information required to
effectively define architecture and design of Application system in order to give the
development team guidance on the architecture of the system to be developed. Design
documents are incrementally and iteratively produced during the system development life
cycle, based on the particular circumstances of the SMS App.

System Architecture

Architecture focuses on looking at a system as a combination of many different


components, and how they interact with each other to produce the desired result. It involves
the process of defining a collection of hardware and software components and their interfaces
to establish the framework for the development of an android application.

Big Data Overview


Detailed Design of DFD

When it comes to conveying how information data flows through


systems (and how that data is transformed in the process), data flow diagrams
(DFDs) are the method of choice over technical descriptions for three principal
reasons.

 DFDs are easier to understand by technical and nontechnical audiences


 DFDs can provide a high-level system overview, complete with boundaries
and connections to other systems
 DFDs can provide a detailed representation of system components

DFDs help system designers and others during initial analysis stages visualize a
current system or one that may be necessary to meet new requirements.
Systems analysts prefer working with DFDs, particularly when they require a
clear understanding of the boundary between existing systems and postulated
systems. DFDs represent the following:

• External devices sending and receiving data.

• Processes that change that data.


Data Flow Diagram
Sequence Diagram

A sequence diagram shows object interactions arranged in a time sequence. It depicts


the object and classes involved in the scenario and the sequence of messages exchanged
between the objects needed to carry out the functionality of the scenario.

Sequence diagram are typically associated with use case realization in the logical
view of the system under development. Sequence diagrams are sometimes called event
diagrams or event scenario.
Implementation Approaches

In this stage physical system specifications are converted into a working and reliable
solution. This is where the system is developed. It involves careful planning, investigation of
the current system and its constraints on implementation, design of methods to achieve the
changeover, an evaluation, of change over methods.

The objective of the system implementation phase is to implement a fully functional system,
which deliverables met in order of priority. The result of this phase is an accurately
functioning system. The more complex system implemented, the more involved will be the
system analysis and the design effort required just for implementation.

The implementation phase deals with issues of quality, performance, baselines, libraries, and
debugging. The end deliverable is the product itself. During the implementation phase, the
system built according to the specifications from the previous phases. This includes writing
code, performing code reviews, performing tests, selecting components for integration,
configuration, and integration.
Screenshot
Code

Testing
Introduction

Web applications run on devices with limited memory, CPU power and power supply. The
behaviour of the application also depends on external factors like connectivity, general
system utilization, etc.
Therefore, it is very important to debug, test and optimize web application. Having
reasonable test coverage for web application helps to enhance and maintain the web
application.
As it is not possible to test bootstrap web applications on all possible device configurations, it
is a common practice to run on typical device configurations. Should test application at least
on one device with the lowest possible configuration. In addition, should test on one device
with the highest available configuration, e.g., pixel density, screen resolution to ensure that it
works fine on these devices.

Testing Concepts

Web application testing based on Unit. In general, a Unit test is a method whose statements
test a part of the application. Organizes test methods into classes called test cases, and group
test cases into test suites.

Unit tests
Local Unit Tests

Unit tests that run on local machine only. These tests compiled to run locally on the
NetBeans to minimize execution time. Use this approach to run unit tests that have no
dependencies on the web framework or have dependencies that mock objects can satisfy.
Instrumented unit tests

Unit tests that run on device. These tests have access to Instrumentation information, such as
the Context of the application are testing. Use this approach to run unit tests that have web
application dependencies, which mock objects cannot easily satisfy.

Integration Tests
This type of test verifies that the target app behaves as expected when a user performs a
specific action or enters a specific input in its activities. For example, it allows checking that
the target app returns the correct UI output in response to user interactions in the app’s
activities. UI testing frameworks like Espresso allow programmatically simulating user
actions and testing complex intra-app user interactions.
Cross-app Component
This type of test verifies the correct behaviour of interactions between different user apps or
between user apps and system apps. For example, might want to test that app behaves
correctly when the user performs an action in the Settings menu. UI testing frameworks that
support cross-app interactions, such as UI Automaton, allow creating tests for such
scenarios.
Test Cases:
A test case is a set of conditions or variables under which a tester will determine whether a
system under test satisfies requirements or works correctly. The process of developing text
causes can also help find problems in the requirements or design of an application.
The following tables show the various test causes scenarios that are generated along with the
required inputs o the given scenarios, expected outputs, actual output and the result whether
the test passes or fails.

Test causes with positive scenarios:


TC No Positive scenario Actual output Test Result

1 Identify constructs of interest As expected, Pass


2 Develop quetionaries As expected, Pass

3 Pilot testing As expected, Pass

4 Data collection As expected, Pass

5 PLS-SEM analysis As expected, Pass

6 Model validation As expected, Pass

Conclusion

This study has proposed the practical implications based on perspectives at the individual
level. Individual perspectives are imperative since the resistance to use technology commonly
originates from this level of users. Hence, the results of this study may be benefcial for
organisations that have not yet agreed to implement BDA. Tey could use the results to have a
sense of the possibilities from embracing this technology. Tis study has also shown the
theoretical implications based on the incorporation of BDT as a single construct and DQD as
an underpinning theory for the development of a new BDA application model. Tis study is
the frst to investigate the infuence of BDT and DQD towards BDA application by individual
level users. Several limitations apply to the interpretation of the results in this study. First, the
intrinsic and contextual data quality categories are inadequate to specify the DQD included in
the proposed model. Future studies may include other DQD, such as objectivity and
reputation to represent the intrinsic category. Meanwhile, value-added, relevancy, and
appropriate amount of data can be used for measuring the contextual category. Second, the
chosen undergraduate students who have knowledge on BDA were insufcient to generalise
the individual level perceptions towards BDA application. Hence, future studies could
include more experienced respondents, such as lecturers or practitioners. Tird, although the
sample size was statistically sufcient, a larger sample may be useful to reinforce the results of
this study. Finally, although this study has attempted to bridge the gaps between BDT and
DQD, future studies are encouraged to explore other constructs for better understanding of
BDA application. For instance, future studies could explore the role of security and privacy
concerns in BDA application since data protection is becoming more crucial due to recent big
open data initiatives. Terefore, a novel BDA application model that can address security and
privacy concerns may be worth exploring. Overall, the fndings of this study have contributed
to the body of knowledge in the BDA area and ofered greater insights for BDA application
initiators.

References

1. Seyedan M, Mafakheri F. Predictive big data analytics for supply chain demand
forecasting: methods, applications, and research opportunities. J Big Data. 2020.
https://doi.org/10.1186/s40537-020-00329-2.

2. Saggi MK, Jain S. A survey towards an integration of big data analytics to big insights for
value-creation. Inf Process Manag. 2018;54(5):758–90.
https://doi.org/10.1016/j.ipm.2018.01.010.

3. Hasan MM, Popp J, Oláh J. Current landscape and infuence of big data on fnance. J Big
Data. 2020. https://doi.org/ 10.1186/s40537-020-00291-z.

4. Arumugam S, Bhargavi R. A survey on driving behavior analysis in usage based insurance


using big data. J Big Data. 2019. https://doi.org/10.1186/s40537-019-0249-5.

5. Obitade PO. Big data analytics: a link between knowledge management capabilities and superior
cyber protection. J Big Data. 2019. https://doi.org/10.1186/s40537-019-0229-9.

6. Boukhalfa A, Hmina N, Chaoui H. Survey on security monitoring and intrusion detection in the big
data environment. Int J Adv Trends ComputSciEng. 2020;9(4):6175–9.

7. Taleb I, Serhani MA, Dssouli R. Big data quality assessment model for unstructured data. Proc
IntConf 2018 13th InnovInfTechnol IIT. 2018;2019:69–74.

8. Côrte-Real N, Ruivo P, Oliveira T. Leveraging internet of things and big data analytics initiatives
in European and American frms: Is data quality a way to extract business value? InfManag.
2020;57(1):103141. https://doi.org/10. 1016/j.im.2019.01.003.
9. Janssen M, van der Voort H, Wahyudi A. Factors infuencing big data decision-making quality. J
Bus Res. 2017;70:338–45. https://doi.org/10.1016/j.jbusres.2016.08.007.

10. Ghasemaghaei M. Are frms ready to use big data analytics to create value? The role of structural
and psychological readiness. EnterpInfSyst. 2019;13(5):650–74.
https://doi.org/10.1080/17517575.2019.1576228.

11 Hariri RH, Fredericks EM, Bowers KM. Uncertainty in big data analytics: survey, opportunities,
and challenges. J Big Data. 2019. https://doi.org/10.1186/s40537-019-0206-3.

12. Salih FI, Ismail SA, Hamed MM, MohdYusop O, Azmi A, MohdAzmi NF. Data quality issues in
big data: a review. AdvIntellSystComput. 2019;843:105–16.

13. FossoWamba S, Akter S, de Bourmont M. Quality dominant logic in big data analytics and frm
performance. Bus Process Manag J. 2019;25(3):512–32.

14. Haryadi AF, Hulstijn J, Wahyudi A, Van Der Voort H, Janssen M. Antecedents of big data
quality: an empirical examination in fnancial service organizations. Proc 2016 IEEE IntConf Big
Data. 2016;2016:116–21.

15. Janssen M, Konopnicki D, Snowdon JL, Ojo A. Driving public sector innovation using big and
open linked data (BOLD). InfSyst Front. 2017;19(2):189–95.

16. Merino J, Caballero I, Rivas B, Serrano M, Piattini M. A data quality in use model for big data.
FuturGenerComputSyst. 2016;63:123–30.

17 Wahyudi A, Farhani A, Janssen M. Relating big data and data quality in fnancial service
organizations. Lect Notes Comput Sci. 2018. https://doi.org/10.1007/978-3-030-02131-3_45.

18. Wahyudi A, Kuk G, Janssen M. A process pattern model for tackling and improving big data
quality. InfSyst Front. 2018;20(3):457–69.

19. Panahy PHS, Sidi F, Afendey LS, Jabar MA, Ibrahim H, Mustapha A. A framework to construct
data quality dimensions relationships. Indian J SciTechnol. 2013;6(5):4422–31.

You might also like