0% found this document useful (0 votes)
43 views24 pages

Learning Unit 3

This document discusses databases and their utilization. It begins by explaining how data has become one of the most valuable resources and how databases are widely used by businesses to organize and manage data. The document then discusses how databases are used in accounting and auditing to store financial transaction data. It notes the importance of database security for accounting systems. Finally, the document provides an overview of how data moves from operational databases to data warehouses and marts where it can undergo processes like extract, transform, load (ETL) and online analytical processing (OLAP) before being analyzed and used for reporting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views24 pages

Learning Unit 3

This document discusses databases and their utilization. It begins by explaining how data has become one of the most valuable resources and how databases are widely used by businesses to organize and manage data. The document then discusses how databases are used in accounting and auditing to store financial transaction data. It notes the importance of database security for accounting systems. Finally, the document provides an overview of how data moves from operational databases to data warehouses and marts where it can undergo processes like extract, transform, load (ETL) and online analytical processing (OLAP) before being analyzed and used for reporting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

S T U DY U N I T 3

Utilising databases
You have read extensively from various perspectives about data in AIN1501 and the
previous study units. According to Taniar and Rahayu (2022), data is the largest community
nowadays. The Economist (https://www.economist.com/leaders/2017/05/06/the-worlds-
most-valuable-resource-is-no-longer-oil-but-data), on 6 May 2017, published an article “The
world’s most valuable resource is no longer oil, but data”, highlighting that data is the most
valuable resource. Subsequently the term “data is the new oil”, was coined. The article
mentioned the giants that deal with data, such as Google. Amazon, Apple, Facebook, and
Microsoft. Oil is useful only if it is used to fuel an engine, whether it be a vehicle engine, an
aircraft engine, or a manufacturing engine. It is thus designed to convert one form of energy
into mechanical energy. In other words, oil fuels the engine as the engine uses fuel and
burns it to produce mechanical power. Then, if data is the new oil, it should be fuel for data
engines, as it will only be useful if it fuels the data engine. A data engine, like any other
engine, produces power that enables the information system to move and operate. As the
efficient data and information storage and retrieval needs of organisations increased over the
years, the use of databases to manage data and information also increased. Databases are
widely used in business nowadays and their utilisation can differ between organisations. In
processing of massive data sets to make the data analysis efficient, data needs to be
organised in a way that ensures efficient storage and access, including different data
management technologies: File management systems, database management systems
(hierarchical, network-based, relational), data warehouses, and SQL (UNISA 2022).

In this study unit

1
3.1 Introduction

Organisations today know that information technology is essential not only for daily
operations but also for gaining strategic advantage in the marketplace. The importance of
information technology means that information security has also become important.
Breaches in information security can result in litigation, financial losses, damage to brands,
loss of customer confidence, loss of business partner confidence, and can even cause the
organisation to go out of business. As we consider all these issues as a whole, we see how
critically important it is for information security professionals to have strong business
management and organisational skills. Information security professionals must also
communicate with entire user communities to raise their awareness of information security
issues through training and education, thereby promoting culture attuned to information
security. They must also work with business managers and the user community during risk
assessment (UNISA 2022). Anwar, Panjaitan and Supriati (2021) mention that since the
advent of the digital revolution, information systems have been defined and put into
practice. Thus, Database Administrators (DBAs) must be better aware of the procedures
used to preserve business data, as well as the standards and regulations that may be
applied to the data. In this regard, many businesses want the analytical process to take as
little time as feasible. Therefore, it is critical for businesses to have the ability to execute
analysis and report production from information systems in an effective, efficient, and
integrated manner.

In the previous study unit, we learnt about the database environments, its components
and the terminology used in a relational database. In this study unit, we will briefly discuss
the utilisation of databases in accounting and auditing. We will also look at the use of
databases in data warehouses, data marts and data mining and online analytical
processing.

The learning outcomes of this study unit are as follows:

• Describe the importance of database management systems for accountants and auditors.
• Distinguish the key differences between OLAP and data mining
The following icons are included in this study unit:

3.2 The big picture


Taniar and Rahayu (2022) state that the transformation process from an operational
database to a data warehouse is a small part of the big picture, where operational data
turns into useful and meaningful information in the organisation. Each phase in the big
picture has a certain role to play in the data journey. In the example given, the fuel is the
data service and include operational data sources, data warehouses, data marts, data
lakes, etc. The entire process is s to convert one form of ”energy” (e.g., operational
database) to ”organisational mechanical energy” (e.g., reports and decision models)
through complex processes covering data transformation, preparation, pre-processing,
integration and aggregation, all of which are the central activities of data warehousing. The
transformation process from the operational databases to data warehouse, which may

3
involve data cleaning, filtering, extractor, integration, aggregation amongst others, is known
as the extract, transform and load (ETL) process. Extracting the data from the data
warehouse is done by Online Analytical Processing (OLAP), which will then present the
retrieved data to a Business Intelligence (BI) tool for producing reports and charts.

The figure below shows the entire journey of data in various forms and formats. It starts
from an operational database (or transactional database) which is the backbone of any
information system.

Figure 1.3. The Big picture (Taniar & Rahayu 2022)

3.2.1 Operational database


In the era of digital information every company has a database system, which is the system
used to operate the daily business activities of the company. The daily activities can be
sales, finance, booking systems or any transaction events that are pivotal to the business.
The big picture starts in the operational database, which is often called Online Transaction
Processing (OLTP), according to Taniar & Rahayu (2022). Thus, the use of an operational
database is centred around transactions.

It is called an operational database (also known as a transactional database) because it is


used to support the operation of the business. The data in the operational database is used
to support the daily operations of the business by primarily focusing on transactions, which
is the foundation of decision support. This is then transformed into a data warehouse to
support systematic data aggregation. The various aggregation levels are used in data
warehousing, supporting data cubes, which allows managers of an organisation to have an
understanding of various levels of data aggregates.

In the transaction system database, correctness, especially in the concurrency setting, is


crucial because the database which is centrally located can be accessed simultaneously by
various people. In operational database transactions management plays a critical role in
guaranteeing data integrity and consistency. Hence, the primary purpose of the operational
database is to support data integrity, consistency, and concurrency. Without an operational
database, the automatic process of the business will be severely limited, which was the
case prior to the digital era. With implementation of the operational databases, business
processes can be made more automatic. Ultimately, the operational databases, which are
used to record transactions, are used as inputs to the data warehouse (UNISA 2022).
Example
An ATM withdrawal is an example of a transaction in the personal banking system and
providing consistency in the personal banking system (as well as in any other system) is
critical. With the maturity of DBMS technology, transactions will always guarantee database
consistency, accuracy, and correctness.

3.3 Databases in accounting

Finance departments as well as accounting and auditing firms use accounting software to
record financial transactions. Most financial software uses databases to store and retrieve
financial data. For example, Pastel Partner software, which you will learn more about in
topic 7, uses a relational database to store data. We have learnt more about the different
available accounting information systems (AISs) in AIN1501 study unit 14.

Inadequate Accounting Information System (AIS) security increases the opportunity for
manipulation, falsification or alteration of accounting records. Thus, the accounting
profession recognises the need for increased security over AIS. Security of information has
become a major concern to all types of businesses as all technological advancements in hot
topic lists, represent fundamental changes to business practices, with significant system
security implications. The elevated importance of security stems from the recognition that
inadequate security over a system precludes any assurance that an AIS will produce
reliable information to meet internal and external reporting requirements (UNISA 2022).

3.3.1 System security


A major product of computer systems is information. For businesses, much of this
information is financial information, because financial information is typically processed by
the AIS, where the primary guardian is usually an accountant. Key components of system
security include passwords, firewalls, data encryption, employee participation and protection
from computer viruses. Thus, in developing a new AIS, abstract system security measures
should be a primary focus, as the stability of security measures depends upon informed and
constant monitoring of the system, which is unfortunately often neglected. Although
insurance coverage will help cover lost profits caused by some threats, the periodic internal
audits and surprise inspections will reinforce the system.

Generally speaking, and as security philosophy, system security involves risk assessment
and counter-measure implementation to ensure that such systems will operate, function
correctly and be safe from attack by internal and external adversaries. Central to proper
security is that all stakeholders must understand the general need for security and specific
potential threats faced by the organisation. The system security topics of information
security and control, disaster recovery and high availability and resiliency of systems
security are both philosophical and financial. System security is often viewed in a manner
similar to physical security: Buy it once and use it forever, Unfortunately, like physical
security, obsolete policies, procedures and technologies leave systems extremely
vulnerable to external and internal attacks. Most stakeholders find it difficult to accept the
need for constant spending on system security when it is difficult to quantify the benefits.
Even when benefits can be quantified, unenlightened stakeholders may still question the
need for continuous spending in the system security area. In many cases, education can

5
overcome this philosophical barrier. Unfortunately, often only severe losses from a security
breakdown will prompt appropriate, albeit late action. The benefits of system security can be
calculated and quantified from a loss exposure perspective. Once each system is identified
and prioritised in terms of sustaining daily operations and the dollar amount calculated for
the upper cost limits, the cost of the security system program or upgrade must be compared
to the upper cost limits of system failure. Before undertaking such tasks, the system’s
vulnerabilities in terms of passwords, firewalls, data encryption, and employees must be
understood. The greatest threat to computer security is unauthorised access to data or
equipment. There are five basic threats to security: persons outside the organisation – 5%;
natural disasters – 8%; disgruntled employees – 10%; dishonest employees – 10%, and
human error – 67% (UNISA 2022).

3.3.2 Cybersecurity risks


According to Eaton, Grenier and Layman (2019), by 2020, over one third of data is
projected to live in or pass through the Cloud; within five years, there will be over 50 billion
smart connected devices. Accordingly, these trends provide ample opportunity for
cybercriminals looking to compromise an organisation's sensitive data. In fact, 4,500 data
breaches have been publicly disclosed since 2005 in the United States alone, including
recent attacks at well-known companies such as Equifax, Uber, Yahoo, Target, and Home
Depot. Stakeholders (e.g., investors, customers, suppliers) are also demanding more
information about cybersecurity. An FBI director asserted that there are only two types of
companies: those that have been hacked, and those that will be. As a result, managing and
reporting cybersecurity risks are high priorities for the management and board of almost
every company—particularly as proprietary financial and non-financial information, including
information related to customers and suppliers, is increasingly stored on networks and in
the Cloud. As a result, the AICPA, in conjunction with the Auditing Standards Board, issued
a voluntary reporting and assurance framework as a means for companies to communicate
their cybersecurity risk management efforts to interested stakeholders. In sum, companies
have spent over $1 trillion on cybersecurity efforts in the five-year period from 2017 to 2021.

Furthermore, public accounting firms are now equipped to help companies secure their
networks and work alongside companies to improve their current security systems. These
practices are growing rapidly as more companies seek help in protecting their information.
Market demand should also increase as companies start externally reporting on their
cybersecurity risk management efforts and obtaining assurance for the reporting (e.g., using
the AICPA Framework). Accordingly, beyond the need to protect organisational data, recent
academic research suggests numerous benefits of cybersecurity disclosures for client firms.
Two of the key benefits are that cybersecurity disclosures can be informative to investors
and can help mitigate the negative impacts of a subsequent breach. However, despite the
recommendations of regulatory bodies and the findings of recent research, firms often fail to
provide disclosures regarding cyber issues.

3.3.3 Blockchain technology as database engine

In their study, Tan and Low (2019) examined the prediction that blockchain technology will
transform accounting and the accounting profession because transactions recorded on a
blockchain can be aggregated into financial statements and confirmed as true and accurate.
They argued that in a blockchain-based AIS, accountants will no longer be the central
authority but will remain the preparer of financial reports required by regulations. However,
they will continue to influence policies such as the choice and accreditation of validators and
serve as validators of last resort. Using the three-tier architecture of the AIS, their study
addressed the gap in the literature about how characteristics of blockchain technology can
influence the implementation of a blockchain-based AIS, with related implications for the
accounting profession. Thus, they argue that blockchain technology affects the database
engine of the accounting information system (AIS) through digitisation of the current paper-
based validation process. However, audit evidence still needs to be gathered for rendering
of an audit opinion in a blockchain-based AIS. Digitisation of the validation process reduces
the error rate and lowers the cost of vouching, tracing and immutability of blockchain data,
and reduces the incentive and opportunities for fraud. Therefore, a blockchain-based AIS
alone does not guarantee that financial reports are true and fair. Lower error rates and
reduced incentives for accounting fraud in a blockchain-based AIS are expected to improve
audit quality. However, this prediction will need to be empirically tested when blockchain-
based AIS become available.

3.3.4 Blockchain as a decentralised ledger technology

Yu, Lin and Tang (2018), in their study shedding light on the potential application of
blockchain technology in financial accounting and its possible impacts, established that
blockchain, as a decentralised ledger technology with its characteristics of being
transparent, secure, permanent and immutable, has been applied in many fields such as
cryptocurrency, equity financing, and corporate governance. However, the blockchain
technology is still in the experimental stage and several problems have to be solved,
including limited data processing capacity, information confidentiality, and regulatory
difficulties. They argued that in the short run the public blockchain could be used as a
platform for firms to voluntarily disclose information. In the long run, the application could
effectively reduce errors in disclosure and earnings management, increase the quality of
accounting information and mitigate information asymmetry.

3.3.5 Blockchain as a distributed ledger technology

Schmitz and Leoni (2019) posit that blockchain is a distributed ledger technology expected
to have significant impacts on the accounting and auditing profession. Their study,
applicable and timely for both accounting and auditing scholars and practitioners, explored
blockchain technology and its main implications for the accounting and auditing profession.
The research question it addressed was: What are the major themes emerging from
academic research and professional reports and websites debating blockchain technology
in the accounting and auditing context? A literature review of academic literature and
professional reports and websites was performed to identify a taxonomy of emerging
themes. They found that the most discussed themes in scholarly works and professional
sources are governance, transparency and trust issues in the blockchain ecosystem,
blockchain-enabled continuous audits, smart contract applications and the paradigmatic
shift in accountants' and auditors' roles.

3.3.6 Blockchain as a disruptive force

7
According to Smith and Castonguay (2020), Blockchain technology has been a disruptive
force in currency, supply chain, and information sharing practices across a variety of
industries. Its usage has only recently expanded into assurance and financial reporting.
Their study explored blockchain's impact in these areas and provides guidance for
organisations and auditors utilising blockchain by addressing financial data integrity issues,
financial reporting risks, and implications for external auditors and firms' corporate
governance practices. Organisations utilising blockchain must adapt their policies and
procedures regarding internal controls and counterparty risk assessment to address
increasing regulation of the distribution of financial data, while their audit committees must
be prepared to address these challenges leading up to financial statement preparation.
External auditors need to assess blockchain implementation as a financial reporting risk and
balance the potentially more reliable and timelier audit evidence obtained from blockchain-
based reporting systems against the related increase in internal control testing.

3.3.7 Role of accountants


Accountants must ensure the security of their computer systems against internal or external
threats such as sabotage from disgruntled employees, fraud, unintentional data errors,
power failure, damage to computers and loss of information integrity through unauthorised
alterations and modifications to data. Accountants should be familiar with security risks to
protect their own applications and computer use and properly advise clients and others in
their organisations about the various security risks to which they are exposed. Accountants
need to be familiar with the threats to computer security. Security risks are changing
dramatically as technology moves rapidly through various phases of use from mainframe to
PCs to networks and to more powerful network servers that are beginning to take on the
function and personality of mainframe computers (UNISA 2022).

According to Eaton, Grenier and Layman(2019), at a time when data breaches are common
headlines and companies are making massive investments in cybersecurity risk
management and reporting, the accounting firms are in a unique position to help. An
analysis of recent major data breaches creates an opportunity to learn how companies'
security systems are compromised and demonstrate how public accounting firms can assist
in those areas. Specifically, the role of accountants should be considered in all stages of
effective cybersecurity risk management: risk identification and measurement, control
system design and testing, external reporting, and independent assurance is key. It is
important to note that accounting firms in their advisory and/or assurance capacities utilise
multidisciplinary teams comprising traditional accountants who are also trained in
IT/cybersecurity, working alongside IT/cybersecurity specialists who may not have an
accounting background to enhance cybersecurity efforts in the reporting and assurance
stages.

3.3.8 Role of accountants in the context of business environment


According to Saeidi et al (2014), accountants can assume three roles, namely those of
being a designer, user, and auditor. As a designer of an AIS, the accountant brings
knowledge of accounting and auditing principles, information systems (IS) techniques, and
system development methods. In designing the AIS (or a component of AIS), the
accountant might answer such questions as:
a. What will be recorded (i.e., what is recordable business environment)?
b. How will the event be recorded (i.e., what elements will be captured and where will
they be stored) e.g., what ledger accounts will be used?
c. What controls will be necessary to provide valid, accurate, and complete records, to
protect assets, and ensure that AIS can be audited.in these databases?
d. Methods of retrieving those data. To perform analysis to prepare information for
management decisions making and to audit a form of financial records, the
accountant must be able to access and use data from public and private databases.

Databases: other accounting courses have emphasised accounting as a reporting function.


The all-accounting cycle warehouse include data collection and storage and these aspects
must become part of the knowledge base. The advantages of a complete understanding of
AIS can be summarised as follows:
• The variety of databases, both private and public, the quantity and type of data
available for increased efficient and expansion of services
o The profession must find solutions to offer investors and stakeholders, up-
to=date real-time financial information and increase transparency.
• CPAs must embrace technologies and social media to modernise and enhance
interaction and collaborate with clients.
• Due to increasing fraud that is becoming more difficult to detect, CPAs must
continue to be vigilant in ensuring proper that data and information are properly
interpretated to determine the quality and relevance of information to be used for
decision making (e.g., evaluating information for assessment of risks) (UNISA 2022).

ACPIA has been looking to the future on a broad basis, resulting to the CPA Horizons 2025
report. Relative to technology, the reports suggest the importance of technology as follows.
• CPAs must stay current with, embrace and exploit technology for their benefit.
Hierarchically, the audit has performed an attest function to determine the reliability
of financial information presented in printed financial statistics. This is expanding to
include the following:
– non- financial information not measured in monetary units (e.g.,
accountants might help determine occurrence rate for hotels or apartments
completeness)
– use of information technology to create or summarise information from
databases

The variety of opportunities within accounting were confirmed by the reports of AICPIA, the
Institute of Management Accountants ((IMA) and the Big Five (at the time) public accounting
firms. Practitioners surveyed reported that accounting graduates would need to be able to
provide services in the areas of financial analysis, financial planning financial reporting,
strategic consulting and system consulting.

3.4 Database auditing

According to Anwar, Panjaitan and Supriati (2021), several research studies on database
auditing have been conducted. Some of them include theories that assist the auditing
process. Database auditing is a capability for searching the use of database authority and
resources at a high level. It is a crucial part of database security and governance
requirements and is critical to conduct a database audit to detect malicious activities,
maintain data quality, and optimise system performance. Database auditing is thus an

9
option for investigating transactions that occur and is a vital part of database security and
government regulatory compliance. However, business applications lose track of the
company's business operations due to a lack of data audits. The audit trails created by data
operations allow DBAs to do periodic examination of access patterns and data updates in
the Database Management System (DBMS). One of the most serious issues in information
security is database auditing. Historical or temporal database data is required to develop
audits to track operations and types of activities through time. A database audit trail is a
generalised recording of "who did what to whom, when, and in what sequence." This
information is to be used to satisfy system integrity, recovery, auditing, and security
requirements of advanced integrated database/data communication systems. In this
process it is important to know what information must be retained in the audit trail to permit
recovery and auditing later and a scheme of organising the contents of the audit trail to
provide the required functions at minimum overhead. In the regard, auditing technologies
and methodologies are continually changing to catch up with business data processing
methods. For instance, the introduction of computers in business forced the creation of
Electronic Data Processing (EDP) auditing. Databases and distributed computing
substantively changed audit risks and necessitated the utilisation of essential new audit
tools. The advent of the internet, the consequent internetworking of applications, and the
progressive electronisation of many corporate processes have accelerated the trend and
the demand for new, more timely assurance processes (UNISA 2022).

3.4.1 Historical data auditing in databases


Anwar Panjaitan, and Supriati (2021) state that auditing database design and conducting
research on historical data offered many techniques for doing historical data auditing in
databases. These include: (i) line-based audit (ii) based column-audit logs, and (iii) semi-
structured table auditing. The three methodologies of auditing may be performed using
relational databases initially, and semi-structured auditing with relational database
extensions and XML technology (extensions markup language) in databases like IBM DB2
9.5, Oracle 10g, and MS SQL Server.

3.4.2 Row-based relational auditing


Anwar, Panjaitan and Supriati (2021) use the relational model to use row-based auditing to
audit DML activity on business transactions, taking into account operation status, valid time,
and kind of operation. Based on the results of the above-mentioned research, it can be
determined that the auditing table should be isolated from the operating table. This is done
to separate the transaction workload from the analytic workload. When the table is audited
separately from the operational table, the DBMS (engine Database Management System)
may conduct auditing queries faster than when the table is audited as part of the operational
system. Furthermore, the DBA may more simply administer the DBMS, preventing engine
performance deterioration, reducing data redundancy, saving storage, and simplifying query
auditing are all benefits of choosing the proper architecture. The auditing findings can be
saved for use by the DBA in analysing access patterns and making changes to data in the
database.

As technology develops, the use of electronic audit files in audits has increased. These
audit software applications use databases to store and retrieve data. Examples of audit
software that uses databases to store data are CaseWare and CCH TeamMate (UNISA
2022).

3.4.3 Database synchronisation


Database synchronisation, which is a type of replication in which every copy of a database
has the same data, allows auditors to do a variety of tasks, one of which is database
auditing, which logs every database activity. Thus, database synchronisation is a
component of replication, which is the act of guaranteeing that each copy of the database's
data includes the same objects and data. Its function permits data to be updated in real time
or on a regular basis if the data changes, and this condition may be used to provide audits
that records every database activity that is relevant. DBAs must be more knowledgeable
about the methods used to secure corporate data, as well as monitor and guarantee that
sufficient data security is in place. Information security issues arise in terms of
authentication and authorisation when information is used for business processes. Although
the notion protects information, it is of no use in investigations. Log files are presented as a
way to track database and system access. The log file's primary goal, however, is to restore
(recover) transactions (UNISA 2022).

3.4.4 Continuous auditing


Emerging information technology (IT) frameworks, such as extensible markup language
(XML) and Web services, can be utilised to facilitate continuous auditing for the next
generation of accounting systems. Relying on a number of components of Web services
technology, they present a new model for continuously audit business processes, referred
to as continuous auditing web service (CAWS). The CAWS mechanism would run as a
“web service” in the audit firm's computing environment and could be applied at a very
granular level to provide assurance about specific business processes, at a very aggregate
level for providing assurance relating to continuously reported earnings, or to provide
continuous assurance (CA) about the operation of internal controls resident in the audit
client's environment. The primary user of CAWS, given the current audit model, is the audit
firm itself. However, the proposed CAWS approach facilitates a new “pull” model of auditing,
where assurance consumers invoke the CAWS routines to obtain assurance on demand. In
such a model, the auditor would offer restricted views provided by the CAWS routines on a
fee basis to analysts, investors, financial institutions, and other parties interested in
obtaining CA of business performance or other audit objects of interest (UNISA 2022).

Groomer and Murthy (2018) demonstrated an approach to address the unique control and
security concerns in database environments by using audit modules embedded into
application programs. Embedded audit modules (EAM) are sections of code built into
application programs that capture information of audit significance on a continuous basis.
The implementation of EAMs is presented using INGRESS a relational database
management system. An interface which enables the auditor to access audit-related
information stored in the database is also presented.

3.4.5 Continuous online audit


Nowadays, financial statements are not as important to investors as they once were, as
technology has changed the way companies create value in auditing in the era of e-
commerce. While these changes pose serious threats to the economic viability of auditing,
they also create new opportunities for auditors to pursue. Both the American Institute of
Certified of Public Accountants and the Canadian Institute of Chartered Accountants (CICA)

11
Task Force on Assurance Services have identified continuous auditing as a service that
should be offered. Continuous auditing is significantly different from an annual financial
statement audit. A recent research report produced by the CICA defines a continuous audit
as “a methodology that enables independent auditors to provide written assurance on a
subject matter using a series of auditors’ reports issued simultaneously with, or a short
period of time after, the occurrence of events underlying the subject matter”. However,
continuous auditing would present significant technical hurdles. Therefore, with the real‐time
accounting and electronic data interchange popularising, CAATs are becoming even more
necessary (UNISA 2022). The demand for timely and forward‐looking information hints that
the continuous audit will eventually replace the traditional audit report on year‐end results.

In addition, in the future, the entire concept of audit will change to a loose set of assurance
services, some of which will be statutory in nature. Many management processes
progressively rely on this future infrastructure. Four main issues distinguish assurance
processes from other management support functions: data structures, independent review,
the nature of analytics, and the nature of alarms. The data structures tend to focus on
cross-process metrics and time-series evaluation data. Its analytics focuses is on cross-
process integrity. E-Schwabe for example, continuously monitors all trades and filters some
for tighter scrutiny by internal auditors. Its alarms are independently delivered to auditors
(and other parties) and are defined, reviewed, and tested by these assurance professionals.
They propose that continuous assurance (CA) is therefore an aggregate of objectively
provided assurance services, derived from continuous online management information
structures — the objective of which is to improve the accuracy of corporate information
processes. These same services may also provide different forms of attestation including
point-in-time, evergreen, and continuous. One of the central views of corporate monitoring,
corporate IT (legacy, middleware, and internet) systems for series of real time
administrative (cash management or receivables management) high-level corporate metrics
(key performance indicators) and other processes is the evolving field of continuous
assurance.

3.4.6 Role of auditing

Anwar Panjaitan and Supriati (2021) suggest that auditing is a process that involves
monitoring and recording activities from a user's database where an audit trail is the output
of the auditing process. Every database action that is audited creates an audit trail of the
information changes performed when auditing is enabled. The audit trail's contents contain
records that detail what occurred to the database. Each DBMS has its own restrictions in
terms of the number of records or event records it can manage. Database auditing,
according to Meg Coffin Murray, may be used to determine who accessed the database,
what actions were performed, and what data was modified. Auditing activities and database
access can aid in identifying and resolving database security concerns. Because auditing
analyses the record of activities, procedures, and behaviour of organisations or individuals,
it plays a critical role in ensuring compliance with the rules. The ability to follow changes in
the data trail, what modification actions were performed, and when they occurred using
historical data is one of the keys to successful auditing. Historical data may be modelled
relationally in databases using a variety of approaches, including distinct tables for historical
records, transaction logs, and multidimensional data.
Many auditors also use Microsoft Access or IDEA data analysis software to interrogate
their clients’ operational and financial data. These interrogations may find anomalies,
exceptions and trends in datasets obtained from clients, which then helps them to perform
sound quality risk-based audits.

3.5 Data warehouses

Many organisations have multiple databases such as a database for financial information,
operational information, marketing and so forth. Because these databases are not linked, it
becomes difficult and extremely cumbersome to analyse data when it is needed from more
than one database. Some organisations also have massive databases. Running queries on
these huge databases requires a great deal of processing, which may affect operations
owing to the slowness of response times. Data warehouses have been created to overcome
these problems.

A data warehouse is a database populated with current and historical data


extracted from the organisation’s various databases. It may also contain
data from external sources and/or databases (UNISA 2022).

Companies over the last couple of decades have done more logging and data capture with
the advent of computers with database capabilities. Many have found that this data is quite
useful to augment, if only the information were available for statistical analyses. Both the
warehouses and the marts store information about clients, demographics, interactions, and
transactions which is not limited to commercial gains but can be applied in any number of
fields (from astronomy to zoology—anything that can be measured and has volumes of
data). For example, a transaction log history would keep information like: "Joe X withdrew
$50 each weekend at 9 am Saturday morning" or "most reliable visual astronomical
observations were before sunrise". Of course, like the last case, the conclusions are fairly
obvious.

The increasing popularity of intranets and the internet itself, has given rise to repositories of
data and engines that can search for correlates for internal uses and "sellable" information
(e.g., "what kind of people watch what kind of television shows during what times of the
day"). The database model types normally used for data warehouses are relational or
multidimensional (UNISA 2022).

3.5.1 Purpose of the data warehouse


Depending on the purpose of the data warehouse, it sometimes only contains summary
information or only certain specific related data extracted from the various databases. The
data warehouse is updated periodically, usually according to a predetermined frequency,
from the various source databases. These updates can also happen as the source database is
updated; the data warehouse is then referred to as a real-time data warehouse. Bear in
mind that data cannot be modified in the data warehouse. It can only be updated by the
source databases, which means that any updates to data must be done in the source
database. One could therefore say that a data warehouse is “read-only” (UNISA 2022).

As you can imagine, data warehouses are massive because they contain data from various

13
databases. Running queries on the data warehouse can be painstakingly slow because of
the size of the data warehouse.

3.5.2 Data warehouse transformation


Data warehouse transformation is basically changing the structure of the database to
support data analysis once a data warehouse is built. Data warehouse transformation and
implementation require a systems and structured approach. Using the data in the data
warehouse is achieved through Online Analytical Processing (OLAP), which is a querying
mechanism to obtain the required data from the data warehouse for further analysis (UNISA
2022).

3.5.3 Data warehouse operational environment


Data warehouses are specially designed to handle different types of queries—queries
based on statistical analysis. Most companies, until recently, were forced to build their own
warehouses. Now, there are several companies which sell warehouse and mart databases.
However, these tools are very costly (mart’ pricing may run into hundreds of thousands of
dollars and warehouses often exceed millions) and are more general than the custom-built
ones. On the one hand, the advantage of custom-built marts and warehouses is that it
ensures that the structure and queries match the data, but the customisation makes it very
difficult and expensive to maintain. On the other hand, off-the-shelf marts and warehouses
are maintained by the third party but are more general and less useful than the custom
ones. In either case, they can easily grow beyond anything manageable, such as in the
case of enterprise data warehouses. Until there are better definitions of what and how to
process the volumes of available data within a company in a meaningful and reliable way,
any company considering implementing a data warehouse or data mart will have to
anticipate a growing monster that will require more IT/IS staff than they currently employ
and will be marginally reliable in reporting "nuggets of market-savvy truth".

According to Fleckenstein and Fellows (2018), data warehouses were traditionally built to
reflect “snapshots” of the enterprise-level operational environment over time. In other
words, a certain amount of operational data was recorded at a particular point in time and
stored in a data warehouse. Originally, such snapshots were typically taken monthly.
Today, they are often taken multiple times per day. Data warehouses provide a history of
the operational environment suitable for trend analysis. This allows analysts and business
executives to plan based on recent trends. Answers to questions such as how revenue or
costs have evolved over time and the ability to “slice and dice” such data are typical
functions asked of a data warehouse.

3.6 Data mart


Owing to the size of data warehouses and the resulting slow updating and querying of the
data warehouse, a need arose to be able to interrogate “smaller” data warehouses. This
requirement gave rise to data marts.

A data mart is a smaller data warehouse extracted from the main data
warehouse and contains specific related data extracted for a specific
organisational user group such as the finance department or the marketing
department (UNISA 2022).

The use of data marts makes running queries much quicker than running queries on the
full data warehouse. In response to the unnatural homogeneity and sheer data collection
problems of data warehouses, data marts tried to cut down the database by focusing on
topics or specific subjects. Focusing on more specific topics helped structure the data in a
more intuitive way and made the information more accessible. The collections would still be
gathered from other sources, including warehouses and other data marts. Lastly, marts
were easier to compartmentalise so that off-the-shelf solutions could be sold. The classical
transaction database is not able to do analytical processing, because

• transactional databases contain only raw data, and thus, the processing speed will
be considerably slower
• transactional databases are not designed for queries, reports and analyses.
• transactional databases are inconsistent in the way that they represent information
(UNISA 2022).

Hurst, Liu, Maxson, Permar, Boulware and Goldstein (2021) state that the development of
an electronic health records datamart, to support clinical and population health research is
necessary and that EHR systems represent an important research data source. This type
of data is highly complex and can be difficult to access. Typically, EHR data are stored in
an enterprise data warehouse (EDW) along with a number of other data sources such as
billing and claims data, laboratory tracking systems, and scheduling data that underlie
health system operations. These data warehouses require significant expertise and time to
navigate, and access is typically restricted to a small number of individuals to manage
privacy and legal concerns associated with access to large amounts of protected health
information (PHI). One way to make EHR data more accessible and actionable for research
purposes is to organise it into smaller relational databases, referred to as datamarts. These
datamarts are typically organised under Common Data Models (CDMs). CDMs, such as
those used by the National Patient-Centered Clinical Research Network (PCORnet) and/or
the Observational Medical Outcomes Partnership (OMOP), comprise a set of rules for how
to turn raw EHR data into simpler data models. These efforts have stimulated a significant
number of retrospective analyses and innovative multicentre clinical trials

3.7 Data mining

Data mining often produces impressive quantifiable benefits across a broad range of
industries in a wide variety of applications. Data mining yields firm numbers that can make
the case not only for data mining, but for your whole data warehouse effort. For example, a
large wireless company dramatically increased their profitability using data mining. Faced
with a high churn rate (percentage of customers leaving), 40 per cent of the customer base
still using analogue as opposed to digital services, and a low monthly minutes usage that
resulted in an average revenue per user of less than $50, they turned to data mining. If they
could keep and upgrade more customers, the potential payback was significant. They might
otherwise lose 700,000 customers per month, at an annual replacement cost of $360
million! The data consisted of hundreds of fields, with approximately one third coming from

15
call detail records. Using SPSS's Clementine to mine the data on a Teradata platform, they
built a series of models that scored customers on their likelihood to leave and succeeded in
finding sets of rules that would predict customer behaviour. They confirmed the wisdom of
delivering the right offer at the right time, which meant talking directly to customers as well
as sending customised direct mail. To succeed, they needed several coordinated teams to
work together (UNISA 2022).

Lee (2017) states that irrespective of the communities, there is one point of consensus
about data mining. Data mining is a field of study about discovering useful summary
information from data. As innocent as it may look, there are important questions behind
discovering useful summary information. The following is an approach that can be followed
as a starting point.

3.7.1 What is summary information?


Data should not be considered as equivalent to information – at least not summary
information. Suppose we have a data set of daily weather temperature for a year. We can
abstract the data by pieces of summary information where each is a monthly average, or by
one piece of summary information where each is an annual average. In essence, summary
information should be an abstraction that preserves certain properties of data. By focusing
on the statistical and probability properties of the data, we will be able to better understand
how to interpret the meaning behind the information, to verify its correctness in the
analytical process, and to validate its truthfulness with respect to its attempt to characterise
a physical real-world phenomenon (UNISA 2022).

3.7.2 Stages of data mining


A typical data mining project consists of the following stages:
1. A question is floated by senior management (for example: can we get a better
forecast of next year’s milk production?)
2. How can we better manage the replacement schedule for the city’s infrastructure?
3. Data requirements are scoped out, and data sources identified.
4. Data is assembled, cleaned and prepared for analysis.
5. Model building, construction of predictions, identification of relevant patterns.
6. Selling management.
7. Implementation (UNISA 2022).

3.7.3 Data mining software


Data mining software is used to analyse data sets to uncover previously
unknown trends, patterns and relationships between data (UNISA 2022).

These analyses can be used in decision making (including strategic decisions), forecasts,
predicative modelling, fraud detection, risk management and so on. The data sets used in
data mining are usually a data warehouse or a data mart, but data mining may also be
performed on source databases. For example, an insurance company using data mining
on its motorcar claims could uncover the fact that red cars with drivers younger than 25
years are more likely to be involved in an accident. The company could use this information
to correctly price insurance premiums for red motorcar drivers 25 years and younger.

Examples of data mining software include RapidMiner, SAS Enterprise Miner, IBM
SPSS Modeller and Orange.

3.7.4 Extraction of data information


The extraction of information from data relies heavily on a group of techniques: (i)
Visualisation, (ii) Prediction and classification, and (iii) Finding patterns using clustering and
association rules. These techniques are based on clustering, dimension reduction,
visualization, association rules. Collectively, these are tools for statistical learning.
3.7.4.1 Prediction and classification
When making predictions, implicitly or explicitly, we assume that the future data being
predicted is in some sense similar to the data used to construct prediction rule. In
traditional statistics, it is assumed that the distribution of the data in these two phases
(construction and prediction) is the same. If this is not the case, predictions will be biased,
no matter how big the data sets are. It cannot be relied on the “bigness” of big data to be
protected from error.
a. Supervised learning refers to prediction and classification, where we want to predict the
value of a variable (target, response) in terms of other variables (features, predictors,
covariates). It is based on models, where we assume some mathematical relationship
between the target and the features.
b. Unsupervised learning is when we don’t differentiate between target and features, but
want to identify patterns and understand structure (UNISA 2022).
3.7.4.2 Finding patterns
When finding patterns in data, we must beware of spurious patterns in the data arising by
chance or by some artifact in the data collection process - the phenomenon of typing
monkeys producing the works of Shakespeare. This problem is made worse by the size of
modern data sets (UNISA 2022).
3.7.4.3 Forms of data
Data can assume many forms: data base tables, twitter feeds, images, e-mails,
spreadsheets, and SAS data sets. It can be in many formats (raw text, XML, JSON). Most
statistical learning procedures require these data to be processed into rectangular data
sets acceptable to the DM software, with rows representing measurements on objects, and
columns representing variables (the measurements). Often no “missing values” can be
present. Getting data into this form is a very big part of data science.
Computers are getting more powerful, but data is getting bigger. Our rectangular data set
may not fit into computer memory, or take too long to process. Thus, parallel computing,
segmentation of computations, and data streaming (work on a bit at a time) are important
data science topics (UNISA 2022).
3.7.4.4 Data cleaning and imputation
Much real-word data is “dirty” – full of errors and missing values. Data cleaning (correcting
of errors) and imputation (filling in of missing values) are really important.

17
3.8 Online analytical processing (OLAP)

Online analytical processing (OLAP) software enables users to


interactively and rapidly analyse large data sets from various viewpoints (i.e.,
OLAP can handle multidimensional queries) (UNISA 2022).

As an example, consider the following question: How many pairs of red shoes were sold per
month to persons aged 20 to 30 years? The multiple dimensions used in the query were
product type (shoes), product colour (red), time period (month) and age (20–30 years).
Because these data sets are usually multidimensional, they are stored in a multidimensional
database, although some OLAP software is also compatible with relational databases.
OLAP is used in business intelligence, budgeting, forecasting, management reporting
and so on. IBM Cognos Business Intelligence and Oracle Database OLAP Option are
examples of OLAP software.

According to Taniar, and Rahayu, (2022), once data warehouse is created, the next step is
to use it. Using data warehouse means to extract data from the data warehouse for further
data analysis. The query to extract data from the data warehouse is an Online Analytical
Processing tool or OLAP. OLAP is implemented using SQL. Because it uses SQL command
to retrieve data from the data warehouses, the result is a table and the data is in a relational
table format. In short, a data warehouse is a collection of tables. OLAP queries the tables
and the results are also in a table format. The following is an example of an OLAP to
retrieve total sales for all shoes and jeans sales in 2022 and 2023 in Japan. This OLAP
uses group by cubes which not only get total sales as specified in the where clause but also
the respective subtotals as well as grand total:

select
T. Year, P. ProductName,
sum (Total_Sales) as TotalSales
from
SalesFact S,
TimeDim T,
ProductDim P,
LocationDim L,
where S. TimeID = T. TimeID
And S. ProductNo = P. ProductNo
And S. LocationID = L. LocationID
And T. Year = 2022 and T. Year = 2023
And P. ProductionName in (‘Dresses’, ‘Belts’)
And L. Country = ‘South_Africa’
Group by Cube (T. year, P. Production);

The results of the above are shown in the below table as OLAP raw results:

Year Product Name Total Sales


2022 Dresses 2 252 600
2022 Belts 675 760
2022 2 928 360
2023 Dresses 1 684 548
2023 Belts 1 357 179
2023 3 041 727
Dresses 3 937 148
Belts 2 032 939
5 970 087
Table 3.1: OLAP raw results (adapted from Taniar & Rahayu 2022).

The blank cells indicate the sub-total. For example, the line containing 2022 and product
name shows the Total Sales for 2022 empty (sub-total for 2022), whereas the line
containing empty year followed by Dresses indicates the Total Sales for dresses (for the
year of 2022 and 2023). The last line in the results is the grand total: Total Sales of Dresses
and Belts in both years. Note that the results only contain the data that satisfies the SQL
query, and there is no fancy formatting. The formatting itself is not part of OLAP. Using SQL
commands, OLAP retrieves raw data which can then be later transformed using any
Business Intelligence (BI) tools. So, the focus is on the data as retrieved data is the most
important. The BI tool is for further presentation and visualisation. The data retrieved by the
SQL command as shown in Table 1.1 can be later formatted in a number of ways
depending on the use of the data in the business, as well as the features available in the BI
tools. For example, the data can be shown is a matrix- like format, such as Table 3.2

2022 2023
Dresses R2 252 600 R1 684 548 R3 937 148
Belts R 675 760 R1 357 179 R2 032 939
R2 928 360 R3 041 727 R5 970 087

Table 3.2: Results in matrix format (adapted from Taniar & Rahayu 2022).

In this matrix format the respective sub-totals are more clearly shown. This can also be
shown in various graphs. The presentation and visualization is not the focus of QLAP.
OLAP only retrieves, the required data that is the raw data. BI tools which receive the data
can present the data in any form: reports, graphs, dashboards, etc. Some BI tools have
complex features, whereas others may be simple but adequate for the business. For
example, Microsoft Excel is often deemed to be only adequate in presenting some basic
graphs or R also has some visualization features.

19
Figure 1.4. Table visualization of the data

OLAP is the foundation of data analytics as it is able to retrieve the required data from the
data warehouse while is later retrieved into data OLTP analytical engine. The data retrieved
by OLAP is raw data. This raw data needs to be presented in a suitable and attractive
format for management to be able to understand various aspect of the organisation. This
then becomes Business Intelligence (BI) which takes the raw data from OLAP and creates
various reports, graphs, and other data tools for presentation (Taniar & Rahayu 2022).

3.9 Similarities and differences between OLAP and data mining


OLAP and data mining are similar, but technically different. Both are business intelligence
tools that complement each other, and they are used in conjunction with each other. OLAP
and data mining, however, operate differently on data. OLAP is mainly used to summarise
data, say, per month and per region, the number of smart phones sold to persons aged 20 to
30 years. Data mining is used to break down data in order to uncover trends, patterns and
relationships. Data mining will be used to answer a question such as: What factors
resulted in the increase in smart phone sales in Gauteng during September? OLAP and
data mining are therefore used to answer different questions. Data mining will answer
questions such as the following: Why have sales increased in Mpumalanga? or A person
with what characteristics would be more likely to buy our product? OLAP is used to answer
the following kind of questions: What is the value of motor accident claims for women
driving green motorcars in Kwazulu-Natal by month? or What are the average sales of
electric drills by month and by region? (UNISA 2022)

3.10 Comparing OLAP and OLTP


OLAP is read-only as it focuses on the data retrieval from a data warehouse for further data
analysis. On the other hand, OLTP which is used to support the daily operation of the
business focuses on transaction (e/g/ insert, update, delete of transaction data) to maintain
data integrity and consistency in the concurrent access environment (UNISA 2022).

3.11 Business Intelligence (BI)


Using visual presentation, data from OLAP can be more appealing and meaningful to
management. This visualisation is often presented as a dashboard in which managers are
able to perform some as simulations (UNISA 2022).

3.12 Data analytics


The last phase in the big picture is data analysis. Data analytics focuses on algorithms to
determine the relationship between data to find patterns which otherwise cannot be found
and to undertake predictive analysis. Data analytics uses various data analytics methods to
analyse data from the data warehouse. These data analytic methods are specialised
methods tailored for data warehouses (UNISA 2022).
3.13 Difference between BI and data analytics
The major difference between BI and data analytics is that data analytics has predictive
capabilities, whereas BI focuses mainly on analysing past data to help inform decision-
making. The data engine takes the data and processes it to produce useful information and
knowledge. This includes reports, decision models and machine learning models, amongst
others. This information and knowledge drive the organisation. The data engine is a type of
processing software that produces these outcomes (UNISA 2022).

3.14 Summary
In this study unit, we briefly examined how databases are used in accounting and auditing.
We also discussed data warehouses, data marts, data mining and OLAP.

The next topic deals with how to develop and create spreadsheets to solve problems in a
business and accounting context using appropriate formats, formulas and functions. We
will also gain an understanding of the risks and controls associated with spreadsheets.

Activity 3.1

For a company that has just been established, e.g., Clothing Store, which
form of data utilisation might add more value to the company – data mining or
OLAP? Share your view on the Discussion Forum with reasons or examples
to support your answer.

Go to Discussion Forum 3.1 and discuss your findings with your fellow students.

Guidelines for participating in forums:


• Compile your post offline and keep record of it.
• Use an academic writing style for referencing and citing the sources you used.
• Post your answer on the forum.
• Reply to contributions of at least two of your fellow students.

References
Anwar, M.R., Panjaitan, R. & Supriati, R. (2021). Implementation Of Database Auditing By
Synchronization DBMS. Int. J. Cyber IT Serv. Manag, 1(2), pp. 197-205.

Eaton, T.V., Grenier, J.H. & Layman, D. (2019). Accounting and cybersecurity risk
management. Current Issues in Auditing, 13(2), pp. C1-C9.

Fleckenstein, M. & Fellows, L. (2018). Data Warehousing and Business Intelligence.


In Modern Data Strategy (pp. 121-131). Springer, Cham.

Groomer, S.M. & Murthy, U.S. (2018). Continuous Auditing of Database Applications: An
Embedded Audit Module Approach1. In Continuous auditing. Emerald Publishing Limited.

21
Hurst, J.H., Liu, Y., Maxson, P.J., Permar, S.R., Boulware, L.E. & Goldstein, B.A. (2021).
Development of an electronic health records datamart to support clinical and population
health research. Journal of Clinical and Translational Science, 5(1).

Dean, T., Lee-Post, A. & Hapke, H. (2017). Universal Design for Learning in Teaching
Large Lecture Classes. Journal of Marketing Education, 39(1), 5–16. https://0-doi-
org.oasis.unisa.ac.za/10.1177/0273475316662104

Saeidi, H., Prasad, G.V.B. & Saremi, H. (2014). The Role of Accountants in Relation to
Accounting Information Systems and Difference between Users of AIS and Users of
Accounting. Vol 4 [11] October 2015: 115-123.
Schmitz, J. and Leoni, G. (2019). Accounting and auditing at the time of blockchain
technology: a research agenda. Australian Accounting Review, 29(2), pp.331-342.

Smith, S.S. & Castonguay, J.J. (2020). Blockchain and accounting governance: Emerging
issues and considerations for accounting and assurance professionals. Journal of Emerging
Technologies in Accounting, 17(1), pp.119-131.

Tan, B.S. & Low, K.Y. (2019). Blockchain as the database engine in the accounting
system. Australian Accounting Review, 29(2), pp.312-318.

Taniar, D. & Rahayu, W. (2022). Data Warehousing and Analytics: Fueling the Data Engine.
Springer Nature.

Vasarhelyi, M.A. & Halper, F.B. (2002). Concepts in continuous assurance. Researching
accounting as an information systems discipline, pp.257-271.

Yeo, D. (2021). Is Data Mining Merely Hype? In Data Management (pp. 777-782). Auerbach
Publications

You might also like