Enabling The Forensic Study of Application-Level Encrypted Data in Android Via A Frida-Based Decryption Framework
Enabling The Forensic Study of Application-Level Encrypted Data in Android Via A Frida-Based Decryption Framework
these information need to be discovered by means of suitable decrypt data, and by modifying the behavior of these functions in
analysis techniques, which are however quite complex to order to force them to export these data to external storage for the
carry out; perusal of the experimenter.
(2) the data modified by the app needs to be inspected continu- Function call interception is achieved through a technique
ously, ideally after each action carried out on the app during known as hooking [13], which consists in dynamically instrument-
the experimentation. This induces a loop of modify–extract ing the binary code of the app so that (a) the calls to specific func-
from device memory–decrypt the data, which significantly tions are intercepted at run-time, and (b) external code injected
increases the time and effort of carrying out the forensic into the application space (the hook) is executed before the actual
study. execution of the function code.
In this paper, we propose a framework able to address both the This process is conceptually depicted in Fig. 1, where we show
issues mentioned before in scenarios where encryption is achieved five apps: App 1 and App 2 use Data encryption library X to
by using third-party software components, as discussed below. encrypt their data, while instead App 3, App 4, and App 5 use
Database encryption library Y.
1.1 Problem definition
Generally speaking, there are two possible ways for an app to imple-
ment encryption. The first one involves using homemade encryp-
tion code. This approach, however, is more and more deprecated, as
it is quite difficult to write correct and robust encryption code [14].
The second approach, which is adopted by the vast majority of
modern mobile apps, is to use third-party encryption code, which
is provided either as a library of functions (e.g., Android JetPack
Security [3] or javax.crypto [2]), or as an encryption service
running locally (e.g., encrypted databases like the SQLCipher [37]
extension to SQLite or the Realm [35] DBMS). This latter approach
is more robust and secure than homemade one. In this paper, we
thus focus on apps using the latter approach, since it allows us to
maximize the applicability of the framework we have developed.
Third-party encryption code provides two distinct methods to
Figure 1: Conceptual approach.
encrypt application data, which can be used either individually
or in combination, namely [27]: (a) encrypt data by prior to their As shown in Fig. 1, in our approach function calls interception
storage into a file or a database; (b) store plaintext data into an is carried out at the level of the individual encryption software
encrypted database. components, e.g. Data encryption library X and Database
In both cases, to obtain the plaintext of these data, the experi- encryption library Y. This is achieved by injecting, into the
menter needs to determine the encryption algorithm, parameters, app’s binary code, a hook specific to the encryption library it uses.
and key used by the app, which may entail performing a quite An important point to make here is the fact that hooks work at
complex reverse engineering of that app and of the encryption the function level, so once the hooks for a given encryption library
code. More specifically, reverse engineering of these apps entails have been developed, it can be used to get the plaintext out of any
the combined use of both static and dynamic analysis techniques, app using that library. For instance, in Fig. 1, plaintext data for App
and requires both technical skills which are typically not in the 1 and App 2 is extracted using the hooks for Data encryption
possession of all experimenters, and a significant amount of manual library X, while for App 3, App 4, and App 5 the hooks for
work [16, 18, 19, 24]. Database encryption library Y are used.
It is important to note that application-level encryption adds an Our framework exploits this fact to provide automation in plain-
extra layer of encryption to the other encryption layers used by text extraction. In particular, it automates the injection of hooks
modern mobile operating system, such as File-Based Encryption [25] into an application under study, provided that the hooks for the en-
(FBE). As a consequence, even after the operating system decrypts cryption library it uses are available, and the extraction and storage
its file systems when the experimenter unlocks the device, data of plaintext data. To the best of our knowledge, there is no other
encrypted by an app remains unencrypted. similar framework that has been published in the literature.
In its current implementation, our framework supports
1.2 Our framework two prominent and widely-used encryption libraries, namely
The framework we propose in this paper is able to automate the SQLCipher [37] for SQLite databases and Jetpack Security [3]
extraction of plaintext data from the memory space of a running for files (support for the latter is only partial, but work is ongoing
app. As a matter of fact, encrypted data stored on persistent storage to complete it), and the Realm [34], a DBMS for mobile apps which
needs to be necessarily decrypted before being load in main memory is an alternative to SQLite and provides encryption as a built-in
when the app needs to process it (in other words, these data are feature.
encrypted only when they are stored on persistent storage). The To carry out hooking, our framework relies on the Frida toolkit
corresponding plaintext can be therefore obtained by intercepting [28] for binary instrumentation, so it can be implemented on any
at run time the calls made by the app code to all the functions that operating system supported by Frida, and in particular on both
Enabling the forensic study of application-level encrypted data in
Android via a Frida-based decryption framework ARES 2023, August 29–September 01, 2023, Benevento, Italy
Android and iOS. In this paper, however, we focus on Android, and identifying hooks placed by malicious code. These works, however,
we leave the iOS case to future work. deal with problems different from the one we address in this paper.
As a final consideration, it should be noted that our framework
requires that the apps under study are executed with super-user
3 The decryption framework
privileges (i.e, on rooted Android devices). This, however, is not a As we already mentioned, our framework works by dynamically
real limitation since, as discussed before, the focus of our work is on (i.e., at run time) instrumenting the app binary code to hook the
the experimental forensic study of mobile apps, which entails the decryption functions it calls, so that the calls to these functions can
use of test devices under the complete control of the experimenter, be intercepted and, consequently, the plaintext data they obtain
and the interaction with the app to elicit the generation of data. In after decryption can be exported outside the app. Furthermore, if
other words, our framework is not intended to be used for the forensic the encryption key and parameters are passed to these functions,
analysis of app data stored on the seized devices. our framework exports them too.
In this section, we present the above framework by first dis-
2 Related work cussing its architecture (Sec. 3.1), then by illustrating some guide-
The problem of decrypting the data encrypted by apps, both for lines on (a) how it can be concretely implemented (Sec. 3.2) and on
Android and other operating systems, has been already tackled in (b) how it can be used with a specific app (Sec. 3.3).
the literature. 3.1 Framework architecture
[15] studies the decryption of databases of the Telegram X and
BBM-enteprises applications, both for Android and for Windows As we anticipated in Sec. 1, our framework is based on Frida,
platforms. They found out that both applications use the SQLCipher an open source, multi-platform, binary dynamic instrumentation
encryption library, and perform manual reverse engineering to dis- toolkit. Frida is able to inject a custom script (written either in the
cover the encryption parameters and the key generation procedure JavaScript or the TypeScript languages) into a running process,
used by them. [18] focuses on the decryption of databases of the and to hook that script to any function called by the process. In
NateOn, KakaoTalk, and QQ messengers on Windows platforms. this way, when a hooked function is called by the running process,
They also perform manual reverse engineering to discover the key the corresponding script is executed either prior to or after the
generation and the encryption procedure used by these apps, all of execution of the function, depending on which sequence is required
which use home-brewed encryption code. In [24], authors analyze to get plaintext data.
56 note and journal Android apps, and find out that, while 95% The architecture of our decryption framework is schematically
store their data locally in an insecure way, the remaining 5% use depicted in Fig. 2, where the components we developed are high-
home-brewed encryption libraries for their databases, and perform lighted in light gray, while the ones shown in white are those
manual reverse engineering to recover the key generation proce- belonging to the Frida toolkit.
dure and the encryption algorithms and parameters. [19] focuses on
the decryption of the databases for the Signal, Wickr, and Threema
instant messengers. Authors find that all these apps rely on the
SQLCipher library, and a thorough manual reverse engineering
is carried out in order to determine the key generation process
and the encryption parameters. [27] proposes an approach, based
on static code analysis, that automates the process of discovering
whether an Android app encrypts its database, and in some cases
also the encryption method that is used. However, no means of
recovering the encryption key and parameters for the analyzed
apps is provided.
All these works show the relevance of the problem we tackle with
our framework, and that the manual recovery of the encryption key
and the other encryption parameters, required to decrypt databases,
is a complex, time-consuming, and possibly error-prone process.
In contrast, our framework, by automating the extraction of these
Figure 2: Framework architecture.
parameters from prominent database encryption libraries, avoids
the problems characterizing manual analysis. As shown in the figure, the framework runs on two distinct
Other works have instead focused on the decryption of back- systems: (a) an Android device, where the app of interest is running,
ups generated by utilities developed by smartphone vendors, such and (b) an experimenter’s machine, to which the device is connected.
as [21–23]. These works focus on a problem different from that It encompasses four distinct components, two of which belong to
targeted by this paper, so they do not compare to our work directly. Frida, while the other three ones have been developed by us. In
There are other works in the literature that rely on hooking to particular:
obtain relevant information from the memory space of running • the Plaintext extraction agent: it is the code, injected
processes. [26] proposes a hooking framework aimed at monitor- into the app binary code, which takes care of accessing the
ing sensitive methods in shared objects, while [13] proposes a plaintext data and of exporting them to external storage. The
framework for the automated analysis of app code with the aim of agent includes also a hook, which is the code injected into the
ARES 2023, August 29–September 01, 2023, Benevento, Italy Anglano, et al.
app binary code that starts the execution of the agent when the functions used to “open” the container prior to its use
the functions of the Encryption library that decrypt data requires them. In case they are exposed, the encryption key
are called by the app; and parameters can be obtained by intercepting the calls to
• the Automation console: it provides the experimenter with these functions, and having the hook code access and print
suitable mechanisms allowing to inject specific agents and the values of the function parameters corresponding to the
hooks for the app under study; above key and parameters.
• Frida toolkit components: they are used to inject hooks These functions are typically identified by examining the docu-
and agents into the running app, and to collect the results mentation of the encryption library and/or its source code. In case
generated by the agents above. More specifically, they are: neither the source code of the library, nor its documentation are
– the FRIDA Server: it spawns the process of the app un- available, then a reverse engineering phase of the library needs to
der study, injects the Plaintext extraction agent into be performed.
this process, hooks it to the encryption/decryption func- After the functions to be hooked have been identified, the corre-
tions, and captures the output generated by it. The FRIDA sponding Plaintext extraction agent is developed by writing
Server needs super-user privileges to run properly so, as a set of hooks for them, as well as the code that is executed by the
already anticipated, our framework assumes that the An- hook when the corresponding function is called. The implementa-
droid device is rooted; however, this is not a real restriction, tions of both parts depend on whether the library has been written
as devices used in the forensic study of apps are typically in Java or as native code (i.e., using the Android NDK toolset [29]).
under the complete control of the experimenter; 1 In the remainder of this section we discuss how to associate a hook
– the FRIDA Client: it runs on the experimenter’s machine, with the corresponding function, while the implementation of the
and interacts with both the Automation console and the code executed by the hook – which depends on the specific en-
FRIDA Server. In particular, it accepts user commands, cryption library it targets – is discussed in Sec. 4 for three distinct
forwards them to the server, waits for the server to com- widely-used libraries.
plete the command, receives the output from the server, Encryption library written in Java. To attach a hook to a
and forwards it to the user. generic Java method class_to_hook, which corresponds to the
3.2 Implementing the framework for a specific en- generic <x>.<y>.<class_to_hook> fully qualified class name (e.g.,
io.realm.RealmConfiguration), the code skeleton shown in
cryption library
Fig. 3 can be used. As shown in Fig. 3, the FRIDA Java.perform(fn)
To support a given encryption library, the only framework com-
ponent that needs to be implemented for each specific encryption 1 Java.perform(function x() {
library is the Plaintext extraction agent. The other compo- 2 const <class_to_hook> = Java.use("<x>.<y>.<class_to_hook>")
3 <class_to_hook>.$init.overload(arg1,arg2,...,argN)
nents are instead library-independent and, therefore, they do not 4 .implementation = function (val1,val2,...,valN) {
5 <hook_code>
need to be modified when support for a new library is developed. 6 const toRet = this.$init(val1,val2,...,valN);
In this section, we illustrate the generic procedure that can be 7 return toRet
8 };
followed to carry out this implementation (we will illustrate how 9 });
this generic procedure can be instantiated in practice in Sec. 4).
To implement the Plaintext extraction agent for a specific
library, the first step to be carried out is the identification of the Figure 3: Attaching a hook to a Java method.
functions of that library that need to be hooked, by following the
guidelines discussed below (where we use the generic term container function is used (line 1) to attach the hook to the chosen Java
to denote both a file and a database): method. Within this function, we first specify the name of the class
• obtaining plaintext data: the plaintext of all the data that to be hooked, and its fully qualified method name (line 2). Then,
have been saved into an encrypted container while an app is we specify which method of that class we want to hook to (line 3),
running can be obtained by intercepting the functions that which in Fig. 3 is the constructor ($init) of class class_to_hook,
“close” that container (i.e., those functions that are called by also specifying its signature (i.e., the list of its arguments).
when the app no longer needs to use those data), and by Next, we specify the code of the hook using the
reading its contents before passing the control to the func- .implementation keyword (line 4). As indicated in Fig. 3,
tion that will actually close it. Sometimes, it is necessary to we can insert any code (generically denoted as <hook_code>)
intercept also the functions that “open” the container if the either before (as indicated in the figure) or after the call to the
encryption library does not provide “close” functions (e.g., original function (line 6).
see Sec. 4.3 as an example). Encryption library written in native code. To attach a hook to a
• obtaining the encryption key and parameters: these informa- native library function named, e.g., LIB, the code of the Plaintext
tion are typically (though not always) exposed by the app if extraction agent (expressed in TypeScript) – which is shown
in Fig. 4 – is more complex than the one used with a Java library.
1We made this decision in order to make the framework app independent, i.e. able To hook a specific native library LIB, we must be sure that LIB
to run with any app without having to customize it for that specific application. The has been already loaded into main memory by the app. However,
alternative was indeed to run the FRIDA Server without super-user privileges, but
this required to modify the executable code of each app in order to include into it a in an Android app, native libraries might not be loaded in mem-
special-purpose FRIDA library, that gets loaded when the app is started. ory when the app is started, but later on demand by explicitly
Enabling the forensic study of application-level encrypted data in
Android via a Frida-based decryption framework ARES 2023, August 29–September 01, 2023, Benevento, Italy
1 let libraryLoaded = false method (line 18) and then (2) installing the hooks for the inter-
Java.performNow(function x() {
2
3 const System = Java.use('java.lang.System'); ested functions (line 9). Similarly, the hooking of the System.load
4 const Runtime = Java.use('java.lang.Runtime'); method performs the same steps but it uses the Java Runtime.load0
5 const VMStack = Java.use('dalvik.system.VMStack');
6 function loadedHookedLibrary(library: string) { method to load the library stored in the file libFile.
7 if ((library === 'LIB' || library.endsWith('LIB.so')) && !libraryLoaded) {
8 // Make sure to hook the LIB library just once 3.3 Using the framework with an app
9 loadNativeHooks() // Load hooks for the LIB library
10 libraryLoaded = true After a Plaintext extraction agent for a specific encryption
11 }
12 } library has been developed, to use it with a specific app using that
13
14 // Hook for the System.loadLibrary() method
encryption library requires to determine whether the app uses that
15 System.loadLibrary.implementation = function (libName: string) { library or not. The ways in which this can be determined differ
16 try {
17 // Load the given library as expected by the app and then perform
according to the availability or not of the app source code.
function hooking If source code is available, then it can be inspected to determine
18 const loaded =
Runtime.getRuntime().loadLibrary0(VMStack.getCallingClassLoader(), which library it uses. In some cases, the source code may also
libName); include a build file that explicitly lists the libraries from whom the
19 loadedHookedLibrary(libName)
20 return loaded app depends, and that of course includes also the reference to the
21 } catch (ex) { encryption library. As an example, the excerpt in Fig. 5, extracted
22 // Code for exception handling (not shown for brevity)
23 } from the source code of the Element [11] secure messaging app
24 };
25
(which it is known to use encrypted databases), shows the contents
26 // Hook for the System.load() method of its build.gradle file [12].
27 System.load.implementation = function (libFile: string) {
28 try {
29 // Load the given library as expected by the app and then perform
function hooking 1 apply plugin: 'com.android.library'
30 const loaded = Runtime.getRuntime().load0(VMStack.getStackClass1(), 2 apply plugin: 'kotlin-android'
libFile); 3 apply plugin: 'kotlin-kapt'
31 loadedHookedLibrary(libFile) 4 apply plugin: 'kotlin-parcelize'
32 return loaded 5 apply plugin: "org.jetbrains.dokka"
33 } catch (ex) { 6 apply plugin: 'realm-android'
34 // Code for exception handling (not shown for brevity) 7 //[...]
35 }
36 };
37 })
Figure 5: Excerpt from the Element app source code
build.gradle.
Figure 4: Attaching a hook to the System.loadLibrary() and
System.load() methods. As can be seen from line 6, the Element app uses the Realm
library to encrypt its databases.
invoking suitable Java methods, that is System.loadLibrary and
If the application is, instead, closed-source, a different approach
System.load. Hence, we must hook the above two methods so
needs to be used. In particular, its APK file (i.e., the package file
that, when they are called by the app, they associate the hook with
which is installed on the device) needs to be obtained first, and
the target library LIB. Such a hooking is actually performed by
then unpacked and de-compiled using a tool like Apktool [30].
function loadedHookLibrary, which is defined in lines 6– 12, and
Once these steps have been carried out, it is possible to inspect
is called by both System.loadLibrary (line 19) and System.load
the resulting code to look for the names of the included library
(line 31).
files, among which there will be also the encryption library. Typ-
These actions need to be performed as soon as possible to prevent
ically, the code of the app includes a folder, named after the
that System.load or System.loadLibrary are invoked by the app
so-called reverse domain name notation of the library. For in-
before hooks have been installed, thus missing the calls to them.
stance, in the case of the SQLCipher library, this folder is named
This is achieved by using the Java.performNow(fn) FRIDA method
net.zetetic.database.sqlcipher.
(see line 2 of Fig. 4), which is executed by FRIDA just after the Java
Once the encryption library used by the app has been determined,
Virtual Machine has started but before any app-specific class is
and the availability of the corresponding Plaintext extraction
loaded.
agent has been ascertained, then such agent is hooked to the app
In particular, we hook the loadLibrary and load methods of
while it is running.
both the System and the Runtime classes.2 The hooking of the
As mentioned before, this step is carried out by the FRIDA
System.loadLibrary method is achieved in lines 15 – 24, while
Client, which requires that the user specifies the names of the
that of the System.load() method is achieved in lines 27 – 36.
app (more precisely, its package name, e.g. com.whatsapp) and of
As can be seen, in both cases when the method is loaded, the
the file storing the code of the Plaintext extraction agent to
loadedHookedLibrary (lines 6 – 12) is called (lines 19 and 31, re-
attach, and sends to the FRIDA Server the commands that make it
spectively). Specifically, the hooking of the System.loadLibray
hook the agent with the chosen app.
method consists in (1) loading the library libName (as expected by
To illustrate how this is done in practice, let us discuss the excerpt
the invoking application) through the Java Runtime.loadLibrary0
of the FRIDA Client (written in Python) shown in Fig. 6. The first
2 InFig. 4, because of space constraints and to avoid cluttering, we omit the hooking action which is carried out is the spawn of the app on the device
of the Runtime methods, which however is similar to that of the System methods. (line 3), which is followed by the creation of a FRIDA session whereby
ARES 2023, August 29–September 01, 2023, Benevento, Italy Anglano, et al.
1 import frida the right database, in case several of them are simultaneously used
2 #[...]
3 pid = frida.get_usb_device().spawn(package) by the app,
4 session = frida.get_usb_device().attach(pid) We achieve this goal by having the agent hook the sqlite3_open
5 script = session.create_script(agent.read())
6 script.load() function (and its variants) to store the database handle returned by
7 frida.get_usb_device().resume(pid) such function, and using it later when extracting the plaintext or
the encryption key and parameters associated with that database.
The TypeScript hook code is reported in Fig. 7, where we see that
Figure 6: An excerpt of the FRIDA Client.
1 Interceptor.attach(sqlcipher.getExportByName("sqlite3_close"), { the key has been correctly set (lines 10–12). 4 Finally, the key is re-
onEnter: function (args) {
2
3 this.dbHandle = args[0]; // Database handle ported to the FRIDA Client (action denoted by the generic function
4 const db = dbDict.get(this.dbHandle.toString()) Report_key_to_FRIDA_Client (line 7)).
5 dumpDbToPlainText(db)
6 } Conversely, to extract the encryption parameters, we extend the
7 }); hook to the sqlite3_close function (already shown in Fig. 8) to
8 const callable_sqlite3_exec = new
NativeFunction(sqlcipher.getExportByName("sqlite3_exec"), 'int', include also the queries for these values to the database engine.
['pointer', 'pointer', 'pointer', 'pointer', 'pointer']);
9 function dumpDbToPlainText(nativeDb: NativeDb) {
By doing so, we avoid the need of instrumenting all the possible
10 let pathPlaintext = nativeDb.path + ".native_plaintext" SQLCipher functions that could change these values at run time.
const errorMsgPtr = Memory.alloc(Process.pointerSize)
11
12 const sqlQueryPtr = Memory.allocUtf8String("ATTACH DATABASE '" + The extended code for the hook to the sqlite3_close function is
pathPlaintext + "' AS plaintext KEY '';SELECT shown in Fig. 10, where the unchanged parts of the code from Fig. 8
sqlcipher_export('plaintext');DETACH DATABASE plaintext;")
13 const retExec = callable_sqlite3_exec(ptr(nativeDb.handle), sqlQueryPtr, are replaced by comments in square brackets (lines 3 and 12).
NULL, NULL, errorMsgPtr)
14 }
1 Interceptor.attach(sqlcipher.getExportByName("sqlite3_close"), {
2 onEnter: function (args) {
3 [CODE FROM LINE 3 TO LINE 5 OF Fig. 8 GOES HERE]
Figure 8: SQLCipher: hooking of the sqlite3_close function 4 // Note: pragmaKeys is a list of the PRAGMA names associated with the
encryption parameters
to extract plaintext data from the database. 5 for (const pragmaKey of pragmaKeys) {
6 db.params[pragmaKey] = getPragmaValue(db, pragmaKey) // Extract parameter
pragmaKey
Name Meaning 7 Report_params_to_FRIDA_Client(db.params[pragmaKey])
key the encryption key 8 }
cipher_kdf_algorithm key derivation function to be used 9
kdf_iter number of iterations used with the key derivation function 10 }
cipher_page_size page size for the encrypted database 11 });
cipher_plaintext_header_size size of the header of the encrypted 12 [CODE FROM LINE 8 TO LINE 14 OF Fig. 8 GOES HERE]
database that must not be encrypted 13 function getPragmaValue(nativeDb: NativeDb, name: string) {
cipher_use_hmac either enables or disables the use of a per-page HMAC 14 // Runs a PRAGMA statement to get the value associated with 'name'
cipher_hmac_algorithm the HMAC algorithm to be used 15 const errorMsgPtr = Memory.alloc(Process.pointerSize)
16 let val: string[] = [];
Table 1: Parameters used by SQLCipher to control encryption. 17 const callback = new NativeCallback((_arg1, count, data, columns) => {
18 for (let i = 0; i < count; i++) {
19 const arrayElementPointer = data.add(Process.pointerSize *
i).readPointer()
20 const value: string | null = arrayElementPointer.readUtf8String()
explicitly set, suitable default values will be used for them. However, 21 if (value !== null)
to successfully decrypt a database, all the above parameters must be 22 val.push(value);
23 }
set to the same values used at encryption time. Hence, all of them 24 return 0;
need to be extracted by the SQLCipher Agent in order to enable 25 }, 'int', ['pointer', 'int', 'pointer', 'pointer']);
26 const retExec = callable_sqlite3_exec(ptr(nativeDb.handle),
the offline decryption of a database. Memory.allocUtf8String("PRAGMA " + name + ";"), callback, NULL,
The procedure to extract the above parameters is different for errorMsgPtr)
27 return val;
the encryption key and the encryption parameters. 28 }
To extract the encryption key of an open database, it is suf-
ficient to hook the sqlite3_key and sqlite3_rekey functions
(and their variants), as shown in Fig. 9. As shown in the figure, Figure 10: SQLCipher: extension to the hook for
sqlite3_close to extract the encryption parameters.
1 Interceptor.attach(sqlcipher.getExportByName("sqlite3_key"), {
2 onEnter: function (args) { As shown in Fig. 10, the encryption parameters are obtained
3 this.dbHandle = args[0]; // Database to be keyed
4 this.key_ptr = args[1]; // The key by executing a series of SQL PRAGMA statements (embedded into
5 this.key_size = args[2].toInt32(); // The length of the key (in bytes) the getPragmaValue function (lines 13 – 28)) to retrieve the value
6 },
7 onLeave: function (retval) { of the encryption parameters just before the database is closed.
if (retval.toInt32() == 0) {
8
9 // The key has been successfully set in the database
This function is repeatedly called by the onEnter callback (see line
10 let key_bytes = this.key_ptr.readByteArray(this.key_size); 5); the result of each call is stored in the in-memory dictionary
const db = dbDict.get(String(this.dbHandle.toString()))
11
12 db.key_hex = arrayBufferToHexString(key_bytes) db (line 6). At the end of this sequence of calls, the results are re-
13 Report_key_to_FRIDA_Client(db.key_hex) ported to the FRIDA Client (action denoted by the generic function
14 }
15 }, Report_params_to_FRIDA_Client (line 7)).
16 });
4.2 Decrypting Realm databases
Realm is a database library (available for both Android and iOS)
Figure 9: SQLCipher: hooking of the sqlite3_key function to which provides support to create and manage object-oriented data-
extract the encryption key. base for mobile applications, and whose adoption by apps is rapidly
gaining momentum. As such, a Plaintext extraction agent for
the SQLCipher Agent extracts the values of the input arguments this library has been developed and included in our framework.
when entering the hooked function (including the encryption key; 4 Thehooks for the other variants of sqlite3_key() function as well as those for the
lines 3–5) and stores the extracted information in the in-memory sqlite3_rekey() function and its variants are very similar and they are not shown
dictionary dbDict before leaving that function, once it is sure that here to avoid repetitions.
ARES 2023, August 29–September 01, 2023, Benevento, Italy Anglano, et al.
Realm provides both Java and Kotlin libraries but, because of 4.2.2 Extracting the encryption key The JavaScript code
space constraints, in this section we describe the Java library only. to hook the RealmConfiguration constructor is reported in lines
The analysis of the documentation and of the source code of 10–36 of Fig. 11. In particular, we first instantiate the name of the
Realm [32] indicates that: method to hook (lines 10 – 27), and the code executed by the hook
• Realm uses only an encryption key (and not additional pa- (lines 28 – 36). The code of the hook first opens (and decrypts)
rameters, as instead done by SQLCipher) to encrypt and the database (line 31), and then sends to the FRIDA Client the
decrypt data; encryption key (by means of the generic function
• to obtain the encryption key, it is sufficient to hook the Report_key_to_FRIDA_Client (line 34)).
constructor of the RealmConfiguration class, which is used 4.3 Decrypting the Android Jetpack Security library
to open an existing database;
• to obtain the plaintext data of an encrypted database, it is The Jetpack Security library is part of the Android Jetpack suite
sufficient to hook the close method of the Realm class. of libraries and its main goal is to help developers follow security
best practices related to securely reading and writing files, as well
The resulting code for these hooks is reported in Fig. 11. as key management through the Android Keystore system. Con-
sidering the wide adoption of Jetpack libraries in Android apps,
1 Java.perform(function x() { we developed a Plaintext extraction agent for the Jetpack
2 const RealmConfiguration = Java.use("io.realm.RealmConfiguration") Security library and included it in our framework.
3 const Realm = Java.use("io.realm.Realm")
4 const File = Java.use("java.io.File"); The Jetpack Security library provides an API both in Java
5 function dumpDbToPlainText(realmConfigInstance) {
6 const filePlaintext = File.$new(plaintextPath)
and Kotlin programming languages but, since both APIs are quite
7 const instanceRealm = similar in the functionality provided and in the interface, and due
Realm.getInstance.overload('io.realm.RealmConfiguration').call(Realm,
realmConfigInstance) to space limits, in this paper we describe the Java API only.
8 instanceRealm.sharedRealm.value.writeCopy(filePlaintext, null); Jetpack Security provides two classes to securely reading and
9 }
10 RealmConfiguration.$init.overload( writing data at rest, namely the EncryptedFile class (used to read
11 'java.io.File', and write encrypted files) and the EncryptedSharedPreferences
12 'java.lang.String',
13 '[B', class (used to encrypt keys and values in a preference file). Cur-
14 'long',
15 'io.realm.RealmMigration',
rently, our framework fully supports plaintext extraction from
16 'boolean', EncryptedSharedPreferences instances only, but we are work-
'io.realm.internal.OsRealmConfig$Durability',
17
18 'io.realm.internal.RealmProxyMediator', ing to support also the plaintext extraction from EncryptedFile
19 'io.realm.rx.RxObservableFactory', instances (which we plan to present in a future work).
20 'io.realm.coroutines.FlowFactory',
21 'io.realm.Realm$Transaction', From the analysis of the documentation and of the source code
22 'boolean', of Jetpack Security [4], we found that to obtain the plaintext
23 'io.realm.CompactOnLaunchCallback',
24 'boolean', data of an encrypted preference file, it is sufficient to hook both
'long',
25
26 'boolean',
the create method of the EncryptedSharedPreferences class
27 'boolean') (invoked to create an instance of this class) as well as the commit
.implementation = function (realmPath,
28
29 a1, key, a3, a4, a5, a6, a7, a8, a9, and apply methods of the Editor nested class (invoked to commit
30 a10, a11, a12, a13, a14, a15, a16) { preferences changes from memory back to the preference file).
31 const toRet = this.$init(realmPath, a1, key,
32 a3, a4, a5, a6, a7, a8, a9, The reason to hook all the above methods is to cover all possible
33 a10, a11, a12, a13, a14, a15, a16); experimental scenarios, including those where the preference file is
34 Report_key_to_FRIDA_Client(key)
35 return toRet not changed (in this case, neither the commit and apply methods
36 };
37 Realm.close.overload().implementation = function () {
will be invoked; therefore the plaintext extraction is performed in
38 dumpDbToPlainText(this.sharedRealm.value.getConfiguration()) the create method), and also where the preference file is modified
};
39
40 }) (in this case, hooking the commit and apply methods assures that
the most updated plaintext is extracted).
The resulting code for these hooks is reported in Fig. 12. As we
Figure 11: Realm: hooking of the RealmConfiguration con- can note, each hook invokes the original (hooked) method and saves
structor and the class method. the returned value in an auxiliary variabile ret, then dumps the
contents of the preference file in plaintext to the console through
4.2.1 Extracting plaintext contents To extract the plaintext the dumpSharedPrefs function (defined at lines 1–10), and finally
contents of the database, we hook the close method of the Realm returns the value returned by the hooked function that we pre-
class (lines 37–39 of Fig. 11) by overwriting it with the viously stored in the variable ret. For instance, in the hook for
dumpDbToPlainText function (which is defined in lines 5–9). the create method, we first invoke the original create method
In this function, the object named filePlainText (line 6) and save the returned value (representing an instance of an en-
stores the plaintext copy of the database, the input parameter crypted SharedPreferences) in the variable ret (line 23), then we
realmConfigInstance is used to open the database (line 7) and, dump the preferences data in plaintext to the console by calling the
finally, the writeCopy function is invoked on the sharedRealm dumpSharedPrefs function (line 24), and finally we return the in-
field contained in the Realm instance (line 8) to actually perform stance of the encrypted SharedPreferences stored in the variable
the plaintext copy of the database. ret (line 25).
Enabling the forensic study of application-level encrypted data in
Android via a Frida-based decryption framework ARES 2023, August 29–September 01, 2023, Benevento, Italy
1 function dumpSharedPrefs(sp) { Since all these apps use SQLCipher, we consider also Element
var m = sp.getAll();
2
3 var HashMapNode = Java.use('java.util.HashMap$Node'); that, instead, uses Realm, although there are no published results
4 var iterator = m.entrySet().iterator(); for it against which our results can be compared.
5 console.log("Shared Preferences:");
6 while (iterator.hasNext()) { Concerning the apps using SQLCipher, namely Wickr Me,
7 var entry = Java.cast(iterator.next(), HashMapNode); Signal and Threema, with our framework we are able to extract
8 Report_params_to_FRIDA_Client(entry.getKey(),entry.getValue())
9 } the same encryption parameters presented in [19] (except for the
10 }
11
encryption key, which is obviously different and thus not consid-
12 Java.perform(function() { ered in the validation). This can be verified by comparing the values
13 const EncryptedSharedPrefs =
Java.use('androidx.security.crypto.EncryptedSharedPreferences'); of the parameters extracted with our framework and reported in
14 const EncryptedSharedPrefsEditor = Table 2 (where, for each encryption parameter, we report its values
Java.use('androidx.security.crypto.EncryptedSharedPreferences$Editor');
15 extracted from the encrypted database generated by the above apps)
16 EncryptedSharedPrefs.create.overload( with those presented in [19]. For completeness, we also report the
17 'android.content.Context',
18 'java.lang.String', encryption keys (truncated to avoid cluttering) extracted by our
19 'androidx.security.crypto.MasterKey',
20 'androidx.security.crypto.EncryptedSharedPreferences$PrefKeyEncryptionScheme',
framework, which however have been placed outside the main
21 'androidx.security.crypto.EncryptedSharedPreferences$PrefValueEncryptionScheme') table, as these values are, of course, different from those reported
22 .implementation = function(context, fileName, masterKey,
prefKeyEncryptionScheme, prefValueEncryptionScheme) { in [19], as they have been created on a device different from those
23 var ret = this.create(context, fileName, masterKey, used in the above paper.
prefKeyEncryptionScheme, prefValueEncryptionScheme);
24 dumpSharedPrefs(ret); We also performed some experiments to validate the results ob-
25 return ret; tained with the Realm Plaintext extraction agent. Given that,
26 }
27 to the best of our knowledge, there are no published results concern-
EncryptedSharedPrefsEditor.apply.implementation = function() {
28
29 var ret = this.apply();
ing the decryption of Realm databases, we performed experiments
30 dumpSharedPrefs(this.mEncryptedSharedPreferences.value); using Element [11], a secure messaging app which uses the Realm
31 return ret;
32 } library to manage encrypted databases. In particular, we tested our
33 agent with Element version 1.4.3. In our experiments we installed
34 EncryptedSharedPrefsEditor.commit.implementation = function() {
35 var ret = this.commit(); Element on a virtual device and we populated its databases with
36 dumpSharedPrefs(this.mEncryptedSharedPreferences.value); data by using the app to exchange messages. Then we instrumented
37 return ret;
38 } it with the Realm Plaintext extraction agent and, by using our
39 });
framework, we found out that it uses several encrypted databases,
which in all cases but two use different encryption keys. The list
Figure 12: Jetpack Security: hooking of the of the encryption keys extracted are reported in Table 3. In order
EncryptedSharedPreferences class and of its Editor nested to validate our agent, we use the Realm Studio [33], a developer
class. tool to easily manage Realm databases. In particular, we pass to
Realm Studio the encrypted databases and the encryption keys
5 Experimental results extracted by our agent and we successfully obtain the plaintext
In order to validate the decryption framework we developed, we database contents.
performed experiments in which the decryption parameters are
extracted from real-world applications. 5 In these experiments, we 6 Conclusions
consider a set of applications that use either SQLCipher or Realm
to encrypt their databases, we install and initialize these apps on an In this paper, we have presented a framework for the decryption
Android device (to ease the experimentation we use virtual Android of data encrypted by apps, which is based on the use of dynamic
devices with the Android Emulator [10], but a rooted real Android instrumentation of app’s binary code by means of hooking. Our
device could have been used instead without changing anything in framework has been conceived to be used only with test devices
our experiments). For the experiments, we considered apps whose used for forensic study purposes, and not with devices that need to
decryption procedure has been already published in the literature, be forensically analyzed.
so as to validate our framework against published results, and By executing suitable hooks when the app under study is exe-
in particular Wickr Me (version 5.84.6), Signal (version 5.19.4), cuted on a test device, our framework can export the plaintext of
and Threema (version 4.8), whose decryption procedure has been data after they have been decrypted by the app, as well as the cor-
published in [19]. Note that while for Signal and Wickr Me we responding encryption key and parameters (when possible), thus
use the same versions considered in [19], as these versions can still enabling the experimenter to access and analyze them.
be downloaded from the APKMirror site [36], for Threema we use Hooking works at the function level, meaning that once the
the version currently available on the Google Playstore as, to hook for a given decryption function has been developed for the
the best of our knowledge, no previous versions of this app are first time, it can be used with any app using the same function.
available on third-party sites like APKMirror. Therefore, by writing hooks for popular encryption libraries, it is
possible to support the decryption of data for all apps that use these
5 Note that we do not report here the plaintext of the data encrypted by these apps, since
libraries. Our framework currently supports two prominent and
we had no way to compare them to existing published works. We stress however that,
in this kind of validation experiments, our framework reported the correct plaintexts popular encryption libraries, namely SQLCipher [37] and Jetpack
of the above data. Security [3], and the Realm [34] DBMS. We have validated it by
ARES 2023, August 29–September 01, 2023, Benevento, Italy Anglano, et al.
Table 2: Main encryption parameters extracted with our framework from the encrypted database generated by Wickr Me, Signal
and Threema. Their values (except for the encryption keys) are the same one reported in [19].
matrix-sdk-auth.realm 0x7878C1FE4364FB8867BE645751917826CDE26C7A1CBDD...
disk_store.realm 0xF4C024F59FA71E203B8EC24DE83CF80A05CAA15840162...
crypto_store.realm 0xA35E9873554F56286527FD9926E957829DBC8F05BE1EA...
matrix-sdk-identity.realm 0xF4C024F59FA71E203B8EC24DE83CF80A05CAA15840162...
matrix-sdk-content-scanning.realm 0xF4C024F59FA71E203B8EC24DE83CF80A05CAA15840162...
matrix-sdk-global.realm 0xDDC9411EEBC27CDA10AA8398134D899A5C29D1CC538B8...
Table 3: List of encryption keys extracted by the Realm Plaintext extraction agent. Keys are truncated due to space constraints.
comparing our results with those reported in literature for several and Control (IMCCC). 500–503.
[21] Myungseo Park et al. 2019. Decrypting password-based encrypted backup data
real-word apps that use encryption. for Huawei smartphones. Digital Investigation 28 (2019).
As future work, we plan to expand the set of encryption libraries [22] Myungseo Park et al. 2020. A methodology for the decryption of encrypted
supported by the framework, as well as to support also iOS. smartphone backup data on android platform: A case study on the latest samsung
smartphone backup system. Forensic Science International: Digital Investigation
35 (2020).
References [23] Soojin Kang et al. 2021. Methods for decrypting the data encrypted by the latest
Samsung smartphone backup programs in Windows and macOS. Forensic Science
[1] Android. 2022. The Room persistence library. https://developer.android.com/ International: Digital Investigation 39 (2021).
jetpack/androidx/releases/room Accessed on Oct 7, 2022. [24] Sumin Shin et al. 2022. Forensic analysis of note and journal applications. Forensic
[2] Android. 2023. The javax.crypto encryption class. https://developer.android. Science International: Digital Investigation 40 (2022).
com/reference/javax/crypto/package-summary Accessed on Feb 2, 2023. [25] Tobias Groß et al. 2019. Analyzing Android’s File-Based Encryption: Information
[3] Android. 2023. The Android Jetpack Security Library. https://developer.android. Leakage through Unencrypted Metadata. In Proceedings of the 14th International
com/jetpack/androidx/releases/security Accessed on Feb 2, 2023. Conference on Availability, Reliability and Security (Canterbury, CA, United King-
[4] Android. 2023. The androidx.security.crypto API reference. https://developer. dom) (ARES ’19). Association for Computing Machinery, New York, NY, USA.
android.com/reference/androidx/security/crypto/package-summary Accessed [26] Yu-an Tan et al. 2020. An Android Inline Hooking Framework for the Securing
on Feb 2, 2023. Transmitted Data. Sensors 20, 15 (2020).
[5] C. Anglano. 2014. Forensic Analysis of WhatsApp Messenger on Android Smart- [27] Yu Zhang et al. 2020. Android Encryption Database Forensic Analysis Based on
phones. Digital Investigation 11, 3 (Sept. 2014), 201–213. Static Analysis. In Proceedings of the 4th International Conference on Computer
[6] Cosimo Anglano, Massimo Canonico, and Marco Guazzone. 2016. Forensic Anal- Science and Application Engineering (Sanya, China) (CSAE 2020). Association for
ysis of the ChatSecure Instant Messaging Application on Android Smartphones. Computing Machinery, New York, NY, USA.
Digital Investigation 19 (Dec. 2016), 44–59. [28] Frida. 2022. Frida: a world-class dynamic instrumentation framework. https:
[7] Cosimo Anglano, Massimo Canonico, and Marco Guazzone. 2017. Forensic //frida.re/ Accessed on Oct. 5th, 2022.
analysis of Telegram Messenger on Android smartphones. Digital Investigation [29] Google Developers. 2022. Android NDK. https://developer.android.com/ndk
23 (2017), 31–49. Accessed on Oct 6, 2022.
[8] Cosimo Anglano, Massimo Canonico, and Marco Guazzone. 2020. The Android [30] iBotPeaches. 2022. Apktool - A tool for reverse engineering Android apk files.
Forensics Automator (AnForA): A tool for the Automated Forensic Analysis of https://ibotpeaches.github.io/Apktool/ Accessed on Oct 7, 2022.
Android Applications. Computers & Security 88 (2020). [31] T. Mehrotra and B. M. Mehtre. 2013. Forensic analysis of Wickr application
[9] SQLite Consortium. 2023. SQLite Home Page. https://www.sqlite.org Accessed on android devices. In 2013 IEEE International Conference on Computational
on Mar 13, 2023. Intelligence and Computing Research. 1–6.
[10] Google Developers. 2022. Run apps on the Android Emulator. https://developer. [32] MongoDB. 2022. Encrypt a Realm - Java SDK. https://www.mongodb.com/docs/
android.com/studio/run/emulator Accessed on Oct 8, 2022. realm/sdk/java/advanced-guides/encryption/ Accessed on Oct 10, 2022.
[11] Element. 2022. Element | Secure Collaboration and Messaging . https://element.io/ [33] MongoDB. 2022. Realm Studio. https://www.mongodb.com/docs/realm-legacy/
Accessed on Oct 7, 2022. products/realm-studio.html Accessed on Oct 13, 2022.
[12] Element. 2022. Element Android build.gradle. https://github.com/vector- [34] MongoDB. 2022. Realm.io. https://www.realm.io Accessed on Oct 10, 2022.
im/element-android/blob/develop/matrix-sdk-android/build.gradle Accessed on [35] P. Cobley and G. Geneste. 2022. Realm. In Mobile Forensics - The File Format
Oct 7, 2022. Handbook, C. Hummert and D. Pawlaszczyk (Ed.). Springer, Chapter Chapter 8.
[13] Andrew Case et al. 2019. HookTracer: A System for Automated and Accessible [36] Wickr. 2022. APKMirror - Free APK Downloads - Free and safe Android APK
API Hooks Analysis. Digital Investigation 29 (2019), S104–S112. downloads. https://www.apkmirror.com/ Accessed on Oct 7, 2022.
[14] D. Votipka et al. 2020. Understanding security mistakes developers make: Quali- [37] Zetetic. 2022. SQLCipher. https://www.zetetic.net/sqlcipher/ Accessed on Oct
tative analysis from Build It, Break It, Fix It. In 29th USENIX Security Symposium. 9, 2022.
[15] Giyoon Kim et al. 2020. A study on the decryption methods of telegram X and [38] Zetetic. 2022. SQLCipher for Android. https://github.com/sqlcipher/android-
BBM-Enterprise databases in mobile and PC. Forensic Science International: Digital database-sqlcipher Accessed on Oct 7, 2022.
Investigation 35 (2020).
[16] Giyoon Kim et al. 2021. Forensic analysis of instant messaging apps: Decrypting
Wickr and private text messaging data. Forensic Science International: Digital
Investigation 37 (2021), 301138.
[17] H. Zhang et al. 2018. Digital Forensic Analysis of Instant Messaging Applica-
tions on Android Smartphones. In 2018 International Conference on Computing,
Networking and Communications (ICNC). 647–651.
[18] Jusop Choi et al. 2019. Digital forensic analysis of encrypted database files in
instant messaging applications on Windows operating systems: Case study with
KakaoTalk, NateOn and QQ messenger. Digital Investigation 28 (2019).
[19] Jihun Son et al. 2022. Forensic analysis of instant messengers: Decrypt Signal,
Wickr, and Threema. Forensic Science International: Digital Investigation 40 (2022),
301347.
[20] L. Zhang et al. 2016. The Forensic Analysis of WeChat Message. In 2016 Sixth Inter-
national Conference on Instrumentation Measurement, Computer, Communication