Hidden APIs MSA
Hidden APIs MSA
ABSTRACT would have used the same set of the APIs. However, by performing
Mobile applications, particularly those from social media platforms a manual analysis, we discovered discrepancies in the APIs used
such as WeChat and TikTok, are evolving into “super apps” that by these miniapps. For instance, privileged APIs like openUrl are
offer a wide range of services such as instant messaging and media present in 1st-party miniapps like Tencent Doc [4], which has more
sharing, e-commerce, e-learning, and e-government. These super than 200 million online consumers. openUrl can open arbitrary
arXiv:2306.08134v1 [cs.CR] 13 Jun 2023
apps often provide APIs for developers to create “miniapps” that URLs, but the 3rd-party miniapps cannot use openUrl and must
run within the super app. These APIs should have been thoroughly use the wx.request API to ensure that the URLs are checked by
scrutinized for security. Unfortunately, we find that many of them WeChat to prevent the loading of malicious content. Moreover,
are undocumented and unsecured, potentially allowing miniapps not all APIs are equally mentioned in the official documentation.
to bypass restrictions and gain higher privileged access. To sys- The Chinese version of the development documentation comprises
tematically identify these hidden APIs before they are exploited by 975 APIs [8], while the English version has only 570 APIs [9]. Ad-
attackers, we developed a tool APIScope with both static analysis ditionally, none of the privileged APIs, such as openUrl are ever
and dynamic analysis, where static analysis is used to recognize referenced in the official documentation, regardless of the language.
hidden undocumented APIs, and dynamic analysis is used to con- Thus, there may be undocumented APIs in the super app platforms
firm whether the identified APIs can be invoked by an unprivileged (at least in WeChat). Such undocumented APIs may pose security
3rd-party miniapps. We have applied APIScope to five popular su- risks. For example, they may have a higher level of privilege, as they
per apps (i.e., WeChat, WeCom, Baidu, QQ, and Tiktok) and found are designed exclusively for use by 1st-party apps. In order to en-
that all of them contain hidden APIs, many of which can be ex- sure security, super apps should implement proper access controls
ploited due to missing security checks. We have also quantified the for these privileged APIs, such as allowing access solely through
hidden APIs that may have security implications by verifying if an approved list for 1st-party miniapps. Otherwise, they may be
they have access to resources protected by Android permissions. a weak spot for unauthorized access by 3rd-party miniapps.
Furthermore, we demonstrate the potential security hazards by pre- Although our manual analysis with the host app and its 1st-party
senting various attack scenarios, including unauthorized access to miniapp implementation has yielded surprising findings, it is cer-
any web pages, downloading and installing malicious software, and tainly not scalable nor complete. Meanwhile, given the fact that so
stealing sensitive information. We have reported our findings to the many super apps are available today, it will be extremely helpful if
relevant vendors, some of whom have patched the vulnerabilities we can have a tool to identify all of the hidden APIs if that is possi-
and rewarded us with bug bounties. ble from their implementations. Also, since privileged APIs without
any checks can be easily exploited by malicious miniapps, we must
inform the super app vendors to patch the missing or misplaced
1 INTRODUCTION checks. Motivated by these pressing needs, in this paper, we present
APIScope, a binary analysis tool combined with both static and
Over the past a few years, we have witnessed a rapid growth of dynamic analysis to systematically scrutinize hidden APIs, which
the miniapp paradigm [33], in which a mobile super app (e.g., are undocumented, from super app implementations.
WeChat [6] and TikTok [5]) provides a seamless runtime environ-
Multiple challenges must be addressed while developing APIS-
ment for a miniapp, a web-app alike small application, for enhanced
cope. Particularly, several programming languages have been used
user experience (e.g., install-less) and stickiness with the super app
to implement a super app at various layers (e.g., JavaScript at the
(e.g., a user can access almost all the daily services without leaving
miniapp layer, C/C++ at the JavaScript runtime layer, and Java at
it). Today, more than 4.3 million miniapps [7] have been developed
the service abstraction layer provided by the host app), and conse-
in WeChat (a super app with 1.2 billion monthly active users [1]),
quently it is challenging to recognize how APIs across these differ-
surpassing the total number of Android apps in Google Play (which
ent languages and interfaces are invoked. Second, after identifying
has about 2.7 million as of November 2022 [2]). These miniapps of-
an undocumented API, it is also challenging to classify whether it
fer a variety of daily services from transportation (e.g., ride hailing),
is an API that can be invoked by third-party miniapps. Fortunately,
e-commerce (e.g., online shopping), e-learning, e-government (e.g.,
we have addressed these challenges and successfully implemented
pandemic control and contact tracing), mobile gaming, to entertain-
APIScope. There are two key components inside APIScope: Static
ment (e.g., short-form user videos), and so on. They are developed
API Recognition and Dynamic API Classification. At a high level, it
by both the 1st-party (i.e., the one who makes the super app plat-
takes a super app binary as well as its list of public APIs as input,
form), as well as 3rd-party (i.e., developers who create additional
and identifies the hidden APIs based on the invariants of the func-
software based on the platform provided by the 1st-party).
tions and interfaces from the public APIs in the super apps using
Obviously, since both the 1st-part and the 3rd-party miniapps are Static API Recognition. Next, it dynamically executes the identified
all built on top of the APIs provided by the super app platform, they
This is a preprint of our CCS 2023 paper. Chao Wang, Yue Zhang, and Zhiqiang Lin
APIs to confirm whether they are true APIs, and further classi- Application JavaScript
MiniApps Implementation
Layer
fies them into checked and unchecked ones based on whether it
can only be invoked by the 1st-party miniapps using Dynamic API JavaScript JavaScript
Framework JavaScript APIs Implementation
Classification. Layer
all the tested super apps contained hidden APIs. Interestingly, our
study found hidden APIs in different categories, with some super Service
User Data Bluetooth Network Java
Abstraction ...
Layer Services Services Services Implementation
apps having more hidden APIs than documented ones. For example,
the API category of Payment of WeChat contains 28 hidden APIs, Host App Layer Host App
Java & C/C++
Implementation
which is significantly more than its documented ones (i.e., only one).
Java & C/C++
We also measure the usage of hidden APIs in both 1st party miniapps OS Layer Android OS Implementation
and 3rd party miniapps. We found that the use of undocumented
APIs is common among both 1st-party miniapps and 3rd-party Figure 1: Architecture of Super App Runtime in Android
miniapps regardless of their category.
It is evident that not all hidden APIs may pose security risks
when misused. Therefore, our objective was to dive into the security
implications of hidden APIs. Specifically, we focused on the hidden
2 BACKGROUND
APIs that lack security checks but can access sensitive Android OS Miniapps are programs that run on top of host apps instead of
resources. To achieve this, we proposed the use of dynamic analysis directly on the operating system. Host apps have to function like
techniques. Our dynamic analysis approach involves identifying an operating system and provide resources (e.g., location, phone
APIs that call native APIs, which can access sensitive resources. We numbers, addresses, and social network information) to miniapps
achieved this by hooking APIs that access sensitive resources and through APIs. Mobile super apps are organized in a layered architec-
monitoring their use by unchecked and undocumented APIs. After ture, with each layer focusing on different aspects like portability,
conducting our investigation, we found that WeChat has 39 hidden security, and convenience, but working together to support miniapp
unchecked APIs (7.77%) that invoke Android APIs protected by execution within host apps, as shown in Figure 1:
permissions. Similarly, WeCom has 40 (6.75%), Baidu has 8 (7.61%), • Mini-Application Layer, which is the top layer of a super-app
Tiktok has 32 (26.23%), and QQ has 38 (12.88%) such APIs, which runtime. All miniapps, including 1st-party and 3rd-party miniapps,
can have security risks. are located in this layer. To prevent one miniapp from accessing
To further validate our findings, we conducted several attack resources of other miniapps, the host app creates an isolated pro-
case studies by developing a number of malicious miniapp using cess for each miniapp. If privileged access is given to 1st-party
these hidden APIs. Specifically, in WeChat, we developed a ma- miniapps, it must be controlled and checked to prevent 3rd-party
licious mini-app to exploit the hidden private_openUrl API to miniapps from using them. Typically, miniapps are implemented
access arbitrary malicious content without detection by the super using JavaScript [33].
apps. Additionally, by using the installDownloadTask hidden API, • JavaScript Framework Layer provides APIs for resource ac-
we developed a mini-app that can download and install harmful cesses and management, which are consumed by miniapps in the
Android apps surreptitiously. Malicious apps have the capability to Application Layer. These APIs allow miniapps to access resources
pilfer a user’s sensitive information. Our demonstration reveals the (such as location-based services) and manage UI elements (such
utilization of hidden APIs such as captureScreen, which enables as opening a new UI window). The JavaScript Framework Layer
malicious miniapps to steal screenshots, getLocalPhoneNumber, is also implemented using JavaScript.
which permits theft of the user’s phone number, and searchCon-
tacts, which facilitates the theft of the user’s contact information. • Customized V8 Layer, which provides support for native C/C++
libraries such as WebGL to power the execution of miniapps. It
Contributions. We make the following contributions: also acts as a bridge between the JavaScript Framework layer and
• We are the first to discover that super apps may provide hidden, lower-layers. When miniapps call APIs such as wx.getLocation,
i.e., undocumented, APIs (for the 1st-party miniapps), and those the Framework layer sends the API name and parameters to the
hidden APIs that do not have permission checks can be exploited Customized V8 layer, which then passes the request to the un-
by the 3rd-party miniapps for privileged accesses. derlying layers. This layer is usually implemented using C/C++.
• We propose APIScope to systematically identify and classify the • Service Abstraction Layer, which provides an interface to ac-
hidden APIs in super apps, with two novel techniques to statically cess services from either the super apps (e.g., user account infor-
recognize the APIs and dynamically execute and classify them. mation) or the underlying OS (e.g., Bluetooth, location-based ser-
• We implement APIScope, and evaluate it with 5 super apps and vices). In the case of the wx.getLocation API, this layer commu-
find all of them containing hidden APIs, some of which can be nicates with the host app using IPC to invoke the Java API get-
exploited by malicious 3rd-party miniapps. We have made the re- SystemService(LOCATION_SERVICE) to retrieve the current lo-
sponsible disclosure to their vendors, and received bug bounties cation. This layer is implemented using a combination of Java
from some of them. and C/C++ code for the Android platform.
10 a(apiName, params, callbackId) {
11 callbackId = NativeGlobal.invokeHandler(apiName, params,
12 callbackId);
13 invokeCallbackHandler(callbackId, callbackHandler)
14 }(apiName, filteredParams, callbackId)
15 }
NativeGlobal.invokeHandler("getLocation", 'wgs84',callbackId)
16 return this;
17 }(global);
This is a preprint of our CCS 2023 paper. Submission to ACM CCS 2023, 2023
1// Implementation of Docuemented API getLocation look similar to that of the documented APIs (e.g., they have similar
2 package com.tencent.mm.plugin.appbrand.jsapi.m;
3 public class x extends a { function signature, similar parameter type and return value type).
4 public static final int CTRL_INDEX = 17;
5 public static final String NAME = "getLocation"; We start by inferring whether those functions are indeed undoc-
6 umented APIs, since intuitively the public APIs and undocumented
7 @Override
8 public final void b(IAppBrandComponent env, JSONObject data,int cId){ APIs are APIs, and the developers would have followed the same
9 // some other logic
10 env.doCallback(cId, env.Map2JSON(result)); practice to implement them. Without surprise, we found the imple-
11 }
12 } mentation of openUrl, which confirms our observation. In Figure 2,
13 we show 3 API implementations of WeChat. Although the code
14 // Implementation of Undocumented API openUrl
15 package com.tencent.mm.plugin.appbrand.jsapi.n; is highly obfuscated (where the names of the classes and methods
16 public class y extends a {
17 public static final int CTRL_INDEX = 201; are replaced with meaningless letters, such as “a”,“b”), we still can
18 public static final String NAME = "openUrl";
19
observe some invariants: WeChat’s public API getLocation (line
20 @Override 1–13) and its undocumented API openUrl (line 14–25) both have
21 public final void b(IAppBrandComponent env,JSONObject data, int cId){
22 // some other logic the same parameter types and return types, as well as the same
23 env.doCallback(cId, env.Map2JSON(result));
24 } superclass (i.e., class b). As such, we can use these invariants (e.g.,
25 }
26
the superclass of the API, the parameters of the API) collected from
27 // Implementation of Undocumented API private_openUrl the public APIs to search for possible undocumented APIs. For
28 package com.tencent.mm.plugin.appbrand.jsapi.n;
29 public class z extends a { instance, as shown in Figure 2, we identified another function pri-
30 public static final int CTRL_INDEX = 406;
31 public static final String NAME = "private_openUrl"; vate_openUrl (lines 28–38) that has the same function signature,
32
33 @Override
which is very likely an undocumented API.
34 public final void b(IAppBrandComponent env,JSONObject data, int cId){
35 // some other logic Observation-II: Undocumented API Invocation. Although there
36 env.doCallback(cId,env.Map2JSON(result));
37 } may be undocumented APIs (e.g., private_openUrl) provided by
38 }
WeChat, we have to find a way to invoke them (if they are indeed
APIs). Interestingly, when we directly invoke undocumented APIs
Figure 2: APIs implementations of WeChat.
such as private_openUrl in a miniapp, we obtain an error, “fail:
not supported”, which is different from the error we observed
3 MOTIVATION AND PROBLEM STATEMENT when invoking openUrl with “fail: no permission”. As such,
we infer that the accessibility of the API private_openUrl is not
This section describes the motivation of this work by providing
the same as that of openUrl (since the observed error messages are
some key observations in §3.1, then define the problem, the scope
different), and there may be a way to invoke it. As such, we further
and the threat model in §3.2.
inspected the normal invocation of the documented APIs, and seek
to obtain insights from the process.
3.1 Key Observations
1 // Docuemented API Implementation of Baidu
To be more precise, as described in §2, the JavaScript Framework
2 package com.baidu.swan.apps.scheme.actions.f;
As34 public
alluded class a extends
earlier, when
public a (e context) {
aa manually
{ inspecting the implementation Layer acquires the invocation request during a regular API call and
of some of the 1st-party miniapps offered by WeChat, we found
5 super(context, "/swanAPI/getLocation"); transfers it to the lower layers via the interfaces exposed by the
6 }
that
7 other than the public APIs that all the miniapps can access Customized V8 Layer. In Figure 3, we provide a code snippet illus-
8 @Override
without
9 restrictions,
public the 1st-party
boolean a (Context c, Schememiniapp Tencent Doc
s, CallbackHandler actually
cb, SwanApp a){ trating the API invocation chain of WeChat, where the invocation
10
uses some // some other logic
undocumented APIs (e.g., openUrl for opening arbitrary
11 }
request for the getLocation API (line 3 in the top-left frame) is
URLs).
12 } Moreover, the designers of WeChat do not make the APIs eventually passed to the NativeGlobal.invokeHandler function
13
available to be public
14 // Unocuemented (their documentation
API Implementation of Baidu does not even mention (line 11 in the bottom-left frame), which in turn conveys the API
15 package com.baidu.swan.apps.impl.account.a;
openUrl), and have placed
16 public class f extends aa {
security checks to prevent openUrl from invocation request to the underlying layers. Notably, the Native-
being
17 accessed
public f (eby arbitrary
context) { miniapps. For example, whenever a Global.invokeHandler function receives three inputs: the API
18 super(context, "/swanAPI/getBDUSS");
3rd-party
19 } miniapp attempts to invoke openUrl, WeChat will throw name (e.g., getLocation), the API parameters, and a callback func-
20
an21insufficient
@Override permission exception (i.e., “fail: no permission”) tion ID (which enables the API to manage the asynchronous call).
22 public boolean a (Context c, Scheme s, CallbackHandler cb, SwanApp a){
and
23
terminate its execution. The use of openUrl in the 1st-party
// some other logic Given that NativeGlobal.invokeHandler can deliver the nor-
Tencent
24 } Doc miniapp prompted us to investigate the possibility
25 } mal invocation request to the underlying layers, we conclude that
of other hidden APIs offered by WeChat without proper security it also has the capabilities to deliver undocumented API invocation
checks. This inspired us to explore the feasibility of identifying and requests. Therefore, we feed the API name private_openUrl and
exploiting these APIs, but we faced two challenges: (i) identifying its parameter (which is a URL) to the interface and let it pass the
the hidden APIs and (ii) properly invoking them to test for poten- API name and the URL to the underlying layers. Interestingly, we
tial vulnerabilities. Through further exploration, we made two key find that the underlying layers handle the passed API name and the
observations to address these challenges. parameter as normal API invocations and further pass the invoca-
Observation-I: Undocumented API Recognition. By manually tion requests to the host apps. As shown in Figure 4, while WeChat
inspecting the implementation of WeChat, we found that multi- restricts the undocumented APIs to be accessed by mini-apps, un-
ple suspicious undocumented functions are co-located with their fortunately we find that not all undocumented APIs are protected
documented APIs. That is, those functions and the public APIs are through security checks. In particular, WeChat has enforced the se-
located in the same super app packages, and their implementations curity check for the undocumented API openUrl, but it does not add
This is a preprint of our CCS 2023 paper. Chao Wang, Yue Zhang, and Zhiqiang Lin
Malicious
Miniapp
JavaScript
Framework Layer
Customized V8
Layer
Service
Abstraction Layer
Host App the convenience and also our expertise, we focus on the super
getLocation apps running on Android platform, though in theory our approach
openUrl should also work for the iOS platform.
1// Implementation of Docuemented API getLocation
2 package com.tencent.mm.plugin.appbrand.jsapi.m;
3getLocation
public class x extends a {
private_openUrl 4 public static final int CTRL_INDEX = 17;
getLocation
5 openUrl
public static final String NAME = "getLocation"; 3.3 Threat Model
6
private_openUrl
7 @Override openUrl getLocation
8 public final void b(IAppBrandComponent env, JSONObject data,int callbackId) {
As previously discussed, our objective is to develop techniques for
9 // some other logic
10 env.doCallback(callbackId, env.Map2JSON(result));
detecting hidden APIs that lack security checks before a malicious
11 }
12 }
app exploits them. In this context, the attacker is a malware that
private_openUrl private_openUrl
13
14 // Implementation of Undocumented API openUrl
has been installed on the user’s mobile device. We will not delve
15 package com.tencent.mm.plugin.appbrand.jsapi.n;
16 public class y extends a {
into the details of how this malware can be installed, as we believe
17 public static final int CTRL_INDEX = 201;
18 public static final String NAME = "openUrl";
it is practical to assume that super apps are not aware of such types
19
of malware until we report our findings to them. It is worth noting
Figure 4: The Workflow@Override 20
of API invocations. Public API in-
21
public final void b(IAppBrandComponent env,JSONObject data, int callbackId) {
that previous research on super apps has also made similar assump-
vocation getLocation (green // line);22
some other
23
logic
Checked Undocumented
env.doCallback(callbackId, env.Map2JSON(result));
tions [25]. Undocumented APIs refer to functions or APIs that are
API openUrl (red line);} Unchecked
} 24
25
Undocumented API pri-
not included in the official documentation, regardless of whether
vate_openUrl (purple line). 26
27
// Implementation of Undocumented API private_openUrl
28 package com.tencent.mm.plugin.appbrand.jsapi.n;
it is in English or Chinese. An attacker could acquire knowledge
the security checks for the undocumented
public class z extends a API
29 { private_openUrl, about the existence of these hidden APIs by reverse engineering
30
public static final int CTRL_INDEX = 406;
which has the exact same functionalities
31 asString
public static final openUrl. Also, the API
NAME = "private_openUrl";
the super app client or by reading technical blogs on the internet.
32
name and parameters are @Override
not obfuscated since they have to be
33 Specifically, undocumented APIs may have access to sensitive re-
34
public final void b(IAppBrandComponent env,JSONObject data, int callbackId) {
passed to lower layers. 35
// some other logic sources that are safeguarded by Android OS. If an attacker exploits
36 env.doCallback(callbackId,env.Map2JSON(result));
37
38 }
} these APIs, they can launch attacks against the victim users.
3.2 Problem Statement and Scope
Since our manual investigation 1 // DocuementedhasAPI revealed that of
Implementation there
Baiduare indeed
2 package com.baidu.swan.apps.scheme.actions.f; 4 CHALLENGES AND INSIGHTS
hidden APIs in the super app
3 public
4
platform
class a extendsand
public a (e context) {
aa { some of them can be
(II) Forward Slicing for API (III) Dynamic API Probing for Undocumented
(I) Automatic (II) Undocumented (I) Test Case
Invocation Identification API Category Classification Unchecked API
Invariants API Recognition Generation
Decompiler
Extraction Undocumented
APIs
G P
E Testing JavaScript Testing
Undocumented
Checked API
Super Apps Testing Cases Forward API JavaScript
Cases Runtime Slicing Cases Probing Runtime
Generator
Decompiled Code
Public APIs
recognize APIs. For example, when implementing the callbacks of example, most JavaScript analysis tools (e.g., Jalangi2 [30]) are de-
the APIs, WeChat uses android.webkit.ValueCallback at the signed for traditional web browsers. They cannot run with the
Service Abstraction layer to handle all the callback results. From super apps since the offered APIs are different. Moreover, most of
the callbacks, we can locate the corresponding APIs and extract these tools need to instrument the testing instances, which involves
patterns to pinpoint the rest APIs. However, there are multiple the modification of the testing instances. In our case, the testing
super apps, each of which could have different implementations. instances are the miniapps (not web applications), which usually
For example, unlike the implementation of WeChat, TikTok uses have integrity checks and cannot be modified easily.
com.he.jsbinding.JsContext.ScopeCallback at the Service Ab-
Insights. To invoke the API for its behavior classification, we need
straction layer to handle the callback results of their APIs, and the
to find the interface, e.g., NativeGlobal.invokeHandler as shown
pattern for WeChat will fail when dealing with TikTok. Moreover,
in Figure 3. Interestingly, to identify this interface, we can monitor
such a pattern-matching approach requires recognizing callbacks
how a public API is executed, e.g., how it is invoked (its name, pa-
first, which may be challenging due to the code obfuscation. As
rameters), and when it is passed between the boundary of the layers.
discussed in §3.1, the miniapp is executed on top of the super apps
More specifically, we notice that we can use function trace analysis
(e.g., Android apps), which is often heavily obfuscated. It is hard
to identify interfaces such as NativeGlobal.invokeHandler, since
to recognize callbacks statically unless we fully understand the
the API execution starts from the invocation, and ends at the inter-
obfuscated code, and as such, we need a more obfuscation-resilient
face boundary. By tracing all of the function executions with their
approach instead of simple pattern matching.
parameters and then identifying them based on the use of the API
Insights. We notice that there exist some invariants such as the name, which is passed as parameters, we can automatically identify
method signatures of public APIs and their superclasses in the API the interface, which is typically the last invocation point in the
implementations, as illustrated in §3.1 based on super app WeChat JavaScript layer. With the identified invocation point, we can then
(e.g., every API has the same superclass a, though this name is feed it with different API names and invoke them to classify further
obfuscated; every public API must contain the name of the API for (e.g., whether they can be invoked by the 3rd-party miniapps).
the references by the miniapps, and this cannot be obfuscated but
can be easily recognized). As such, we can first extract these API in- 5 APISCOPE
variants based on these public API implementations, from which to
recognize the rest of the APIs. This process can be automated since As shown in Figure 5, our developed APIScope consists of two
it is easy to identify these API invariants when the implementation phases of analysis—static analysis first and then dynamic analysis,
of public APIs is provided. with the following two key components:
• Static API Recognition (§5.1). This component takes the bi-
(II) Challenges in API Classification. Once we have identified
nary code of super apps (i.e., APKs) and the list of the official
all these hidden APIs, we still need to further classify them into dif-
APIs in the documentation as input, and produces the undocu-
ferent categories and determine whether they are invocable (when
mented APIs as output. At a high level, it first decompiles the
there is no security check). It will be very challenging if we only use
APKs by Soot [3], automatically extracts the invariants based on
static analysis to decide this, and thus we need to rely on dynamic
the public APIs, and then uses the invariants to recognize the
analysis to dynamically invoke them. However, to invoke a hidden
hidden APIs from the implementations of super apps.
API, we still need to recognize the interface that can communicate
with the underlying layers. Although we have already known that • Dynamic API Classification (§5.2). This component takes
the interface communicates with the underlying layers takes the the hidden APIs as input, and classifies them into three dif-
API name as its inputs (as described in §3.1), it is still challenging ferent categories: unchecked hidden APIs (exploitable by 3rd
to know whether this interface accepts the API name as its in- miniapps), checked APIs (available to only 1st-party miniapps),
put before we actually execute it (due to the obfuscated JavaScript and non-APIs, as the final output. At a high level, it first uses
code). Meanwhile, although multiple dynamic tools are available the Test Case Generator to produce two types of test cases: one
for JavaScript, they cannot be applied to our case directly due to is for API invocation identification executed by a lightweight
the highly customized JavaScript framework implementations. For tracing engine for the monitored execution, and the other is for
API classification. With these test cases, APIScope eventually
identifies the interfaces as well as the categories of the APIs.
This is a preprint of our CCS 2023 paper. Chao Wang, Yue Zhang, and Zhiqiang Lin
“test” and duration is set to “1”. Using such a template method, use. Meanwhile, although it is true that different platforms may
we successfully instantiated all the parameters. customize the V8 Engine to enable their desired functionalities,
• Parameter Order Permutation. Although we have instanti- they will not intentionally remove the built-in Profiler since it is
ated the parameters, we still do not know the orders of those also helpful for their own debugging purposes. Therefore, as long
parameters for the undocumented APIs, as the parameters in the as we can find a way to invoke Profiler, we will be able to collect
Service Abstraction layer are all encapsulated in JSON objects. the traces. Fortunately, we can use Frida [16], an Android hooking
Therefore, we have to properly order the parameters, and we tool, to dynamically instrument the V8 Engine to invoke startPro-
use a brute-force approach. For example, true and 1234 are two filing of Profiler and let it start profiling, and collect the function
parameters of testAPI, which could have two possible combina- traces of documented API execution.
tions: testAPI(true, 1234)) and testAPI(1234, true). We With the collected function traces, we then present how to find
just assume that all those combinations are valid and invoke the desired interface using function trace analysis, a standard tech-
them one-by-one (the invalid ones will be filtered out during the nique widely used in program analysis. As discussed in §3.1, API
API classification, which will be described later). Given that one invocation is a complicated process involving multiple layers. For-
API can accept no more than 4 parameters (which results in 24 tunately, the Profiler only runs inside the JavaScript Framework
combinations), according to our static analysis with the code, we layer, and we can just monitor the function traces produced at
believe such a brute-force approach is acceptable. this layer since we aim to identify how to invoke an API from the
Specifically, we would like to clarify certain technical details. JavaScript layer. In particular, our analysis starts from the API of
First, during our dynamic analysis, we only explore a limited range our interests (e.g., wx.getLocation), identifies all the functions
of inputs. This is because dynamic tracing does not require a broad involved based on the dependencies of parameter and API names,
range of input to expose hidden APIs. Additionally, the test case and eventually identifies the last invocation function, e.g., NativeG-
generation is sufficient for testing security checks, such as whether lobal.invokeHandler (see Figure 3), which is the desired interface
the hidden API is protected by security checks. In other words, as we aim to discover. Specifically, the dependencies are indeed the
long as valid inputs are provided to the API, our tool can trigger the chained relationship, and we actually build such dependencies based
API if there are no security checks. If there are security checks, we on the parameters that are fed into the functions (we can monitor
the changes of parameters of the functions). For example, when
can observe errors. Our objective is not to enumerate all possible
we execute wx.getLocation, we will observe a function named
inputs, as we are not fuzzing the actual hidden API. Second, hidden
APIs may require complex parameter types, such as JSON-objects. NativeGlobal.invokeHandler that takes a parameter named get-
These complex parameter types are combinations of other basic Location as its inputs. Therefore, we know that wx.getLocation
parameter types (e.g., integer, string), and can be recursively derived and NativeGlobal.invokeHandler have dependencies.
until they become primitive types. For instance, an object may To provide a detailed explanation of how our trace analysis
contain a string, an integer, and a boolean. We can simply inflate works, we will utilize an example that features the implementa-
each parameter based on its respective parameter type. As APIs tions of API invocations across three layers, namely the JavaScript
implemented in the Service Abstraction Layer lack states or context, Framework layer, the Customized V8 layer, and the Service Ab-
it is unnecessary to determine their execution state within this layer. straction layer. The process begins with the JavaScript Frame-
Our testing process involves providing our tool with a code snippet work layer, which initiates the API invocation by calling Native-
containing the API to be tested, which is sufficient for our purposes. Global.invokeHandler. This invocation is then handed over to
The JavaScript Framework Layer handles most of the checks, so the Customized V8 layer, which is responsible for handling it. As
the API invocation is checked before its order or dependency state shown in Figure 7, this step is represented line 10 of the JavaScript
is resolved. Framework layer’s implementation. Next, the Customized V8 layer
extracts critical information from the API invocation, including the
Step-II: API Invocation Identification. Next, APIScope needs to API name, its parameters, and any corresponding callbacks. This in-
execute the generated test cases on top of our customized V8 engine formation is obtained from lines 28–32 of the Customized V8 layer’s
to identify how the documented API is invoked, so that it can later implementation. The Customized V8 layer then proceeds to invoke
similarly invoke the undocumented ones. Intuitively, when we test the relevant APIs at the Service Abstraction Layer through the use
a specific API, we need to compile and produce a testing miniapp of the Java Native Interface (JNI) [21]. Finally, during the API invo-
that contains the API for our test. However, this approach is not cations at the Service Abstraction layer (line 4), this layer may need
scaled and can slow down our testing performance. Interestingly, to communicate with the Customized V8 layer for additional op-
we notice that we can let the V8 engine directly inject the JavaScript erations, such as performing permission checks if the API requires
code into the JavaScript Framework Layer (the V8 engine has a them. We have omitted this code for the sake of brevity. In summary,
function named script, which accepts JavaScript code as input, our trace analysis provides insight into the entire process of API
and injects the code for the JavaScript Framework Layer to execute). invocations across the three layers of the system. We track the flow
Since the JavaScript code is injected into the JavaScript Framework of control and collect data on API names, parameters, and callbacks
layer, the super apps will handle the code as they handle the code to enable a more comprehensive analysis of the system’s behavior.
in a regular miniapp.
Also, in most cases, V8 Engine has a built-in Profiler, but the Step-III: Dynamic Probing for API Category Classification.
super apps do not directly expose any interfaces for developers to With the identified interfaces of how to invoke a public API, we then
use it to similarly invoke undocumented APIs, by first generating
3 var globalCount = 0;
4
5 function invokeMethod(apiName, pa
6 params = WeixinNativeBuffer.p
7 var filteredParams = paramFil
8 callbackId = ++globalCount
9 callbackQueue[callbackId] = c
10 a(apiName, params, ca
11 callbackId = NativeGl
12 callbackId)
13 invokeCallbackHandler
14 }(apiName, filteredParams
This is a preprint of our CCS 2023 paper. Chao Wang, Yue Zhang, and Zhiqiang Lin 15 }
16 return this;
17 }(global);
When executing a particular test case, there could be three types an error message is used as a signature to match the non-APIs.
of outcomes: the tested “API” is a checked API (when invoked, a As an example, in the case of WeChat, if we attempt to use the 1 // Docuemen
2 package com
permission denial will be observed based on the standard error API openUrl, the super app will generate an error message stating 3 public clas
4 public a
messages), the tested “API” is an unchecked API (which can be “fail: no permission”. This error message implies that the API 5 supe
6 }
invoked successfully), the tested “API” is not an API. As such, we is a checked hidden API. On the other hand, if we use the API 7
8 @Overrid
can use the following strategies to identify them. private_openUrl, the super app will handle the invocation request 9 public b
10 //
11 }
• Unchecked APIs. Similar to the public APIs, the unchecked as a regular request without displaying any error message. As a 12 }
13
undocumented APIs can be invoked without requiring additional result, we can conclude that this API is an unchecked hidden API. 14 // Unocuem
15 package co
permissions. As such, we first deliver a public API invocation 16 public cla
17 public
request, such as getLocation, and record the feedback of the 6 EVALUATION 18 sup
19 }
host app. For example, WeChat and Baidu will not print any 20
errors when the invocation request gets approved, and we then We have developed a prototype of APIScope with 5K lines of code 21
22
@Overri
public
use this as a signature to see whether an invocation request is on top of open source tools such as Soot [13] for decompilation 23
24 }
//
successfully executed. and Frida [16] for tracing. In this section, we present the evalua- 25 }
1st-party miniapp
Name Vendor Version V8 Date Installs
being tested?
has fewer API candidates (i.e., 124 API candidates), likely due to its
Baidu Baidu 12.21 7.6 08/13/2021 5,000,000+ ✓ smallest LoC compared to other super apps.
QQ Tencent 8.8 7.2 10/05/2021 10,000,000+ ✓
TikTok ByteDance 17.9 7.2 10/19/2021 1,000,000,000+ ✗ The effectiveness of dynamic analysis is measured by the number
WeChat Tencent 8.0 8.0 07/21/2021 100,000,000+ ✓ of traced functions during API invocation identification and the
WeCom Tencent 3.1 8.0 09/14/2021 100,000+ ✓
number of test cases used during API classification. Among the test
Table 1: Summary of the Tested Super Apps
cases, we also quantify the number of automatically generated test
register as their developers. However, they allow individuals to cases and manually created test cases. We can see that most of the
apply for trial accounts to use their development tools to develop test cases are automatically generated by our test case generation
miniapps, and therefore, we tested Baidu using their trial accounts. algorithm, and the number of automatically generated test cases is
greater than the number of API candidates due to the parameter
The Tested Miniapps. We believe it is important to measure the order permutation (as discussed in §5.2). With our dynamic classifi-
usage of undocumented APIs in 1st-party and 3rd-party miniapps cation for the identified APIs, APIScope detected a large number
for two reasons. First, understanding how 1st-party miniapps use of hidden APIs, many of which are unchecked (as reported in Ta-
these APIs can help us comprehend the entire ecosystem. Second, ble 2). WeChat has more APIs (590 public APIs, 502 undocumented
if 3rd-party developers know about these APIs, they may use them, unchecked APIs, and 65 undocumented checked APIs) than the
which can lead to security issues if these APIs have access to sen- other super apps. However, TikTok has a relatively small number
sitive resources. To analyze the usage of undocumented APIs in of APIs (383 public APIs, 120 undocumented unchecked APIs, and
1st-party miniapps, we searched for interfaces provided by host 2 undocumented checked APIs). With respect to the percentage
apps and collected 236 miniapps from WeChat and WeCom, 340 of undocumented unchecked and checked APIs, WeCom has the
miniapps from Baidu, and 24 miniapps from QQ. We could not most undocumented unchecked APIs (46.3%) and undocumented
find information about the 1st-party miniapps of TikTok, so we checked APIs (6.4%).
did not report their API usage. We could not scan all 3rd-party
miniapps because there is no public dataset or crawlers available. Correctness of Our Result. We quantify whether there are any
Therefore, we can only measure the usage of hidden APIs among false positives or false positives for the identified hidden APIs. First,
3rd-party miniapps within the WeChat ecosystem. We collected a false positive here means that the identified API is not hidden, or
267, 359 miniapps using Mini-Crawler [38] within 3 weeks. is not an API. By design, APIScope will not have false positives for
two reasons: (1) the invariants we extracted have very strict patterns
The Testing Environment. We performed our static analysis on (they have to exist among all public APIs and all of them have to be
one laptop, which has 6 cores, Intel Core i7-10850H (4.90 GHz) present in the undocumented APIs), and (2) our dynamic probing
CPUs and 64 GB RAM, and our dynamic analysis on a Google Pixel for API classification can filter out those non-APIs, which eliminate
4 running Android 11 and a Google Pixel 2 running Android 9, since potential false positives. Nevertheless, we still thoroughly scruti-
we particularly focused on the Android version of miniapps. nized each API identified for WeChat by conducting a manual check
to ensure that there were no false positives. In other words, we
6.2 Effectiveness made sure that the tool did not mistakenly classify non-APIs as APIs.
The effectiveness evaluation aims to quantify how APIScope un- Thanks to our design, we did not come across any false positives
covered the hidden APIs in terms of the specific numbers for the during our examination. Second, with respect to false negatives (i.e.,
involved analysis (which is presented in Table 2), and their quali- “true” hidden API is missed by APIScope), we note that theoretically
ties (i.e., whether there are any false positives). It is worth noting APIScope could have false negatives, for instance, if our invariants
that the manually created cases are indeed rare. For example, for are too strong. However, we will not be able to quantify this, since
Baidu, we automatically created 423 test cases, and created another we do not have the ground truth, unless we can manually examine
56 test cases manually, so the manual efforts are around 11%, i.e., each line of code. Therefore, we leave this to future work.
56/(56+423) = 0.11. Other super apps even have a lower amount of Categories of the Identified APIs. With the identified APIs,
manual efforts than Baidu (e.g., WeCom has 2.9 % manual efforts). we can then obtain some insights with them, such as which cate-
Specifically, the effectiveness of our static analysis is measured gory contains more hidden APIs. To this end, we manually walked
by the identification of API invariants, the number of identified through each API, and categorize them based on the categories of
API candidates (i.e., the functions that are very likely to be APIs). the documented ones, to classify the undocumented (i.e., hidden)
However, whether those API candidates are really APIs are deter- APIs. This result is presented in Table 3. Interestingly, we found
mined in dynamic API classification. For the API invariants, while that most of the categories contain undocumented unchecked APIs.
we have listed four invariants in §5.1, not all of them will exist in In particular, for some of the super apps (e.g., WeChat), their undoc-
all super apps (e.g., Baidu and QQ do not have caller invariant), as umented unchecked APIs can be even more than the documented
shown in Table 2. That is why APIScope aggressively identifies as APIs in some of the categories (e.g., the API category Payment
many invariants as possible. With these invariants, it sufficiently has 28 undocumented APIs, which is way more than their docu-
recognizes the undocumented APIs even though some of them do mented APIs). Finally, we found that some well-documented APIs
not exist in other super apps. During static API recognition, APIS- of a specific super app may not be open to the public in other super
cope recognized in total 1,829 API candidates for these super apps. apps. For example, getUserInfo is an undocumented API of Baidu,
Among them, WeCom contains the most hidden API candidates while WeChat has the same API with the same functionalities,
(683), followed by WeChat (containing 575 API candidates). Tiktok
This is a preprint of our CCS 2023 paper. Chao Wang, Yue Zhang, and Zhiqiang Lin
Table 2: Effectiveness of APIScope with the tested super apps. The terms “Signature”, “Super Class”, “Super Package”, and
“Callers” have consistent meanings with those defined in §5.1.
WeChat WeCom Baidu TikTok QQ
Available APIs
D % UU % UC % D % UU % UC % D % UU % UC % D % UU % UC % D % UU % UC %
Basic 5 71.4 2 28.6 - 0.0 6 66.7 3 33.3 - 0.0 8 72.7 2 18.2 1 9.1 7 63.6 4 36.4 - 0.0 3 100.0 - 0.0 - 0.0
App 13 39.4 14 42.4 6 18.2 13 37.1 16 45.7 6 17.1 8 42.1 10 52.6 1 5.3 6 50.0 6 50.0 - 0.0 9 34.6 17 65.4 - 0.0
Base
Debug 15 88.2 2 11.8 - 0.0 15 88.2 2 11.8 - 0.0 1 3.3 28 93.3 1 3.3 - 0.0 - 0.0 - 0.0 20 100.0 - 0.0 - 0.0
Misc 10 58.8 7 41.2 - 0.0 10 55.6 8 44.4 - 0.0 9 100.0 - 0.0 - 0.0 10 52.6 9 47.4 - 0.0 9 100.0 - 0.0 - 0.0
Interaction 6 46.2 7 53.8 - 0.0 6 46.2 7 53.8 - 0.0 7 41.2 10 58.8 - 0.0 9 81.8 2 18.2 - 0.0 6 40.0 9 60.0 - 0.0
Navigation 4 44.4 5 55.6 - 0.0 4 40.0 6 60.0 - 0.0 4 100.0 - 0.0 - 0.0 5 100.0 - 0.0 - 0.0 4 33.3 8 66.7 - 0.0
UI Animation 32 100.0 - 0.0 - 0.0 32 100.0 - 0.0 - 0.0 21 95.5 1 4.5 - 0.0 1 100.0 - 0.0 - 0.0 31 100.0 - 0.0 - 0.0
WebView - 0.0 22 95.7 1 4.3 - 0.0 24 96.0 1 4.0 - 0.0 3 75.0 1 25.0 - 0.0 3 100.0 - 0.0 - 0.0 16 100.0 - 0.0
Misc 20 27.0 54 73.0 - 0.0 20 25.6 58 74.4 - 0.0 37 77.1 11 22.9 - 0.0 14 73.7 5 26.3 - 0.0 18 42.9 24 57.1 - 0.0
Request 5 55.6 4 44.4 - 0.0 5 55.6 4 44.4 - 0.0 2 66.7 1 33.3 - 0.0 6 60.0 4 40.0 - 0.0 4 66.7 2 33.3 - 0.0
Download 7 24.1 21 72.4 1 3.4 7 23.3 22 73.3 1 3.3 11 100.0 - 0.0 - 0.0 - 0.0 4 100.0 - 0.0 6 60.0 4 40.0 - 0.0
Network Upload 7 50.0 5 35.7 2 14.3 7 46.7 6 40.0 2 13.3 6 100.0 - 0.0 - 0.0 - 0.0 4 100.0 - 0.0 6 75.0 2 25.0 - 0.0
Websocket 14 93.3 1 6.7 - 0.0 14 93.3 1 6.7 - 0.0 13 100.0 - 0.0 - 0.0 7 77.8 2 22.2 - 0.0 13 86.7 2 13.3 - 0.0
Misc 23 88.5 3 11.5 - 0.0 23 85.2 4 14.8 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 10 55.6 8 44.4 - 0.0
Storage 10 66.7 5 33.3 - 0.0 10 66.7 5 33.3 - 0.0 10 100.0 - 0.0 - 0.0 10 90.9 1 9.1 - 0.0 10 83.3 2 16.7 - 0.0
Map 8 14.3 48 85.7 - 0.0 8 14.3 48 85.7 - 0.0 7 100.0 - 0.0 - 0.0 6 100.0 - 0.0 - 0.0 9 36.0 16 64.0 - 0.0
Image 6 60.0 4 40.0 - 0.0 6 60.0 4 40.0 - 0.0 6 85.7 1 14.3 - 0.0 5 83.3 1 16.7 - 0.0 6 60.0 4 40.0 - 0.0
Video 14 35.0 26 65.0 - 0.0 14 31.8 30 68.2 - 0.0 19 95.0 1 5.0 - 0.0 8 80.0 2 20.0 - 0.0 14 63.6 8 36.4 - 0.0
Audio 64 84.2 9 11.8 3 3.9 64 79.0 14 17.3 3 3.7 44 100.0 - 0.0 - 0.0 44 81.5 10 18.5 - 0.0 61 85.9 10 14.1 - 0.0
Media
Live 26 46.4 30 53.6 - 0.0 26 39.4 40 60.6 - 0.0 8 100.0 - 0.0 - 0.0 19 100.0 - 0.0 - 0.0 23 57.5 17 42.5 - 0.0
Recorder 16 84.2 3 15.8 - 0.0 16 84.2 3 15.8 - 0.0 12 100.0 - 0.0 - 0.0 11 91.7 1 8.3 - 0.0 15 88.2 2 11.8 - 0.0
Camera 9 60.0 6 40.0 - 0.0 9 52.9 8 47.1 - 0.0 9 50.0 9 50.0 - 0.0 20 95.2 1 4.8 - 0.0 4 36.4 7 63.6 - 0.0
Misc 12 75.0 3 18.8 1 6.3 12 75.0 3 18.8 1 6.3 18 100.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 6 100.0 - 0.0 - 0.0
Location 3 42.9 4 57.1 - 0.0 3 42.9 4 57.1 - 0.0 7 100.0 - 0.0 - 0.0 3 100.0 - 0.0 - 0.0 3 100.0 - 0.0 - 0.0
Share 4 33.3 7 58.3 1 8.3 4 16.7 19 79.2 1 4.2 3 100.0 - 0.0 - 0.0 5 71.4 2 28.6 - 0.0 5 35.7 9 64.3 - 0.0
Canvas 60 74.1 21 25.9 - 0.0 60 74.1 21 25.9 - 0.0 46 92.0 4 8.0 - 0.0 49 98.0 1 2.0 - 0.0 48 92.3 4 7.7 - 0.0
File 39 97.5 1 2.5 - 0.0 39 92.9 3 7.1 - 0.0 35 100.0 - 0.0 - 0.0 34 97.1 1 2.9 - 0.0 37 97.4 1 2.6 - 0.0
Login 2 100.0 - 0.0 - 0.0 5 83.3 1 16.7 - 0.0 3 42.9 1 14.3 3 42.9 2 100.0 - 0.0 - 0.0 2 100.0 - 0.0 - 0.0
Navigate 2 33.3 2 33.3 2 33.3 2 22.2 5 55.6 2 22.2 3 100.0 - 0.0 - 0.0 7 100.0 - 0.0 - 0.0 2 50.0 1 25.0 1 25.0
User Info 2 16.7 7 58.3 3 25.0 5 23.8 13 61.9 3 14.3 1 10.0 6 60.0 3 30.0 2 13.3 13 86.7 - 0.0 2 28.6 4 57.1 1 14.3
Open API Payment 1 3.4 13 44.8 15 51.7 1 3.2 15 48.4 15 48.4 1 50.0 - 0.0 1 50.0 1 33.3 1 33.3 1 33.3 2 22.2 7 77.8 - 0.0
Bio-Auth 3 27.3 3 27.3 5 45.5 3 21.4 6 42.9 5 35.7 - 0.0 - 0.0 - 0.0 - 0.0 1 100.0 - 0.0 3 100.0 - 0.0 - 0.0
Enterprise - 0.0 1 100.0 - 0.0 5 17.9 6 21.4 17 60.7 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0
Misc 14 19.4 42 58.3 16 22.2 14 16.7 54 64.3 16 19.0 16 57.1 2 7.1 10 35.7 25 55.6 20 44.4 - 0.0 12 13.0 78 84.8 2 2.2
Wi-Fi 9 100.0 - 0.0 - 0.0 9 100.0 - 0.0 - 0.0 10 100.0 - 0.0 - 0.0 4 100.0 - 0.0 - 0.0 9 100.0 - 0.0 - 0.0
Bluetooth 18 60.0 11 36.7 1 3.3 18 58.1 12 38.7 1 3.2 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 18 100.0 - 0.0 - 0.0
Contact 1 10.0 5 50.0 4 40.0 1 9.1 6 54.5 4 36.4 1 33.3 2 66.7 - 0.0 - 0.0 - 0.0 - 0.0 1 25.0 2 50.0 1 25.0
Device NFC 5 26.3 14 73.7 - 0.0 9 39.1 14 60.9 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 5 100.0 - 0.0 - 0.0
Screen 4 36.4 6 54.5 1 9.1 4 36.4 6 54.5 1 9.1 3 100.0 - 0.0 - 0.0 9 100.0 - 0.0 - 0.0 4 100.0 - 0.0 - 0.0
Phone 1 4.3 21 91.3 1 4.3 1 4.3 21 91.3 1 4.3 1 100.0 - 0.0 - 0.0 1 100.0 - 0.0 - 0.0 1 50.0 1 50.0 - 0.0
Misc 28 63.6 15 34.1 1 2.3 28 59.6 18 38.3 1 2.1 21 80.8 5 19.2 - 0.0 16 69.6 7 30.4 - 0.0 28 82.4 6 17.6 - 0.0
CV 19 100.0 - 0.0 - 0.0 19 100.0 - 0.0 - 0.0 18 90.0 2 10.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0
AI
Misc - 0.0 - 0.0 - 0.0 - 0.0 1 100.0 - 0.0 11 100.0 - 0.0 - 0.0 7 100.0 - 0.0 - 0.0 - 0.0 - 0.0 - 0.0
AD 19 95.0 1 5.0 - 0.0 19 95.0 1 5.0 - 0.0 9 64.3 4 28.6 1 7.1 13 61.9 8 38.1 - 0.0 3 25.0 9 75.0 - 0.0
Uncategorized 30 38.5 47 60.3 1 1.3 30 36.6 51 62.2 1 1.2 15 53.6 10 35.7 3 10.7 17 68.0 7 28.0 1 4.0 34 68.0 15 30.0 1 2.0
All 590 51.0 502 43.4 65 5.6 606 47.3 593 46.3 82 6.4 464 77.1 113 18.8 25 4.2 383 75.8 120 23.8 2 0.4 506 62.7 295 36.6 6 0.7
Table 3: Categories of Documented and Undocumented APIs. “D” means documented APIs; “UU” means undocumented
unchecked APIs; “UC” means undocumented checked APIs.
which is publicly accessible. Finally, since APIScope is a systematic Usage of Hidden APIs (Among the 1st-party Miniapps). We
and mostly automated tool, it can inspect API changes based on obtained many 1st-party miniapps and classified them into cate-
previous versions of the super app implementations as long as we gories based on their meta-data. From the data in Table 4, we found
can obtain both their APKs and documentation. We have a detailed that the use of undocumented APIs is common among 1st-party
evaluation of API evaluation in Appendix-§C for interested readers.
This is a preprint of our CCS 2023 paper. Submission to ACM CCS 2023, 2023
Baidu
E-learning 5 9 55.6 5 9 55.6 12 33 36.4 - 1 0.0 swan.getBDUSS User Info 4 3.39 ✓
Entertainment 9 17 52.9 9 17 52.9 29 75 38.7 2 2 100.0 swan.getCommonSysInfo System 3 2.54 ✓
Finance 1 1 100.0 1 1 100.0 21 23 91.3 - - 0.0 swan.getUserInfo User Info 3 2.54 ✗
Food - - 0.0 - - 0.0 - 5 0.0 - - 0.0 swan.getChannelID Uncategorized 2 1.69 ✓
Games 18 36 50.0 18 36 50.0 - - 0.0 - - 0.0 wx.hideNavigationBar Bar 28 32.18 ✗
Government 2 7 28.6 2 7 28.6 3 8 37.5 1 1 100.0 wx.requestSubscribeMessage Subscribe 25 28.74 ✗
Health 2 7 28.6 2 7 28.6 1 5 20.0 - 1 0.0 wx.showNavigationBar Bar 23 26.44 ✗
Job - 1 0.0 - 1 0.0 - - 0.0 - - 0.0 wx.requestVirtualPayment Payment 11 12.64 ✓
Lifestyle 2 5 40.0 2 5 40.0 3 15 20.0 - 1 0.0 wx.openUrl Misc 8 9.20 ✓
Photo 3 7 42.9 3 7 42.9 - - 0.0 - - 0.0 wx.hideHomeButton Interaction 8 9.20 ✗
Shopping 1 1 100.0 1 1 100.0 - 2 0.0 - - 0.0 wx.enterContact Contact 5 5.75 ✓
Social 4 8 50.0 4 8 50.0 1 4 25.0 - 1 0.0 wx.drawCanvas Canvas 5 5.75 ✗
Sports - - 0.0 - - 0.0 - 1 0.0 - - 0.0 wx.setPageOrientation Misc 4 4.60 ✗
WeChat
Tool 15 55 27.3 15 55 27.3 16 47 34.0 4 8 50.0 wx.operateWXData Misc 4 4.60 ✗
Traffic 3 5 60.0 3 5 60.0 4 10 40.0 - 1 0.0 wx.getBackgroundFetchData Misc 3 3.45 ✗
Travelling 2 2 100.0 2 2 100.0 1 56 1.8 1 2 50.0 wx.setBackgroundFetchToken Misc 3 3.45 ✗
Uncategorized - - 0.0 - - 0.0 1 2 50.0 - - 0.0 wx.startFacialRecognitionVerify Bio-Auth 3 3.45 ✓
Total 87 236 36.9 90 236 38.1 118 340 34.7 9 24 37.5 wx.checkIsSupportFacialRecognition Bio-Auth 2 2.30 ✓
wx.navigateBackApplication Navigate 2 2.30 ✗
Table 4: The 1st party miniapps that have used the un- wx.navigateBackNative Navigate 2 2.30 ✓
documented APIs. The first column indicates the number wx.onDeviceOrientationChange Device 2 2.30 ✗
wx.openBusinessView View 2 2.30 ✗
of 1st-party mini-apps using undocumented APIs, and the wx.verifyPaymentPassword Payment 2 2.30 ✓
second column represents the total number of 1st-party wx.hideNavigationBar Bar 28 31.11 ✗
wx.requestSubscribeMessage Subscribe 25 27.78 ✗
mini-apps. We calculate the percentage of mini-apps by us- wx.showNavigationBar Bar 23 25.56 ✗
wx.requestVirtualPayment Payment 11 12.22 ✓
ing the first column divided by the second. wx.openUrl Misc 8 8.89 ✓
wx.hideHomeButton Interaction 8 8.89 ✗
wx.enterContact Contact 5 5.56 ✓
wx.drawCanvas Canvas 5 5.56 ✗
wx.setPageOrientation Misc 4 4.44 ✗
wx.operateWXData Misc 4 4.44 ✗
WeCom
Resource
WeChat
# UUS %
WeCom
# UUS %
Baidu
# UUS %
Tiktok
# UUS % # UUS
QQ
%
API Usage by Super App
Bluetooth
Camera
3 0.59
1 0.20
3 0.51
1 0.17
-
-
-
-
-
-
-
-
-
1 0.34
-
WeChat
Location - - - - - - - - 1 0.34 102 WeCom
Number of Uses
Media 5 0.96 5 0.84 - - 11 9.17 11 3.73 Baidu
NFC
Network
3 0.59
16 3.19
3 0.51
16 2.70
-
7 6.19
- - -
20 16.67
-
24 8.14
-
TikTok
Package 3 0.59 4 0.67 1 0.88 - - 1 0.34 QQ
Storage 25 4.98 26 4.38 3 2.65 2 1.67 8 2.71 101
Telephony - - - - - - 1 0.83 - -
Total 39 7.77 40 6.75 8 7.08 32 26.67 38 12.88
MediaMetadataRetriever
WifiManager
MediaExtractor
WifiInfo
Camera
AudioManager
MediaPlayer
NdefRecord
NsdManager
WifiNetworkSpecifier
NetworkRequest
WifiConfiguration
PackageManager
SharedPreferences
BluetoothGatt
BluetoothManager
MediaFormat
NdefMessage
ConnectivityManager
IpPrefix
MacAddress
BluetoothDevice
BluetoothGattCharacteristic
BluetoothGattService
AudioDeviceInfo
BluetoothAdapter
LocationManager
NfcAdapter
LocalServerSocket
NetworkInfo
LinkProperties
unchecked sensitive APIs. Please note that a single hidden
API may have access to multiple types of resources. There-
fore, the total number of hidden APIs may not be equal to
the sum of all the APIs that have been identified for each
individual resource type.
API Packages. We exclude the API packages that only be invoked Table 9: Summary of the attacks we tested
once. It can be observed from Figure 8 that the API most commonly
used is SharedPreferences. This is reasonable, as many of the
APIs involve file operations. The available APIs consist of those
the official API wx.request to access websites, and any network
dedicated to saving screenshots onto disks, which can be utilized
requests made through this API will be thoroughly vetted), but our
to launch A3. Besides file access APIs, numerous hidden APIs make
malware can bypass these restrictions and navigate to any webpage
use of Internet access APIs for different purposes, including pay-
without being vetted. This vulnerability allows our miniapp to open
ment processing, network resource access, and more. The currently
phishing websites and steal sensitive information, which is more
available APIs comprise those responsible for website access, which
powerful than previous phishing attacks [25]. We were successful
can be leveraged to trigger A1, APIs created for APK downloading
in this attack on several super apps but could not test it on TikTok
and installation, which can be utilized to launch A2, and APIs for
because it does not have the necessary APIs. This vulnerability is a
querying contact information, which can be employed to initiate A5.
significant security risk for super apps because they have a unique
Please note that there are also APIs that access NFC, Camera, and
threat model that differs from web browsers. Super apps only allow
Telephony Manager (which can be used to launch A4). However,
access to specific domains, unlike web browsers that can access any
since they have only been invoked once, we have excluded them
website. This vulnerability has been confirmed as a high-severity
from the figure.
vulnerability by Tencent.
7.2 Attack Case Studies (A2) Malware Download and Installation. We developed a
malicious miniapp that can download and install malware using
We present a few case studies to demonstrate how we can exploit APIs installDownloadTask or addDownloadTaskStraight. Reg-
those hidden unchecked (i.e., unprotected) APIs. For proof of con- ular miniapps cannot download or install APK files on a mobile
cept, we present five case studies covering from arbitrary webpage device because they have limited capabilities and can only down-
access to information theft, as shown in Table 9. load certain file types from specific servers. However, by using these
(A1) Arbitrary Web Page Access. We made a malicious miniapp APIs, a miniapp can download and install harmful APKs, which can
that can open any webpage using the hidden API private_openUrl. cause significant damage to the user’s mobile security and privacy.
Super apps usually have an allowlist of approved domains to prevent This attack works on both WeChat and WeCom. Finally, although
users from accessing untrusted sources (i.e., miniapps usually utilize APKs cannot be installed without the user consent, miniApps is
This is a preprint of our CCS 2023 paper. Chao Wang, Yue Zhang, and Zhiqiang Lin
running inside the Super Apps, and as long as Super App has the in- might want to make these miniapps because chat history can be
stalling permission (which most users will grant because they trust used as evidence in court. We plan to develop a tool that can identify
Super Apps), the malicious miniApp can install arbitrary APKs. hidden API vulnerabilities (e.g., SQL injection and buffer overflow).
(A3) Screenshot-based Information Theft. We made a malicious Ethics and Responsible Disclosure. Being an attack work by
miniapp that uses the captureScreen to secretly take screenshots nature, we must carefully address the ethical concerns. To this
and store them without the user’s permission. This could be used end, we have followed the community practice when exploiting
by attackers to steal sensitive information like passwords and credit the vulnerabilities and demonstrated our attacks. First, for proof of
card numbers from the user’s screen. The consequences of this kind concept, we developed quite a number of malicious miniapps and
of attack are serious. For example, the attacker could use them to launched attacks against our own accounts and devices. We have
steal the victim’s identity and open fake accounts or make illegal never uploaded our malicious miniapps onto the markets to harm
purchases. They could also use the screenshots to commit financial other users. Second, we have disclosed the vulnerabilities and our
fraud by stealing the victim’s credit card. attacks against WeChat to Tencent in September 2021, and the other
four super apps in November 2021. They have all acknowledged and
(A4) Phone Number Theft. The malicious miniapps may use
confirmed our findings, and so far among them Tencent (the biggest
getLocalPhoneNumber to illicitly obtain the user’s phone numbers.
super app vendor with 1.2 billion monthly users) has confirmed
The hidden API is implemented by getLine1Number, which is a
with 4 vulnerabilities, ranked 1 low, 2 medium, and 1 high, and
built-in feature of the Android SDK intended to provide the phone
awarded us with bug bounty and fixed them. TikTok has been
number associated with the SIM card currently inserted in the de-
patched too, but not Baidu at this time of writing.
vice. Nevertheless, access to phone number information from the
SIM card may be blocked or restricted by some carriers or manufac-
turers, thereby rendering this attack unsuccessful in certain cases. 9 RELATED WORK
(A5) Contact Information Theft. A miniapp can potentially ac- Super Apps Security. More and more super apps have started to
cess sensitive information, such as friend list (including the user- support the miniapp paradigm. Correspondingly, its security has
names and WeChat ID) using searchContacts. Our experiments received increasing attention. For instance, Lu et al. [25] identi-
were conducted primarily in 2021, during which we found that this fied multiple flaws in WeChat, and demonstrated how an attacker
hidden API was still functional based on our raw results. Upon re- would be able to launch phishing attacks against mobile users and
porting the issue to WeChat, we were informed that another group collect sensitive data from the host apps. Zhang et al. [38] devel-
had already reported the problem to them (CVE-2021-40180 [32]), oped a crawler, and understood the super apps by measuring the
and that the exploit no longer works on the new version of WeChat. program practices of the provided miniapps, including how often
the miniapp code will be obfuscated. Most recently, Zhang et al. [37]
studied the identity confusion in WebView-based super apps, and
8 DISCUSSION
identified that multiple super apps contain this vulnerability. A new
Limitations and Future Work. Although effective, APIScope can attack named cross-miniapp request forgery (CMRF) [36] was also
still be improved in various ways. It is possible for the tool to have recently discovered, which exploits the missing checks of miniapp
false positives and negatives, although none have been encountered IDs for various attacks. Differently from those works, our study
through dynamic validation and manual verification. Also, while uncovers the undocumented APIs provided by the super apps and
currently tested on Android, additional work is needed to support demonstrates how they can be exploited. In a broader scope, there
other platforms. However, our findings are representative across is a large body of research studying the security of other super apps
different platforms, as miniapp codebases are similar. Note that including web browsers and their lightweight apps, such as Google
APIScope is limited to super-apps that use the V8 engine and is not Instant apps [11]. In particular, Aonzo et al. [11], and Tang et al.
suitable for those that do not (e.g., Alipay). [31] point out that Google Instant Apps can be abused to mount
In our study, we discovered some hidden APIs that may be password-stealing attacks.
vulnerable, such as the installDownloadTask and addDownload- Undocumented API Detection and Exploitation. APIScope is
TaskStraight APIs, which are susceptible to SQL injection attacks. the first system to detect and exploit undocumented APIs in mobile
Attackers can compromise super app file download tasks by re- super apps like WeChat. Previous work has focused on detecting un-
placing the download URL of the WeChat update package with documented APIs in other platforms, such as Android and iOS, or on
a malicious one. We also noticed that there are two APIs called identifying missing security checks (e.g., [10, 15, 19, 24, 28, 29, 39]).
dumpHeapSnapshot and HeapProfiler that also have vulnerabil- For example, PScout analyzed undocumented APIs in Android [12],
ities. These APIs are designed to save data from the V8 engine to and Li et al. showed that there are 17 undocumented Android
a file, but our miniapp misuses them to write to any file it wants. APIs that are widely accessed by 3rd-party apps [20]. Zeinab and
While Android tries to prevent this, important files like chat histo- Yousra studied access control vulnerabilities caused by residual
ries are still at risk. This could lead to serious problems because our APIs [22]. In addition, there are ways to invoke undocumented
miniapp could overwrite important files of other miniapps and their APIs in iOS [17, 34] and detect their abuses [14]. Yang et al. [35]
host apps, which breaks the security measures put in place by super proposed BridgeScope to identify sensitive JavaScript bridge APIs
apps. Our experiment proved that we could overwrite a file called in hybrid apps. Undocumented APIs have also been found in the
EnMicroMsg.db, which stores chat history on WeChat. Attackers Java language and exploited by attackers [18, 26]. APIScope builds
This is a preprint of our CCS 2023 paper. Submission to ACM CCS 2023, 2023
on this previous work to specifically focus on mobile super-apps. [17] J. Han, S. M. Kywe, Q. Yan, F. Bao, R. Deng, D. Gao, Y. Li, and J. Zhou, “Launching
Finding hidden APIs in super apps using traditional techniques is generic attacks on ios with approved third-party applications,” in International
Conference on Applied Cryptography and Network Security. Springer, 2013, pp.
difficult due to the combination of web views, host native apps, 272–289.
and mini app execution environments, along with code scattering [18] S. Huang, J. Guo, S. Li, X. Li, Y. Qi, K. Chow, and J. Huang, “Safecheck: safety
enhancement of java unsafe api,” in 2019 IEEE/ACM 41st International Conference
and obfuscation. Our new approach monitors parameter propaga- on Software Engineering (ICSE). IEEE, 2019, pp. 889–899.
tion to detect API usage, using robust signatures based on super [19] S. M. Kywe, Y. Li, K. Petal, and M. Grace, “Attacking android smartphone systems
classnames and public methods. We have also created a method for without permissions,” in 2016 14th Annual Conference on Privacy, Security and
Trust (PST). IEEE, 2016, pp. 147–156.
automatic test case generation and API classification. [20] L. Li, T. F. Bissyandé, Y. Le Traon, and J. Klein, “Accessing inaccessible android
apis: An empirical study,” in 2016 IEEE International Conference on Software
Maintenance and Evolution (ICSME). IEEE, 2016, pp. 411–422.
10 CONCLUSION [21] S. Liang, The Java native interface: programmer’s guide and specification.
Addison-Wesley Professional, 1999.
In this paper, we have revealed that super apps often contain undoc- [22] Z. Ling, R. Liu, Y. Zhang, K. Jia, B. Pearson, X. Fu, and L. Junzhou, “Prison
umented and unchecked APIs for their 1st-party mini-apps, which break of android reflection restriction and defense,” in IEEE INFOCOM 2021-IEEE
can grant elevated privileges such as APK downloading, arbitrary Conference on Computer Communications. IEEE, 2021, pp. 1–10.
[23] Listen, “How to use “openUrl”?” https://developers.weixin.qq.com/community/
web view accessing, and sensitive information querying. Unfortu- develop/article/doc/00000efea1c4785424fc1dd4e51c13.
nately, these undocumented APIs can be exploited by malicious [24] B. Livshits and J. Jung, “Automatic mediation of { Privacy-Sensitive } resource
access in smartphone applications,” in 22nd USENIX Security Symposium (USENIX
3rd-party mini-apps, as they lack security checks. To address this Security 13), 2013, pp. 113–130.
issue, we have designed and implemented APIScope, a tool that can [25] H. Lu, L. Xing, Y. Xiao, Y. Zhang, X. Liao, X. Wang, and X. Wang, “Demystifying
statically identify these undocumented APIs and dynamically verify resource management risks in emerging mobile app-in-app ecosystems,” in
Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications
their exploitability. Through our testing on five popular super apps Security, 2020, pp. 569–585.
such as WeChat and TikTok, we have found that all of them contain [26] L. Mastrangelo, L. Ponzanelli, A. Mocci, M. Lanza, M. Hauswirth, and N. Nystrom,
these types of APIs. Our findings suggest that super app vendors “Use at your own risk: the java unsafe api in the wild,” ACM Sigplan Notices, vol. 50,
no. 10, pp. 695–710, 2015.
must thoroughly examine and take caution with their privileged [27] MayBG, “How to use “requestFacetoFacePayment”?” https://developers.
APIs to prevent them from becoming potential exploit points. weixin.qq.com/community/develop/doc/000cce1ebd80006b1e8f5185b56800.
[28] X. Pan, X. Wang, Y. Duan, X. Wang, and H. Yin, “Dark hazard: Learning-based,
large-scale discovery of hidden sensitive operations in android apps.” in NDSS,
REFERENCES vol. 17, 2017, pp. 10–14 722.
[29] J. Samhi, L. Li, T. F. Bissyandé, and J. Klein, “Difuzer: Uncovering suspicious hid-
[1] “6 powerful wechat statistics you need to know in 2022,” https://brewinteractive. den sensitive operations in android apps,” in Proceedings of the 44th International
com/wechat-statistics/, (Accessed on 12/30/2022). Conference on Software Engineering, 2022, pp. 723–735.
[2] “Google play store: number of apps 2022 | statista,” https://www.statista.com/ [30] K. Sen, S. Kalasapur, T. Brutch, and S. Gibbs, “Jalangi: A selective record-replay
statistics/266210/number-of-available-applications-in-the-google-play-store/, and dynamic analysis framework for javascript,” in Proceedings of the 2013 9th
(Accessed on 12/27/2022). Joint Meeting on Foundations of Software Engineering, 2013, pp. 488–498.
[3] “Soot:a framework for analyzing and transforming java and android applications,” [31] Y. Tang, Y. Sui, H. Wang, X. Luo, H. Zhou, and Z. Xu, “All your app links are belong
http://soot-oss.github.io/soot/, (Accessed on 12/30/2022). to us: understanding the threats of instant apps based attacks,” in Proceedings of
[4] “Tencent app,” https://www.nbd.com.cn/articles/2022-12-01/2576229.html. the 28th ACM Joint Meeting on European Software Engineering Conference and
[5] “Tiktok - make your day,” https://www.tiktok.com/, (Accessed on 12/30/2022). Symposium on the Foundations of Software Engineering, 2020, pp. 914–926.
[6] “Wechat mini programs showcases new capabilities to celebrate its third anniver- [32] vuldb, “Cve-2021-40180,” https://vuldb.com/?id.205138.
sary,” https://www.tencent.com/en-us/articles/2200946.html. [33] W3C, “Miniapp standardization white paper,” https://w3c.github.io/miniapp/
[7] “What are wechat mini-programs? a simple introduction - walkthechat,” https: white-paper/, 2020.
//walkthechat.com/wechat-mini-programs-simple-introduction/, (Accessed on [34] T. Wang, K. Lu, L. Lu, S. Chung, and W. Lee, “Jekyll on ios: When benign apps
12/30/2022). become evil,” in 22nd { USENIX } Security Symposium ( { USENIX } Security 13),
[8] “WeChat Chinese Documentation,” https://developers.weixin.qq.com/ 2013, pp. 559–572.
miniprogram/en/dev/api/, 04 2022, (Accessed on 12/21/2022). [35] G. Yang, A. Mendoza, J. Zhang, and G. Gu, “Precisely and scalably vetting
[9] “WeChat English Documentation,” https://developers.weixin.qq.com/ javascript bridge in android hybrid apps,” in International Symposium on Re-
miniprogram/en/dev/api/, 04 2022, (Accessed on 12/30/2022). search in Attacks, Intrusions, and Defenses. Springer, 2017, pp. 143–166.
[10] M. Alhanahnah, Q. Yan, H. Bagheri, H. Zhou, Y. Tsutano, W. Srisa-An, and [36] Y. Yang, Y. Zhang, and Z. Lin, “Cross miniapp request forgery: Root causes,
X. Luo, “Dina: Detecting hidden android inter-app communication in dynamic attacks, and vulnerability detection,” in Proceedings of the 2022 ACM SIGSAC
loaded code,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. Conference on Computer and Communications Security, 2022, pp. 3079–3092.
2782–2797, 2020. [37] L. Zhang, Z. Zhang, A. Liu, Y. Cao, X. Zhang, Y. Chen, Y. Zhang, G. Yang, and
[11] S. Aonzo, A. Merlo, G. Tavella, and Y. Fratantonio, “Phishing attacks on modern M. Yang, “Identity confusion in webview-based mobile app-in-app ecosystems,”
android,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and in 31st { USENIX } Security Symposium ( { USENIX } Security 22), 2022.
Communications Security, 2018, pp. 1788–1801. [38] Y. Zhang, B. Turkistani, A. Y. Yang, C. Zuo, and Z. Lin, “A measurement study of
[12] K. W. Y. Au, Y. F. Zhou, Z. Huang, and D. Lie, “Pscout: analyzing the android wechat mini-apps,” in Abstract Proceedings of the 2021 ACM SIGMETRICS/Inter-
permission specification,” in Proceedings of the 2012 ACM conference on Computer national Conference on Measurement and Modeling of Computer Systems, 2021,
and communications security, 2012, pp. 217–228. pp. 19–20.
[13] A. Bartel, J. Klein, Y. Le Traon, and M. Monperrus, “Dexpler: converting android [39] Q. Zhao, C. Zuo, B. Dolan-Gavitt, G. Pellegrino, and Z. Lin, “Automatic uncov-
dalvik bytecode to jimple for static analysis with soot,” in Proceedings of the ACM ering of hidden behaviors from input validation in mobile apps,” in 2020 IEEE
SIGPLAN International Workshop on State of the Art in Java Program analysis, Symposium on Security and Privacy (SP). IEEE, 2020, pp. 1106–1120.
2012, pp. 27–38.
[14] Z. Deng, B. Saltaformaggio, X. Zhang, and D. Xu, “iris: Vetting private api abuse in
ios applications,” in Proceedings of the 22nd ACM SIGSAC Conference on Computer
and Communications Security, 2015, pp. 44–56.
[15] K. Drakonakis, S. Ioannidis, and J. Polakis, “The cookie hunter: Automated
black-box auditing for web authentication and authorization flaws,” in Proceed-
ings of the 2020 ACM SIGSAC Conference on Computer and Communications
Security, 2020, pp. 1953–1970.
[16] A. Druffel and K. Heid, “Davinci: Android app analysis beyond frida via dynamic
system call instrumentation,” in International Conference on Applied Cryptography
and Network Security. Springer, 2020, pp. 473–489.
This is a preprint of our CCS 2023 paper. Chao Wang, Yue Zhang, and Zhiqiang Lin
19 𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ←getSuperClass (𝑐𝑎𝑝𝑖𝑖 ) ; Figure 9: Time cost of APIScope in its static and dynamic
20 foreach 𝑐𝑎𝑝𝑖 𝑗 ∈ 𝑃𝐴𝑃𝐼 do
21 if 𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ! =getSuperClass (𝑐𝑎𝑝𝑖 𝑗 ) then analysis. The dynamic analysis only includes the time con-
22 𝑖𝑠𝐼𝑛𝑣𝑎𝑟𝑖𝑛𝑡 ← FALSE; sumed for identifying API invocation points.
23 BREAK;
the string “getLocation” as shown in the 5th line of Figure 2, if
24 if 𝑖𝑠𝐼𝑛𝑣𝑎𝑟𝑖𝑛𝑡 == TRUE then
25 if 𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ∉ 𝐼 then it matches, we add the implementation of the whole body into set
26 𝐼 .𝑎𝑑𝑑 (𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ) ; 𝑃𝐴𝑃𝐼 (line 6). Next we will iterate API implementation in 𝑃𝐴𝑃𝐼
27 𝑖𝑠𝐼𝑛𝑣𝑎𝑟𝑖𝑛𝑡 ← TRUE; to extract the invariants (line 7-44). For each specific invariant,
28 𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ←getSuperPackage (𝑐𝑎𝑝𝑖𝑖 ) ; e.g., the superclass (line 19-26), only when this invariant exists
29 foreach 𝑐𝑎𝑝𝑖 𝑗 ∈ 𝑃𝐴𝑃𝐼 do
30 if 𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ! =getSuperPackage (𝑐𝑎𝑝𝑖 𝑗 ) then in all APIs, we consider it is an invariant and we add it to the
31 𝑖𝑠𝐼𝑛𝑣𝑎𝑟𝑖𝑛𝑡 ← FALSE; invariant set 𝐼 (line 26); otherwise, we break the iteration and skip
32 BREAK;
this invariant (line 22). After these iterations, our invariant set will
33 if 𝑖𝑠𝐼𝑛𝑣𝑎𝑟𝑖𝑛𝑡 == TRUE then contain method signature, super class, super packages, and callers,
34 if 𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ∉ 𝐼 then
35 𝐼 .𝑎𝑑𝑑 (𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ) ; if they exist in the corresponding public API implementations.
36 𝑖𝑠𝐼𝑛𝑣𝑎𝑟𝑖𝑛𝑡 ← TRUE; With the extracted API invariants, it then becomes straightfor-
37 𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ←getCaller (𝑐𝑎𝑝𝑖𝑖 ) ; ward to identify the undocumented APIs, as shown in line 45-51.
38 foreach 𝑐𝑎𝑝𝑖 𝑗 ∈ 𝑃𝐴𝑃𝐼 do Specifically, we first iterate implementations of functions by match-
39 if 𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ! =getCaller (𝑐𝑎𝑝𝑖 𝑗 ) then
40 𝑖𝑠𝐼𝑛𝑣𝑎𝑟𝑖𝑛𝑡 ← FALSE; ing the collected invariants (line 48), and if a implementation matches
41 BREAK; with all the invariants as in the public APIs (and it has not been
42 if 𝑖𝑠𝐼𝑛𝑣𝑎𝑟𝑖𝑛𝑡 == TRUE then added in the undocumented set yet), the implementation is added
if 𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ∉ 𝐼 then
43
44 𝐼 .𝑎𝑑𝑑 (𝑖𝑛𝑣𝑎𝑟𝑖𝑎𝑛𝑡 ) ;
as an undocumented API (line 50).
800
700
600
Number of APIs
500
400
300
200
100
0
6.3.28-880
6.3.30-900
6.3.31-920
6 .32 40
6.5.5.3--941
6.5.4-1980
6.5.5.8--1040
6.5.10-1061
6.5.13-1080
6.5.14-1100
6.5.16-1100
6.5.19-1120
6.5.22-1140
1 0
6.6.23-1140
6.6.0-1180
6.6.1-1200
6.6.2-1220
6.6.3-1240
6.6.6-1260
6.7.7-1300
6.7.2-1321
6.7.3-1340
6.7.3-1341
7.0.4-1360
7.0.0-1360
7.0.0-1360
7.0.0-1362
7.0.0-1363
7.0.3-1380
7.0.3-1381
7.0.4-1400
7.0.4-1402
7.0.5-1420
7.0.6-1421
7.0.6-1480
7.0.7-1500
7.0.7-1505
7.0.7-1522
7.0.8-1523
7.0.9-1540
7.0.9-1542
7.0.9-1543
7.0.0.9--1544
7.0.10-1565
7.0.10-1560
7.0.10-1560
7.0.12-1581
7.0.12-1600
7.0.12-1600
7.0.13-1621
7.0.13-1620
7.0.13-1620
7.0.14-1641
7.0.14-1640
7.0.15-1660
7.0.15-1660
7.0.15-1680
7.0.16-1680
7.0.16-1681
7.0.16-1690
7.0.16-1690
7.0.17-1701
7.0.17-1700
7.0.17-1700
7.0.18-1721
7.0.18-1720
7.0.18-1720
7.0.19-1743
7.0.19-1740
7.0.20-1760
7.0.20-1760
7.0.21-1780
7.0.21-1781
7.0.21-1781
7.0.21-1782
7.0.21-1803
7.0.22-1820
1 1
8.0.22-1800
8.0.0-1820
8.0.1-1840
8.0.2-1841
8.0.2-1841
8.0.2-1852
8.0.3-1860
8.0.6-1880
8.0.7-1900
8.0.7-1900
8.0.0.9--1920
.11 19 0
-1940
60
6 .7 00
7 .9 54
8 .9 92
9
6.3.27-
6.3
Versions
Figure 10: # of Uncovered APIs in WeChat. The bluebar is the # the APIs, and the redbar is # of public APIs.
500 300
APIs of Baidu
APIs of QQ
400 300
300 200
200
200 100
100
100
0 0 0
2.8.2
2.8.2
2.8.5
2.8.6
2.8.7
2 .8
2.8.8.9
2.8.10
3.0.0.17
3.0.12
3.0.20
3.0.24
3.1.28
.18
11.19.0.8
11.21.0.8
11.22.0.9
12.25.00.8
1 .0. .8
122.0.00.8
12.0.0 .9
12.0.0..10
1 .3 12
12 2.3..0.8
12.16. 0.9
12.17.1.10
.0. 0
12
8.4 1
.18
.0
.5
.0
.5
.0
8.8 3
.33
3 .1
.21 5.1
.
2.7
8.4
8.5
8.5
8.6
8.6
8.7
8.8
11.18.
Versions of Baidu
cannot be precisely measured. For example, when our tool invokes the first two version of WeChat, and also all of them contain sig-
getLocation, the host app will pop up a dialog and ask the user nificantly number of undocumented APIs. Meanwhile, through our
to grant permission. The user’s reaction time, including the time manual investigations on the historical versions, we also obtained
taken to press the button, will also be included in the results. As two interesting findings: (i) the documented APIs in earlier ver-
such, we can only provide approximate results, and we found that sion may later become undocumented available. For example, API
none of the dynamic API executions took hours to complete, even captureScreen, which is used to capture a screenshot, has been
though there may be thousands of test cases to execute (as shown in removed from their documentation and become an undocumented
Table 2). In fact, most of them just took several minutes to complete, one; (ii) the undocumented APIs can be released to the public. For
which is acceptable since APIScope is a one-time program analysis example, an API named “chooseContact” was an undocumented
tool for a specific super app. API, and since 7.0.12, it has become a documented API.