Vulnerability Detection On Android AppsInspired by Case Study On Vulnerability Related With Web Functions
Vulnerability Detection On Android AppsInspired by Case Study On Vulnerability Related With Web Functions
ABSTRACT Nowadays, people’s lifestyle is more and more dependent on mobile applications (Apps),
such as shopping, financial management and surfing the internet. However, developers mainly focus on the
implementation of Apps and the improvement of user experience while ignoring security issues. In this
paper, we perform the comprehensive study on vulnerabilities caused by misuse of APIs and form a
methodology for this type of vulnerability analysis. We investigate the security of three types of Android
Apps including finance, shopping and browser which are closely related to human life. And we analyze four
vulnerabilities including Improper certificate validation(CWE-295:ICV), WebView bypass certificate vali-
dation vulnerability(CVE-2014-5531:WBCVV), WebView remote code execution vulnerability(CVE-2014-
1939:WRCEV) and Alibaba Cloud OSS credential disclosure vulnerability(CNVD-2017-09774:ACOCDV).
In order to verify the effectiveness of our analysis method in large-scale Apps on the Internet, we propose
a novel scalable tool - VulArcher, which is based on heuristic method and used to discover if the above
vulnerabilities exist in Apps. We download a total of 6114 of the above three types of samples in App stores,
and we use VulArcher to perform the above vulnerability detection for each App. We perform manual
verification by randomly selecting 100 samples of each vulnerability. We find that the accuracy rate for
ACOCDV can reach 100%, the accuracy rate for WBCVV can reach 95%, and the accuracy rate for the other
two vulnerabilities can reach 87%. And one of vulnerabilities detected by VulArcher has been included in
China National Vulnerability Database (CNVD) ID(CNVD-2017-23282). Experiments show that our tool
is feasible and effective. For the convenience of researchers in related communities, We make our data and
tool available at https://buptnsrclab.github.io/blog/2020/01/03/vularcher-site-launched.
INDEX TERMS Static analysis, vulnerability, mobile agents, security, application software, detection
algorithms.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 106437
J. Qin et al.: Vulnerability Detection on Android Apps–Inspired by Case Study on Vulnerability
reproduce the vulnerabilities by combining the vulnerabil- improvement recommendations to mitigate the vulnerability
ity reports with widely crowdsourced information. In 2019, threat.
Diao et al. [8] analyzed one vulnerability caused by the usage (2) Based on the methodology, we propose a vulnerabil-
of the accessibility API. Amin et al. [9] focused on proposing ity verification for workflow for each vulnerability, relevant
a detection tool for Android vulnerabilities, they did not give researchers can verify corresponding vulnerabilities follow-
out a methodology for analyzing Android vulnerabilities. ing them.
Luo et al. [18] was concerned about the vulnerabilities of the (3) In order to verify our methodologies in large-scale
Android Framework and adopt a symbolic execution method Apps, we propose a static and scalable vulnerabilities
to detect vulnerabilities. Wu et al. [19] conducted a system- detection tool based on our findings for vulnerability anal-
atic study for Android system vulnerabilities. ysis,VulArcher. We download 6,177 Android Apps con-
Aforementioned vulnerabilities researches are related to taining transactions from the Internet, which are the latest
the web vulnerabilities [20]. Although these vulnerabilities version in 2018. After our random manual verification
have been reported for a long time, they have received little of the results of VulArcher, it shows that the average
attention. Moreover, researchers did not consider how to ana- accuracy rate is 91%. In addition, VulArcher can also
lyze the vulnerabilities of these Apps if it is packed [21]–[23]. detect the vulnerabilities on packed Apps. By adding rules
To the best of our knowledge, there is little effective analysis in a configuration file, VulArcher can detect other cat-
on aforementioned vulnerabilities. There is a very serious egories of vulnerabilities. We will share VulArcher at
problem that many vulnerabilities reported early still exist in https://buptnsrclab.github.io/blog/2020/01/03/vularcher-site-
latest Apps, even in the same App. The main reason is that launched.
there is no feasible and complete vulnerability verification
workflows for developers. In additional, more and more web II. BACKGROUD
functions are used in Apps [24], these functions may also A. ANDROID APP OVERVIEW
increase new vulnerabilities of Apps. Users send their infor- An Android App file is a compressed file that contains
mation through the network in the process of Apps,which AndroidManifest.xml file, certificate files, dex file, static
makes it is particular important for Apps to ensure(assure) resource files and so on.
their security. 1) Androidmanifest.xml, which defined and included in
In this paper, our target of vulnerabilities are related to every App, describes the name, version, permissions,
web functions. By manual analysis of 400 Apps, we find component name and other information of the App.
the vulnerabilities caused by the misuse of APIs accounted 2) Certificate files hold the signature information that
for the majority, and the most of vulnerabilities caused ensures the integrity of the App.
by API misuse are WRCEV, WBCVV, ICV and ACOCDV. 3) Dex file is an executable file, which contains all the
So we chose these four vulnerabilities as our study goals. operating instructions and runtime data of an the App.
At the same time, we divide the above four vulnerabilities Its structure is as follows:
into three categories based on their behaviors: Overridden (1) Header: containing metadata, and a body which
method(Cat1:OM), Use unsafe settings(Cat2:USS) and Data contains the majority of the data.
leakage of sensitive information(Cat3:DLSI). (2) String_ids: recording the offsets of each string in
Contributions. Based on the above analysis, we explore the the data area.
severity of these vulnerabilities and perform the comprehen- (3) Type_ids: recording the string indexs for each type.
sive study on the four vulnerabilities caused by the misuse of (4) Proto_ids: recording the declaration strings of
APIs and form an analysis methodology for each category methods, return type strings, parameter lists.
of vulnerabilities. In order to mitigate them, we propose (5) Field_ids: recording the classes, type, and method
corresponding improvement recommendations. Based on the name which belongs to.
methodology, we propose a vulnerability verification work- (6) Method_ids: recording the classes’ name, declara-
flow for each vulnerability. In order to verify the effective- tion and the name of the methods and other informa-
ness of our analysis methodology in large-scale Apps on the tion.
Internet. Based on our findings, we propose a vulnerability (7) Class_defs: containing the type, inheritance hierar-
detection tool, named VulArcher. It can detect the above four chy, access metadata, and other class metadata.
vulnerabilities in packed or unpacked Android Apps,and the (8) Data: containing data of each classes.
average accuracy rate is 91%.The main contributions are as 4) Static resource files that exist in the res folder mainly
follows: contain pictures, audios and layout files of the App.
(1) For the four vulnerabilities that are caused by the
misuse of APIs, we divide them into three categories based on B. ANDROID API AND ANDROID SDK
their behaviors. we perform the comprehensive study on them An Android API is an interface provided to developers to
and form an analysis methodology for each vulnerability, invoke system services, making App development more effi-
which can help researchers to discover other vulnerabilities cient and convenient. Here are a few Android APIs related to
in the three categories. Based on the analysis, we provide this paper.
TABLE 1. WebView APIs’ name and their functions. TABLE 4. Statistics of Vulnerabilities in the 100 Apps (‘ICV’ refers to
‘Improper certificate validation, ‘WBCVV’ refers to ‘WebView bypass
certificate validation vulnerability’, ‘WRCEV’ refers to ‘WebView remote
code execution vulnerability’ and ‘ACOCDV’ refers to ‘Alibaba Cloud OSS
credential disclosure vulnerability’).
webpage. Also, it uses addJavascriptInterface(Object object, FIGURE 7. The demo code provided by documentation of Alibaba cloud
String name) API to make the JavaScript code can call the OSS.
b object by string a method in the called HTML page. How-
ever, this App does not limit the vulnerability through some
protection methods, so the App has the vulnerability. SDK of the service need to be called in the App. However,
We install the App on an Android emulator, we open developers usually simply copy the official sample code and
the browser to tap an url, such as https://www.yahoo.com. don’t consider the security weekness of programming. Fig 7
As is shown in Fig 6a, the App correctly displays the page is the demo code provided by the Alibaba cloud. Attackers
passed by the url. Viaing a MITM attack, we forged mali- can obtain this set of credentials by using the OSS man-
cious JavaScript code into the webpage. Malicious JavaScript agement tool to get the data in the service easily. It will
code executes the command ‘‘ls - al /MNT/sdcard/’’ to read result in information leakage and constitute a serious security
file information in sdcard. As shown in Fig 6b, malicious weekness.
JavaScript normally runs on the vulnerable App, which reads
the file information from the sdcard directory of the Android 2) VULNERABILITY POC
device. It indicates that this vulnerability exists and can be (1) File MD5: fea5226fdf7a1a1ed95d6c24a6e30f06
exploited in the App. (2) version: 4.37
(3) Exploit: In order to facilitate the manual analysis and
3) INSIGHT reading of the source code of the App, we use the Jeb [26] tool
When developers use the addJavascriptInterface (Object to reverse the App when manually verifying the vulnerability.
object, String name) API to call a web page. As summarized Because the App is packed, we use our unpacking tool [27]
in Table 5, the functions of removing the risk interface should to get the source code of the App. As shown in Fig 8,
be added to limit the permission of JavaScript code. They can it is the source code which is obtained after unpacking the
eliminate the WRCEV in Apps. App.
Fig 9 is the code which contains sensitive information
C. CASE STUDY 2-ALIBABA CLOUD OSS CREDENTIAL in the App. In onCreate method, we find construct the
DISCLOSURE VULNERABILITY (ACOCDV) object Alibaba cloud storage service initializes the object.
1) VULNERABILITY TO EXPLAIN When constructing an object, this method directly invoke the
Alibaba Cloud Object Storage Service (OSS) is a secure and accessKeyId and accessKeySecret, and both of these sensitive
reliable cloud Storage Service provided by Alibaba. When the values are hard coded by the developer in the program, thus
service is used in an Android App, the APIs provided by the it causes the weakness of sensitive information.
TABLE 6. Vulnerability characteristics with case study(‘OM’ refers to IV. VulArcher-A TOOL TO DETECT VULNERABILITIES
‘Overridden method’, ‘USS’ refers to ‘Use unsafe settings’, ‘DLSI’ refers to
‘Data leakage of sensitive information’; ‘ICV’ refers to ‘Improper A. SYSTEM OVERVIEW
certificate validation, ‘WBCVV’ refers to ‘WebView bypass certificate As shown in Fig 10, the working process of VulArcher can be
validation vulnerability’, ‘WRCEV’ refers to ‘WebView remote code
execution vulnerability’ and ‘ACOCDV’ refers to ‘Alibaba Cloud OSS divided into the following steps:
credential disclosure vulnerability’).
1) DECOMPILATION
For an App, VulArcher decompresses it to obtain the
classes.dex file which contains the source code, the Android-
Manifest.xml file which contains component information,
permissions and the resource files. VulArcher reverses the
classes.dex based on Androgurd [28]. Through this process,
It obtains the classes and methods in the source code of
the App. Also it decompiles the AndroidManifest.xml file to
obtain registered component information and configuration
information.
2) PACKER IDENTITION
VulArcher recognizes whether an App is packed accord-
ing to formula (1) and (2). AC indicates the number of all
components registered by the App, and CC indicates the
number of classes in classes.dex which can obtain from the
AC. Because the packed App almost hides all components
and other logic codes, classes.dex does not contain these
information. Some packing methods have signature files, ie.
libexe*.so and libexecma**.so. In order to further clarify
packing methods, VulArcher uses the fingerprint to identify
the detailed packing type and version of the App.
CC
F= (1)
( AC
unpacked, if F ≥ 0.8
App (2)
packed, otherwise
3) UNPACKING
At present, more and more Apps are packed. Static analysis
can not directly get the real code of the packed Apps. This
calling link and the context of it. Because some vulnerabilities causes static detection to fail to analyze vulnerabilities of
are related to the Android environment, API level required for such Apps. Therefore, the analysis of packed Apps should be
Apps is also one of the elements to identify the vulnerability. carried out by unpacking method, for extracting the original
We can combine the above elements to determine whether the code of Apps. The unpacking method used in this system
vulnerability exists. DexX [27] is a result of our previous research. It can handle
the six packers, such as Ali [29], Baidu [30], Bangcle [31],
4) ALIBABA CLOUD OSS CREDENTIAL DISCLOSURE Tencent [32], Qihoo 360 Mobile [33], and ijiami [34]. This
VULNERABILITY (ACOCDV) paper does not elaborate too much for the method.
For an App that uses the API OSSPlainTextAKSKCre-
dentialProvider provided by the Alibaba OSS SDK to 4) BUILDING TAINT PATH
create credential information with server, its secretKey VulArcher creates a taint control flow graph(TCFG) of
needed in the method is hard coded in the program. a classes.dex, the algorithm is shown in Algorithm 1.
This causes the existence of information disclosure VulArcher creates a control flow graph(CFG) of the
vulnerability. classes.dex. the set V of p nodes in CFG is shown in for-
Data leakage of sensitive information (DLSI). The analy- mula (3). Each vk contains the package name(pkgk ), class
sis and characteristics of this category vulnerabiliy is shown name(ck ) and method(mk ). As shown in formula (4), the set
in TABLE 6. An App has an invoke method that uses sensitive IA of m interested APIs is a subset of V . For each iAj ,
information. The object of the sensitive information is traced we follow the heuristic method to find its taint path tpj , it’s
to determine whether it is hard-coded in the App. If it is shown in formula (5). V 0 is a subset of V , the points in V 0 are
hard-coded in it then the vulnerability exists. related to the control flow of iAj . If vk has a control flow to vw ,
FIGURE 10. The overview of VulArcher, it describes the modules of the overall tool, as well as the
workflow.(‘‘Y’’ indicates that the App is packed and ‘‘N’’ indicates that it is unpacked).
Algorithm 1 The Algorithm of Building TCFG as described in Algorithm 2, so that we can avoid match-
1: Input: IA, CFG {IA the set of interested APIs.} ing vulnerability rules of all App code. For tpj , VulArcher
2: Output: TCFG extracts its context slices(csj ). R is a set of vulnerability rules.
3: INITIALIZE TCFG = ∅ The input to function f(.) is R and CS. If the m matchs the
4: for each iAj ∈ IA do vr, the App is commented with the vulnerability which vr
5: tpj ⇐ HeuPath(CFG, iAj ) {A heuristic method to find represents.
taint path tpj of iAj .}
6: BuildG(TCFG, tpj ) {Building the TCFG with tpj .} CS = {csj |j = 1, . . . , r} (7)
7: end for R = {vr1 , vr2 , . . . , vrn } (8)
(
8: return TCFG Vul, if vri is matched in CS
f (R, CS) (9)
notVul, otherwise
we think there is a taint control flow relationship from vk to We just need to find interesting points directly in the paths of
vw <vk , vw >, we denote the set of all the above relationships the vulnerabilities. In this way, the entire vulnerability search
as E 0 . As shown in formula (6), TCFG is composed of all process is fast and accurate.
tpj (j = 1, . . . . ., m).
B. EXTRACT SUSPICIOUS CODE SEGMENT
V = {vk = (pkgk , ck , mk )|k = 1, . . . , p} (3) An App contains many classes and methods, if we iterate
IA = {iAj = (pkgj , cj , mj )|j = 1, . . . , m, and m ≤ p} through each method to search vulnerabilities and weak-
(4) nesses, it will certainly affect the efficiency of detection. So
we first built the TCFG of the App, recording classes and
tpj = {(Vj , Ej )| < vj k, vj w >∈ Ej ,
0 0 0
methods that contain suspicious interesting paths. When we
Vj0 = {vj k|k = 1, . . . , q}} (5) are looking for vulnerabilities, we can use the heuristic-based
TCFG = {tpj |j = 1, . . . , m} (6) approach described in Algorithm 2 to find out if a vulnera-
bility exists. The interestSet in the algorithm represents all
sensitive APIs and methods that may cause vulnerabilities in
5) DETECTION the App. FR represents a collection of rules for vulnerability
VulArcher uses the rules of above vulnerabilities and the fixes and TR represents the set of rules that trigger the vul-
TCFG of an App generated in the Building Taint Path nerability. The algorithm first obtains a suspected vulnerable
for vulnerability judgment. In order to detect vulnerabilities point in interestSet, then in the already constructed TCFG,
faster, we propose a heuristic vulnerability search algorithm it constructs the relevant slice content of all objects and
17: end if
18: end if
19: if TR 6 = ∅ then been reported for a long time. In this paper, we provide
20: if HeuSearch(contextSlice,TR) == True then semi-automated verification methods and workflow for the
21: output(path,contextSlice) vulnerabilities and weaknesses we analyzed. This work can
22: else verify the detection results of VulArcher and provides readers
23: continue with clear methods of vulnerabilities’ verifications.
24: end if
25: end if 1) A VERIFICATION METHOD FOR MAN-IN-MIDDLE ATTACK
26: end for VULNERABILITY CAUSED BY WEAK CERTIFICATE
27: return output(path,contextSlice) VERIFICATION
Fig 11 depicts the attack architecture for this weakness. First,
we set up the environment to enable Burpsuite [35] to block
all network traffic from mobile. We write a script based
variables in the point, that is, contextSlice. It uses a heuristic on Burpsuite, which is mainly responsible for analyzing the
search algorithm to search in contextSlice. If there is a rule traffic information of mobile, extracting the Host and body
for repairing the vulnerability in the slice, then the App does and sending it to the server. The web service module is mainly
not have the vulnerability. Otherwise, if it finds the rule responsible for auto installing Apps and recycling the results
for triggering the vulnerability in slice through the search of analyzing network traffic, also, determining whether the
algorithm, then it records the complete path information and App is attacked by MIMT automately.
slice of the vulnerability of the App.
2) A VERIFICATION METHOD FOR WRCEV
C. VERIFY METHOD AND WORKFLOW As shown in Fig 12, our verification method is mainly to
Nowadays there are more and more vulnerability detection inject malicious JavaScript code in to the App’s network traf-
tools. To some certain extent, they can detect some vulnera- fic by MIMT. We use Burpsuite to block all traffic sent by the
bilities and weaknesses. But they do not provide developers mobile and injecting the written malicious JavaScript code
with some way to reproduce vulnerabilities. This is the reason in the response. If the JavaScript can successfully execute
why more and more vulnerabilities are ignored by developers, malicious code, the code will send a record to the web service
as a result, many Apps still have vulnerabilities that have to record whether the App exists the vulnerability.
TABLE 10. Summary table of all sample detection results.(‘P’ refers to ‘Probability of vulnerability(%)’; ‘ICV’ refers to ‘Improper certificate validation,
‘WBCVV’ refers to ‘WebView bypass certificate validation vulnerability’, ‘WRCEV’ refers to ‘WebView remote code execution vulnerability’ and ‘ACOCDV’
refers to ‘Alibaba Cloud OSS credential disclosure vulnerability’).
FIGURE 14. Distribution of vulnerabilities in different category of FIGURE 15. Distribution of vulnerabilities per month in 2018. (‘ACOCDV’
Apps.(‘ACOCDV’ refers to ‘Alibaba Cloud OSS credential disclosure refers to ‘Alibaba Cloud OSS credential disclosure vulnerability’, ‘WRCEV’
vulnerability’, ‘WRCEV’ refers to ‘WebView remote code execution refers to ‘WebView remote code execution vulnerability’, ‘WBCVV’ refers
vulnerability’, ‘WBCVV’ refers to ‘WebView bypass certificate validation to ‘WebView bypass certificate validation vulnerability’ and ‘ICV’ refers to
vulnerability’ and ‘ICV’ refers to ‘Improper certificate validation’). ‘Improper certificate validation’).
vulnerabilities have almost the same probability distribution there are vulnerabilities and weekneses in the monthly release
in different sized apps. Both WBCVV and ICV have a high of Apps, and the number of vulnerabilities does not decline
percentage in Apps of different sizes. over time.
Finding 5: The number and the type of vulnerabilities in Finding 8: As shown in Fig 15, the trend of vulnerabilities
Apps are independent of the size of them. tends to be in a stable state. This shows that various vulnera-
The relationship between vulnerabilities and cate- bilities still exist in Apps and developers do not mitigate these
gories. Fig 14 shows the probability of vulnerabilities for vulnerabilities.
each of the categories of Apps. ACOCDV does not exist
in the browser Apps, because they do not need to use this C. VulArcher PERFORMANCE
service to store a large number of images. The reason that Efficiency. VulArcher takes an average of one minute to
the percentage of WRCEV in these Apps is slightly higher analyze an App. However, if the analysis App is packed, then
than the other two types of Apps is most of these Apps use the time to unpack it is an average of 50 seconds.
the WebView component and many of them support for the Computing cost. For an App, the memory consumption is
Android 4.4 version. 400M, CPU utilization is 40%.
Finding 6: More and more Apps face the vulnerability Scalability. VulArcher supports the rules of vulnerabilities
of MITM, because developers do not validate certificate for to detect them. Hence, for a new vulnerability, VulArcher
convenience. So certificate validation must be done strictly in only need to konw the vulnerability rule to complete the
the development. detection of the new vulnerability.
Finding 7: Although Alibaba OSS SDK has released
new instructions to circumvent this credential information VI. DISCUSSIONS AND LIMITATIONS
leak [40], results in Fig 14 shows that many Apps still have We download the Apps to analyze the four types of vulnera-
this vulnerability. This means that Apps should be timely bilities, but the Apps was two years ago. We should use the
updated the SDKversion to fix the vulnerability. latest data for analysis. However, the latest Apps compared
The trend of the vulnerability. Fig 15 shows the trend with the previous ones, the Android system versions they
of probability of vulnerabilities for Apps in each month from support has not been updated much, the APIs used in them
January to August in 2018. It can be intuitively reflected that has not been updated. Our findings are not biased.
TABLE 11. Details of 98 Apps with Webview bypass certificate validation risk.
The vulnerability results detected by VulArcher contain of the vulnerabilities. Chen et al. [41] manually analysed sev-
both third-party components and the App itself. At present, eral OAuth providers for Apps and determined how differ-
we regard these two aspects as all the vulnerabilities of ences between the Apps and browser environments lead to
the App. If we want to distinguish the vulnerability results. the OAuth vulnerability. Wu et al. [1] proposed a detection
We can organize a third-party component library, which con- method for the confused deputy vulnerability in Android
tains information such as the package name of the com- applications based on features of AndroidManifest.xml file
ponent. The vulnerability results detected by VulArcher and Control Flow Graph (CFG). Fang et al. [5] studyed an
contains the package name and class name of a vulner- input validation vulnerability in Android inter-component
ability, so we use the package name and the class name communication. This kind of vulnerability is caused by the
to go back to the component library to check whether incomplete security verification mechanism of Apps develop-
the vulnerability belongs to the App itself or third-party ers. Yang et al. [10] used a combination of static and dynamic
components. methods to analyze the App’s dynamic loading vulnerabil-
ity. This kind of vulnerability is caused by the insecure
VII. RELATED WORK verification on the loaded executable file. Yang et al. [20]
The study of Android vulnerabilities can be divided into two studyed about security issues caused by mixed postMessage
aspects, one is about the Apps and the other is about the Origin Stripping Vulnerability(OSV) in Apps. Ranganath
Android OS. and Mitra [42] proposed a detection tool for supporting
Li et al. [14] just conducted a detail analysis of the vulnera- large-scale Android vulnerabilities, they didn’t conduct anal-
bilities that will occur in the music Apps. The music Apps that ysis method for vulnerabilities. Farhang et al. [43] focused
are analyzed mainly produce man-in-the-middle hijacking on how different when vendors deal with Android vulnera-
attacks when communicating with servers. Qian et al. [23] bilities.
proposed a structure called app property graph (APG) for Other studies explored different vulnerabilities from we
Android vulnerabilities. Based on this feature representation studied. Feng and Shin [44] analyzed some vulnerabilities
method, a detection tool is proposed for five common vulner- related to Binderin Android, and proposed an automated tool
abilities. But they did not give workflows for the verification for detecting this type of vulnerabilities from the analysis of
TABLE 13. Details of 20 Apps with Alibaba cloud OSS certificate information leakage vulnerability.
TABLE 14. Details of 28 Apps with Webview remote code execution vulnerability.
the causes of the vulnerabilities. Linares-Vásquez et al. [45] system-level vulnerabilities. Through analysis rules and
analyzed Android system vulnerabilities by using the Bul- results, they proposed a detection tool for this type of vul-
letin resource. Zhang et al. [21] mainly studyed the Android nerabilities.
VIII. CONCLUSIONS [8] W. Diao, Y. Zhang, L. Zhang, Z. Li, F. Xu, X. Pan, X. Liu, J. Weng,
We deeply analyzed the four kinds of vulnerabilities on a data K. Zhang, and X. Wang, ‘‘Kindness is a risky business: On the usage of
the accessibility apis in android,’’ in Proc. 22nd Int. Symp. Res. Attacks,
set with more than 6000 Android Apps. The aforementioned Intrusions Defenses, 2019, pp. 261–275.
vulnerabilities are ACOCDV, WRCEV, WBCVV and ICV. We [9] A. Amin, A. Eldessouki, M. T. Magdy, N. Abdeen, H. Hindy, and I. Hegazy,
found that webview and HTTPs are existed in most of finance ‘‘Androshield: Automated Android applications vulnerability detection, a
hybrid static and dynamic analysis approach,’’ Information, vol. 10, no. 10,
and shopping apps. According to this, attackers can easily p. 326, 2019.
perform MITM. As for WRCEV which has been disclosed for [10] T. Yang, H. Cui, and S. Niu, ‘‘Dynamic loading vulnerability detection
a long time, it still exist in browser Apps, because developers for Android applications through ensemble learning,’’ Chin. J. Electron.,
vol. 26, no. 5, pp. 960–965, 2017.
lack the methods of vulnerability identification and security [11] S. Wei, Q. Huang, and J. Huang, ‘‘Understanding javascript vulnerabilities
development awareness. Apps that use Alibaba Cloud OSS in large real-world Android applications,’’ IEEE Trans. Dependable Secure
services, information leakage vulnerability of Alibaba cloud Comput. early access, Jun. 11, 2018, doi: 10.1109/TDSC.2018.2845851.
[12] P. Mutchler, A. Doupé, J. Mitchell, C. Kruegel, and G. Vigna, ‘‘A large-
OSS credentials is caused by the weak secure awareness scale study of mobile Web app security,’’ in Proc. Mobile Secur. Technol.
of developers. We developed a vulnerability detection tool, Workshop (MoST), 2015, pp. 1–8.
VulArcher. The experiments show that it can automatically [13] E. Chin and D. Wagner, ‘‘Bifocals: Analyzing Web view vulnerabilities
in Android applications,’’ in Information Security Applications, Y. Kim,
detect the above vulnerabilities, and has good scalability. H. Lee, and A. Perrig, Eds. Cham, Switzerland: Springer, 2014,
One of vulnerabilities which VulArcher detected had been pp. 138–159.
included in China National Vulnerability Database (CNVD) [14] H. Li, L. Qian, S. Zhang, H. Zhang, and J. Liu, ‘‘Data leakage between C/A
communication: A case study on Android music app,’’ in Proc. Int. Conf.
ID(CNVD-2017-23282). It detected more than 6000 Apps Wireless Commun. Signal Process., 2017, pp. 1–7.
and found nearly 3000 Apps with the above vulnerabilities. [15] Z. Li, D. Zou, S. Xu, H. Jin, H. Qi, and J. Hu, ‘‘Vulpecker: An automated
and we have manually verified the accuracy of results among vulnerability detection system based on code similarity analysis,’’ in Proc.
32nd Annu. Conf. Comput. Secur. Appl., 2016, pp. 201–213.
nearly 300 Apps. The source codes of VulArcher and sam- [16] Z. Li, D. Zou, S. Xu, X. Ou, H. Jin, S. Wang, Z. Deng, and Y. Zhong,
ple data presented in this paper can be utilized by further ‘‘Vuldeepecker: A deep learning-based system for vulnerability detec-
researchers. We incentivize this by making our data publicly tion,’’ in Proc. NDSS, 2018, pp. 1–8.
[17] D. Mu, A. Cuevas, L. Yang, H. Hu, X. Xing, B. Mao, and G. Wang,
available [46]. ‘‘Understanding the reproducibility of crowd-reported security vulnerabil-
ities,’’ in Proc. 27th USENIX Secur. Symp., Baltimore, MD, USA, 2018,
APPENDIX pp. 919–936.
[18] L. Luo, Q. Zeng, C. Cao, K. Chen, J. Liu, L. Liu, N. Gao, M. Yang,
Table 13 gives the the attack results of ‘‘Alibaba Cloud OSS X. Xing, and P. Liu, ‘‘Tainting-assisted and context-migrated symbolic
credential disclosure vulnerability ’’. Table 14 gives the attack execution of Android framework for vulnerability discovery and exploit
results of ‘‘WebView remote code execution vulnerability’’. generation,’’ IEEE Trans. Mobile Comput., early access, Aug. 20, 2019,
doi: 10.1109/TMC.2019.2936561.
Since there are no more than 100 Apps for the above two [19] D. Wu, D. Gao, K. T. Eric Cheng, Y. Cao, J. Jiang, and H. R. Deng,
types of vulnerabilities, we have all analyzed. Table 11 gives ‘‘Towards understanding Android system vulnerabilities: Techniques and
a detailed description of 98 Apps randomly selected which insights,’’ in Proc. ACM Asia Conf. Comput. Commun. Secur., New York,
NY, USA, 2019, pp. 295–306.
were detected ‘‘WebView bypass certificate validation vul- [20] G. Yang, J. Huang, G. Gu, and A. Mendoza, ‘‘Study and mitigation
nerability’’. As Table 12 shown, we selected 99 Apps that of origin stripping vulnerabilities in hybrid-postmessage enabled mobile
were taged as ‘‘Improper certificate validation’’, and we applications,’’ in Proc. IEEE Symp. Secur. Privacy (SP), Oct. 2018,
pp. 742–755.
recorded whether or not it can be attacked for each App. [21] J. Zhang, Y. Yao, X. Li, J. Xie, and G. Wu, ‘‘An Android vulnerability
detection system,’’ in Network and System Security, Z. Yan, R. Molva,
REFERENCES W. Mazurczyk, R. Kantola, Eds. Cham, Switzerland: Springer, 2017,
pp. 169–183.
[1] J. Wu, T. Cui, T. Ban, S. Guo, and L. Cui, ‘‘Paddyfrog: Systematically [22] F. Yamaguchi, N. Golde, D. Arp, and K. Rieck, ‘‘Modeling and discovering
detecting confused deputy vulnerability in Android applications,’’ Secur. vulnerabilities with code property graphs,’’ in Proc. IEEE Symp. Secur.
Commun. Netw., vol. 8, no. 13, pp. 2338–2349, 2015. Privacy, May 2014, pp. 590–604.
[2] A. Hovsepyan, R. Scandariato, W. Joosen, and J. Walden, ‘‘Software [23] C. Qian, X. Luo, Y. Le, and G. Gu, ‘‘VulHunter: Toward discovering
vulnerability prediction using text analysis techniques,’’ in Proc. 4th Int. vulnerabilities in Android applications,’’ IEEE Micro, vol. 35, no. 1,
workshop Secur. Meas. metrics, 2012, pp. 7–10. pp. 44–53, Jan. 2015.
[3] R. Scandariato, J. Walden, A. Hovsepyan, and W. Joosen, ‘‘Predicting [24] S. M. Kerner. Mobile Internet Traffic Growing Fast. Accessed:
vulnerable software components via text mining,’’ IEEE Trans. Softw. Eng., May 10, 2019. [Online]. Available: http://www.enterprisenetworkingpl
vol. 40, no. 10, pp. 993–1006, Oct. 2014. anet.com/netsp/mobile-internet-traffic-growing-fast.html
[4] S. Ma, F. Thung, D. Lo, C. Sun, and R. H. Deng, ‘‘VuRLE: Automatic vul- [25] B. Popper. Google Announces. Accessed: Jun. 4, 2018. [Online]. Avail-
nerability detection and repair by learning from examples,’’ in ESORICS. able: https://www.theverge.com/2017/5/17/15654454/ android-reaches2-
Springer, 2017, pp. 229–246. billion-monthly-active-users
[5] Z. Fang, Q. Liu, Y. Zhang, K. Wang, and Z. Wang, ‘‘IVDroid: Static detec- [26] PNF. Accessed: Jul. 8, 2018. [Online]. Available: https://www.
tion for input validation vulnerability in Android inter-component com- pnfsoftware.com/
munication,’’ in Information Security Practice and Experience. Springer, [27] C. Sun, H. Zhang, S. Qin, N. He, J. Qin, and H. Pan, ‘‘Dexx: A double layer
2015, pp. 378–392. unpacking framework for android,’’ IEEE Access, vol. 6, pp. 61267–61276,
[6] F. Wu, J. Wang, J. Liu, and W. Wang, ‘‘Vulnerability detection with 2018.
deep learning,’’ in Proc. 3rd IEEE Int. Conf. Comput. Commun. (ICCC), [28] Androgurd. Accessed: May 6, 2018. [Online]. Available: https://github.
Feb. 2017, pp. 1298–1302. com/androguard/androguard
[7] G. Lin, J. Zhang, W. Luo, L. Pan, Y. Xiang, O. De Vel, and P. Montague, [29] Alibaba. Accessed: Jun. 8, 2018. [Online]. Available: http://jaq.
‘‘Cross-project transfer representation learning for vulnerable function alibaba.com/
discovery,’’ IEEE Trans. Ind. Informat., vol. 14, no. 7, pp. 3289–3297, [30] Baidu. Accessed: Jun. 8, 2018. [Online]. Available: http://apkprotect.
Jul. 2018. baidu.com/
[31] Bangcle. Accessed: Jun. 8, 2018. [Online]. Available: http://www. JING GUO received the B.S. degree from the
bangcle.com Beijing University of Posts and Telecommuni-
[32] Tencent. Accessed: Jun. 8, 2018. [Online]. Available: http://www. cations (BUPT), in 2004, and the Ph.D. degree
qcloud.com/product/product.php?item=appup in information and communication system from
[33] Qihoo360. Accessed: Jun. 8, 2018. [Online]. Available: http://dev.360. Tsinghua University, in 2011. She is currently a
cn/protect/welcome Senior Engineer with National Internet Emergency
[34] Ijiami. Accessed: Jun. 8, 2018. [Online]. Available: http://www. Center. Her research interests include network
ijiami.cn
security situation awareness, protection of critical
[35] Burpsuite. Accessed: Jul. 18, 2018. [Online]. Available: https://
information infrastructure, and the IoT security.
portswigger.net/burp/
[36] Baidu. Accessed: Apr. 10, 2018. [Online]. Available:
http://pcappstore.baidu.com/en/index.php
[37] Wandoujia. Accessed: Apr. 10, 2018. [Online]. Available: https://www.
wandoujia.com/
[38] Qihoo360. Accessed: Apr. 10, 2018. [Online]. Available: http://
zhushou.360.cn/
[39] Huawei. Accessed: Apr. 10, 2018. [Online]. Available: http://app.
hicloud.com/ SENMIAO WANG received the B.Eng. degree in
[40] Alibaba. Accessed: Feb. 2, 2020. [Online]. Available: https://help. software engineering from the Dalian University
aliyun.com/knowledge_detail/66168.html of Technology, in 2017. She is currently pursu-
[41] E. Chen, Y. Pei, S. Chen, Y. Tian, R. Kotcher, and P. Tague, ‘‘Oauth demys- ing the Ph.D. degree with the State Key Labo-
tified for mobile application developers,’’ in Proc. ACM Conf. Comput. ratory of Networking and Switching Technology,
Commun. Secur. (CCS), Nov. 2014, pp. 892–903. Beijing University of Posts and Telecommunica-
[42] V.-P. Ranganath and J. Mitra, ‘‘Are free Android app security analysis tions, China. Her current research interests include
tools effective in detecting known vulnerabilities?’’ Empirical Softw. Eng., mobile security and machine learning.
vol. 25, no. 1, pp. 178–219, 2020.
[43] S. Farhang, M. Bahadir Kirdan, A. Laszka, and J. Grossklags, ‘‘An
empirical study of Android security bulletins in different vendors,’’ 2020,
arXiv:2002.09629. [Online]. Available: http://arxiv.org/abs/2002.09629
[44] H. Feng and G. Kang Shin, ‘‘Understanding and defending the binder
attack surface in android,’’ in Proc. 32nd Annu. Conf. Comput. Secur. Appl.,
New York, NY, USA, 2016, pp. 398–409.
[45] M. Linares-Vásquez, G. Bavota, and C. Escobar-Velásquez, ‘‘An empirical
study on android-related vulnerabilities,’’ in Proc. 14th Int. Conf. Mining QIAOYAN WEN received the B.S. and M.S.
Softw. Repositories, Oct. 2017, pp. 2–13. degrees in mathematics from Shaanxi Normal Uni-
[46] H. Zhang and J. Qin. (2019). Experimental Material. [Online]. Available: versity, Xi’an, China, in 1981 and 1984, respec-
https://buptnsrclab.github.io/blog/2020/01/03/vularcher-site-launched tively, and the Ph.D. degree in cryptography from
Xidian University, Xi’an, in 1997. She is cur-
rently a Professor with the Beijing University
JIAWEI QIN received the B.S. degree in com- of Posts and Telecommunications. Her current
puter science from Shenyang Aerospace Univer- research interests include coding theory, cryptog-
sity, China, in 2015. He is currently pursuing raphy, information security, internet security, and
the Ph.D. degree in computer science with the applied mathematics.
Beijing University of Posts and Telecommunica-
tions, China. His research interests include mal-
ware analysis, and security of mobile systems and
Apps.