0% found this document useful (0 votes)

21 views5 pages

Tracking Protection For Android's WebView - Andrzej Hunt

Uploaded by

vjxphbydbj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views5 pages

Tracking Protection For Android's WebView - Andrzej Hunt

Uploaded by

vjxphbydbj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Andrzej Hunt

Home About Me

Tracking Protection for

Android’s WebView
Posted on May 10, 2017 by Andrzej Hunt

Unlike iOS (really just Safari), Android has no content blocking API. Tracking protection is
available in some browsers, e.g. Firefox in combination with addons (and also in Firefox’s
private browsing which includes tracking protection enabled by default). For fun, we
decided to look into whether it’s possible to provide Tracking Protection when using
Android’s default WebView implementation. This blog post describes how that was done,
and explores some of the implementation details of our URL matching algorithm.

It turns out that Firefox Focus on iOS also had to build their own URL matching
implementation: iOS content blocking is current only available in Safari, and not in the iOS
WebView equivalent. That implementation was influenced by the design of iOS’s content
blocking APIs and file formats, but when you’re not subject to that restriction it’s possible to
build a faster approach, so my ignorance of that version wasn’t necessarily a bad thing, as
I’ll describe later in this post.

Why would you want to do this? One reason is that browser engines are large – and we
wanted to see whether it’s possible to build a privacy focused browser whose size
measures in megabytes instead of tens of megabytes – which would require reusing
whatever engine the platform provides (in the case of iOS you actually have no choice in
the matter, fortunately Android is a little more free). There are actually some drawbacks to
using platform-provided browser engines – which will the topic of a future post – but it’s
certainly possible to implement tracking protection on top of Android’s WebView.

Tracking Protection Lists

Firefox and Focus use the Disconnect tracking protection lists: these are lists of domains
hosting trackers that should be blocked, categorised by tracker type, e.g. Social trackers,
Analytics Trackers, Advertising Trackers, etc. Further to this there’s an override “entity” list,
which unblocks domains that are owned by a given company whenever you are browsing
a site owned by that company. (E.g. if FooBar Tracker Corp owns both foo.com and
bar.com, we would allow loading of resources from bar.com while browsing foo.com,
even though we’d block all other sites from loading resources from foo.com and
bar.com.) You can read more about these lists at the repo where the Mozilla copies of
these lists are maintained.

As such, tracking protection is fairly simple: every time a given webpage requests a
resource, we match the resource URL’s host against the blocklist. If it’s blocked, we check
the entitylist to verify whether there’s an override in place for the current site. Android’s
WebView provides a callback that is called every time it wants to load a resource, allowing
you to override resource loading.

The iOS content blocking API actually allows for regex based matching on the entire
resource URL, which is more complex than what we needed for basic tracking protection.
The disconnect lists only work using domains/hosts, which simplifies the implementation
somewhat. Focus on iOS originally only supported the content blocking API, and added
the browser later – the browser implementation therefore simply reused the same bundled
list format. The content blocking lists aren’t used for iOS’s WebView equivalent, although
that is apparently changing.

Implementing URL matching

The simple (but not particularly efficient) method would be iterate over the list of hosts
every time a resource is fetched. In fact, we could just iterate over the regex’s in the iOS
content blocking lists, and check those directly to avoid implementing our own matching.

The original Android implementation was actually a rushed afternoon (or two) hacky proof
of concept from our December All Hands – it turned out to be robust and fast enough, so it
was kept beyond that time. It might be possible to build an even faster implementation, but
this one hasn’t provoked any user complaints yet.

As mentioned, iterating over the list of blocked hosts is expensive, O(nh) for n = number
of blocked hosts == very large, h = host length (small). Fortunately at
some point or another I had learned about Tries (contrary to what some might assume, an
Information and Computer Engineering degree at my alma mater doesn’t actually involve
any Data Structures and Algorithms – but that’s nothing a little independent study can’t
quickly fix).

Those offer much smaller memory consumption (not that memory consumption is
particularly significant compared to what a web engine will need), and much faster lookup
[O(h)]:
A trie containing multiple domains.

(In reality, the Trie possibly consumes more memory because of the overhead of each
node being an object. More efficient representations are available in order to avoid one
node per character, but that didn’t seem worthwhile given that this implementation is
already performant enough.)

There’s still a bunch of overhead in various places: we’re using the Android/Java URL
classes to extract the hostname from the resource URL, which could well be more costly
than the actual act of searching the tree. I haven’t measured in detail yet.

(Building this concluded completed the bi-yearly cycle of proper Data Structures and
Algorithms construction – I’d last been able to build some trees for a bookmarks folder UI
the preceeding summer.)

As mentioned above, there’s also the entitylist: this consists of sets of hosts (A), for which
another set of hosts (B) is whitelisted (usually those sets would be the same, but that isn’t
guaranteed or necessary). This is simply an extension of the same tree: the set of
whitelisted domains (B) is another Trie. That Trie is then attached to every node
representing one of the whitelisted domains (A) – we simply extend the default Node to
have a WhitelistNode, which has a reference to the whitelisted-domains Trie.

Every real project needs its own

String implementation
Searching and inserting into our hostname tries involves walking strings backwards. That
would either require either some annoying index arithmetic, or reversing the String before
insertion/search (i.e. creating a copy of the String). Neither of those sounded like fun, so I
decided to add a String wrapper. This is arguably completely unnecessary, but made
things a little simpler (and perhaps more efficient). The String wrapper also meant that the
Trie implementation didn’t need to have much knowledge about subdomains either, we
can just start at the start of our reversed String. (Because we need to correctly match
subdomains, but not other domains, the Trie still needs to be aware of full stop being used
for domain separation, so it isn’t completely domain agnostic.

We only need to access the String character by character, which is why we can avoid a
complete string copy/reversal – if this weren’t the case, there would be little value in a
wrapper.

The wrapper takes care of index arithmetic for reversed strings – and implements support
for getChar(int) and substring(int). That’s pretty much all there was to FocusString. (I no
longer need to miss the amazing days of many C++ string classes…)

substring() copies…
Somewhat naively, I’d assumed that our Java implementation doesn’t create a copy when
calling String.substring() – in other words that it would just adjust internal indexes while
reusing the same String buffer and/or equivalent behaviour. Without that assumption, there
would be little point in avoiding a String copy on reversal, since – thanks to our recursive
Trie traversal – we’d be creating copies when traversing that Trie.

It turns out that assumption was wrong: it was true for Java 6, and also for earlier versions
of Java 7 – before changing in Java 7u6. I don’t really know where Android’s
implementation originates, but it also creates copies. Thus, FocusString was expanded to
include offsets, and FocusString.substring() merely fiddles those offsets.

It was hard to predict what the impact of this change might be in advance, since I didn’t
have much experience in this area – I discovered that it was actually a noticeable
improvement: on my fairly modern Nexus 6P, average URL matching time dropped by
about 20% – from approximately 1.2ms to 1.0ms (these numbers are for debug builds with
code coverage enabled – that drops to 0.26ms vs 0.42ms for coverage free debug builds,
which is even more significant). We already had tests in place which helped verify that
things wouldn’t break, so this was a fairly low risk change (I did use this as an opportunity
to extend those tests though).

Results
As mentioned above, the iOS equivalent implementation is a lot simpler. It iterates over the
lists of hosts, and does regex matching for each host. I decided to port that implementation
to Android, primarily to check for consistency of results. Fortunately the Trie based
implementation was mostly correct, except for our subdomain matching. Both bar.com
and foo.bar.com should be blocked if bar.com is in the blocklist. My Trie based
implementation also blocked foobar.com. Ooops. That was a quick fix, albeit one which
required making the Trie search implementation hostname aware. Other than that, results
have been the same in our testing.
These parallel implementations allowed for performance comparisons. (Note: the
underlying regex and other library implementations on each platform might be different, so
the difference in results could be very different if both algorithms were running on an
iPhone.) On my N6P, the Trie based implementation took an average of 0.3ms per
resource URL check, the ported iterative/regex approach took 42ms. Some pages like to
load a lot of resources – so that’s a difference you’d notice quickly. It’s possible that my
ported implementation was suboptimal, but it’s certainly clear that the Trie based approach
was worth it from a performance perspective.

To be fair, this implementation did take more work – and you have to remember that the
iOS implementation was influenced by the blocklist file format that iOS uses for its tracking
protection API, whereas the Android version was clean-sheet design.

Edits:

Trie Diagram corrected on 10th May 2017, thank you to Gervase Markham for spotting the
mistake.

‹ Postbuild gradle commands in Buddybuild for Android

ASAN_SYMBOLIZER_PATH improvements ›
Tagged with: android, firefox, focus, mozilla
Posted in Firefox for Android, Mozilla
2 comments on “Tracking Protection for Android’s WebView”

Gervase Markham says:

May 11, 2017 at 09:24

Tiny bug in diagram – u and k should be reversed.

Andrzej Hunt says:

May 12, 2017 at 05:56

Well spotted! Ooops… and Thanks!

Unit II
No ratings yet
Unit II
41 pages
Android Development Tutorials
100% (1)
Android Development Tutorials
336 pages
UGRD IT6312 Mail and Web Services Legit Source by ComsicBowie
40% (5)
UGRD IT6312 Mail and Web Services Legit Source by ComsicBowie
16 pages
E Book
100% (2)
E Book
109 pages
ACOS 5.2.1-P3 MIB Reference: For A10 Thunder Series and AX™ Series 12 August 2021
No ratings yet
ACOS 5.2.1-P3 MIB Reference: For A10 Thunder Series and AX™ Series 12 August 2021
1,950 pages
Instapdf - in Mscit Exam Questions Answers English 495
0% (1)
Instapdf - in Mscit Exam Questions Answers English 495
18 pages
1KHW029113 FOX61x VPWS Configuration A
No ratings yet
1KHW029113 FOX61x VPWS Configuration A
66 pages
Mastering PhoneGap Mobile Application Development - Sample Chapter
100% (1)
Mastering PhoneGap Mobile Application Development - Sample Chapter
46 pages
Complete Guide For Selenium Interview
No ratings yet
Complete Guide For Selenium Interview
15 pages
Computer Memory PDF
0% (1)
Computer Memory PDF
4 pages
Learn Html5 and Javascript For Ios Web Standards Based Apps For Iphone Ipad and Ipod Touch 1St Edition Scott Preston
No ratings yet
Learn Html5 and Javascript For Ios Web Standards Based Apps For Iphone Ipad and Ipod Touch 1St Edition Scott Preston
48 pages
Hacking Phones From 2013 To 2016 (PDFDrive)
No ratings yet
Hacking Phones From 2013 To 2016 (PDFDrive)
64 pages
(Pub) TBHM App v1
No ratings yet
(Pub) TBHM App v1
64 pages
Unit 5 & 6 Notes Mad
No ratings yet
Unit 5 & 6 Notes Mad
28 pages
What Is SAP Transport Request? How To Import/Export TR
No ratings yet
What Is SAP Transport Request? How To Import/Export TR
9 pages
Material Summary and Final Exam 2022
No ratings yet
Material Summary and Final Exam 2022
27 pages
Practical No.1: Cordova Basics and Configurations
No ratings yet
Practical No.1: Cordova Basics and Configurations
34 pages
Igs-Nt Communication-Guide - 7
No ratings yet
Igs-Nt Communication-Guide - 7
155 pages
Lecture - Slides Airbnb Slides PDF
No ratings yet
Lecture - Slides Airbnb Slides PDF
127 pages
Hybrid Apps Introduction
No ratings yet
Hybrid Apps Introduction
21 pages
Jquery A Brief Expguide
No ratings yet
Jquery A Brief Expguide
39 pages
Android Important Notes
No ratings yet
Android Important Notes
16 pages
MIS513 Advanced Datavbases Course Outline
No ratings yet
MIS513 Advanced Datavbases Course Outline
8 pages
IOS Syllabus
No ratings yet
IOS Syllabus
5 pages
FULLTEXT01
No ratings yet
FULLTEXT01
32 pages
Lecture0 PDF
No ratings yet
Lecture0 PDF
48 pages
Module 2 Mobile App Taxonomy
No ratings yet
Module 2 Mobile App Taxonomy
15 pages
MCB Properties Manual
No ratings yet
MCB Properties Manual
188 pages
Phonegap: Writing Iphone and Android Applications in Javascript, HTML and Css
No ratings yet
Phonegap: Writing Iphone and Android Applications in Javascript, HTML and Css
39 pages
Hybrid Apps Introduction
No ratings yet
Hybrid Apps Introduction
9 pages
Ultimate Web Development Roadmap 2021+: Start Here
100% (1)
Ultimate Web Development Roadmap 2021+: Start Here
1 page
Networking and Web Services
No ratings yet
Networking and Web Services
12 pages
Building Mobile Applications: Computer Science S-76
No ratings yet
Building Mobile Applications: Computer Science S-76
48 pages
Unit - II EMAD
No ratings yet
Unit - II EMAD
29 pages
MAD
No ratings yet
MAD
11 pages
Phonegap 3.X Mobile Application Development Hotsh T: Chapter No. 1 "Your First Project"
No ratings yet
Phonegap 3.X Mobile Application Development Hotsh T: Chapter No. 1 "Your First Project"
40 pages
Mobile Notes
No ratings yet
Mobile Notes
12 pages
Build Mobile Apps With Ionic and Firebase Sample
No ratings yet
Build Mobile Apps With Ionic and Firebase Sample
43 pages
Mobile ECM Apps: Jan. 20 2011 - Stefane Fermigier - Nuxeo
No ratings yet
Mobile ECM Apps: Jan. 20 2011 - Stefane Fermigier - Nuxeo
56 pages
JavaScript Leverage Its Strengths To Eliminate Its Weaknesses
No ratings yet
JavaScript Leverage Its Strengths To Eliminate Its Weaknesses
35 pages
Free Resources For Web Development
No ratings yet
Free Resources For Web Development
10 pages
Arduino Maver
No ratings yet
Arduino Maver
8 pages
Phonegap Jeter
No ratings yet
Phonegap Jeter
22 pages
00 Introduction
No ratings yet
00 Introduction
6 pages
Git and Array
No ratings yet
Git and Array
5 pages
Cs 696 Emerging Web and Mobile Technologies Spring Semester, 2011 Doc 12 Phonegap Api Mar 8, 2011
No ratings yet
Cs 696 Emerging Web and Mobile Technologies Spring Semester, 2011 Doc 12 Phonegap Api Mar 8, 2011
28 pages
Android Programming QP Solutions 2017 - Tutorialsduniya
No ratings yet
Android Programming QP Solutions 2017 - Tutorialsduniya
7 pages
Android Apps Using HTML5: Imaduddin Amin
No ratings yet
Android Apps Using HTML5: Imaduddin Amin
32 pages
Introduction To: Ejlp12@gmail - Co M
No ratings yet
Introduction To: Ejlp12@gmail - Co M
14 pages
Native, Web or Hybrid Mobile App Development?: Worklight Webinar Series
No ratings yet
Native, Web or Hybrid Mobile App Development?: Worklight Webinar Series
36 pages
Login NodeJS
No ratings yet
Login NodeJS
13 pages
How To Build A Web Crawler With Node
No ratings yet
How To Build A Web Crawler With Node
10 pages
Web App Basic 1
No ratings yet
Web App Basic 1
3 pages
Going Native How To Build Mobile Apps
No ratings yet
Going Native How To Build Mobile Apps
18 pages
Unit1 Compressed
No ratings yet
Unit1 Compressed
5 pages
Getting Started With Apache Cordova To Build Hybrid Mobile Apps
No ratings yet
Getting Started With Apache Cordova To Build Hybrid Mobile Apps
9 pages
Seattle Best Practices 2010
No ratings yet
Seattle Best Practices 2010
12 pages
Stanford CS193p: Developing Applications For iOS Fall 2011
No ratings yet
Stanford CS193p: Developing Applications For iOS Fall 2011
12 pages
Sample Ccna v4 Courseware
No ratings yet
Sample Ccna v4 Courseware
102 pages
Attacks On Webview in The Android System: Tongbo Luo, Hao Hao, Wenliang Du, Yifei Wang, and Heng Yin
No ratings yet
Attacks On Webview in The Android System: Tongbo Luo, Hao Hao, Wenliang Du, Yifei Wang, and Heng Yin
10 pages
Phonegap: Android: Training in Seattle, Wa Fil Maj Andrew Lunny
No ratings yet
Phonegap: Android: Training in Seattle, Wa Fil Maj Andrew Lunny
10 pages
Apigee Web Api Design The Missing Link Ebook 1 5
No ratings yet
Apigee Web Api Design The Missing Link Ebook 1 5
5 pages
Summer Internship Course On Advanced Android & Web Application Development
No ratings yet
Summer Internship Course On Advanced Android & Web Application Development
7 pages
Built With Web Standards
No ratings yet
Built With Web Standards
5 pages
Advanced Android
No ratings yet
Advanced Android
4 pages
History of MacOS
No ratings yet
History of MacOS
37 pages
VN5650 VN5240 10BASE T1S PressRelease 202307 EN
No ratings yet
VN5650 VN5240 10BASE T1S PressRelease 202307 EN
2 pages
HTML and Whats Next
No ratings yet
HTML and Whats Next
5 pages
Project Documentation Format CS
No ratings yet
Project Documentation Format CS
7 pages
Operators in Java - Javatpoint
No ratings yet
Operators in Java - Javatpoint
18 pages
Tutorials
No ratings yet
Tutorials
2 pages
Solmetric PV Analyzer Users Guide - 1500 - en
No ratings yet
Solmetric PV Analyzer Users Guide - 1500 - en
152 pages
WEG CFW500 Technical Support Manual 10008154830 en
No ratings yet
WEG CFW500 Technical Support Manual 10008154830 en
15 pages
Smart Cities: Iot-Enabled Solid Waste Management in Smart Cities
No ratings yet
Smart Cities: Iot-Enabled Solid Waste Management in Smart Cities
14 pages
Engleza Chap 2 x86 Arch
No ratings yet
Engleza Chap 2 x86 Arch
13 pages
Major Finalproject
No ratings yet
Major Finalproject
44 pages
DAT en MS400890MX BS3 Switch Main Chassis Power Substation Cert 3module Slots 1.8
No ratings yet
DAT en MS400890MX BS3 Switch Main Chassis Power Substation Cert 3module Slots 1.8
16 pages
Scalable Neural Network
No ratings yet
Scalable Neural Network
31 pages
OpenLNS Server License Guide
No ratings yet
OpenLNS Server License Guide
24 pages
Experiment 3: Multiplexing and Demultiplexing: Digital Systems and Microcontrollers - Spring 21
No ratings yet
Experiment 3: Multiplexing and Demultiplexing: Digital Systems and Microcontrollers - Spring 21
7 pages
Internet Safety Quiz - Rezultati
No ratings yet
Internet Safety Quiz - Rezultati
27 pages
Microprocessor Systems and Interfacing (Eee342)
No ratings yet
Microprocessor Systems and Interfacing (Eee342)
23 pages
FM Rail Book Lecture Notes Version
No ratings yet
FM Rail Book Lecture Notes Version
17 pages
Barkoder SDK Datasheet
No ratings yet
Barkoder SDK Datasheet
2 pages
1.2. Install The Quickstart - en-US
No ratings yet
1.2. Install The Quickstart - en-US
7 pages
Altamash
No ratings yet
Altamash
1 page

Tracking Protection For Android's WebView - Andrzej Hunt

Uploaded by

Tracking Protection For Android's WebView - Andrzej Hunt

Uploaded by

Andrzej Hunt

Tracking Protection for

Tracking Protection Lists

Implementing URL matching

Every real project needs its own

‹ Postbuild gradle commands in Buddybuild for Android

Gervase Markham says:

Tiny bug in diagram – u and k should be reversed.

Andrzej Hunt says:

Well spotted! Ooops… and Thanks!

© 2012, 2013 Andrzej Hunt ↑ Powered by WordPress

You might also like