We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 14
82123, 108 PN @ Third-party cookies alternatives | by Rafat Rybnik | DataDrivenlnvestor
-only story is on us. Upgrade to access all of Medium.
+> Member-only story
PART ONE
Third-party cookies alternatives
Openinapp 7
Bminread - Sep 23,2020
Listen (") Share *** More
Cookie bakery
Modern Web is the cookie bakery. Advertisers use cookies to serve creepy ads.
Website owners use cookies to measure their audience. Developers use cookies to
store user settings.
Gosh, even those cookie information pop-ups are cookie-based (how would
otherwise a website know you have already seen one?).
hitps/Imedium datadriveninvestorcom/atteratves-t-third-paty-cookies-in-2020-867 60016206 221812123, 4:05 PM @ Third-party cookies alternatives | by Rafat Rybnik | DataDriveninvestor
Video fragment of asdfmovie5
Since the invention of mighty cookies, the whole industry is relying on the ability to
track website users. And third-party cookies are the most popular mechanism used
by advertisers to identify the user between domains.
So no wonder, this mechanism is under discussion. Due to privacy concerns.
QQ Want more articles like this? Sign up here.
Google cookie monster
In August 2019 Chrome’s team announced that they are going to drop third-party
cookies from their browser within 2 years. Chrome is the last browser to start
working on the restriction of third-party cookies.
Cape’
Image by Author
Although Safari and Firefox have already built-in solutions for reducing cross-site
tracking, Chrome has a majority share of the browser market and a correspondingly
higher impact on the market.
hitps:iImediur datadriveninvestor com/altratves-to-third-paty-cookies-in-2020-B5760918/2bb 212282123, 108 PN @ Third-party cookies alternatives | by Rafat Rybnik | DataDrivenlnvestor
Quest for alternative
However, there are still alternatives that comply with web standards. To find them,
let's search web standards documentation for “privacy concerns”.
In this part, I focused on mechanisms that use se:
ion-storage, storage-based
mechanisms and web caches.
Storage-based
* Local Storage
* Session Storage
« IndexedDB
Web caches
« Embedding identifiers in cached documents
* Loading performance tests
« ETag & Last-Modified
In the next parts, I will survey more advanced methods, like fingerprinting and
clickjacking.
hitps/Imedium datadriveninvestorcom/atteratves-t-third-paty-cookies-in-2020-867 60016206
31221812123, 4:05 PM @ Third-party cookies alternatives | by Rafat Rybnik | DataDriveninvestor
Table of possible tracking mechanisms (based on https://arxivorg/paf/1507.07872.pdf, moderated by
Author)
Storage for the ui
Let's consider this use case.
You are a publisher owning three websites with different domains. You need to
implement mechanisms of identification users between them (for example, to
determine how many users have visited each site at least once).
As none of your websites has sign-in functionality, you have to go for some kind of
anonymous identifier.
For example, when a user visits one of your websites for the first time, you'll
generate a UUIDv4 and save it on the user’s computer using all available methods.
hitps:iImediur datadriveninvestor com/altratves-to-third-paty-cookies-in-2020-B5760918/2bb 41221812123, 4:05 PM @ Third-party cookies alternatives | by Rafat Rybnik | DataDriveninvestor
HELLO
Yaar eke
aleO0LOclo -db¥b -4cla -loobd-
Vollel5\79d
Image by Author
Then, each of your websites should be able to retrieve this identifier each time the
user visits it.
The recovered ID should be the same, regardless of the website that fetched it.
Storage-based mechanisms
HTMLS standard supports a range of structured data storage mechanisms on the
client-side. These are the most widely known generation of tracking techniques,
which are based on persistent storage on a user's computer.
Cookies fall into this category, which also includes localStorage, Indexed DB and
File API.
Each of these mechanisms presented the biggest threat to users’ privacy. So, none of
them works in a cross-domain context by default and most browsers implement
clearing of these storages on the user's request.
Local Storage
Local Storage is a mechanism, similar to cookies, for storing objects on the client-
side.
hitps:iImediur datadriveninvestor com/altratves-to-third-paty-cookies-in-2020-B5760918/2bb 512282123, 108 PME
@ Third-party cookies alternatives | by Rafat Rybnik | DataDriveninvestor
Local storage, a part of the web storage API, is a type of persistent storage built into the browser. (source:
Everything You Need To Know About Local Storage)
The objects (key-value pairs) are stored permanently and persist until the user or
website removes them. One object can be as big as 5 MB, which gives a significant
advantage over cookies.
As I mentioned before, standard Web Storage (a.k.a. Local Storage) doesn’t allow
cross-domain data sharing, Accessing Local Storage is even harder than cookies
because you can’t even specify subdomains having access to the data. They are
treated as completely different domains.
This can be problematic when someone has many (sub)domains and wants to share
data between them.
But there is a solution to this problem — postMessage.
The postMessage method safely enables cross-origin communication between a
page and an iframe embedded within it. The post-messaging functionality is
designed to allow data sharing between documents from different domains while
still being secure.
Simply embed an iframe within all of your domains, use it to save the data in
localStorage, and then all domains could access the same storage through this
iframe.
Unfortunately, there is another obstacle.
hitps/Imedium datadriveninvestorcom/atteratves-t-third-paty-cookies-in-2020-867 60016206 22(82123, 495 Ph @ Third-party cookies alternatives | by Rafat Rybnik | DataDriveninvestor
Some browsers (Yes Safari, I’m talking about you!) block third parties from setting
and reading storage (regardless it is a cookie, Local Storage or other).
This “First-party only” limitation can be easily bypassed by redirecting the user to
the third-party website. This intermediate site can set or read the cookies because
its content appears in the first-party context.
Next, the user is redirected to the one he originally visited.
Session Storage
HTMLS Session Storage is analogous to Local Storage, but stored objects are
available only to the current browser window and are deleted when the window is
closed.
Stig
p 1 80
Sa
Session Storage
_ scessiowranle”
eeavsesstontieys)
clenvLecalkeysO
(source: Having fun with HTMLS — Local Storage and Session Storage)
Although Local Storage and Session Storage are part of the same standard.
Besides its temporality, it can be used to restore user ID, when the user clears other
storages while the website is still open.
IndexedDB
IndexedDB is a NoSQL. database that is built into the browser. It is much more
powerful than Local Storage. The downside is IndexedDB is slightly more
complicated to use, than cookies or Local Storage.
hitps:iImediur datadriveninvestor com/altratves-to-third-paty-cookies-in-2020-B5760918/2bb 1221812123, 4:05 PM @ Third-party cookies alternatives | by Rafat Rybnik | DataDriveninvestor
IndexedDB is a large-scale object store built into the browser. (source: JavaScript IndexedDB)
But this problem you can circumvent using libraries like this one.
Of course, different domains can’t access databases of each other. But the solution
used for sharing Local Storage can be used to share data from IndexedDB as well.
File API
Ihave wondered whether to put this mechanism on the list.
The problem is that the API needs explicit action from the user (choosing a file from
disk). This is a big limitation because most of the users would not want to share
their data.
However, I think File API could be some kind of backup for publishers, used as a
last chance to exchange data across websites. With some effort, it could even add
new functionalities for the user. For example, the user could get a file with keys to
their account as two-factor authentication.
But at this point, you should treat it as a more esoteric solution.
ImmortalDB
To take your storage game to a new level, you can use ImmortalDB.
hitps:iImediur datadriveninvestor com/altratves-to-third-paty-cookies-in-2020-B5760918/2bb 8221812123, 4:05 PM @ Third-party cookies alternatives | by Rafat Rybnik | DataDriveninvestor
As the author of this library says:
ImmortalDB is the best way to store persistent key-value data in the browser. Data saved
to ImmortalDB is redundantly stored in Cookies, IndexedDB, and LocalStorage, and
relentlessly self heals if any data therein is deleted or corrupted.
For example, clearing cookies is a common user action, even for non-technical users. And
browsers unceremoniously delete IndexedDB, LocalStorage, and/or SessionStorage without
warning under storage pressure.
ImmortalDB is resilient in the face of such events.
More about ImmortalDB: https://github.com/gruns/ImmortalDB
Cache-based mechanisms
Another group of tracking methods also use client-based storage. By design, the
cache is used to store data which is rarely changed.
This functionality limits unnecessary data transfer through the network and is
crucial for infrastructure bandwidth maintenance. So, browsers will have
difficulties limiting cache-based tracking mechanisms without disturbing user
experience and internet performance.
hitps:iImediur datadriveninvestor com/altratves-to-third-paty-cookies-in-2020-B5760918/2bb 9122812729, 4:05 PM @ Third-party cookies altematives | by Rafat Rybnik | DataDriveninvestor
Mechanisms needed for caching can be used as storage for user identification data
as well.
Embedding identifiers in cached documents
In our basic scenario, a server may return a JavaScript document with a unique
identifier embedded in its body (e.g. as variable value or even in comment).
This JS file may be attached to all 3 domains. This id will be generated server-side
whenever it’s requested.
But to store once generated identifier, a server sets Expires/max-age= to a date set in
the distant future.
Loading performance tests
Let’s temporarily abandon the basic scenario and assume that we just want to know
that a user of a given site has been on one of our other sites.
Websites can use JavaScript to detect the time of loading any object (e.g. an image)
from any URL. The measured loading time can be reported to the server, which can
evaluate if the object is present in the cache (and therefore, a user visited a website
previously).
But you only have to remember about user experience, as this kind of “knocking”
may quickly slow down the loading of your website.
ETag & Last-Modified
ETag and Last-Modified are HTTP response headers. Their purpose is to optimize
performance and enhance the client-server communication process.
‘The ETag field is an identifier for a specific version of a resource (e.g. a hash of a
document's content). It can contain about 10kb of data. It is enough to store our
UUID.
The Last-Modified header, in theory, should contain DateTime, but in practice
accepts any string too. But I don’t recommend using it this way.
Better is to build a custom hash function with a DateTime set of values. It
How can we use ETag and Last-Modified in our case?
hitps/Imedium datadriveninvestorcom/atteratves-t-third-paty-cookies-in-2020-867 60016206 10122(82123, 495 Ph @ Third-party cookies alternatives | by Rafat Rybnk | DataDriveninvestor
When a user requests some resource from the server for the first time, the response
header contains ETag and Last-Modified fields. On the next visit to the website, the
browser sends the If-Modified-Since and If-None-Match in the request headers.
‘These headers contain values of the Last-Modified and ETag fields from the
previously cached resource.
olCU
a €
HTTP response
Tag, "abD020cb-d685-4c1a-bb64-261 13e151794
HTTP request
HTTP request
It None Match “sb0020eb-d6s-4cta bb6d: 21136151704
G = ——
HTTP response (304 Not Modified)
ag: “ab0020cb-d686-4c1a-tb6d-201 13e151 790
https://medium.com/@fischerbach
ETag diagram (Image by Author)
hitps:iImediur datadriveninvestor com/altratves-to-third-paty-cookies-in-2020-B5760918/2bb 1121812123, 4:05 PM @ Third-party cookies alternatives | by Rafat Rybnik | DataDriveninvestor
HTTP request
EEE
HTTP response
LastModified: Wed, 23 Sep 2020 11:02:43 GMT
HTTP request
IFModified Since: Wed, 23 Sep 2020 11:02:43 GMT
e a
HTTP response (304 Not Modified)
LastModified: Wed, 23 Sep 2020 11:02:43 GNT
https://medium.com/@fischerbach
Last-Modified diagram (Image by Author)
Did I mention it works cross-domain?
Interestingly, tracking is possible even during a single private browsing session, as,
the cache is kept until the last browser window is closed.
The only challenge is developing a mechanism that inserts server-side only id from
ETag/Last-Modified into client-side JavaScript code.
Takeaways
In this article, we focused on tracking methods based strictly on storing data on the
user's device.
|| isi Cookles = |/Local storage | Session storage] IndexedDs) a) Frag 071] tart wcdified
Capecity kb 10mb 5mb User's disk space 10k» ~42b
‘Accessibilty Arywindow Any window Sametab ‘Any window = Server-only—_Server-enly
Bopices Manually set Never On tab close Never Never Never
Sentwith requests Yes No No. No
hitps:iImediur datadriveninvestor com/altratves-to-third-paty-cookies-in-2020-B5760918/2bb 122212129, 4:05 PM @ Third. party cookies alternatives |by Rae! Rybnk | DataDrivenhwestor
Comparison of tracking methods (Image by Author)
In the context of cookie apocalypse, the most promising are cache-based HTTP
headers and postwessaging between iframes.
However, there will always be a discussion about user privacy. Regardless of the
methods used by website owners.
In the next part, I focus on more sophisticated methods. Fingerprinting combined
with a statistical approach can provide results nearly as good as cookie tracking.
Se ta tr
Thank you for reading. I hope you enjoyed reading as much as I enjoyed writing this
for you.
https://fischerbach.medium.com/membership
Pro tip: Sign up to Medium to gain full access to all my content and thousands more
content creators like me. Use my link to support my writing if you like what you see.
Pee
References
Fingerprinting
Cross-domain tracking without using third-party cookies
medium datadriveninvestorcom
Tracking mechanisms
https://arxiv.org/pdf/1507.07872.pdf
http:/Awww.chromium.org/Home/chromium-security/client-identification-
https://github.com/gruns/ImmortalDB
hitps:iImediur datadriveninvestor com/altratves-to-third-paty-cookies-in-2020-B5760918/2bb 1912282123, 108 PN @ Third-party cookies alternatives | by Rafat Rybnik | DataDrivenlnvestor
Local Storage
https://www.w3.org/TR/webstorage/
https://caniuse.com/mdn-api_ window localstorage
https://levelup.gitconnected.com/share-localstorage-sessionstorage-between-
different-domains-eb07581e9384
IndexedDB
Aittps://www.w3.org/TR/IndexedDB-2,
https://caniuse.com/indexeddb
https://github.com/localForage/localForage
File API
https: //www.w3.org/TR/2019/WD-FileAPI-20190911.
https://caniuse.com/mdn-api_file
ETag & Last-Modified
https://caniuse.com/mdn-http_headers.etag
https://caniuse.com/mdn-http_headers last-modified
Advertising Online Marketing JavaScript Adtech
=
Written by Rafat Rybnik
811Followers » Writer for DataDriveninvestor
a
Software Developer « Data Science, Business Tools, Privacy « [email protected]
More from Rafat Rybnik and DataDriveninvestor
hitps/imedium datadriveninvestorcom/atternatves-t-third-paty-cookies-in-2020-867 60016206
Software Development
LY
a
VL)
Cy
LY
wiza