0% found this document useful (0 votes)
5 views

csmp.week5.slides

The document discusses various aspects of online tracking, including how cookies and SDKs are used to collect user data for targeted advertising. It highlights the significant revenue generated by companies like Google and Meta through behavioral targeting and the ethical concerns surrounding sensitive data collection, particularly in relation to health and personal information. Additionally, it mentions specific examples of tracking practices and their implications for privacy, including the tracking of abortion clinic visitors and the sale of location data from mobile apps.

Uploaded by

johnmwass1999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

csmp.week5.slides

The document discusses various aspects of online tracking, including how cookies and SDKs are used to collect user data for targeted advertising. It highlights the significant revenue generated by companies like Google and Meta through behavioral targeting and the ethical concerns surrounding sensitive data collection, particularly in relation to health and personal information. Additionally, it mentions specific examples of tracking practices and their implications for privacy, including the tracking of abortion clinic visitors and the sale of location data from mobile apps.

Uploaded by

johnmwass1999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

week 5 - intro

cookies
recipe for

cookies...

see cookies!
• in chrome, open the console (windows = Ctril + Shift + C, mac = Cmd + Option + C) open
chrome dev tools • Application > Storage > Cookies and select a site View, add, edit, and delete
cookies

tracking via the facebook sdk


35C3 - How Facebook tracks you on Android 1:45 - 9.55
I asked an online tracking company for all of my data and here’s what I found
3.04 “baby products and nappies”
• browsing -> cookies
• inferred data -> machine learning
• offline data from credit card purchases
Discrimination in Online Ad Delivery - Latanya Sweeney
3.40 “people who DO NOT HAVE a facebook account”
4.31 “conscious decision not to be on facebook”
• period tracker!
• flashlight!
5.34 list of apps
• any you recognise as having installed?
6.09 “MITM proxy”
• man-in-the-middle - intercepts all
traffic 6.37 “the moment they open the
app”!!
7.32 “SDK”
• who knows what that is?
• so the point is that this tracking is a byproduct 7.50 incredibly
granular data 8.45 “baby plus app”
the original ‘tracking’ story from 2010: How Target Figured Out A Teen Girl Was Pregnant Before Her
Father Did

the tracking ecosystem

1
In 2021, Google’s ad revenues totalled US$209.49 billion worldwide, followed by Meta at
US$114.93 billion and
Amazon at US$31.16 billion. According to IAB Europe, already in 2016 behavioural targeting
accounted for 66 % of all digital advertising and contributed to 90 % of growth in digital
advertising. Unpacking ‘commercial surveillance’: The state of tracking - decent overview What is
Bidstream Data?
What is Cookie Syncing and How Does it Work?

what happens with tracking data? xandr audience segments


From “Heavy Purchasers” of Pregnancy Tests to the Depression-Prone: We Found 650,000 Ways
Advertisers Label
You
• On Xandr’s platform, advertisers can pay for the ability to target people through
the segments. Try searching the ad terms How Are These Segments Used?
1. You are scrolling through a news website.
2. You tap on a link to read an article about a new study looking at people diagnosed with
depression, as you have a close friend suffering from the condition.
3. As the page starts to load, a signal goes out from an advertising platform used by the website
publisher that says there is an available ad slot up for auction. This signal includes information
about the website, information about the page you requested, the ad size, your device or
mobile ad ID, your IP address, and often your approximate location.
4. Another ad platform receives the signal and opens a bidding process for advertisers who wish
to show you an ad.
5. Ad platforms working behalf of the advertisers analyze the data in the bid request to see if it
aligns with the advertisers’ current campaigns.
6. One of the bidders recognizes your IP address and ad ID and finds that you are in the “Health
& Fitness::Depression (audience interest)” segment. This bidder is an ad agency working on
behalf of its client, a pharmaceutical company that sells drugs to treat depression and is
willing to pay enough in the real-time auction to win the ad placement.
7. The ad agency submits its bid through the ad platform and wins the auction.
8. An ad for an anti-depression drug made by the pharmaceutical company loads on your page.
9. The whole process of auctioning your attention unfolded in the blink of an eye, mere
milliseconds.
• The trove of data indicates that advertisers could also target people based on sensitive
information like being
“heavy purchasers” of pregnancy test kits, having an interest in brain tumors, being prone to
depression, visiting places of worship, or feeling “easily deflated” or that they “get a raw deal
out of life.”
• Christl said he thinks the large number of companies named in the file shows that Xandr was
(at least in 2021) reselling large amounts of sensitive data from a wide range of data brokers
from around the world.
• Regarding the large amounts of segments related to sensitive topics, Christl said, “I think the
file suggests that Xandr did not take even the slightest measures to exclude at least the most
sensitive data from its marketplace.”
• Demographics (Ex: “Life Events > Newly Engaged”)
• Grocery (Ex: “Intent > Heavy Purchaser - Meat Pies - Refrigeration“)
• Financial (Ex: “Highest Risk > Poorer Unemployed Neighbourhoods”)
• Travel (Ex: “Vacation Travel Attitudes > Not a Sightseer“)
• Health (Ex: “Healthcare > Medications > Depression Medications”)
Many medical- and health-related segments mentioned specific conditions consumers may be
diagnosed with, medicine they may be taking, or conditions they may develop. This category

2
included several segments relating to reproductive health, including some involving pregnancy
tests, contraceptives, and infertility.
Race and ethnicity showed up frequently among the demographic data targeted by the segments.
Some of the most colorfully described audience segments came from consumer credit agencies
Equifax and Experian. Segments are branded with alliterative names like “Silver Sophisticates” and
“Progressive Potpourri” that reflect the political and socioeconomic makeup of the household.
Some of these brand-name segments promise a package of economically stressed individuals to
target with names like “Struggling Elders” and “Tight Money.”
Consumers are packaged according to their location history and movements. Advertisers were
offered segments that appeared to target people based on where they shop, work, and visit,
including those who go to state capitol buildings, congressional offices, federal agency offices, and
locations like defense contractor and gun manufacturer headquarters.

tracking examples
Q. In pairs: can you think of examples of unethical tracking e.g. in terms of profiling,
advert targeting etc)?

med trackers
Study: Online trackers follow health site visitors
Unaccounted Privacy Violation: A Comparative Analysis of Persistent Identification ofUsers Across
Social Contexts

bbc story
UK councils’ benefits pages push credit card adverts
FOR SHARING: Council cookies
FOR SHARING: UK councils breakdown.xlsx

payday loans
Your Social Networking Credit Score. “Big data” can help determine who really deserves a loan. But
there are dangers.
• Wonga, an extremely ambitious online payday-lending company based in London, even
considers the time of the day and the way a candidate clicks around the site in determining
whether to grant a loan Wonga: What makes money lender tick?
• People borrow money from Wonga by applying on its website. This offers a swift decision and
then transfers the money into a bank account within 15 minutes.
• Its key feature is that it combines information about potential customers in a massive in-house
credit scoring operation. Errol Damelin said his computers use artificial-intelligence software
to collect and digest up to 8,000 different pieces of information about applicants to decide if
they should be offered loan
Wonga data breach ‘affects 245,000 UK customers’
Wonga goes into administration: The payday lender has been crippled by compensation claims
from customers as a result of irresponsible lending

Brightbeam
N.B. you might need to temporarily switch off adblockers if you have them installed
Brightbeam is a Firefox extension which allows you to visualise the 3rd party trackers and to export
the data. It was originally developed by Mozilla and has been adapted by the Digital Methods
Initiative.
Here’s the link to install the firefox brightbeam add-on (it will only install in firefox).
3
There’s a useful animated gif of how it works on the lightbeam github (although note that this is
the old version of Lightbeam, not Brightbeam). Basically it visualises the trackers as a network. In
this case, the circles are the
websites and the triangles are the trackers.
• play around, visit a few websites - you should quickly be able to see which sites have which
trackers (triangles) in common. You may be surprised by how many trackers there are on
some sites.
• look at a specific set of sites e.g. ones you regularly visit, or sites to do with a theme like
healthcare, or some other grouping
• see what you can establish in terms of the tracking that’s going on You can export the network
for gephi using Save Data (GDF)

Projects That Have Used Blacklight to Hold Tech Accountable


10 Million Blacklight Scans Later, Here’s What You Found
Websites Selling Abortion Pills Are Sharing Sensitive Data With Google
“How Dare They Peep into My Private Life?” Children’s Rights Violations by Governments that
Endorsed Online Learning During the Covid-19 Pandemic
• Most online learning platforms installed tracking technologies that trailed children outside of
their virtual classrooms and across the internet, over time.
• Some invisibly tagged and fingerprinted children in ways that were impossible to avoid or get
rid of—even if children, their parents, and teachers had been aware and had the desire and
digital literacy to do so—without throwing the device away in the trash

themarkup / blacklight
Blacklight: A Real-Time Website Privacy
Inspector The High Privacy Cost of a “Free”
Website nested trackers examples -> good
journalistic stories!
canvas
fingerprinting key
loggers

to download blacklight data


• click on ‘visited [site] on [date]’
• click on ‘download the archive’
• unzip the file
• the ‘raw’ directory contains json files with the inspection results

4
tracker control

5
TrackerControl for Android
tracker control github
TC Slim (Google Play Store)
• export • traffic

exodus
• exodus db https://exodus-privacy.eu.org/en/

abortion data
• The Supreme Court’s decision last week overturning the nationwide right to an abortion in the
United States may have sent worried people flooding to Planned Parenthood’s website to learn
about nearby clinics or schedule services.
• But if they used the organization’s online scheduling tool, it appears Planned Parenthood could
share people’s location — and, in some cases, even the method of abortion they selected —
with big tech companies.
• An investigation by Lockdown Privacy, the maker of an app that blocks online tracking, found
that Planned Parenthood’s web scheduler can share information with a variety of third parties,
including Google, Facebook, TikTok and Hotjar, a tracking tool that says it helps companies
understand how customers behave. You scheduled an abortion. Planned Parenthood’s website
could tell Facebook. The organization left marketing trackers running on its scheduling pages
• The company selling the data is SafeGraph. SafeGraph ultimately obtains location data from
ordinary apps installed on peoples’ phones. Often app developers install code, called software
development kits (SDKs), into their apps that sends users’ location data to companies in
exchange for the developer receiving payment.
• Sometimes app users don’t know that their phone—be that via a prayer app, or a weather app
—is collecting and sending location data to third parties, let alone some of the more
dangerous use cases that Motherboard has reported on, including transferring data to U.S.
military contractors.
• Edwards said “SafeGraph is going to be the weapon of choice for anti-choice radicals
attempting to target
‘out of state clinics’ providing medical care.” Missouri is considering a law to make it illegal to
“aid or abet” abortions in other states.
Data Broker Is Selling Location Data of People Who Visit Abortion Clinics
• Google’s original promise, made in July 2022, came shortly after the supreme court’s decision
to end federal abortion protections. The tech giant said it would delete entries for locations
deemed “personal” or sensitive, including “medical facilities like counseling centers, domestic
violence shelters, and abortion clinics”.
• In four out of eight of the tests, the route to the Planned Parenthood was retained in the
device’s location history, though the name of the clinic was scrubbed.
• Police and law enforcement agencies have also made increasing use of a novel category of
search warrant called “reverse search warrants”. In that category are geofence location
warrants, which police use to come up with a list of suspects by seeking out information on all
users whose devices have been detected in a certain place at a certain time.
• Google announced that it planned to change the way it stored location history data for all
users in a way that could render responding to geofence warrants effectively impossible.
Google promised to delete location data on abortion clinic visits. It didn’t, study says

muslim location tracking


How the U.S. Military Buys Location Data from Ordinary Apps

6
• A Muslim prayer app with over 98 million downloads is one of the apps connected to a wide-
ranging supply chain that sends ordinary people’s personal data to brokers, contractors, and
the military.
• Some companies obtain app location data through bidstream data, which is information
gathered from the real-time bidding that occurs when advertisers pay to insert their adverts
into peoples’ browsing sessions. Firms also often acquire the data from software development
kits (SDKs).
• The SDK then collects the app users’ location data and sends it to X-Mode; in return, X-Mode
pays the app developers a fee based on how many users each app has. An app with 50,000
daily active users in the U.S., for example, will earn the developer $1,500 a month, according
to X-Mode’s website.
• Motherboard used network analysis software to observe both the Android and iOS versions of
the Muslim Pro app sending granular location data to the X-Mode endpoint multiple times.
• The data transfer also included the name of the wifi network the phone was currently
collected to, a timestamp, and information about the phone such as its model, according to
Motherboard’s tests.
Leaked Location Data Shows Another Muslim Prayer App Tracking Users
• Salaat First (Prayer Times) is an app created to help Muslims with prayers; reminding them
when to pray, the position to take to face Mecca, and nearby mosques. For all that, the app
needs to access and identify users’ location.
Muslim prayer app Salaat First was tracking users
Muslim prayer app ‘sold users’ tracking data’ to contractor linked to US government agencies:

report strava fitness trackers


• Strava provides an app that uses a mobile phone’s GPS to track a subscriber’s exercise
activity.
• It uses the collected data, as well as that from fitness devices such as Fitbit and Jawbone, to
enable people to check their own performances and compare them with others.
• It says it has 27 million users around the world.
• It’s heatmap shows one billion activities - some three trillion points of data, covering 27 billion
km (17bn miles) of distance run, jogged or swum.
• A 20-year-old Australian student pointed out that it seems to show the locations of Coalition
military bases in Syria and Afghanistan
https://twitter.com/Nrg8000/status/957318498102865920 strava global heatmap
• notes on implementation - Building the Global Heatmap
ANU student reveals location of US military bases
• video
Fitness app Strava lights up staff at military bases
Strava suggests military users ‘opt out’ of heatmap as
row deepens How to Use and Interpret Data from Strava’s

Activity Map tracking - ranking


Tracking has evolved because advertising is the main means of generating revenue on the web.
Tracking is what makes targeted advertising possible, and targeted advertising is what advertisers
like to pay for.
However, we now realise that tracking and targeting have major downsides, not just at an
individual level but at a population level, as that information can also be used to manipulate
democracies and to deny people services. 4CORNERS
• laws like GDPR, cookie law
• Privacy tools (e.g. anonymity)

7
• Breaking up big tech (Google, Facebook etc)
• whistleblowers protection
real time bidding
example from OpenRTB

patternz and rtb


• advertisements in ordinary mobile apps can ultimately lead to surveillance by spy firms and
their government clients through the real time bidding data supply chain.
• Patternz monitors at scale, and in marketing material claims to analyze more than 90
terabytes of data everyday, and have profiles on more than 5 billion user IDs.
• Nuviad has a DSP (demand side platform), which allows companies to buy “large scale media”
using real time bidding, and also offers the ability to target ads with geofencing, which is
where ads are delivered to devices in a certain area.
Inside a Global Phone Spy Tool Monitoring Billions
• These data flow from Real-Time Bidding (RTB), an advertising technology that is active on
almost all websites and apps.
• RTB involves the broadcasting of sensitive data about people using those websites and apps
to large numbers of other entities, without security measures to protect the data. This occurs
billions of times a day.
• private surveillance companies in foreign countries deploy RTB data for surreptitious
surveillance.
• Our examination of RTB data reveals Cambridge Analytica style psychological profiling of
target individuals’ movements, financial problems, mental health problems and vulnerabilities,
including if they are likely survivors of sexual abuse.
see ad exchange<->DSP diagram, p5
• RTB data are broadcast without any security measures.5 After the broadcast there is no way
to know or limit how receiving entities handle the RTB data. Nor is there any technical way to
stop further distribution of RTB data
• RTB industry codes indicate a target individual’s psychological condition. For example, a
recent family bereavement is marked by IAB Audience Taxonomy code 1502, which indicates
intent to purchase funeral services).
• A heavy drinker is categorised by the combination of the IAB Audience Taxonomy code for
“frequent purchaser” (code PIPF3) with the code for “alcohol consumption” (code 369).
here’s a weird one: Catholic priest outed by sensitive RTB data
Europe’s hidden security crisis

reverse tracking ad codes


see slides: ‘How to Investigate Online Disinformation Networks - Bruno
Nicola’ 4-step Methodology:
1)Identifying a disinformation website 2)Who is behind a website? 3)Follow the money! (reverse
IDs) 4)Social amplification
• List of fake news websites
• ‘Boston Leader’}
• ctrl+U to see source: look for analytics or ad codes e.g. UA- or ca-pub-
• crtl+F: ‘ca-pub’
• adsense – How AdSense works – Google AdSense: Find your publisher ID
• try BuiltWith pro - 30-day signup
• What the rollout of Google Analytics 4 means for website investigations • try welovetrump

bellingcat wayback ad codes

8
Using the Wayback Machine and Google Analytics to Uncover Disinformation Networks

week 5 - close

You might also like