0% found this document useful (0 votes)
190 views39 pages

Weaponizing Data Science For Social Engineering

This document discusses using machine learning techniques like Markov chains and neural networks to automate spear phishing attacks on Twitter. It describes profiling potential targets on Twitter, generating personalized phishing content tailored to each target using the ML models, and evaluating the effectiveness of the attacks. The authors aim to demonstrate an end-to-end automated spear phishing workflow on Twitter at a security conference.

Uploaded by

Lapomme Salee3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
190 views39 pages

Weaponizing Data Science For Social Engineering

This document discusses using machine learning techniques like Markov chains and neural networks to automate spear phishing attacks on Twitter. It describes profiling potential targets on Twitter, generating personalized phishing content tailored to each target using the ML models, and evaluating the effectiveness of the attacks. The authors aim to demonstrate an end-to-end automated spear phishing workflow on Twitter at a security conference.

Uploaded by

Lapomme Salee3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Weaponizing Data Science

for Social Engineering:


Automated E2E Spear Phishing on Twitter

John Seymour | Philip Tully


1 #SNAP_R
You care about phishing on social media

2 #SNAP_R
TL;DR
#SNAP_R Twitter Profiles

Social #SNAP_R
Network
Automated
Phishing with Phishing Offense
Reconnaissance

3 #SNAP_R
ISO: Demo Volunteers

Tweet%#SNAP_R(before%the%demo%
to%get%an%example%tweet!

4 #SNAP_R
#whoami

John Seymour Philip Tully


@_delta_zero @phtully

Data Scientist at ZeroFOX Senior Data Scientist at ZeroFOX


Ph.D. student at UMBC Ph.D. student at University of Edinburgh &
Royal Institute of Technology

Researches Malware Datasets Brain Modeling & Artificial Neural Nets

5 #SNAP_R
A Novel Phishing Campaign Design

High
Our$#SNAP_R Spear$Phishing
Fully+Automated Highly+Manual
>30%+Accuracy 45%+Accuracy
Success Rate

Phishing
Mostly+Automated
5?14%+Accuracy
Low

Low High
Level of Effort
6 #SNAP_R
Fooling Humans for 50 Years

1966: ELIZA Chatbot 2016: @TayandYou


! Joseph Weizenbaum, MIT ! Microsoft AI
! Parsing & keyword replacement ! Deep Neural Network

7 #SNAP_R
InfoSec ML Historically Prioritizes Defense

8 #SNAP_R
Machine Learning on Offense
Automated Target Discovery
Automated Social Spear Phishing
Evaluation and Metrics
Results and Demo
Wrap Up
Weaponizing Data Science
for Social Engineering:
Automated E2E Spear Phishing on Twitter

9 #SNAP_R
Machine Learning on Offense

Weaponizing Data Science


for Social Engineering:
Automated E2E Spear Phishing on Twitter

10 #SNAP_R
Why Twitter?
! Bot-friendly API
! Colloquial syntax
! Shortened links
! Trusting culture
! Incentivized data disclosure

11 #SNAP_R
Shoutout

Where(Do(the(Phishers(Live?(Collecting(Phishers(
Geographic(Locations(from(Automated(Honeypots(
Robbie(Gallagher

We’ve+taken+a+novel+approach+to+automating+the+determination+of+a+
phishers+geographic+location.+With+the+help+of+Markov+chains,+we+
craft+honeypot+responses+to+phishers’+emails+in+an+attempt+to+beat+
them+at+their+own+game.+We’ll+examine+the+underlying+concepts,+
implementation+of+the+system+and+reveal+some+results+from+our+
ongoing+experiment.

12 #SNAP_R
Techniques, Tactics and Procedures
! Our ML Tool...
! Shortens payload per unique user
Twitter Profiles ! Auto-tweets at irregular intervals
! Triages users wrt value/engagement
! Prepends tweets with @mention
#SNAP_R
! Obeys rate limits

Phishing Offense ! We added...


! Post non-phishing posts
! Build believable profile

13 #SNAP_R
Design Flow
is_target(user)

Twitter Profiles get_timeline(depth)

#SNAP_R
gen_markov_tweet() gen_nn_tweet()

Phishing Offense

schedule_tweet_and_sleep() post_tweet_and_sleep()

14 #SNAP_R
Automated Target Discovery

Weaponizing Data Science


for Social Engineering:
Automated E2E Spear Phishing on Twitter

15 #SNAP_R
Triage of High Value Targets on Twitter

! Accessible personal info


! Historical profile posts
! Heterogeneous data
! Text, images, urls, stats, dates

16 #SNAP_R
Extracting Features from
GET users/lookup

! Engagement: following/followers
! #myFirstTweet
! Default settings
! Description content
! Account age

17 #SNAP_R
Clustering Predicts High Value Users

Eric+Schmidt Eric+Schmidt

18

18 #SNAP_R
Selecting the
Best
Clustering
Model
! Many algorithms
! Many hyperparameters
! Max avg. score [-1,..,1]
! 0.5-0.7 reasonable structure

19 #SNAP_R
Automated Social Spear Phishing

Weaponizing Data Science


for Social Engineering:
Automated E2E Spear Phishing on Twitter

20 #SNAP_R
Recon and Footprinting for Profiling
! Compute histogram of tweet timings
(binsize = 1 hour)

! Random minute within max hour to tweet

! Bag of Words on timeline tweets

! Select most commonly occurring non-


stopword

! We seed the neural network with topics that


the user frequently posts about

21 #SNAP_R
Leveraging Markov Models
1
! Popular for text generation:
I
0.38
see /r/SubredditSimulator,
InfosecTalk TitleBot
don’t 0.62
! Calculates pairwise frequency of
1 tokens and uses that to generate
like 1 new ones
0.54 0.46
! Based on transition probabilities
ML infosec
1
! Trained using most recent posts on
1
the user’s timeline
.

22 #SNAP_R
Training a Recurrent Neural Network
! Hosted on Amazon EC2

! Trained on g2.2xlarge
instance (65¢ per hour)

! Ubuntu (ami-c79b7eac)

! Training set > 2M tweets

! Took 5.5 days to train


LSTM+=+Long+Short?Term+Memory ! 3 layers, ~500 units/layer
Illustration: Chris Olah (@ch402)
LSTMs: Hochreiter & Schmidhuber, 1997
23 #SNAP_R
Tradeoffs and Caveats
Model LSTM Markov Chain
Metric
Training Speed Days Seconds
Accuracy High Medium
Availability Public Public
Size Large Small

Caveats • Deeper representation of • Overfits to each user, can


natural language, generalizes create temporally irrelevant
well tweets

• Retraining required for new • Performs poorly on users with


languages few tweets
24 #SNAP_R
Language and Social Network Agnosticism
! Markov models only use content on user’s timeline, which
means they can automatically generate content in other
languages

! For neural nets, you’d only need to scrape data from the target
language and retrain

! Both of these methods can also be applied to other social


networks

25 #SNAP_R
Evaluation and Metrics

Weaponizing Data Science


for Social Engineering:
Automated E2E Spear Phishing on Twitter

26 #SNAP_R
Here’s a malicious URL...

27 #SNAP_R
And, apparently goo.gl lets us shorten it!

28 #SNAP_R
goo.gl also gives us analytics

29 #SNAP_R
Results and Demo

Weaponizing Data Science


for Social Engineering:
Automated E2E Spear Phishing on Twitter

30 #SNAP_R
Wild Testing #SNAP_R

31 31
#SNAP_R
Pilot Experiment
! Via #SNAP_R we sent 90 “phishing” posts
out to people using #cat
! After 2 hours, we had 17% clickthrough rate
! After 2 days, we had between 30% and 66%
clickthrough rate

! Inside the Data


! goo.gl showed 27 clickthroughs (30%) came
from a t.co referrer
! Unknown referrers might be caused by bots
! With unique locations, clickthrough rate may be
as high as 66%

32 #SNAP_R
Man vs. Machine 2 Hour Bake Off

User Person SNAP_R


Metric
Total Targets ~200 819
Tweets/minute 1.67 6.85
Click-throughs 49 275
Observations • Copy/Pasting messages to • Arbitrarily scalable with the
different hashtags number of machines

33 #SNAP_R
DEMO of #SNAP_R

34 #SNAP_R
Wrap Up
Weaponizing Data Science
for Social Engineering:
Automated E2E Spear Phishing on Twitter

35 #SNAP_R
Potential Use Cases
! Social media security awareness

Twitter Profiles ! Social media security education

#SNAP_R ! Automated internal pentesting

! Social engagement
Phishing Offense
! Staff Recruiting

36 #SNAP_R
! Of course, we’re white hats here…
Mitigations ! But machine learning is rapidly becoming
automated, so black hats would have this
capability soon.

! Protected accounts are immune to


Twitter Profiles timeline scraping, which defeats the tool

! Bots can be detected


#SNAP_R
! Standard mitigations apply:
! Don’t click on links from people you don’t
Phishing Offense know
! Report! Twitter is pretty good at flagging spam
accounts
! Maybe URL shorteners should be responsible
for malware?

37 #SNAP_R
Black Hat Sound Bytes

! Machine learning can be used


Twitter Profiles offensively to automate spear phishing

! Machine-generated grammar is bad, but


#SNAP_R
Twitter users DGAF

! Abundant personal data is publicly


accessible and effective for social
Phishing Offense engineering

38 #SNAP_R
?
39
John Seymour Philip Tully
@_delta_zero @phtully

We’ll also be at the booth


immediately after the presentation!

#SNAP_R

You might also like