Weaponizing Data Science for Social Engineering: Automated E2E Spear Phishing on Twitter

113

Cite

DEF CON

Seymour, John Tully, Philip

Formal Metadata

Title

Weaponizing Data Science for Social Engineering: Automated E2E Spear Phishing on Twitter

Title of Series

DEF CON 24

Number of Parts

Author

Seymour, John

Tully, Philip

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/36231 (DOI)

Publisher

DEF CON

Release Date

2016

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Historically, machine learning for information security has prioritized defense: think intrusion detection systems, malware classification and bonnet traffic identification. Offense can benefit from data just as well. Social networks, especially Twitter with its access to extensive personal data, bot-friendly API, colloquial syntax and prevalence of shortened links, are the perfect venues for spreading machine-generated malicious content. We present a recurrent neural network that learns to tweet phishing posts targeting specific users. The model is trained using spear phishing pen-testing data, and in order to make a click-through more likely, it is dynamically seeded with topics extracted from timeline posts of both the target and the users they retweet or follow. We augment the model with clustering to identify high value targets based on their level of social engagement such as their number of followers and retweets, and measure success using click-rates of IP-tracked links. Taken together, these techniques enable the world’s first automated end-to-end spear phishing campaign generator for Twitter. Bios: John Seymour is a Data Scientist at ZeroFOX, Inc. by day, and Ph.D. student at University of Maryland, Baltimore County by night. He researches the intersection of machine learning and InfoSec in both roles. He’s mostly interested in avoiding and helping others avoid some of the major pitfalls in machine learning, especially in dataset preparation (seriously, do people still use malware datasets from 1998?) He has spoken at both DEF CON and BSides, and aims to add BlackHat USA and SecTor to the list in the near future. Philip Tully is a Senior Data Scientist at ZeroFOX, a social media security company based in Baltimore. He employs natural language processing and computer vision techniques in order to develop predictive models for combating threats emanating from social media. His pivot into the realm of infosec is recent, but his experience in machine learning and artificial neural networks is not. Rather than learning patterns within text and image data, his previous work focused on learning patterns of spikes in large-scale recurrently connected neural circuit models. He is an all-but-defended computer science PhD student, in the final stages of completing a joint degree at the Royal Institute of Technology (KTH) and the University of Edinburgh.