0% found this document useful (0 votes)
1K views26 pages

Google Dorks: Analysis, Creation, and New Defenses

This document discusses Google Dorks, which are search queries used by attackers to quickly find vulnerable systems. It aims to understand existing dork techniques, develop defenses against them, and evaluate if attackers could evolve new styles of dorks. The authors classify over 5000 dorks and find the most common rely on URL patterns, file extensions, and content like banners and errors. They propose defenses like randomizing URLs and removing identifying content. A new type of "word-based" dork is developed using common words from content management systems, and defenses to break up identifying words are explored.

Uploaded by

Flavio Toffalini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views26 pages

Google Dorks: Analysis, Creation, and New Defenses

This document discusses Google Dorks, which are search queries used by attackers to quickly find vulnerable systems. It aims to understand existing dork techniques, develop defenses against them, and evaluate if attackers could evolve new styles of dorks. The authors classify over 5000 dorks and find the most common rely on URL patterns, file extensions, and content like banners and errors. They propose defenses like randomizing URLs and removing identifying content. A new type of "word-based" dork is developed using common words from content management systems, and defenses to break up identifying words are explored.

Uploaded by

Flavio Toffalini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Google Dorks: Analysis,

Creation, and new Defenses


Flavio Toffalini, University of Verona, IT, [email protected]
Maurizio Abb, LastLine, UK, [email protected]
Damiano Carra, University of Verona, IT, [email protected]
Davide Balzarotti, Eurecom, FR, [email protected]

GOOGLE DORKS

MOTIVATION

Attackers use Dorks to quickly locate targets

After a new vulnerability is disclosed, one Google query is


sufficient to identify a large amount of vulnerable installations
No time for sysadmins to apply patches !!

MOTIVATION

Attackers use Dorks to quickly locate targets

After a new vulnerability is disclosed, one Google query is


sufficient to identify a large amount of vulnerable installations
No time for sysadmins to apply patches !!

If we could prevent dorks, attackers would need to


resort to Internet scanning which is several orders
of magnitude slower

GOALS

Current practices

Understand which information is used by existing dorks

Design simple solutions to defeat those dorks

Future threats

Test if attackers could move towards new styles of dorks

Design simple solutions to prevent it

GOOGLE DORKS

TAXONOMY

The Exploit-DB database contains over 5143 dorks

Automated/manual analysis
URL Patterns
File Extensions
Content-Based

(44%)
(6%)
(74%)

TAXONOMY

The Exploit-DB database contains over 5143 dorks

Automated/manual analysis
URL Patterns
File Extensions

(44%)
(6%)

Content-Based
Banners
Misconfigurations
Error messages
Common words

(54%)
(8%)
(1%)
(11%)

DORKS EVOLUTION BY CATEGORY


URL Patterns

Misconfiguration
Banner
Common words

10

KNOWN DEFENSES

URL Patterns
File Extensions
Content-Based
Banners
Misconfigurations

improve system configuration

Error messages

proper error handling

Common words

11

remove banners

CONTRIBUTION

URL Patterns

??

File Extensions
Content-Based
Banners

remove banners

Misconfigurations

improve system configuration

Error messages

proper error handling

Common words

??

12

URL-DORKS

Force search engines to index randomized URLs

Let the users navigate and share using cleartext URLs


http://www.web-site.com/wp-content/dimva.html

http://www.web-site.com/HD12DAF35TR/dimva.html

13

URL-DORKS

XOR (part of) URLs with random seed kept in the server
a = resource a
O(a) = obfuscated resource a

Redirect 301 to inform search engine that the page is moved

Canonical URL Tag to delete plain URLs in the results

Intercept and replace SiteMap

14

OBFUSCATION PROTOCOL - CRAWLERS

Crawler

URL Obfuscator

Web Site

a
Redir. 301 to O(a)
O(a)

resp. of a + canonical tag

15

a
resp. of a

OBFUSCATION PROTOCOL - BROWSER

URL Obfuscator

Browser
O(a)

resp. of a

resp. of b

16

Web Site

a
resp. of a

b
resp. of b

URL Patterns
File Extensions
Content-Based
Banners

remove banners

Misconfigurations

improve system configuration

Error messages

proper error handling

Common words

??

17

WORD-BASED DORKS

Goal

Using words left by CMSs to create a Google Dork

Greedy search algorithm to maximizes

Hit-rank: percentage of web site made by a target technology

Coverage: number of entries extracted by the Dork

18

WORD-BASED DORKS: CREATION

Joomla!

19

WORD-BASED DORKS: CREATION


Categories Buy
Recent
Register

Submit

Users Contact Registration


Vanilla
installation

List

Compute hit rank


& coverage

20

Category +
Submit +
....

WORD-BASED DORKS: CREATION

Gradient Ascent algorithm

How to add a new word?

22

At each step, we add the word that provides the highest hit
rank between the ones that have a coverage above the
median of all candidate words
(more details in the paper)

WORD-BASED DORKS:
Common Words
WordPress
Joomla!
Drupal
Magento
OpenCart

24

Ground Truth

938/1000

967/1000

47.1 M

83.6 M

878/1000

887/1000

7.24 M

3.73 M

827/1000

997/1000

7.87 M

3.27 M

871/1000

852/1000

0.39 M

0.68 M

891/1000

998/1000

0.59 M

1.42 M

Hit rank
Coverage
Hit rank
Coverage
Hit rank
Coverage
Hit rank
Coverage
Hit rank
Coverage

WORD-BASED DORKS:
Common Words
WordPress
Joomla!
Drupal
Magento
OpenCart

25

Ground Truth

938/1000

967/1000

47.1 M

83.6 M

878/1000

887/1000

7.24 M

3.73 M

827/1000

997/1000

7.87 M

3.27 M

871/1000

852/1000

0.39 M

0.68 M

891/1000

998/1000

0.59 M

1.42 M

Hit rank
Coverage
Hit rank
Coverage
Hit rank
Coverage
Hit rank
Coverage
Hit rank
Coverage

WORD-BASED DORKS:
Common Words
WordPress
Joomla!
Drupal
Magento
OpenCart

26

Ground Truth

938/1000

967/1000

47.1 M

83.6 M

878/1000

887/1000

7.24 M

3.73 M

827/1000

997/1000

7.87 M

3.27 M

871/1000

852/1000

0.39 M

0.68 M

891/1000

998/1000

0.59 M

1.42 M

Hit rank
Coverage
Hit rank
Coverage
Hit rank
Coverage
Hit rank
Coverage
Hit rank
Coverage

WORD-BASED DORKS: DEFENSES


Idea: add invisible characters to break words and
prevent them to be indexed.

Powered by WordPress

Power⁣ed b⁣y Wor⁣dPress

29

DORKS DEFENSES

URL Patterns
File Extensions
Content-Based
Banners
Misconfigurations

improve system configuration

Error messages

proper error handling

Common words

30

remove banners

CONCLUSION

1) Dork classification
2) URL Pattern Dork Defense
3) New type of Dork using common words
4) Defense against common word dorks

31

You might also like