0% found this document useful (0 votes)
40 views1 page

Character String Search

This document provides regular expressions or regex strings that can be used to search for sensitive personal information like Social Security Numbers (SSNs) and credit card numbers in various file formats. It explains that the strings were designed for Cornell's Spider search program but should also work for other tools with some minor syntax modifications. Users are advised to search all file types, including binary files, and to carefully examine any files found to ensure the detected data is actually of concern.

Uploaded by

Raghu Saravanan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views1 page

Character String Search

This document provides regular expressions or regex strings that can be used to search for sensitive personal information like Social Security Numbers (SSNs) and credit card numbers in various file formats. It explains that the strings were designed for Cornell's Spider search program but should also work for other tools with some minor syntax modifications. Users are advised to search all file types, including binary files, and to carefully examine any files found to ensure the detected data is actually of concern.

Uploaded by

Raghu Saravanan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Information Technology Information and Systems Security/ Compliance

Character Strings for Search Tools


The following strings will be useful for finding data using many of these search tools. The tools search for data by matching character strings using regular expressions or regex. You can find a tutorial on using regexs at http://en.wikipedia.org/wiki/Regex. These strings have been prepared for use with Cornells Spider program. They should work for other search programs as well, but may require slight sytax modifications, depending on the version of regex your program supports. You may combine any number of search strings into a single search by joining them with the | character. You may need to replace the "|" with "\|" for some versions of Unix, including Mac OS. When conducting your search, you should also look at all files, including binary, as well as examine the contents of compressed files if possible. On a Unix system, grep -a will examine binary files as text. Please examine the contents of any reported files carefully. Just because you get a particular hit does not automatically mean the data is of concern. SSN w/ dashes: [0-7]\d{2}\-\d{2}\-\d{4} SSN w/ dashes and breaks: \b[0-7]\d{2}\-\d{2}\-\d{4}\b SSN 9 consecutive digits: [0-7]\d{8} SSN 9 consecutive digits with breaks: \b[0-7]\d{8}\b SSN w/ spaces: [0-7]\d{2}\s\d{2}\s\d{4} SSN w/ spaces and breaks: \b[0-7]\d{2}\s\d{2}\s\d{4}\b All SSN search options with breaks \b[0-7]\d{2}\-\d{2}\-\d{4}\b|\b[0-7]\d{8}\b|\b[0-7]\d{2}\s\d{2}\s\d{4}\b All SSN search options with no breaks [0-7]\d{2}\-\d{2}\-\d{4}|[0-7]\d{8}|[0-7]\d{2}\s\d{2}\s\d{4} Visa/Mastercard/Discover \d{4}\-\d{4}\-\d{4}\-\d{4}|\d{4}\s\d{4}\s\d{4}\s\d{4} Visa/Mastercard/Discover with breaks \b\d{4}\-\d{4}\-\d{4}\-\d{4}\b|\b\d{4}\s\d{4}\s\d{4}\s\d{4}\b American Express \d{4}\-\d{6}\-\d{5}|\d{4}\s\d{6}\s\d{5} American Express with breaks \b\d{4}\-\d{6}\-\d{5}\b|\b\d{4}\s\d{6}\s\d{5}\b

You might also like