Assignment 2 Vinay Kumar Chandra
Assignment 2 Vinay Kumar Chandra
2. (Optional) List the examples that you found you could not
match with the current regular expressions with two
extracted parts, ending in .edu. For each example or set of
examples that fit the same pattern, explain briefly why it won’t
work. If you can make expressions in RegExpal that match, but
don’t work in the program to extract the email or phone numbers,
list those here. If you had any false positives in Part 1, include a
discussion here of why that rule generated them.
Answer:
Out of all cases, I had 108 true positives (TP), 0 false positives (FP), and 9
false negatives (FN).
True Positives: A list of 103 emails and phone numbers were correctly
matched
<SCRIPT LANGUAGE="JavaScript">
user = 'name';
site = 'domain.com';
document.write('<a href=\"mailto:' + user + '@' + site + '\">');
document.write(user + '@' + site + '</a>');
</SCRIPT>
In this approach, the actual email address is hidden within a script,
making it challenging for regular expression-based scrapers to identify the
email. Another effective method would be to use CAPTCHA, requiring user
interaction to retrieve the email address, which bots typically cannot
bypass. However, this reduces accessibility and can inconvenience users.