0% found this document useful (0 votes)

26 views3 pages

Assignment 4: Hashing For Strings: Goal

The document describes an assignment to find valid anagrams of input strings by hashing words from a vocabulary. Students are provided a vocabulary file of words and an input file of strings. They must hash the vocabulary, store it efficiently, and search for anagrams of the input strings in lexicographic order. The output must list the anagrams or "-1" to indicate moving to the next string. Efficiency is evaluated based on runtime. Students must also provide a humorous anagram of a friend's name.

Uploaded by

Anshul Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views3 pages

Assignment 4: Hashing For Strings: Goal

Uploaded by

Anshul Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

ASSIGNMENT 4: HASHING for STRINGS

Goal: The goal of this assignment is to get some practice with collision resolution and hash functions. On the side
you will also learn basic string manipulations. It’s a fun assignment where the task is to find valid anagrams of a given
input.

Problem Statement: You are given a vocabulary V of (lowercase) English words. It uses letters of English alphabet
[a-z], digits [0-9], and the apostrophe symbol [']. No other characters are used in the vocabulary V. Your goal is to
print out all valid anagrams of an input string. The input string will be a sequence of at most 12 characters.

Anagram: Two strings are anagrams of each other if by rearranging letters of one string you can obtain the other. For
example, “a bit” is an anagram of “bait”, and “super” is an anagram of “purse”. Note that we can add spaces at will,
i.e., we won’t count spaces when matching characters for checking anagrams.

In this assignment, you will load V from the text file and then be ready to compute anagrams. You will be provided an
input file also in the text format. In both vocabulary and input files there will be one string written per line. Your goal
will be compute all valid anagrams (i.e., each word within your anagram must be present in V) of all input strings. After
computing all valid anagrams of one string you must output a ‘-1’ to indicate that you are done computing anagrams
of this string. For the purpose of this assignment, you only have to compute anagrams with a maximum of 2 spaces in
them (i.e., three words at most). However, each permutation of these words will also be a valid anagram.

This is the first assignment in the course where you will be evaluated not only on the correctness and complexity of
your code, but also on the runtime efficiency of the code. You can compute the time taken for your code to run using
the built-in getTimeMillis() command.

Vocabulary File: The vocabulary file (vocabulary.txt) will be provided in the resources of the assignment. The first
line of the Vocabulary will indicate the number of words in the Vocabulary (V), followed by one word per line (all
lowercase and no spaces). A sample vocabulary.txt is given below:

bit

bat

tab

Input File: The input file (input.txt) will be an input to the code at runtime. The first line will have the number (K)
of input strings. This will be followed by K lines, with one string per line. It will have only lowercase letters, digits, and
apostrophe. It will not have a space. A sample input.txt is given as under

bait

bb
Output File: You will produce all valid anagrams of each input string and output -1 after finishing with one input
and moving onto the next. The output for a particular string should be in lexicographic order. Lexicographic ordering is
done based on ASCII codes: i.e., lowercase>digits>apostrophe>space. For example, for the input file above you will
output:

a bit

bat i

bit a

i bat

i tab

tab i

-1

Note that for the second input word there were no valid anagrams found. Also note that the number of ‘-1’s in the
output should be exactly same as the number of input words in input.txt. Your output must be produced on stdout
(without any other extra information).

Further note that output anagrams should not have contiguous spaces. They should not start with a space, or end with
a space. These will be required for correctly autograding your assignment.

Hashing: The main purpose of the assignment is to have you store vocabulary appropriately and have you check for
anagrams efficiently. There may be many ways to store the vocabulary, but in this assignment you must hash each
valid word. You will have to implement your own hash function and your own collision resolution. You may use
chaining, or open addressing with any probe sequence, as you see fit. The goal is that your anagram computation
should be as efficient as possible. You may use any function within Java built-in String class, except hashCode() or any
other inbuilt hash functions.

Tip: To compute better time efficiency not only will you have to implement a good hashing mechanism, you will also
have to create an optimized approach to search through the space of anagrams. This may take some trial and error,
so start early!

For Humor: You must find some friend of yours (either in the class or otherwise) and output a funny anagram of
their name. Share this anagram with TA at the time of demo. There are no points for a more humorous anagram,
although there are points for completing this part of the task.

Code: Your code will be run using the following command:

javac Anagram.java

java Anagram vocabulary.txt input.txt

This implies that we can change the vocabulary.txt at the time of final evaluation. However, its size will be in the
range of the size of the vocabulary.txt we are providing with the assignment. Also, there will be no words in the
vocabulary that have sizes 1 or 2. That is, all valid words will be at least three characters long.
What is being provided?
The folder contains the following files:

vocabulary.txt

input.txt (sample test case)

What to submit?
1. Submit your code in a .zip file named in the format <EntryNo>.zip. Make sure that when we run “unzip
yourfile.zip” in addition to your code “writeup.txt” should be produced in the working directory.

You will be penalized for any submissions that do not conform to this requirement.

2. The writeup.txt should have a line that lists names of all students you discussed/collaborated with (see
guidelines on collaboration vs. cheating on the course home page). If you never discussed the assignment
with anyone say None.

After this line, you are welcome to write something about your code, though this is not necessary.

Evaluation Criteria
The assignment is worth 6 points. Your code will be autograded at the demo time against a series of tests. 2 points will
be provided for correct output and 2 points will be awarded for efficient code (based on how well it performs based
on the rest of the students in the class). 2 points will be awarded for adequately answering demo questions, sharing
your friend’s anagram and explaining your code and choices, and any interesting approaches you tried.

What is allowed? What is not?

1. This is an individual assignment.

2. Your code must be your own. You are not to take guidance from any general purpose code or problem specific
code meant to solve these or related problems.

3. You are not allowed to use built-in (or anyone else’s) implementations of hash functions or hashing scheme.
A key aspect of the course is to have you learn how to implement hashing.

4. You are allowed to use built-in Java String functions. You are also allowed to use built-in sorting functions.

5. You should develop your algorithm using your own efforts. You should not Google search for direct solutions
to this assignment. However, you are welcome to Google search for generic Java-related syntax.

6. You must not discuss this assignment with anyone outside the class. Make sure you mention the names in
your write-up in case you discuss with anyone from within the class. Please read academic integrity
guidelines on the course home page and follow them carefully.

7. Your submitted code will be automatically evaluated against another set of benchmark problems. You get
significant penalty if your output is not automatically parsable and does not follow input-guidelines.

8. We will run plagiarism detection software. Anyone found guilty will be awarded a suitable penalty as per IIT
rules.

Linux Hackerrank
No ratings yet
Linux Hackerrank
50 pages
Compiler Design Lab Manual
No ratings yet
Compiler Design Lab Manual
84 pages
Strings
No ratings yet
Strings
9 pages
Laboratory Exercise No9 String Function
No ratings yet
Laboratory Exercise No9 String Function
34 pages
FDSA Assignment-4
No ratings yet
FDSA Assignment-4
3 pages
SET - 11 (String Basic)
No ratings yet
SET - 11 (String Basic)
12 pages
Py
No ratings yet
Py
11 pages
Java Exp2
No ratings yet
Java Exp2
13 pages
String and String Buffer Assignments
0% (1)
String and String Buffer Assignments
2 pages
CSCI 2270 - Data Structures and Algorithms Instructor Hoenigman Assignment 2 Due Friday, February 3 Before 3pm Word Analysis
No ratings yet
CSCI 2270 - Data Structures and Algorithms Instructor Hoenigman Assignment 2 Due Friday, February 3 Before 3pm Word Analysis
5 pages
A. Su X Three: Codeforces Round #607 (Div. 2)
No ratings yet
A. Su X Three: Codeforces Round #607 (Div. 2)
5 pages
Assignment#5
No ratings yet
Assignment#5
3 pages
Name: Abdul Haseeb Memon Roll No: 2019-CS-056 Sec: 5-B: Lab 1 Task 1
No ratings yet
Name: Abdul Haseeb Memon Roll No: 2019-CS-056 Sec: 5-B: Lab 1 Task 1
4 pages
CSE143-Computer Programming II Programming Assignment #6 Due: Thursday, 11/16/17 9 PM
No ratings yet
CSE143-Computer Programming II Programming Assignment #6 Due: Thursday, 11/16/17 9 PM
3 pages
BAI233001 - Zoha Hameed (Lab 12)
No ratings yet
BAI233001 - Zoha Hameed (Lab 12)
9 pages
#4 Strings
No ratings yet
#4 Strings
27 pages
P Flab 12 Morning
No ratings yet
P Flab 12 Morning
3 pages
CS2094D Assignment 2 Updated
No ratings yet
CS2094D Assignment 2 Updated
9 pages
CD Experiment 1 To 10-1
No ratings yet
CD Experiment 1 To 10-1
16 pages
PT Lab 5 Tasuk
No ratings yet
PT Lab 5 Tasuk
12 pages
String Handling
No ratings yet
String Handling
16 pages
VL2024250107456 Ast03
No ratings yet
VL2024250107456 Ast03
8 pages
Practicum 21: Course C Programming Language Meeting 41
No ratings yet
Practicum 21: Course C Programming Language Meeting 41
2 pages
CS 103 Computer Programming: Assignment Number 1a February 3, 2017
No ratings yet
CS 103 Computer Programming: Assignment Number 1a February 3, 2017
7 pages
Data Structures and Algorithms II Fall 2019 Programming Assignment #1
No ratings yet
Data Structures and Algorithms II Fall 2019 Programming Assignment #1
7 pages
Practical 4 102 2019
No ratings yet
Practical 4 102 2019
3 pages
Programs On Strings
No ratings yet
Programs On Strings
17 pages
Programming Assignment Unit 1
No ratings yet
Programming Assignment Unit 1
12 pages
Lab 7 v1
No ratings yet
Lab 7 v1
2 pages
Correction and Outputs
No ratings yet
Correction and Outputs
14 pages
Homework #6
No ratings yet
Homework #6
5 pages
Heritage Institute of Technology: 4th Semester Class Test I Examination 2021 Session: 2020-2021
No ratings yet
Heritage Institute of Technology: 4th Semester Class Test I Examination 2021 Session: 2020-2021
1 page
Mastering Strings in Just 5 Days
No ratings yet
Mastering Strings in Just 5 Days
25 pages
Strings
No ratings yet
Strings
3 pages
String
No ratings yet
String
4 pages
TBU Questions
No ratings yet
TBU Questions
4 pages
Word Frequency Counter: Assignment 2
No ratings yet
Word Frequency Counter: Assignment 2
1 page
Sri Krishna-Strings
No ratings yet
Sri Krishna-Strings
16 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
6 String
No ratings yet
6 String
2 pages
Class 11 - Computer Practical Assignments 24-25-1
No ratings yet
Class 11 - Computer Practical Assignments 24-25-1
3 pages
Data Structure Question
No ratings yet
Data Structure Question
6 pages
Class10 - String Programs
No ratings yet
Class10 - String Programs
5 pages
File IO
No ratings yet
File IO
4 pages
Class-X Computer Project
No ratings yet
Class-X Computer Project
1 page
Assignment 2
No ratings yet
Assignment 2
3 pages
19 ALG Assignment Part1 2
No ratings yet
19 ALG Assignment Part1 2
6 pages
Indexing (En)
No ratings yet
Indexing (En)
3 pages
Devops Assignment - Atmana
No ratings yet
Devops Assignment - Atmana
3 pages
Lab Assignment#3
No ratings yet
Lab Assignment#3
2 pages
CS1103 Assignment 1
No ratings yet
CS1103 Assignment 1
3 pages
Creating IBM I Client Partition
No ratings yet
Creating IBM I Client Partition
24 pages
C911 C931 C941 C942 Maintenance Manual Rev 2
No ratings yet
C911 C931 C941 C942 Maintenance Manual Rev 2
304 pages
Intro C Prog
No ratings yet
Intro C Prog
211 pages
Introduction To Computer
100% (1)
Introduction To Computer
57 pages
SyncThru Web Admin Service Administrator Manual - SWAS - Main PDF
No ratings yet
SyncThru Web Admin Service Administrator Manual - SWAS - Main PDF
39 pages
Biostar I945C-M7B r6.0
No ratings yet
Biostar I945C-M7B r6.0
28 pages
How To Port Custom ROMs For MTK SoCs (All Chipsets) 1
100% (2)
How To Port Custom ROMs For MTK SoCs (All Chipsets) 1
38 pages
INOI 2005 Question Paper
0% (1)
INOI 2005 Question Paper
6 pages
USB-82143A Manual v1 PDF
No ratings yet
USB-82143A Manual v1 PDF
11 pages
Splunk ActiveDirectory 1.1.4 DeployAD
No ratings yet
Splunk ActiveDirectory 1.1.4 DeployAD
65 pages
M3 - T-GCPFCI-B - Core Infrastructure v5.1.0 - ILT
No ratings yet
M3 - T-GCPFCI-B - Core Infrastructure v5.1.0 - ILT
49 pages
IT Project Planning Guide For Intel (R) AMT
No ratings yet
IT Project Planning Guide For Intel (R) AMT
36 pages
Dispositivo ASA 5005
No ratings yet
Dispositivo ASA 5005
5 pages
NVR - V4.61.000 Build220507 - Release Notes
No ratings yet
NVR - V4.61.000 Build220507 - Release Notes
12 pages
DH67CF TechProdSpec03
No ratings yet
DH67CF TechProdSpec03
90 pages
All Subject Answers Key
No ratings yet
All Subject Answers Key
21 pages
Vera Connection Guide For Non-Wells Fargo-Owned Pcs - V.8.1
0% (1)
Vera Connection Guide For Non-Wells Fargo-Owned Pcs - V.8.1
28 pages
WEEK 2 - Introduction To Operating System
No ratings yet
WEEK 2 - Introduction To Operating System
5 pages
Introduction To Parallel Programming: Center For Institutional Research Computing
No ratings yet
Introduction To Parallel Programming: Center For Institutional Research Computing
98 pages
SQL All Basic Aommands
No ratings yet
SQL All Basic Aommands
6 pages
Basic Linux and Postgres Commands
No ratings yet
Basic Linux and Postgres Commands
4 pages
FUJITSU Server PRIMEQUEST 3400E
No ratings yet
FUJITSU Server PRIMEQUEST 3400E
8 pages
OS Session 4 Embedded OS Slides
No ratings yet
OS Session 4 Embedded OS Slides
45 pages
10g Segment Advisor
No ratings yet
10g Segment Advisor
8 pages
Lamparas EPSON Comparaciones
No ratings yet
Lamparas EPSON Comparaciones
38 pages
Haskell Source Code For Tower of Hanoi.
No ratings yet
Haskell Source Code For Tower of Hanoi.
2 pages
Ririn Review 4
No ratings yet
Ririn Review 4
5 pages
Module Code & Module Title CS4001NI Programming Assessment Weightage & Type Coursework One 2019 2018-19 Spring
No ratings yet
Module Code & Module Title CS4001NI Programming Assessment Weightage & Type Coursework One 2019 2018-19 Spring
32 pages
3.1 03-03 Open Systems Interconnection OSI Model Overview PDF
No ratings yet
3.1 03-03 Open Systems Interconnection OSI Model Overview PDF
19 pages
Getting Started With TEMPRO (Version 7.2)
No ratings yet
Getting Started With TEMPRO (Version 7.2)
7 pages
Programming Puzzles: Python Edition: The Guide to Sharpen Your Coding Skills with Engaging and Challenging Puzzles
From Everand
Programming Puzzles: Python Edition: The Guide to Sharpen Your Coding Skills with Engaging and Challenging Puzzles
Matthew Whiteside
No ratings yet
Coding Interview Questions and Answers
From Everand
Coding Interview Questions and Answers
Chinmoy Mukherjee
No ratings yet
Learn Programming Using C#
From Everand
Learn Programming Using C#
Taurius Litvinavicius
No ratings yet
Coding for beginners The basic syntax and structure of coding
From Everand
Coding for beginners The basic syntax and structure of coding
Diamond Moore
No ratings yet
Python: Advanced Guide to Programming Code with Python
From Everand
Python: Advanced Guide to Programming Code with Python
Charlie Masterson
No ratings yet
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
From Everand
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
Charlie Masterson
No ratings yet
Java: Best Practices to Programming Code with Java: Java Computer Programming, #3
From Everand
Java: Best Practices to Programming Code with Java: Java Computer Programming, #3
Charlie Masterson
No ratings yet
Java: Best Practices to Programming Code with Java
From Everand
Java: Best Practices to Programming Code with Java
Charlie Masterson
No ratings yet
CODING INTERVIEW: 50+ Tips and Tricks to Better Performance in Your Coding Interview
From Everand
CODING INTERVIEW: 50+ Tips and Tricks to Better Performance in Your Coding Interview
Eric Schmidt
No ratings yet

Assignment 4: Hashing For Strings: Goal

Uploaded by

Assignment 4: Hashing For Strings: Goal

Uploaded by

ASSIGNMENT 4: HASHING for STRINGS

Code: Your code will be run using the following command:

java Anagram vocabulary.txt input.txt

input.txt (sample test case)

What is allowed? What is not?

You might also like