Assignment 2
Assignment 2
Assignment 2
Copyright Notice
This content is protected and may not be shared, uploaded, or distributed. Any unauthorized
distribution, reproduction or sharing of this material, including uploading to CourseHero,
Chegg, StuDocu or other websites, is strictly prohibited and may be subject to legal action.
Students are granted access to this material solely for their personal use and may not use it for
any other purpose without the express written consent of the instructor. Thank you for
respecting the intellectual property rights of the instructor.
Objective: The aim of this assignment is to implement some components which can be used for
the final project. Those include web page dictionary, ranking and spellchecking by using hash
tables, heaps and search trees (please use Java and Eclipse or IntelliJ IDEA), etc.
Rules: This is an individual assignment. You are allowed to discuss problems and ideas within
your group. However, please keep in mind that you are not allowed to share this assignment with
other students from your Section or other Sections.
Instructions:
1. Please combine all the CSV files your group received when completing Assignment 1. If your
file is in a different format, you can use this file instead.
2. Please complete one of the tasks listed below. Each student in a group must choose a different
task, ensuring no two students in the same group work on the same task. Submit your report
individually and specify which task you completed.
Exception: Section 1, Group 9
As this is the only group with six team members, two students are allowed to work on the same
task for Tasks 2,4,5 only. However, these tasks must be approached differently. Specifically, if two
students are solving the same task, one must use self-balancing search trees (e.g., AVL trees, red-
black trees), while the other must use a hash table (e.g. alternative Task 2: use a hash table to
store the vocabulary (each entry should store a word and its frequency of occurrence).
Page 1 of 4
B. Use a hashing technique (e.g., cuckoo hashing) to store the vocabulary for efficient lookup.
Alternative Word Suggestions:
A. Implement an edit distance algorithm to suggest alternative words.
B. Use merge sort to sort the suggestions based on their edit distance.
A. Each time a word is searched, update its count in the search tree.
Page 2 of 4
B. Maintain a log of search queries and display top searches to the user.
A. Parse the content of web pages to extract keywords. Use a self-balancing search tree (e.g.,
AVL tree, red-black tree) to store the frequency of each keyword on each page.
B. Use sorting algorithms (e.g., quick sort or heap sort) to sort keywords based on their
frequency within each page.
Submission requirements:
1. You will earn a maximum of 100 points (accounts for 5.25%) for successfully completing
this assignment and submitting both your report and source code within the specified deadline.
2. You must submit:
I. A report (in PDF or word), in which you provide the following elements:
- Your task (e.g. Task 1 or Task 2 or Task 3 etc.).
- Explanations for the solution provided (explain how you solved your task).
- Outputs (screenshots) with comments and explanations (each screenshot must be
numbered (e.g. Fig 1. Displaying the number of ….) and explained what we can see in
your screenshot).
II. All Java source code files and classes (in both *.java and *.txt format) needed to run
your programs. Your source code must be well-commented. Do not upload your
source code to Brightspace as a single zip file. Such submissions will not be
accepted.
4. This assignment is subject to a plagiarism check. The plagiarism check originality score
must not exceed 50%. No points will be awarded for assignments submitted via email,
Teams, or other platforms, for sending zip archives, or for failing to submit your code in
*.txt files.
5. Assignment submission after the deadline will receive a penalty of 10% for the first 24 hrs,
and so on, for up to three days. After three days, the mark will be zero.
6. Unlimited resubmissions are allowed. But keep in mind that we will consider the last
submission. That means that if you resubmit after the deadline, a penalty will be applied,
even if you submitted an earlier version on time.
Page 3 of 4
Academic Integrity.
Plagiarism is a serious offense and can result in severe consequences such as receiving a grade
of zero on the assignment/lab or even being asked to leave the program.
Copying or using someone else’s code is considered plagiarism. This includes using code from
online sources, previous labs/assignments, or from other students. Even if you have modified the
code, our antiplagiarism software can still show it as plagiarism.
To avoid plagiarism, make sure to always use your own words and ideas when writing source
code. Additionally, always check with your instructor to make sure you understand what is allowed
and what is not in terms of using outside resources.
Remember that academic integrity is essential for your own personal and professional growth,
and it is your responsibility to uphold these principles. So please take it seriously and always
produce your own original work.
Page 4 of 4