Proj 2
Proj 2
NOTE:
Instructor may ask the project group or individual members questions about the actual work and
contribution after submission. The questions may be very specific (E.g., the lines of code written
by a member and the explanation of the codes.) It may be done via email or in-person after the
class without advanced notice. Failing to answer the questions satisfactorily will result in mark
deduction. Please ensure all group members have roughly even contribution in the project,
especially the coding part, and have full understanding on the whole project.
Grouping:
You need to form groups of 4 to complete the project in this course. Sign your project group list on the
Wiki page in Blackboard.
One of the members in your group, the group leader/captain, is responsible for submitting the project.
Marks will be deducted if more than one member of your group submit the same/different project.
Project Description:
In this project you will be asked to create an FAQBot using ElasticSearch and embeddings.
Proj02.ipynb (or a zip file if you have any extra files): In this Jupyter notebook, you have to utilize different
cells (code/markdown) to clearly indicate and explain every step. Your Jupyter notebook should include
all the markdown texts signifying the steps with correct heading, python code, comments/analysis, and
visualizations as stated in the following instruction. Note: You need to create the appropriate markdown
headings for each section mentioned below. Codes should have some short comment describing the
statement. Adding a markdown cell containing text before specific actions performed is appreciated.
(Note: Restart the kernel and re-run all cells of the notebook before submission. Substantial marks will be
deducted for cell errors.)
1
Fall 2024
Files Required:
• https://www.douglascollege.ca/student-services/student-
resources/covid19/international-faqs
• https://www.douglascollege.ca/future-students/apply-douglas/domestic-
students/admissions-faq
• https://www.douglascollege.ca/student-services/student-resources/covid19/general-
faqs
• https://www.douglascollege.ca/current-students/important-dates-information/grading-
faq
• https://www.douglascollege.ca/current-students/enrolment-services/fees-related-
information/fees-faq
• https://www.douglascollege.ca/current-students/advising-services/advising-
services/advising-services-faq
• https://www.douglascollege.ca/student-services/student-
support/counselling/counselling-faq
• https://www.douglascollege.ca/student-services/student-resources/covid19/academic-
faqs
• (and more)
Part A. Planning
2. Planning
Create another markdown cell. In this cell, discuss why search engine is more suitable that Chatbot
tools like DialogFlow in creating FAQBot.
Examine the dataset carefully. Create another markdown cell. In this cell, describe your plan about
how to create a FAQBot using ElasticSearch. Note that in FAQBot, user will input a question to look for
the relevant answer.
2
Fall 2024
In the following parts, you need to write python codes, with appropriate comments or markdown cells
to explain your work. Lacking explanations will result in mark deduction.
1. Discussion
Create a Markdown cell and discuss the problem of keyword search in building FAQBot using
ElasticSearch.
2. Using embedding
i. Study the tutorial https://www.jpmorgan.com/technology/technology-blog/faq-bot
ii. Redo B(2), but using the pre-trained model en_core_web_lg to find the embedding of the
questions. Marks will be deducted if you use other pre-trained model.
4. Conclusion
Create a Markdown cell. In this cell, compare the results in B and C and draw your conclusions.
3
Fall 2024
Member Contribution
In addition to the proposal, each group needs to submit a peer evaluation matrix. Each cell should be
a number between 1 and 4, which reflects how a member thinks the contribution by another member.
The evaluation is opened to open to all members of your group (i.e., Every one can see how others
grade on you), so that each member knows how to enhance their contribution in the project.
(Hint: You may refer to this link to see how to create a table in a Jupyter Markdown cell.)
Member 1
Member 2
Member 3
Member 4
Criteria Grading
Project submitted, named properly with all files included in their corresponding folders to 1
Blackboard.
Part Detail
A Planning for the analysis 2
B.2 Search engine building 4
B.3 Testing and evaluation 4
B Overall description or explanation 2
C.1 Discussion 1
C.2 Using embedding 2
4
Fall 2024