0% found this document useful (0 votes)
19 views15 pages

Information Retrieval - Lecture 1

information

Uploaded by

M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views15 pages

Information Retrieval - Lecture 1

information

Uploaded by

M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Information Retrieval & Search

Engines

Instructor: Prof. Shereen Taie

Information Retrieval & Search Engines

BIS216E

Course: Information Retrieval & Search Engines


Course References

• Textbook:
Essential Books:
– SEO 2018: Learn search engine optimization with
smart internet marketing strategies Adam Clarke,
Simple Effectiveness Publishing, 2018.
Recommended Books:
- Search Engine Optimization All-in-One For
Dummies by Bruce Clay (Author), Kristopher B.
Jones (Author) 2022 For Dummies (Business &
Personal Finance)) 4th Edition.

Course: Information Retrieval & Search Engines


Assessment of
Participants
Assessment will be based on the following deliverables:
• Week 7-
• Mid Term Exam: (15 grades)
• Assignments (15 Grades)
• Week 12
• Evaluation (20 grades) includes (practical
assignments + quizzes)
• Participation (10 Grades)
• End-of-Term-Exam: (40 grades)

For success:
Achieving 50% of total score & achieving at least 12 out of
40 at the Final exam.

Course: Information Retrieval & Search Engines


Group project:

• The aim of this project is to help students to develop a simple Search


Engine.
• Groups form 3 to 5
• The features will be described as separated tasks in the lab.
• Present your project to the on week 12
• Presentations should be no longer than 15 minutes.

Course: Information Retrieval & Search Engines


Introduction to
Information Retrieval
Introducing Information Retrieval
and & Search Engines

Course: Information Retrieval & Search Engines


Information Retrieval
• Information Retrieval (IR) is finding material (usually documents) of
an unstructured nature (usually text) that satisfies an information
need from within large collections (usually stored on computers).

– These days we frequently think first of web search,


but there are many other cases:
• E-mail search
• Searching your laptop
• Corporate knowledge bases
• Legal information retrieval

6
Course: Information Retrieval & Search Engines
The problem of IR
• Goal = find documents relevant to an information
need from a large document set
Inf
o.
ne
Query ed
IR
Document Retrieval
system
collection Answer list

7
Course: Information Retrieval & Search Engines
Example

Google

Web

8
Course: Information Retrieval & Search Engines
What is a Document?
• Examples:
– web pages, email, books, news stories, scholarly
papers, text messages, Word, Powerpoint, PDF,
forum postings, patents, IM sessions, etc.
• Common properties
– Significant text content
– Some structure (e.g., title, author, date for papers;
subject, sender, destination for email)

Course: Information Retrieval & Search Engines


Documents vs. Database
Records
• Database records (or tuples in relational databases) are typically
made up of well-defined fields (or attributes)
– e.g., bank records with account numbers,
balances, names, addresses, social security
numbers, dates of birth, etc.
• Easy to compare fields with well-defined semantics to queries in
order to find matches
• Text is more difficult

Course: Information Retrieval & Search Engines


Documents vs. Records
• Example bank database query
– Find records with balance > $50,000 in branches
located in Amherst, MA.
– Matches easily found by comparison with field values
of records
• Example search engine query
– bank scandals in western mass
– This text must be compared to the text of entire news
stories

Course: Information Retrieval & Search Engines


Unstructured (text) vs. structured
(database) data in the mid-nineties

12
Course: Information Retrieval & Search Engines
Unstructured (text) vs. structured
(database) data today

13
Course: Information Retrieval & Search Engines
Sec. 1.1

Basic assumptions of
Information Retrieval
• Collection: A set of documents
– Assume it is a static collection for the
moment

• Goal: Retrieve documents with information


that is relevant to the user’s information need
and helps the user complete a task

14
Course: Information Retrieval & Search Engines
The classic search model
User task Get rid of mice in a
politically correct way

Misconception?

Info need
Info about removing mice
without killing them
Misformulation?
Search
Query how trap mice
alive

Search
engine

Query Results
Collection
refinement

Course: Information Retrieval & Search Engines

You might also like