0% found this document useful (0 votes)

73 views29 pages

Multimedia Information Retrieval (CSC 545) : The Problem of IR

The document provides an overview of textual information retrieval, including the problems of IR, feature extraction in documents, and term weighting schemes like TF-IDF. It discusses extracting features from documents by removing structure, stopwords, mapping text to terms, and applying stemming. Feature extraction transforms structured documents into unstructured sets of terms with frequency and position information to represent documents for retrieval.

Uploaded by

Are Zelan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views29 pages

Multimedia Information Retrieval (CSC 545) : The Problem of IR

Uploaded by

Are Zelan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Multimedia Information Retrieval (CSC 545)

Textual Retrieval
By Dr. Nursuriati Jamil

The problem of IR

Goal = find documents relevant to an information need from a large document set
Info. need

Document collection

Retrieval

IR system

Query Answer list

The retrieval problem

Given

Problem

N documents (D0, ..., DN-1) Query Q of user ranked list of k documents Dj (0<j<N) which match the query sufficiently well; ranking with respect to relevance of document to the query

Feature extraction (words, phrases, n-gram, stemming, stop words, thesaurus, multimedia) Retrieval model (Boolean retrieval, vector space retrieval, LSI, signatures, probabilistic retrieval) Index structures (inverted list, signature files, relational database, multidimensional index structures) Freshness of data (real-time, update every day / week / month) Query transformation (AND/OR, expansion, stemming, thesaurus) Ranking of retrieved documents (RSV, link structure, phrases)

Text Retrieval - Overview

Query Insert Result HBM4N11 HBJ3N129

Penderaan kanakkanak di Malaysia

Query transformation

Database Document Feature extraction

Relevance ranking RSV(Q,HBJ3N129)= .2 RSV(Q, HBM4N111=.4

Q = {dera, kanakkanak, Malaysia, seksa, pukul, hukum, budak, bayi, remaja}

Retrieval

docID = HBJ3N129 hukum -> word10, word25 denda -> word2, word35 word100, word123, kena -> word67, . OFFLINE

Inverted file: dera - HBJ3N129, HBM4N111 budak -> HBJ2N19, HBJ3N129 Malaysia-> HBJN129 ONLINE

Feature (Terms) extraction

A text retrieval system represents documents as sets of terms (e.g., words). Thereby, the originally structured document becomes an unstructured set of terms potentially annotated with attributes to denote frequency and position in the text. The transformation comprises several steps: 1. Elimination of structure (i.e. formats) 2. Elimination of frequent/infrequent terms (i.e. stop words) 3. Mapping text to terms (without punctuation) 4. Reduction of terms to their stems (stemming, syllable division) 5. Mapping to index terms
(The order of the steps above may vary; often, steps are even broken into several steps or several steps are combined into a single pass)

Types of terms: words, phrases or n-gram (i.e., sequence of n characters)

Overview of feature extraction

3 Stemming

Overview of feature extraction

Structure elimination

Frequent/infrequent terms

Text -> Term

Index

Stem

Step 1. Structure elimination

HTML contains special markups, so-called tags. They describe meta-information about the document and the layout/presentation of content. An HTML document is split into two parts, a header section and a body section:
Header: Contains metainformation about the document; they also describe all embedded elements like images.

Body: Encompasses the document enriched with markups for layout. The structure of the document is not always obvious.

Step 1. Structure elimination (cont.)

Meta data: HTML provides several possibilities to define metainformation (<meta>-Tag). The most frequent ones are: URL of page: http://www-dbs.ethz.ch/~mmir/ Title of document: <title>ETH Zurich - Homepage</title> Meta information in header section: <meta name=keywords content="ETHZ,ETH,swiss,> <meta name=description content=This page is about> Raw Text: the raw text subsumes all text pieces with tags stripped from the original <body>-section. A few tags are useful to derive additional information on the importance of a text piece:

Headlines: <h1>2. Information Retrieval </h1> Emphasized: <b>Retrieval model </b>

Special Character: Meta data and text data may contain special characters which have to be translated   -> space, ü -> Transformation to Unicode, ASCII or other character set

Overview of feature extraction

1 Eliminate structure

Step 1. Structure elimination (cont.)

Embedded links how to handle them:

Embedded objects (image, plug-ins):

Links to external references: <a href=http://anywhere.in.the.net/important.html>

Question: what does the link text describe? the document itself or the embedded/referenced object?

Usually, the link text is associated with both the embedding and the linked document. In most cases, the link text is a good summary for the linked document. In a few cases, the link text is meaningless (click here)

Step 2. Eliminate frequent/infrequent terms

Indexing is used is to determine useful answers for user queries. Thus, it is not required to consider frequent terms with little or no semantics (e.g., the, a, it) or terms that appear seldom. Theoretical solution: restrict indexing to terms that have proven to be useful or that appear interesting from past, practical experiences with the system. However, this requires a feedback mechanism with the user to understand term importance. How to select important terms: Zipf's law states that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table.

Distribution of term frequencies

e.g. stopwords, most frequent words

e.g. seldom used words

Insignificant terms

Stop words are terms with little or no semantic meaning, thus often not indexed. Examples: English: the, a, is Bahasa Melayu: ada, iaitu, mana, bersabda, wahai Often, the rank of these terms is on the left side of the upper cutoff line. Generally, stop words are responsible for 20% to 30% of the term occurrences in a text. With the elimination of stop words, the memory consumption of the index can be reduced. Similarly, the most frequent terms in a collection of documents carry little information (rank on the left side of the upper cut-off line):

Analogously, one can strip off words that are seldom used. This assumes that users will not use them in their queries (the rank is on the right side of the lower cut-off). Although, the additional memory consumption is rather small.

The term Computer is meaningless to index articles about computer science. The term Computer, however, is important to distinguish between general articles such as careers in computer science.

Overview of feature extraction

2 Remove stopwords

Step 3: Mapping text to terms

To select appropriate features for documents, one typically uses linguistic or statistical approaches to define the features based on words, fragments of words or phrases. Most search engines use words or phrases as features. Some engines use stemming, some differentiate between upper and lower cases, and some support error correction. An interesting option is the usage of fragments, i.e., so-called ngrams. Although not directly related to semantics of text, they are very useful to support fuzzy retrieval. But there are other possibilities: fragments of words, i.e., n-grams: Example: street -> str, tre, ree, eet streets -> str, tre, ree, eet, ets strets -> str, tre, ret, ets Benefits: Simple misspellings or bad recognition often result in bad retrievals; fragments significantly improve retrieval quality. Stemming and syllable division not necessary any more better. No language specific retrieval necessary; every language is processed equally

Locations and frequency of terms

Retrieval algorithms often use the number of term occurrences and the positions of terms within the document to identify and rank results. Term frequency ("feature frequency"): tf(Ti, Dj) Number of occurrences of a feature Ti in document Dj Term frequency is important to rank documents. Term locations (feature locations): loc(Ti ,dj) ->P(N) [set of locations] Term locations frequently influence the ranking and whether a document appears in the result at all, e.g.: Condition: Q =shah NEAR alam (explicit phrase matching) looking for documents with the terms shah and alam close to each other Ranking: Q =shah alam (implicit phrase matching) documents with the terms shah next to alam should be at the top of results.

tf*idf weighting schema

tf = term frequency

frequency of a term/keyword in a document

The higher the tf, the higher the importance (weight) for the doc.

df = document frequency

no. of documents containing the term distribution of the term

the unevenness of term distribution in the corpus the specificity of term to a document

idf = inverse document frequency

The more the term is distributed evenly, the less it is specific to a document

weight(t,D) = tf(t,D) * idf(t)

Example
Term Haji #of docs 3 --> Dj, tfj Dj, tfj --> D7, 4 D26,10 Dj, tfj D40, 5 .

Iman

--> D21, 2

....

Term Haji occurs in three documents, 4 times in doc 7, 10 times in doc 26 and 5 times in doc 40.

Some common tf*idf schemes

tf(t, tf(t, tf(t, tf(t,

D)=freq(t,D) idf(t) = log(N/n) D)=log[freq(t,D)] n = #docs containing t D)=log[freq(t,D)]+1 N = #docs in corpus D)=freq(t,d)/Max[f(t,d)]

weight(t,D) = tf(t,D) * idf(t)

Overview of feature extraction

Term Abdul Agong :

Pos 5 4 :

#Doc 2 3 :

Dj,tfj 10, 1 2, 3 :

Dj,tfj 21, 2 6, 5 :

Dj,tfj

3 Text to term

31, 2 :

Step 4: Stemming

How word stemming works? Stemming broadens our results to include both word roots and word derivations. It is commonly accepted that removal of word-endings (sometimes called suffix stripping) is a good idea; removal of prefixes can be useful in some subject domains. Why do we need word stemming in the context of free text searching? Free text-searching, searches exactly as we type in to the search box, without changing it to thesaurus term. Morphological variants of words have similar semantic interpretations. Smaller dictionary size results in a saving of storage space and processing time.

Word stemming (cont.)

Algorithms for Word Stemming A stemming algorithm is an algorithm that converts a word to a related form. One of the simplest such transformations is conversion of plurals to singulars. Affix removal algorithms, Successor Variety, Table Lookup, N-gram In most languages, words have various inflected (or sometimes, derived) forms. The different forms should not carry different meanings but should be mapped to a single form. However, in many languages, it is not simple to derive the linguistic stem without a dictionary. At least for English, there exist algorithms without the need of a dictionary which still produce good results (Porter Algorithm).

Word stemming (cont.)

Pros & Cons Word Stemmers are used to conflate terms to improve retrieval effectiveness and/or to reduce the size of indexing files increase recall at the cost of decreased precision Over stemming and Under Stemming also create a problem for retrieving the documents.

Porter's Algorithm

The Porter Stemmer is a conflation Stemmer developed by Martin Porter at the University of Cambridge in 1980. Porter stemming algorithm (or 'Porter stemmer') is a process for removing the commoner morphological and inflexional endings from words in English. Most effective and widely used. Porter's Algorithm works based on number of vowel characters, which are followed be a consonant character in the stem (Measure), must be greater than one for the rule to be applied. A word can have any one of the forms: CC, C..V, V..V, V..C. These can be represented as [C](VC){m}[V].

Porter's Algorithm (cont.)

The rules in the Porter algorithm are separated into five distinct steps numbered from 1 to 5. They are applied to the words in the text starting from step 1 and moving on to step 5. Step 1 deals with plurals and past participles. The subsequent steps are much more straightforward. Ex. plastered->plaster, motoring-> motor Step 2 deals with pattern matching on some common suffixes. Ex. happy -> happi, relational -> relate, callousness -> callous Step 3 deals with special word endings. Ex. triplicate-> triplic, hopeful-> hope

Porter's Algorithm (cont.)

Step 4 checks the stripped word against more suffixes in case the word is compounded. Ex. revival -> reviv, allowance-> allow, inference-> infer etc., Step 5 checks if the stripped word ends in a vowel and fixes it appropriately Ex. probate -> probat, cease -> ceas, controll -> control The algorithm is careful not to remove a suffix when the stem is too short, the length of the stem being given by its measure, m. There is no linguistic basis for this approach.

Dictionary-based stemming

A dictionary significantly improves the quality of stemming (Note: the Porter Algorithm does not derive a linguistic correct stem). It determines the correct linguistic stem for all words but at the price of additional lookup costs and maintenance costs for the dictionary. The EuroWordNet initiative tries to develop a semantic dictionary for the European languages. Next to words, the dictionary shall contain flexed forms and relations between words (see next section). However, the usage of these dictionaries is not for free (with the exception of WordNet for English). Names remain a problem of their own... Examples of such dictionaries / ontologies: EuroWordNet: http://www.illc.uva.nl/EuroWordNet/ GermanNet: http://www.sfs.uni-tuebingen.de/lsd/ WordNet: http://wordnet.princeton.edu/ We look a dictionary based stemming with the example of Morphy, the stemmer of WordNet. Morphy combines two approaches for stemming: a rule-based approach for regular flexions much like the porter algorithm but muchsimpler an exception list with strong or irregular flexions of terms

Stemming process
Stopwords Is it a stopword? Unstemmed words
Porters algortihm, Fatimahs algorithm, Wordnet dictionary

Stemming algorithm

Word dictionary Is it in dictionary?

Morphological rules
(e.g. ber..an, me+, +lah)

Stemmed words

Apply prefix-suffix, suffix & infix rules

Step 5: Mapping to index terms

Term extraction must further deal with homonyms (equal terms but different semantics) and synonyms (different terms but equal semantics). But there are further relations between terms that may be useful to consider. In the following, a list of the most common relationships: Homonyms (equal terms but different semantics): bank (shore vs. financial institute) Synonyms (different terms but equal semantics): walk, go, pace, run, sprint Hypernyms (umbrella term) / Hyponym (species) Animal -> dog, cat, bird, ... Holonyms (is part of) / Meronyms (has parts) door ->lock The relationships above define a network (often denoted as ontology) with terms as nodes and relations as edges. An occurrence of a term may be interpreted as occurrences of near-by terms in this network as well (whereby near-by has to be defined appropriately). Example: A document contains the term dog. We may also interpret this as an occurrence of the term animal (with a smaller weight)

Step 5 : (cont.)

Some search engine do not implement step 4 and 5. Google only recently improved its search capabilities with stemming. If the collection contains documents in different languages, cross-lingual approaches that (automatically) translate or relate terms to different languages and make them retrievable even for queries in different languages than the document. Term extraction for queries: Similar to term extraction of documents If term extraction of query implements step 5: Omit step 5 in term extraction of documents in the collection Extend the query terms with near-by terms: Expansion with synonyms: Q=house Qnew=house, home, domicile, ... If a specialized search returns not enough answers, exchange keywords with their hypernyms: e.g., Q=mare (female horse) -> Qnew =horse If a general search term returns too many results, let the user choose (i.e. relevance feedback) a more specialized term to reduce the result list: e.g., Q=horse -> Qnew =mare, pony, chestnut, pacer

What is WordNet?

A large lexical database, or electronic dictionary, developed and maintained at Princeton University http://wordnet.princeton.edu Includes most English nouns, verbs, adjectives, adverbs Electronic format makes it amenable to automatic manipulation Used in many Natural Language Processing applications (information retrieval, text mining, question answering, machine translation, AI/reasoning,...) Wordnets are built for many languages.

Whats special about WordNet?

Traditional paper dictionaries are organized alphabetically: words that are found together (on the same page) are not related by meaning WordNet is organized by meaning: words in close proximity are semantically similar Human users and computers can browse WordNet and find words that are meaningfully related to their queries (somewhat like in a hyperdimensional thesaurus) Meaning similiarity can be measured and quantified to support Natural Language Understanding

A simple picture
animal (animate, breathes, has heart,...) | bird (has feathers, flies,..) | canary (yellow, sings nicely,..)

Hypo-/hypernymy relates noun synsets

Creates relationships among more/less general concepts Creates hierarchies. Hierarchies can have up to 16 levels {vehicle} / \ {car, automobile} {bicycle, bike} / \ \ {convertible} {SUV} {mountain bike}

A car is a kind of vehicle <=> The class of vehicles includes cars, bikes

Hyponymy
Transitivity: A car is a kind of vehicle An SUV is a kind of car => An SUV is a kind of vehicle

Meronymy/holonymy (part-whole relation)

{car, automobile} | {engine} / \ {spark plug} {cylinder} An engine has spark plugs Spark plus and cylinders are parts of an engine

Meronymy/Holonymy
Inheritance: A finger is part of a hand A hand is part of an arm An arm is part of a body =>a finger is part of a body

Structure of WordNet (Nouns)

{conveyance; transport}
hyperonym

{vehicle}
hyperonym

{bumper} {car door}

{hinge; flexible joint}

meronym

{motor vehicle; automotive vehicle}

hyperonym meronym

{doorlock}
meronym

{car; auto; automobile; machine; motorcar}

meronym hyperonym hyperonym

{car window} {car mirror}

{armrest}

{cruiser; squad car; patrol car; police car; prowl car}

{cab; taxi; hack; taxicab; }

Homework
Select 5 most frequent noun terms, find homonyms, synonyms, hypernmys and holonyms of the terms. May use Wordnet at http://wordnet.princeton.edu/. Select Use Wordnet Online. Create the noun ontology.

IR models
Overview Boolean Retrieval Fuzzy Retrieval Vector Space Retrieval Probabilistic Retrieval (BIR Model) Latent Semantic Indexing

Boolean search

Boolean model

Historically: Documents were stored on tapes or punched cards. Searching: only sequential access. Today: Boolean search is still very frequent but is not state-of-theart.. Google uses it for simplicity but furtehr improved it by additionally sorting/ranking results sets. Model: Document D represented by binary vector d with di =1 if term ti occurs in document i. Query q comes from query space Q; let t be an arbitrary term, and q1 and q2 be queries from Q; Q is given by queries of type: t, q1 ^ q2, , q1 v q2, q1

Boolean model (cont.)

Term-document matrix

Query: Brutus AND Caesar AND NOT Calpurnia Take the vectors for Brutus, Caesar and Calpurnia, complement the last, and then do a bitwise AND: 110100 AND 110111 AND 101111 = 100100

Boolean retrieval
Query: Brutus AND Caesar AND NOT Calpurnia

Fuzzy retrieval

Fuzzy retrieval (cont.)

Vector-space model

Since Boolean models binary weights too limiting, vector supports partial matching. Non-binary weights are assigned to index terms in queries and documents. Term weights are used to compute degree of similarity between documents in the database and the users query.
term2 = ibadah

d
term3 = malam

term1 = solat

Vector-space model (cont.)

The tf metric is considered an indication of how well a term characterizes the content of a document. The idf, in turn, reflects the number of documents in the collection in which the term occurs, irrespective of the number of times it occurs in those documents.

Inverse document frequency

Document-Term-Matrix

Vector-space model (cont.)

Example

N = #of documents M= # of terms

arrived

gold

silver

truck

Class exercises

Using selected, most frequent 10 terms in your story, create term-document matrix for boolean model and vector model.

Remarks

There are many more methods to determine the vector representations and to compute retrieval status values

Advantages:

Main assumption of vector space retrieval Terms occur independent from each other in documents Not true: if one writes about Mercedes, the term "car" is likely to cooccur in document Simple model with efficient evaluation algorithms Partial match queries possible, i.e., it returns documents that only partly contain the query terms (similar to or-operator of Boolean retrieval) Very good retrieval quality; but not state-of-the-art Relevance feedback may further improve vector space retrieval Many heuristics and simplification; no proof for "correctness" of result set HTML/Web: occurrences of terms is not the most important criteria to rank documents (spamming)

Disadvantages:

Parallel Process Modeling with Petri Nets and Finite Automata
No ratings yet
Parallel Process Modeling with Petri Nets and Finite Automata
33 pages
Module 4 Notes
No ratings yet
Module 4 Notes
34 pages
Chapter 1: Boolean Retrieval
No ratings yet
Chapter 1: Boolean Retrieval
9 pages
Natural Language Processing Using Java: Sang Venkatraman April 21, 2015
No ratings yet
Natural Language Processing Using Java: Sang Venkatraman April 21, 2015
51 pages
Information Retrieval
No ratings yet
Information Retrieval
142 pages
A Language Independent Approach To Develop URDUIR System
No ratings yet
A Language Independent Approach To Develop URDUIR System
10 pages
Chapter-2 - Automatic Text Anlysis
No ratings yet
Chapter-2 - Automatic Text Anlysis
67 pages
mod4 nlp
No ratings yet
mod4 nlp
53 pages
Introduction IR
No ratings yet
Introduction IR
61 pages
Chapter 3 IR
No ratings yet
Chapter 3 IR
56 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
57 pages
1-Overview of Information Retrieval
No ratings yet
1-Overview of Information Retrieval
44 pages
Chapter 2 Text Operations
No ratings yet
Chapter 2 Text Operations
37 pages
NLP - Module 5
No ratings yet
NLP - Module 5
58 pages
IR Ch23 Text Representation
No ratings yet
IR Ch23 Text Representation
36 pages
Information Retrieval Thesis Topics
100% (3)
Information Retrieval Thesis Topics
6 pages
Associative Text Retrieval From A Large Document Collection Using Unorganized Neural Networks
No ratings yet
Associative Text Retrieval From A Large Document Collection Using Unorganized Neural Networks
10 pages
IR Chapter 2 Text Operations
No ratings yet
IR Chapter 2 Text Operations
25 pages
1-Overview of Information Retrieval
No ratings yet
1-Overview of Information Retrieval
44 pages
mod 4
No ratings yet
mod 4
35 pages
bulu
No ratings yet
bulu
47 pages
Module 5 - Information Retrieval and Lexical Resources
0% (1)
Module 5 - Information Retrieval and Lexical Resources
80 pages
Effective Search Engine - Final With Modules
No ratings yet
Effective Search Engine - Final With Modules
12 pages
Topic 4 W4 - Text Processing
No ratings yet
Topic 4 W4 - Text Processing
42 pages
IR Problem: Introduction To Information Retrieval Outline
No ratings yet
IR Problem: Introduction To Information Retrieval Outline
11 pages
ISE Information Retrieval Mod-V (Uploaded by Snaptricks.in)
No ratings yet
ISE Information Retrieval Mod-V (Uploaded by Snaptricks.in)
48 pages
MSC IR 2021
100% (1)
MSC IR 2021
188 pages
1-Getting Started With ELK
No ratings yet
1-Getting Started With ELK
44 pages
Information Retrieval
No ratings yet
Information Retrieval
72 pages
2T-Inverted Index
No ratings yet
2T-Inverted Index
54 pages
chapter 2
No ratings yet
chapter 2
45 pages
IR Presentation 1
No ratings yet
IR Presentation 1
41 pages
Ch2_IR and LT
No ratings yet
Ch2_IR and LT
45 pages
Module1PartBInformationRetrievalWebdocuments
No ratings yet
Module1PartBInformationRetrievalWebdocuments
49 pages
Informaiton Retrieval and Web Search
No ratings yet
Informaiton Retrieval and Web Search
44 pages
Apznzazcghor Yfaefzxic8mtoyxh4styndoxb7gk17qpn3jvxdvqw0hldfkvr9zqdwdlqlvv Bxxsh9ypo05o9bu2vf7xntq6 Pzji8yata6ieq9uptrduksav3o g6fx5brv Epaefr Ehdghr7renjhhptsx6dxy3fundzb1nwwcrmbvg5lggbaw6m2gzk5rudbp31dnn8w
No ratings yet
Apznzazcghor Yfaefzxic8mtoyxh4styndoxb7gk17qpn3jvxdvqw0hldfkvr9zqdwdlqlvv Bxxsh9ypo05o9bu2vf7xntq6 Pzji8yata6ieq9uptrduksav3o g6fx5brv Epaefr Ehdghr7renjhhptsx6dxy3fundzb1nwwcrmbvg5lggbaw6m2gzk5rudbp31dnn8w
61 pages
Chapter 4 IR
No ratings yet
Chapter 4 IR
56 pages
4 IRinArabic2021 Ranked Retrieval I
No ratings yet
4 IRinArabic2021 Ranked Retrieval I
49 pages
1-Introduction-MIR
No ratings yet
1-Introduction-MIR
35 pages
Introduction To Information Retrieval: Jian-Yun Nie University of Montreal Canada
No ratings yet
Introduction To Information Retrieval: Jian-Yun Nie University of Montreal Canada
61 pages
Cs8080 Ir Unit2 I Modeling and Retrieval Evaluation
No ratings yet
Cs8080 Ir Unit2 I Modeling and Retrieval Evaluation
42 pages
CS583 Info Retrieval
No ratings yet
CS583 Info Retrieval
33 pages
Chapter 4 - Processing Text
No ratings yet
Chapter 4 - Processing Text
7 pages
NLP Ir
No ratings yet
NLP Ir
24 pages
CS583 Info Retrieval
No ratings yet
CS583 Info Retrieval
34 pages
Ir Chapter Three
No ratings yet
Ir Chapter Three
41 pages
What is Information Retrieval (IR) (1)
No ratings yet
What is Information Retrieval (IR) (1)
17 pages
Introduction To Information Retrieval: Courtesy
No ratings yet
Introduction To Information Retrieval: Courtesy
61 pages
ISR chap..1
No ratings yet
ISR chap..1
27 pages
IR Chapter 2
No ratings yet
IR Chapter 2
37 pages
NLP Mod-5
No ratings yet
NLP Mod-5
17 pages
Chapter - 6 Part 1
No ratings yet
Chapter - 6 Part 1
21 pages
Ir - Chapter 1
No ratings yet
Ir - Chapter 1
7 pages
Indexing Processes (Text Transformation)
No ratings yet
Indexing Processes (Text Transformation)
10 pages
Unit-Ii Notes
No ratings yet
Unit-Ii Notes
17 pages
ISE Information Retrieval Mod-V
No ratings yet
ISE Information Retrieval Mod-V
48 pages
Temperature Measurement: Temperature Measurements Using Data Acquisition Processors
No ratings yet
Temperature Measurement: Temperature Measurements Using Data Acquisition Processors
66 pages
20200728204914D5872 - COMP6639 - Session 28 - Natural Language Processing
No ratings yet
20200728204914D5872 - COMP6639 - Session 28 - Natural Language Processing
29 pages
Missing Number
No ratings yet
Missing Number
34 pages
Information Retrievalpdf
No ratings yet
Information Retrievalpdf
7 pages
Tamrakar 2015
No ratings yet
Tamrakar 2015
6 pages
Tokenization: Token Normalization Is The Process of Canonicalizing Tokens So That Matches Occur
No ratings yet
Tokenization: Token Normalization Is The Process of Canonicalizing Tokens So That Matches Occur
3 pages
Physics 256: Lecture Physics 256: Lecture The Wavefunction
No ratings yet
Physics 256: Lecture Physics 256: Lecture The Wavefunction
31 pages
Air Standard Cycle Report 1
No ratings yet
Air Standard Cycle Report 1
17 pages
Dynamic Programming:: Example 1: Assembly Line Scheduling. Instance
No ratings yet
Dynamic Programming:: Example 1: Assembly Line Scheduling. Instance
14 pages
Matlab Chapter 1
No ratings yet
Matlab Chapter 1
14 pages
Ratio Latest Classroom Sheet
No ratings yet
Ratio Latest Classroom Sheet
12 pages
Measurement Technique Cheat Sheet
No ratings yet
Measurement Technique Cheat Sheet
7 pages
Applying Multiple Linear Regression and Neural Network To Predict Bank Performance
No ratings yet
Applying Multiple Linear Regression and Neural Network To Predict Bank Performance
8 pages
Tma4140 Mock-Exam 2022 Solutions
No ratings yet
Tma4140 Mock-Exam 2022 Solutions
16 pages
Sna It Unit5
No ratings yet
Sna It Unit5
20 pages
Lesson 4 - Exponential Functions
No ratings yet
Lesson 4 - Exponential Functions
29 pages
Schoenfeld 2000
No ratings yet
Schoenfeld 2000
21 pages
Elementary Problems and Solutions: B-388 Proposed
No ratings yet
Elementary Problems and Solutions: B-388 Proposed
4 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
5 pages
(Final)5 D workshop on Recent Research Trends in Computer Science and Engg_V2
No ratings yet
(Final)5 D workshop on Recent Research Trends in Computer Science and Engg_V2
4 pages
CBSE Class 8 Mathematics Ptactice Worksheet
No ratings yet
CBSE Class 8 Mathematics Ptactice Worksheet
1 page
C# Boyer-Moore String Search Example
No ratings yet
C# Boyer-Moore String Search Example
4 pages
In Classn
No ratings yet
In Classn
3 pages
40 μA Micropower Instrumentation Amplifier with Zero Crossover Distortion AD8236
No ratings yet
40 μA Micropower Instrumentation Amplifier with Zero Crossover Distortion AD8236
20 pages
Diagrid Seminar Report
No ratings yet
Diagrid Seminar Report
19 pages
Ipc2022-87046 - Use of Inertial Measurement Unit In-Line Inspection Data To Support Code
No ratings yet
Ipc2022-87046 - Use of Inertial Measurement Unit In-Line Inspection Data To Support Code
8 pages
Test Card and Learning Card
No ratings yet
Test Card and Learning Card
2 pages
Java Matrix Problem
No ratings yet
Java Matrix Problem
2 pages
Business Mathematics - Module 16 - Measures of Central Tendency and Variability
No ratings yet
Business Mathematics - Module 16 - Measures of Central Tendency and Variability
11 pages
Numerical Simulation of The Francis Turbine and CAD Used To Optimized The Runner Design (2nd)
No ratings yet
Numerical Simulation of The Francis Turbine and CAD Used To Optimized The Runner Design (2nd)
11 pages
Math8 3rdquarter (Week 1-7)
No ratings yet
Math8 3rdquarter (Week 1-7)
8 pages
Example: Fire Design of An Unprotected Beam Using Graphs
No ratings yet
Example: Fire Design of An Unprotected Beam Using Graphs
5 pages
Math C191
No ratings yet
Math C191
2 pages
Data Structures I Essentials
From Everand
Data Structures I Essentials
Dennis Smolarski
No ratings yet