Top
 5 quiz questions in IR, Information retrieval quiz, information 
retrieval mcqs with answers, information retrieval,  stemming, 
query expansion, recall, click relevance feedback, TF-IDF, IR questions answers for GATE exams, vector space retrieval
Information Retrieval MCQs - SET 05
1. Which of the following is a
disadvantage of click relevance feedback method? 
a) Easy availability
b) Less noisy
c) Very noisy
d) Very expensive to obtain
Click here to view answer and explanation
Ans : (c)
 
  | Answer: (c)
  Very noisy Click relevance feedback
  (implicit feedback) is said to be noisier due to one or more of the following
  reasons; A user clicks on
  a result based on his perusal of the snippet he is presented with. A user may click
  on the top k results simply because he trusts that the top results will
  satisfy his information need. A user may get
  side-tracked and click a result simply because it looks interesting, even
  though it is not relevant to the current information need. A user may omit
  possibly relevant results simply because their sources are not as
  authoritative as the ones that he is previously aware of. [Refer here for
  more]  | 
  
 
2. Let us suppose that the number of
documents in a corpus is M, the average length of a document (in terms of
number of words) is N, the size of vocabulary (number of unique keywords in the
corpus) is V, the average length of a query is Q, and the average number of documents
in which a query word appears is D. What is the time complexity of query
processing with inverted index in vector space information retrieval? 
a) V * M
b) V * D
c) Q * D
d) Q * N
Click here to view answer and explanation
Ans : (c)
 
  | Answer: (c)
  Q * D Inverted index
  helps in fast access. With inverted index, for each query keyword, we need to
  look at the documents that contain the keyword. Hence, the time complexity is
  QD. | 
  
 
3. Which of the following is true
about Stemming? 
a) It increases the recall and reduces
the precision
b) It increases the precision and
reduces the recall
c) Recall and precision are equal if
you use stemming
d) None of the above
Click here to view answer and explanation
Ans : (a)
 
  | Answer: (a)
  It increases the recall and reduces the precision Stemming is a
  rule-based process of reducing the inflected words to their root word/stem. It
  is a technique to provide ways of finding morphological variants of search
  terms. If we apply stemming on a word in a user query, it might group
  different word types together. For example, the words ‘clip’ (a metal holder)
  and ‘clipping’ (a small piece trimmed from something) will be stemmed to ‘clip’.
  Hence, it would match more documents than expected (both the documents
  consist of the word ‘clip’ as tool and ‘clipping’ as technique will be
  included in the result). This increases the recall and reduces the precision. | 
  
 
4. The TF-IDF weight a term t will be
______ when t occurs many times within a small number of documents. 
a) Lowest
b) Highest
c) Cannot determine
d) Lower
 
Click here to view answer and explanation
Ans : (b)
 
  | Answer: (b)
  Highest If the word is
  rare and appears in very few documents, the score will approach 1. TF-IDF of a term
  t is calculated by multiplying the Term Frequency and Inverse Document
  Frequency as follows; tf(t,d) *
  idf(t,D) = (frequency of word t in document d) * log((Total number of documents
  in the corpus)/(number of documents containing the term t)) If the term t
  appears in small number of documents, then the idf value for large N will be
  high. Hence, the tf-idf is highest. | 
  
 
5. The TF-IDF weight of a term t will
be ______ when t occurs in virtually all documents. 
a) Lowest
b) Highest
c) Cannot determine
d) Lower
Click here to view answer and explanation
Ans : (a)
 
  | Answer: (a)
  Lowest If the word is
  very common and appears in many documents, the score will approach 0. Please refer to
  the answer of question 4. | 
  
 
********************
 
 Related links
 
Keywords
What
is the time complexity of query processing with inverted index in vector space
information retrieval?
Stemming increases recall and reduces precision.
Why recall is referred as non-decreasing function of the number of documents retrieved?
Why does the process of stemming reduces the value of recall? 
Behavior of tf-idf for small number of documents with the term t against the large number of documents
Why does the value of tfidf decreases for a term that occurs in all the documents. 
What are the reasons for click relevance feedback is very noisy?
Why is the clickthrough relevance feedback (implicit relevance feedback) causes more noise / considered more noisy.