CS6200: Information Retrieval

Homework 5

Assigned: Wednesday, 3 December 2014
Due: Thursday, 11 December 2014, 11:59 p.m. (Note: Since grades must be in on Dec. 15, at most three slip days may be used.)

Instructions

If you collaborated with others, you must write down with whom you worked on the assignment. If this changes from problem to problem, then you should write down this information separately with each problem.
Submit the requested written answers, code, and instructions via subversion.

Problems

[25 points] Query transformation
1. Query transformation (or query rewriting) modifies the original query so that it better matches the query intent. For each of the following examples, name the transformation and give a brief description what it does:
  1. hybrid car $\rightarrow$ hybrid car cars
  2. xbox kinex $\rightarrow$ xbox kinect
  3. middle east turmoil $\rightarrow$ middle east turmoil syria iraq
  4. new york times subscription $\rightarrow$ "new york times" subscription
  5. tapas $\rightarrow$ tapas Cambridge Massachusetts
2. Explain how term association measures based on a document corpus could be used for query expansion. Why would the results be better if term associations were calculated using a query log?
3. A query log contains a record of all the queries that were submitted to a search engine and the documents that were clicked on in the result lists for those queries. Describe how a query log could be used to provide query suggestions for a search results page.
[25 points] Language modeling
1. What is the purpose of smoothing in the query likelihood retrieval model?
2. The likelihood of a query $Q$ with Jelinek-Mercer smoothing is: \[ p(Q \mid D) = \prod_{i=1}^n [(1 - \lambda) \frac{f_{q_i,D}}{|D|} + \lambda \frac{c_{q_i}}{|C|}] \] Explain what types of documents tend to be retrieved when the $\lambda$ parameter for the corpus model is close to 0. Also, how would the ranking change when the lambda parameter approaches 1?
3. Estimating a relevance model provides a formal justification for pseudo-relevance feedback. Explain pseudo-relevance feedback and what happens to the query when using this technique.
[20 points] Term dependence
1. Two popular retrieval models with term dependence are the sequential dependence model and the full dependence model. For the query “russian president vladimir putin”, what features does the full dependence model add that do not occur in the sequential dependence model?
2. In naïve Bayes classification models, as in BM25 and unigram retrieval models, terms are independent of each other given the documents classification. Say that we want to add bigram features to a naïve Bayes classifier for positive vs. negative movie reviews. In addition to features like, ‘martin’ and ‘charlie’ and ‘sheen’, we would also have features like ‘martin sheen’ and ‘charlie sheen’. How could these features help improve classification? If the terms were indeed independent given the classification, what can you say about the relationship between p(‘martin’ | ‘sheen’) and p(‘martin’) in the collection of, say, positive reviews?
[25 points] Evaluation
1. The web search industry uses so-called PEGFB relevance judgments—perfect, excellent, good, fair, and bad—together with the NDCG metric to evaluate search output. Why is this metric well suited to evaluating web search using this type of judgments?
2. What is a disadvantage in using mean average precision (MAP) to evaluate web searches? Explain how retrieving relevant documents at low ranks (i.e., a long way down the ranked list) will decrease MAP.
3. Assume that you are evaluating search engines A and B with MAP. Describe how you would use some particular significance test to describe the relative performance of A and B.