Information Retrieval

Paper Code: 
MCA 525A
Credits: 
04
Periods/week: 
04
Max. Marks: 
100.00
Objective: 

 The course will enable the students to

  1. Introduction of IR and basic theories related to IR.
  2. Standard models of IR (Boolean, Vector-space, Probabilistic and Logical models).
  3. Understand the difficulty of representing and retrieving documents, images, speech, etc.
  4. Be familiar with various algorithms and systems and learning concepts about web search.
  5. Learns about the role of Semantics in IR

 

Course Learning Outcomes (CLOs):

 

Learning Outcome (at course level)

Students will be able to:

Learning and teaching strategies

Assessment Strategies

  1. Understand various IR models along with suitability of their application.
  2. Develop algorithms based on various IR concepts.
  3. Develop applications based on textual classification.
  4. Understand retrieval performance evaluation
  5. Understand diverse concepts and algorithms of web search and Role of Semantics in IR.

Approach in teaching:

Interactive Lectures,

Modeling, Discussions, implementing enquiry based learning, student centered approach, Through audio-visual aids

 

Learning activities for the students:

Experiential Learning, Presentations, Discussions, Quizzes and Assignments

 

  • Assignments
  • Written test in classroom
  • Classroom Activity
  • Continuous Assessment
    • SemesterEnd Examination

 

8.00
Unit I: 
Introduction to IR

Motivation, Basic Concepts,IR Applications and Scope, The nature of unstructured and semi-structured text,Basic structureof search engine,  Past and Future, The Retrieval Process, Web search and IR,  Information Retrieval Vs. Data Retrieval, IR Vs. IE, Concept of relevance.

13.00
Unit II: 
Indexing and Query& Document Processing

Index Construction, Indexing techniques for textual information items, such as inverted indices, Latent Semantic Indexing, Indexing Compression.

Document Preprocessing: tokenization, stemming and stop words, Pattern Matching.

13.00
Unit III: 

Study Popular Retrieval Models

Taxonomy of Information Retrieval Models, A Formal Characterization of IR Models.

Classic Information Retrieval: Basic Concepts, Boolean Model, Vector Space Model - TF-IDF weighting, Probabilistic Model.

Language modeling. Probability ranking principle, relevance feedback, pseudo relevance feedback, query expansion & its Techniques.

13.00
Unit IV: 
Retrieval Performance Evaluation and Document Text Mining

Measures to compute similarity (Cosine, Jacquard), Retrieval performance evaluation: Recall and Precision, F Measure, NDCG.

Document Text Mining- An overview of Text Classification, Document Clustering.

An introduction of Personalized Search and Cross Lingual Information Retrieval.

13.00
Unit V: 
An Introduction to Web Search Basics

Web Search Basics,Web structure & Characteristics, Web Crawling and web Indexes, Meta Crawlers, Focused Crawling, Link Analysis-Hubs and Authorities, Page Rank & HITS algorithms, Query Log Analysis, Searching & Ranking, Introduction to Semantics based IR and Semantic Web.

ESSENTIAL READINGS: 
  • C.D. Manning, P. Raghavan, H. Schütze, “Introduction to Information Retrieval”, Cambridge UP, 2009. 
  • D.A. Grossman, O. Frieder, “Information Retrieval: Algorithms and Heuristics”, Springer, 2004.
REFERENCES: 
  • G. Kowalski, M.T. Maybury. “Information Storage and Retrieval Systems”, Springer, 2005. 
  • C.J. van Risjbergen. “The Geometry of Information Retrieval” Cambridge UP, 2004. 
  • B. Croft, D. Metzler, T. Strohman, “Information Retrieval in Practice” Pearson Education, 2009. 
  • R. Baeza-Yates, B. Ribeiro-Neto, “ Modern Information Retrieval” .  2nd edition, Addison-Wesley, 2011
Academic Year: