WEB MINING AND ANALYTICS

Paper Code: 
24MCA325D
Credits: 
04
Periods/week: 
04
Max. Marks: 
100.00
Objective: 

This course enables the students to

  1. Introduce students to the basic concepts and techniques of Information Retrieval, Web Search, Data Mining, and Machine Learning for extracting knowledge from the web.
  2. Describe complex data types with respect to spatial and web mining
  3. Appreciate the use of machine learning approaches for Web Content Mining
  4. Describe the various aspects of web usage mining
  5. Develop skills of using recent data mining software for solving practical problems of Web Mining
  6. Interpret emergent features such as the structure and evolution of the Web graph, its traffic patterns, and the spread of information

 

Course Outcomes: 

Course

Learning Outcome (at course level)

Learning and teaching strategies

Assessment Strategies

Course Code

Course

Title

24MCA 325D

Web Mining and Analytics

(Theory)

 

  1. Examine the importance of Web Mining Algorithms and Information Retrieval Concepts in Web search.
  2. Compare different components of a web page that can be used for mining.
  3. Examine basic concepts to web content mining.
  4. Implement Page Ranking algorithm and modify the algorithm for mining information
  5. Modify an existing search engine to make it personalized using web analytics.
  6. Contribute effectively in course-specific interaction

Approach in teaching:

Interactive Lectures,

Modeling, Discussions, implementing enquiry based learning.

 

Learning activities for the students:

Experiential Learning, Presentations, Case based learning, Discussions, Quizzes and Assignments

 

  • Assignments
  • Written test in classroom
  • Classroom activity
  • Continues Assessment
  • Semester End Examination

 

12.00
Unit I: 
Introduction

Introduction – Web Mining – Theoretical background –Algorithms and techniques –Association rule mining – Sequential Pattern Mining -Information retrieval and Web search – Information retrieval Models-Relevance Feedback- Text and Web page Pre-processing

 

14.00
Unit II: 
Web Content Mining

Web Content Mining – Supervised Learning – Decision tree - Naive Bayesian Text Classification -Support Vector Machines - Ensemble of Classifiers. Unsupervised Learning - K-means Clustering -Hierarchical Clustering –Partially Supervised Learning

 

14.00
Unit III: 
Web Structure and Web Usage Mining

Hyperlink based Ranking – Introduction -Social Networks Analysis- Co-Citation and Bibliographic Coupling - Page Rank -Authorities -Enhanced Techniques for Page Ranking - Community Discovery – Web Crawling -A Basic Crawler Algorithm- Implementation Issues

Web Usage Mining – sources of data- Applications -Click stream Analysis -Web Server Log Files - Data Collection and Pre Processing- Cleaning and Filtering- Data Modeling for Web Usage Mining – Issues- Discovery and Analysis of Web Usage Patterns – Used tools in Web Usage mining.

 

10.00
Unit IV: 
Introduction to web analytics

Motivation and historical perspective on the development of web analytics, Display and search advertising , Knowledge discovery from web data, Major computing paradigms, Typical problem formulations

 

10.00
Unit V: 
Web analytics at e-Business scale

Framework for mapping business needs to web analytics tasks, Data collection architecture, Introduction to OLAP, Web data exploration and reporting, Introduction to Splunk

 

ESSENTIAL READINGS: 
  1. Bing Liu, “ Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)”, Springer; 2nd Edition 2009

 

REFERENCES: 

Suggested Readings:

  1. Guandong Xu ,Yanchun Zhang, Lin Li, “Web Mining and Social Networking: Techniques and Applications”, Springer; 1st Edition.2010
  2. Zdravko Markov, Daniel T. Larose, “Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage”, John Wiley & Sons, Inc., 2007

 

  1. E-resources:
  2. Data Mining and Analysis( https://online.stanford.edu/)
  3. Text Mining and Analytics ( https://www.coursera.org/ )
  4. Text Retrieval and Search Engines ( https://www.coursera.org/ )
  5. Data Visualization( https://www.coursera.org/)                                                                                                                                                              Journals (International / National):
  6. International Journal of Information Technology and Decision Making
  7. International Journal of Mining Science and Technology
  8. Social Network Analysis and Mining 
  9. International Journal of Engineering Research & Technology (IJERT)


                                                                                                                              

 

Academic Year: