Data Warehousing and Data Mining

Paper Code: 
MCA 424
Credits: 
04
Periods/week: 
04
Max. Marks: 
100.00
Objective: 
  •  To understand the need of Data warehouses over databases
  • To get a clear idea of data mining techniques, their need, scenarios and scope of their applicability
10.00
Unit I: 

Introduction to Data Warehousing: Architecture of Data Warehouse, OLAP and Data Cubes, Dimensional Data Modeling-star, snowflake schemas , Data Preprocessing – Need, Data Cleaning, Data Integration &Transformation, Data Reduction , Machine Learning , Pattern Matching

 

10.00
Unit II: 

Introduction to Data Mining: Basic Data Mining Tasks, Data Mining versus Knowledge Discovery in Databases, Data Mining Issues, Data Mining Metrics, Social Implications of Data Mining, Data Mining Query Language,  Overview of Applications of Data Mining

 

12.00
Unit III: 

Data Mining Techniques: Frequent item-sets and Association rule mining: Apriori algorithm, Use of sampling for frequent item-set, FP tree algorithm, Graph Mining, Frequent sub-graph mining, Tree mining, Sequence Mining

 

 

13.00
Unit IV: 

Classification & Prediction: Decision tree learning: Construction, performance, attribute selection Issues: Over-fitting, tree pruning methods, missing values, continuous classes Classification and Regression Trees (CART) , Bayesian Classification: Bayes Theorem, Naïve Bayes classifier, Bayesian Networks Inference , Parameter and structure learning: Linear classifiers, Least squares, logistic, perceptron and SVM classifiers, Prediction: Linear regression, Non-linear regression

 

15.00
Unit V: 

Accuracy Measures: Precision, recall, F-measure, confusion matrix, cross-validation, bootstrap, Clustering: k-means, Expectation Maximization (M) algorithm, Hierarchical clustering, Correlation clustering, DBSCAN. Brief overview of advanced techniques: Active learning, Reinforcement learning, Text mining, Graphical models, Web Mining

 

ESSENTIAL READINGS: 
  • Jiawei Han & Micheline Kamber, “Data Mining: Concepts & Techniques”, Morgan Kaufmann Publishers, 2002
  • Mohanty, Soumendra, “Data Warehousing: Design, Development and Best Practices”, Tata McGraw Hill, 2006
  • W. H. Inmon, “Building the Data Warehouse”, Wiley Dreamtech India Pvt. Ltd., 4th  Edition, 2005

 

REFERENCES: 
  • Pieter Adriaans & Dolf Zentinge, “Data Mining”, Addison-Wesley, Pearson, 2000.
  • Daniel T. Larose, “Data Mining Methods & Models”, Wiley-India, 2007.
  • Vikram Pudi & P. Radhakrishnan, “Data Mining”, Oxford University Press, 2009.
  • Alex Berson & Stephen J. Smith, “Data Warehousing, Data Mining & OLAP”, Tata McGraw-Hill, 2004.
  • Michael J. A. Berry & Gordon S. Linoff, “Data Mining Techniques”, Wiley-India, 2008.
  • Richard J. Roiger & Michael W. Geatz, “Data Mining – a Tutorial-based Primer”, Pearson Education, 2005.
  • Margaret H. Dunham & S. Sridhar, “Data Mining: Introductory and Advanced Topics”, Pearson Education, 2008.
  • G. K. Gupta, “Introduction to Data Mining with Case Studies”, EEE, PHI, 2006.

 

Academic Year: