Data Warehousing and Data Mining

Paper Code: 
MCA 424
Credits: 
04
Periods/week: 
04
Max. Marks: 
100.00
Objective: 

 The course will enable the students to

  1. Define the scope and essentiality of Data Warehousing and Mining.
  2. Understand the need of Data warehouses over databases.
  3. Describe data; choose relevant models and algorithms for respective applications.
  4. Analyze data, identify problems, and choose relevant models and algorithms to apply.
  5. Investigate research interest towards advances in data mining.
  6. Relate a clear idea of data mining techniques, their need, scenarios and scope of their applicability to real world problems

 Course Learning Outcomes (CLOs):

Learning Outcome (at course level)

Students will be able to:

Learning and teaching strategies

Assessment Strategies

  1. State the Data Warehouse fundamentals, Data Mining Principles.
  2. Describe data warehouse with dimensional modeling and apply OLAP operations.
  3. Apply data mining algorithms to solve real world problems.
  4. Compare and evaluate different data mining techniques like classification, prediction, clustering and association rule mining
  5. Apply Data mining techniques on real world problems using tool.
  6. Benefit user experiences towards research innovation and integration.

Approach in teaching:

Interactive Lectures, Discussion, Demonstration with real world examples, Role plays, tool based experiment

 

Learning activities for the students:

Self-learning assignments, Quiz activity, Effective questions, case study based learning approach, presentation, flip classroom

 

  • Assignments
  • Written test in classroom
  • Classroom Activity
  • Continuous Assessment
  • Semester End Examination

 

10.00
Unit I: 
Introduction to Data Warehousing

Architecture of Data Warehouse, OLAP and Data Cubes, Dimensional Data Modeling-star, snowflake schemas , Data Preprocessing – Need, Data Cleaning, Data Integration &Transformation, Data Reduction

10.00
Unit II: 
Introduction to Data Mining

Basic Data Mining Tasks, Data Mining versus Knowledge Discovery in Databases, Data Mining Issues, Data Mining Metrics, Social Implications of Data Mining, Overview of Applications of Data Mining

12.00
Unit III: 
Data Mining Techniques

Frequent item-sets and Association rule mining: Apriori algorithm, Use of sampling for frequent item-set, FP tree algorithm, Graph Mining, Frequent sub-graph mining, Tree mining, Sequence Mining

13.00
Unit IV: 
Classification & Prediction

Decision tree learning: Construction, performance, attribute selection Issues: Over-fitting, tree pruning methods, missing values, continuous classes Classification and Regression Trees (CART) , Bayesian Classification: Bayes Theorem, Naïve Bayes classifier, Bayesian Networks Inference , Parameter and structure learning: Linear classifiers, Least squares, logistic, perceptron and SVM classifiers, Prediction: Linear regression, Non-linear regression

15.00
Unit V: 
Accuracy Measures

Precision, recall, F-measure, confusion matrix, cross-validation, bootstrap, Clustering: k-means, k-medoids, Expectation Maximization (M) algorithm, Hierarchical clustering, Correlation clustering. Brief overview of advanced techniques: Active learning, Reinforcement learning, Text mining, Graphical models, Web Mining , data mining tool- Orange 3.8 or 3.9

ESSENTIAL READINGS: 
  • Jiawei Han &MichelineKamber, “Data Mining: Concepts & Techniques”, Morgan Kaufmann Publishers,3rd Edition, 2011
  • Mohanty, Soumendra, “Data Warehousing: Design, Development and Best Practices”, Tata McGraw Hill, 2006
  • W. H. Inmon, “Building the Data Warehouse”, Wiley Dreamtech India Pvt. Ltd., 4th  Edition, 2005
REFERENCES: 
  • Pieter Adriaans&DolfZentinge, “Data Mining”, Addison-Wesley, Pearson, 2000.
  • Daniel T. Larose, “Data Mining Methods & Models”, Wiley-India, 2007.
  • VikramPudi& P. Radhakrishnan, “Data Mining”, Oxford University Press, 2009.
  • Alex Berson& Stephen J. Smith, “Data Warehousing, Data Mining & OLAP”, Tata McGraw-Hill, 2004.
  • Michael J. A. Berry & Gordon S. Linoff, “Data Mining Techniques”, Wiley-India, 2008.
  • Richard J. Roiger& Michael W. Geatz, “Data Mining – a Tutorial-based Primer”, Pearson Education, 2005.
  • Margaret H. Dunham & S. Sridhar, “Data Mining: Introductory and Advanced Topics”, Pearson Education, 2008.
  • G. K. Gupta, “Introduction to Data Mining with Case Studies”, EEE, PHI, 2006.
Academic Year: