Architecture of Data Warehouse, OLAP and Data Cubes, Dimensional Data Modeling-star, snowflake schemas , Data Preprocessing – Need, Data Cleaning, Data Integration &Transformation, Data Reduction , Machine Learning , Pattern Matching
Basic Data Mining Tasks, Data Mining versus Knowledge Discovery in Databases, Data Mining Issues, Data Mining Metrics, Social Implications of Data Mining, Data Mining Query Language, Overview of Applications of Data Mining
Frequent item-sets and Association rule mining: Apriori algorithm, Use of sampling for frequent item-set, FP tree algorithm, Graph Mining, Frequent sub-graph mining, Tree mining, Sequence Mining
Decision tree learning: Construction, performance, attribute selection Issues: Over-fitting, tree pruning methods, missing values, continuous classes Classification and Regression Trees (CART) , Bayesian Classification: Bayes Theorem, Naïve Bayes classifier, Bayesian Networks Inference , Parameter and structure learning: Linear classifiers, Least squares, logistic, perceptron and SVM classifiers, Prediction: Linear regression, Non-linear regression
Precision, recall, F-measure, confusion matrix, cross-validation, bootstrap, Clustering: k-means, Expectation Maximization (M) algorithm, Hierarchical clustering, Correlation clustering, DBSCAN. Brief overview of advanced techniques: Active learning, Reinforcement learning, Text mining, Graphical models, Web Mining