Course Objectives:
The course will enable the students to
Course | Learning outcome (at course level) | Learning and teaching strategies | Assessment Strategies | |
Course Code | Course title | |||
24DCAI 703A |
DATA WAREHOUSE & DATA MINING (Theory) | CO115. Analyse the significance of dataware house and data mining in information management. CO116. Elaborate the concepts and architecture of data warehouses. CO117. Determine data mining and machine learning fundamentals with evaluation metrics. CO118. Inspect unbalanced data in unsupervised learning using association rules, clustering algorithms. CO119. Construct models, evaluate performance, address attribute selection in classification and prediction. CO120. Contribute effectively in course-specific interaction
| Approach in teaching: Interactive Lectures, Discussion, PowerPoint Presentations, Informative videos
Learning activities for the students: Self-learning assignments, Effective questions, presentations.
| Assessment tasks will include Class Test on the topics, Semester end examinations, Quiz, Student presentations and assignments. |
Decision support system, Operational versus Decision-Support Systems, Data Warehousing-the only solution, definitions of Data warehousing and data mining, features of Data warehouse, Data Marts, Metadata. Planning Data warehouse, project team, project management considerations, information packages & requirements gathering methods and Requirements definition: Scope and Content.
Objectives, Data Warehouse Architecture, Distinguishing Characteristics, Architectural Framework. Infrastructure: Operational & Physical. Implementation of Data warehouse, ETL (Extract, Transform and Load in Data warehouse) Physical design: steps, considerations, physical storage, indexing, Data lake vs. Data warehouse
Basic Data Mining Tasks, Data Mining versus Knowledge Discovery in Databases, Applications of Machine Learning, Machine Learning vs AI ,Types of Machine Learning, Metrics, Accuracy Measures: Precision, recall, F-measure, confusion matrix, cross-validation.
unbalanced data, Unsupervised Learning: Association rules, Apriori algorithm, FP tree algorithm, Market Basket Analysis and Association Analysis. Clustering: k-means and implementation of k-means using python, Concept of other clustering algorithms: Hierarchical clustering, and DBSCAN.
model Construction, performance, attribute selection Issues: under, Over-fitting, cross validation, tree pruning methods, missing values, Information Gain, Gain Ratio, Gini Index, continuous classes. Classification and Regression Trees (CART) and C 5.0. Linear Regression, Multiple Linear Regression, Logistic Regression, Naïve Bayes, Support Vector Machines(SVM) and Simple neural network.
SUGGESTED TEXT BOOKS:
SUGGESTED REFERENCE BOOKS :
REFERENCE JOURNALS:
e-RESOURCES INCLUDING LINKS: