Data Analytics

Paper Code: 
MCA 524A
Credits: 
04
Periods/week: 
04
Max. Marks: 
100.00
Objective: 
  • To explore the fundamental concepts of Data Analytics
  • To understand the various search methods and statistical techniques
  • To understand forecasting methods and apply in various applications.
  • To understand how to mine the data & about stream computing

To know about the research that requires the integration of large amounts of data

10.00
Unit I: 
Introduction To Big Data

Analytics – Nuances of big data – Value – Issues – Case for Big data – Big data options Team challenge – Big data sources – Acquisition – Nuts and Bolts of Big data. Features of Big Data - Security, Compliance, auditing and protection - Evolution of Big data – Best Practices for Big data Analytics - Big data characteristics - Volume, Veracity, Velocity, Variety – Data Appliance and Integration tools – Greenplum – Informatica

12.00
Unit II: 
Data Analysis

Evolution of analytic scalability – Convergence – parallel processing systems – Cloud computing – grid computing – map reduce – enterprise analytic sand box – analytic data sets – Analytic methods – analytic tools – Cognos – Microstrategy - Pentaho. Analysis approaches – Statistical significance – business approaches – Analytic innovation – Traditional approaches – Iterative

12.00
Unit III: 
Stream Computing

Introduction to Streams Concepts – Stream data model and architecture - Stream Computing, Sampling data in a stream – Filtering streams – Counting distinct elements in a stream – Estimating moments – Counting oneness in a window – Decaying window - Realtime Analytics Platform(RTAP) applications IBM Infosphere – Big data at rest – Infosphere streams – Data stage – Statistical analysis – Intelligent scheduler – Infosphere Streams

14.00
Unit IV: 
Predictive Analytics And Visualization

Predictive Analytics – Supervised – Unsupervised learning – Neural networks – Kohonen models – Normal – Deviations from normal patterns – Normal behaviours – Expert options – Variable entry - Mining Frequent itemsets - Market based model – Apriori Algorithm – Handling large data sets in Main memory – Limited Pass algorithm – Counting frequent itemsets in a stream – Clustering Techniques – Hierarchical – K- Means – Clustering high dimensional data Visualizations - Visual data analysis techniques, interaction techniques; Systems and applications

12.00
Unit V: 
Using R with Large Database

Basic of R, concepts before starting, Working of R - Creating, listing and deleting the objects in memory - The on-line help Data with R Objects, R data Frames and Matrices,  Reading data in a file , Saving data, Generating data,  Manipulating objects Graphics with R Managing graphics , Graphical functions - Low-level plotting commands,  Graphical parameters, A practical example - The grid and lattice packages

ESSENTIAL READINGS: 
  • Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses, Michael Minelli, Michelle Chambers, and Ambiga Dhiraj,  Wiley, 2013.
  • Chris Eaton, Dirk Deroos, Tom Deutsch et al., “Understanding Big Data”, McGrawHIll, 2012.
  • Alberto Cordoba, “Understanding the Predictive Analytics Lifecycle”, Wiley, 2014.
  • Eric Siegel, Thomas H. Davenport, “Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die”, Wiley, 2013.
  • W.N. Venables, D.M Smith, “An introduction to R”, R in Nutshell , O Reilly
  • Frank J Ohlhorst, “Big Data Analytics: Turning Big Data into Big Money”, Wiley and SAS Business Series, 2012
REFERENCES: 
  • Michel Berthold, David J Hand, “Intelligent Data Analysis”, Springer 2007
  • Chris  Eaton, Dirk De Roos, Tomeutsch, George Lapis, Paul Zikopoulos, “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data”, McGraw Hill Publishing, 2012
  • Colleen Mccue, “Data Mining and Predictive Analysis: Intelligence Gathering and Crime Analysis”, Elsevier, 2007
  • Anand Rajaraman and Jeffrey David Ullman, Mining of Massive Datasets, Cambridge University Press, 2012.
  • Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics”, Wiley and SAS Business Series, 2012.
  • Paul Zikopoulos, Chris Eaton, Paul Zikopoulos, “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data”, McGraw Hill, 2011.
Academic Year: