Predictive Analysis using R

Paper Code: 
DAC 332
Credits: 
04
Periods/week: 
02
Max. Marks: 
100.00
Objective: 

Learning outcome (at course level)

Learning and teaching strategies

Assessment Strategies
   

Students will be able to:

  1. Explain basic applications, concepts, and techniques of Predictive analysis.
  2. Apply analysis tool R to solve practical problems in a variety of disciplines
  3. Apply basic and advanced statistical techniques used in business research.
  4. Analyze large sets of data to gain useful business understanding.
  5. Describe and demonstrate R methods for data visualisation

Approach in teaching:

Interactive Lectures, Discussion, Demonstrations, Group activities, Teaching using advanced IT audio-video tools 

 

Learning activities for the students:

Effective assignments, Giving tasks.

 

Assessment Strategies

Class test, Semester end examinations, Quiz, Practical Assignments, Individual and group projects

 

 

12.00
Unit I: 

Introduction to R Programming

R and R Studio, Logical Arguments, Missing Values, Characters, Factors and Numeric, Help in R, Vector to Matrix, Matrix Access, Data Frames, Data Frame Access, Basic Data Manipulation Techniques, Usage of various apply functions – apply, lapply, sapply and tapply, Outliers treatment. 

 

12.00
Unit II: 

Descriptive Statistics

Measures of Central Tendency (Mean, Mode and Median), Charts (Bar, Pie and Box Plot, Histogram, Stem and Leaf Diagram), Measures of dispersion (Range, Inter-Quartile-Range, Standard Deviation, Skewness and Kurtosis), Standard Error of Mean and Confidence Intervals.

Discrete Probability Distributions: Binomial, Poisson, Continuous Probability Distribution, Normal Distribution & t-distribution, Sampling Distribution and Central Limit Theorem.

 

12.00
Unit III: 

Statistical Inference and Hypothesis Testing:

Parametric and non parametric tests (one sample, independent sample, paired sample and two and more then two samples) 

 

12.00
Unit IV: 

Correlation and Regression

Analysis of Relationship, Positive and Negative Correlation, Perfect Correlation, Correlation Matrix, Scatter Plots, Simple Linear Regression, R Square, Adjusted R Square, Testing of Slope, Standard Error of Estimate, Overall Model Fitness, Assumptions of Linear Regression, Multiple Regression, Coefficients of Partial Determination, Durbin Watson Statistics, Variance Inflation Factor.

 

12.00
Unit V: 

Logistic Regression                                                                          

Binary Classification versus Point Estimation, Odds versus Probability, Logit Function, Classification Matrix, Individual Group Classification Efficiency, Overall Classification Efficiency, Nagelkerke R Square, Receiver Operating Characteristic Curve, Sensitivity, Specificity, Area Under ROC Curve, Cut-Offs, True Positive Rate and False Positive Rate.

 

 

ESSENTIAL READINGS: 
  1. Maindonald,John,Braun john ,”Data Analysis and Graphics Using R”, Cambridge University Press,2007
  2. Gardener Mark,”Beginning R: The Statistical Programming Language “ Wiley India Pvt. Ltd. 2015
  3. Srivasa K.G., Siddesh G M,Shetty,” Statistical Programming in R”, Oxford University Press 2017
  4. Business Statistics: Naval Bajpai, Pearson
  5. Menard, S. (2002). Applied Logistic Regression Analysis. Thousand Oaks, CA: Sage.

 

Academic Year: