Introduction to Data Science

Paper Code: 
MCA 324A
Credits: 
04
Periods/week: 
04
Max. Marks: 
100.00
Objective: 

Course Objectives:

This course enables the students to

  1. Define the concepts of data science.
  2. Understand the concepts of big data in data science.
  3. Demonstrate the data science process.
  4. Differentiate between business intelligence and data science.
  5. Evaluate using different statistical methods.
  6. Construct cases and new ideas where the knowledge of data science can be implemented.

 

 

Course Outcomes(COs):

Learning outcomes

(at course level)

Learning and teaching strategies

Assessment

Strategies

CO136. Define basic concepts of Python programming.

 

 

CO137. Describe basic Python file operations.

 

 

CO138. Illustrate how to use Oops concepts using Python.

 

 

CO139. Compare and analyze different packages used in Python.

 

 

CO140. Evaluate, analyze and handle the exceptions in Python programming.

 

 

CO141. Create new ideas where the knowledge of Python can be implemented.

Approach in teaching:

Interactive Lectures,

Modeling, Discussions, implementing enquiry based learning, Student centered approach, Through audio-visual aids

 

Learning activities for the students:

Experiential Learning, Presentations, Case based learning, Discussions, Quizzes and  Assignments

Assignments

Written tests in classroom

Classroom Activity

Objective Quiz

Semester End Exam

 

12.00
Unit I: 

Introduction

What is Data Science, Need for Data Science, Components of Data Science, Big data, Facets of data: Structured data, Unstructured data, Natural Language, Machine-generated data, Graph-based or network data, Audio, image and video, Streaming data, The need for Business Analytics, Data Science Life Cycle, Applications of data science

12.00
Unit II: 

Introduction to Big Data

Classification of Digital Data, Big Data and its importance, Four Vs, Drivers for Big data, Big data analytics, Classification of Analytics , Top Challenges Facing Big Data, Responsibilities of data scientists, Big data applications in healthcare, medicine, advertising

12.00
Unit III: 

 

Data Science Process

Overview of data science process, setting the research goal, Retrieving data, Cleansing, integrating and transforming data, Exploratory data analysis, Data Modeling, Presentation and automation, Types of Analytics: Descriptive analytics, Diagnostic analytics, Predictive analytics, Prescriptive analytics

12.00
Unit IV: 

Statistics

Basic terminologies, Population, Sample, Parameter, Estimate, Estimator, Sampling distribution, Standard Error, Properties of Good Estimator, Measures of Centers, Measures of Spread, Probability, Normal Distribution, Binary Distribution, Hypothesis Testing ,Chi-Square Test , ANOVA

12.00
Unit V: 

Data Science Tools and Algorithms

Basic Data Science languages- R, Python, Knowledge of Excel, SQL Database, Introduction to Weka, Regression Algorithms: How Regression Algorithm Work, Linear Regression, Logistic Regression, K-Nearest Neighbors Algorithm, K-means algorithm.

ESSENTIAL READINGS: 
  • Samuel Burns, “Fundamentals of Data Science: Take the first Step to Become a Data Scientist” , Amazon KDP Printing and Publishing, First Edition, 2019
  • Davy Cielen, Arno D.B. Meysman, Mohamed Ali, “Introducing Data Science”, Manning Publications, 2016


 

REFERENCES: 
  • Cathy O’Neil and Rachel Schutt, “Doing Data Science, Straight Talk From The Frontline”, O’Reilly. 2014.
Academic Year: