Big Data Analytics (Associate Analytics – II)

Unit I: Data Management

Data architecture is composed of models, policies, rules or standards that govern which data is collected, and how it is stored, arranged, integrated, and put to use in data systems and in organizations. We usually export our data to cloud for purposes like safety, multiple access and real time simultaneous analysis. By the end of this session, student will be able to:

                 1. Design Data Architecture

                 2. Understand various Data Sources

                 3. Export Data to Amazon S3


Unit II:Big Data Tools

Introduction to Big Data tools like Hadoop, Spark, Impala etc., Data ETL process, Identify gaps in the

There are thousands of Big Data tools out there. All of them promising to save you time, money and help you uncover never-before-seen business insights. By the end of this session, student will be able to:

1. Know the basics of Big Data Tools

2. Understand gaps in data.

data and follow-up for decision making.


Unit III :Big Data Analytics

Big data analytics is the process of examining large datasets to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. By the end of this session, student will be able to:

1. Execute Descriptive analytics on Big Data tools.

2. Detect outlier and eliminate them.

3. Prepare data for analysis.


Unit IV :Machine Learning Algorithms

Machine learning is the subfield of computer science that "gives computers the ability to learn without being explicitly programmed".Machine learning is sometimes conflated with data mining, where the latter subfield focuses more on exploratory data analysis. By the end of this session, you will be able to:

1. Do Hypothesis Testing

2. Determine multiple analytical methodologies.

3. Train model no 2/3 sample data.

4. Predict Sample.

5. Explore chosen algorithms for accuracy


Unit V : Data Visualization

Data visualization is the presentation of data in a pictorial or graphical format. It enables decision makers to see analytics presented visually, so they can grasp difficult concepts or identify new patterns. With interactive visualization, you can take the concept a step further by using technology to drill down into charts and graphs for more detail, interactively changing what data you see and how it’s processed

By the end of this session, you will be able to:

1. Prepare Data for visualization.

2. Draw insights out of visualization tools


Text Books & References

Text Books:

T1:Student’s Handbook for Associate Analytics-2.

Reference Books:

     T2:Introduction to Data Mining, Tan, Steinbach and Kumar, Addison Wesley, 2006

T3:Data Mining Analysis and Concepts, M. Zaki and W. Meira (the authors have kindly made an online     version available): http://www.dataminingbook.info/uploads/book.pdf

T4:Mining of Massive Datasets Jure Leskovec Stanford Univ. Anand RajaramanMilliway Labs Jeffrey D.   Ullman Stanford Univ.

(http://www.vistrails.org/index.php/Course:_Big_Data_Analysis)