In the initial section of our Data Science bootcamp, you’ll embark on a comprehensive journey through the realm of Data Science. Starting with an overview of the field, we chart out the common career paths, delve into essential technical and practical skills, and showcase real-world applications across various industries. We’ll conclude with a detailed exploration of the Data Science interview process and content, equipping you with strategies to confidently tackle interviews in Statistics, Machine Learning, A/B Testing, Data Analysis, NLP, and Programming. At the end of this section you will know what exactly you need to learn and practice to become a Job Ready Data Scientist.

- Data Science Overview
- Data Science Career Path
- Must-have Skills Required
- Data Science Usage
- Data Science Interview Process
- Statistics Interviews
- Machine Learning Interviews
- A/B Testing Interviews
- Data Analysis Interviews
- NLP Interviews
- Programming Interviews

Overview of Data Science

Data Science Skills

Data Science Interviews

In the second section of our Data Science bootcamp, we dive into essential statistical concepts. Starting with Random Variables, we cover core measures like Mean, Variance, Standard Deviation, and explore the relationship between variables using Covariance and Correlation.

We demystify Probability Distribution Functions and Conditional Probability, including an introduction to Bayes Theorem. Introduction to Econometrics, Causal Analysis, Hypothesis Testing, and Statistical Significance.

We conclude with a variety of basic to advanced Statistical Tests and Inferential Statistics, cementing your understanding of the Central Limit Theorem and the Law of Large Numbers.

This section will reinforce your statistical foundation, equipping you with the statistical skills to analyze, model and interpret complex data.

- Random Variables
- Sample Space
- Probability
- Mean
- Variance
- Standard Deviation
- Covariance
- Correlation
- Probability Density Functions (PDFs)
- Conditional Probability
- Bayes’ Theorem
- Linear Regression
- Ordinary Least Squares (OLS)
- Hypothesis Testing
- Significance Level
- P-Values
- Type I & II Errors
- Confidence Intervals
- Statistical Tests
- Inferential Statistics

Statistics for Data Analysis

Statistics for Machine LEarning

Statistics for A/B Testing

Statistics for Causal Analysis

In the ‘Fundamentals to Machine Learning’ section of our bootcamp, you’ll start by understanding the essential elements of machine learning, including a deep dive into supervised and unsupervised learning.

We guide you on how to strategically select the best machine learning model for your data science project and meticulously walk you through the entire process of training an ML model.

We tackle essential concepts such as the Bias-Variance Trade-off, Overfitting, and Regularization. You’ll delve into the intricacies of both linear and non-linear modeling using a wide variety of popular classification and regression algorithms.

Additionally, we cover an extensive list of clustering algorithms to help you handle unstructured data. We also shed light on Dimensionality Reduction, Feature Selection, Resampling Techniques, and Optimization Techniques.

By the end of this section, you will be well-versed in implementing, evaluating, and improving various Machine Learning models in real-world scenarios.

- Supervised vs Unsupervised
- RSS, MSE, RMSE, Gini Index, Entropy
- Linear Regression
- Logistic Regression
- LDA
- KNN
- Decision Trees
- Bagging
- Random Forest
- AdaBoost
- GBM
- XGBoost
- K-Means
- Hierarchical Clustering
- DBSCAN
- PCA
- Cross Validation
- Bootstrapping
- Grid Search
- GG
- SGD
- Adam Optimizer

Popular Machine Learning Algorithms

ML Model Training

ML MODEL Evaluation

ML MoDEL Optimization

In this industry level training section, we provide complete guide to A/B testing, discussing its definition, uses, and the process involved. We go in-depth into the concept of business and statistical hypotheses and primary metrics.

Next, we focus on designing an A/B test, where you will learn about power analysis, minimum sample size calculation and test duration, along with an understanding of novelty and maturation effects.

When it comes to running the A/B test, we provide guidance on key considerations to ensure its success. The section on result analysis helps you understand how to choose the right statistical test for your A/B test, how to calculate and interpret p-values, for the statistical significance and practical significance.

Lastly, we shine a light on common pitfalls in A/B testing, and how to avoid these pitfalls to ensure the reliability of your A/B tests.

- Primary Metrics
- Business Hypothesis
- Statistical Hypothesis
- Statistical Tests
- Conversion Rate
- Click Through Rate
- A/B Test Design
- Power Analysis
- Minimum Detectable Effect
- Significance Levels
- Minimum Sample Size
- Test Duration
- Statistical Significance
- Practical Significance
- Common Pitfalls
- Novelty Effects
- Maturation Effects

Complete Guide to A/B Testing

A/B Test Design

Power Analysis

A/B Test Results Analysis

Introduction to Natural Language Processing” section begins with an overview of text preprocessing in NLP, highlighting the process and examples of cleaning text step-by-step.

We examine the basic NLP techniques such as tokenization, bag-of-words, word embeddings, semantic analysis.

We also cover Term Frequency-Inverse Document Frequency (Tf-Idf), explaining its definition, idea, and the step-by-step process for calculating Term Frequency (Tf) and Inverse Document Frequency (Idf), along with examples.

Lastly, we leap into the future with the latest innovations in Natural Language Processing (NLP), exploring transformer models like BERT and GPT-3. Comparisons between these models are also highlighted.

- Text Preprocessing
- Tokenization
- Bag-of-Words Representation
- Word Embeddings
- Semantic Analysis
- Tf-Idf
- Cutting-Edge NLP Developments
- Transformers
- BERT
- GPT-3
- ChatGPT
- Machine Learning
- Artificial Intelligence
- Information Retrieval

Text Preprocessing Guide

Basic NLP Techniques

Recent Developemnets in NLP, LLM, and AI

A/B Test Results Analysis

This industry level section starts with best coding practices and the use of the PyCharm environment. It introduces various data types, variables, complex structures like lists, dictionaries, and matrices, and fundamental constructs like for-loops and if-else statements.

The section also explores essential Python libraries for data science and demonstrates data loading, exploration, preprocessing, and random generation.

We further delve into data filtering, sorting, and grouping, along with methods for calculating descriptive statistics.

This includes handling tasks related to merging datasets, creating User Defined Functions (UDFs), text cleaning for NLP, and a range of data visualization techniques.

Finally, we examine various data sampling methods, and we provide a comprehensive and step-by-step walkthrough of A/B Test results analysis in Python.

- Best Coding Practices
- PyCharm IDE
- Data Types
- Data Structures
- For-loops, If-Else Statements
- Python Libraries
- Data Loading
- Data Preprocessing
- Random Data Generation
- Data Aggregation
- Descriptive Statistics
- Merging Datasets
- User Defined Functions (UDFs)
- NLP Text Preparation
- Data Analysis
- Data Visualization
- Data Sampling
- A/B Test Analysis

Python Configuration

Data Analysis in Python

Data Visualization in Python

Text Preparation in Python

A/B Test Results Analysis in Python

**WHAT MAKES PLAYLIST SUCCESSFULL**

Case study that uses Exploratory Data Analysis (EDA) to identify and correlate features of successful music playlists with the success metrics. Then it uses Econometrics, Linear Regression for Causal Analysis to identify features that define the Playlists’ success

- Product Data Science
- Descriptive Statistics
- Success Metrics
- Feature Engineering
- Business Hypothesis
- Data Preparation
- Hypothesis Testing
- Exploratory Data Analysis (EDA)
- Correlation vs Causation
- Causal Analysis
- Linear Regression Interpretation

Predicting Salaries of Job Postings

Case Study that utilizes Machine Learning to estimate salaries based on job postings. It involves statistical analysis to identify key features and outliers in the data. Multiple Machine Learning models are trained and their performances are compared using Cross-Validation to select the best ML model.

- Machine Learning
- Predictive Analytics
- Regression
- Business Goal
- Descriptive Statistics
- Exploratory Data Analysis (EDA)
- ML Model Selection
- ML Model Training
- Linear Regression
- Bagging
- Random Forest
- Gradient Boosting Machine (GBM)
- Extreme Gradient Boosting (XGBoost)
- ML Model Comparison
- K-Fold Cross Validation
- Feature Importance

Building Top-K Job Recommender System

Case Study that develops a Job Recommender System, a top K job recommender algorithm utilizing Natural Language Processing (NLP) and Machine Learning. It uses CountVectorizer to transform data, and KNN Algorithm for building a Collaborative Filtering algorithm that generates tailored job recommendations.

- Natural Language Processing (NLP)
- Unstructured Data
- Text Lemmatization
- NLTK
- Text Cleaning
- CountVectorizer
- Recommender Systems
- Collaborative Filtering
- Similarity Measures
- Machine Learning
- KNN Algorithm
- Artificial Intelligence (AI)

TBD

TBD

TBD

TBD

TBD