- The first concept is the Scales of measurement (nominal, ordinal, interval and ratio - understanding scales of measurement is essential for deciding on the appropriate analysis to perform on each type of scale.
- Degrees of Freedom - a basic concept
- Z-score - how many standard deviations a data point is from the mean
- Central Limit Theorem - an important concept
- Standard Deviation vs Standard Error - a confusing topic
- Confidence Interval - useful for interpretation
- Confusion matrix: useful tool for measuring the accuracy of a classification model.
- Occam's Razor, Bias-Variance Tradeoff, No Free Lunch Theorem and The Curse of Dimensionality - to understand the limitations of machine learning
- Train-Test split and Cross-validation: for building an optimum model which neither underfits nor overfits the dataset.
- Components of Time Series (TCSI): this is the fundamental concept for time series analysis.
Data Science Simplified
Learning the Machine Learning, in a Human-friendly Way
Timeless Statistical Concepts Every Data Scientist Must Master - With Links to Visual Illustrations & Examples
Non-linear Relationships: When a 0 Pearson Correlation Coefficient Can Be Surprisingly Meaningful
We know that Pearson correlation coefficient (r) ranges from -1 to +1. And a zero Pearson correlation coefficient means there exists no linear relationship between the variables.
Here the word linear is crucial. Why? Let's find this out using an example where Pearson Correlation Coefficient = 0.
Consider a case where Y=X2.
Understanding Confidence Intervals with an Intuitive Example
The concept of confidence intervals (CI) is commonly used in data science. Hence, using an intuitive example, let us learn it with confidence!
Imagine you are waiting for the bus at a bus stop. Usually, the bus arrives at 9.30 am. But the arrival time varies.
Another person arrives at the bus stop to catch the same bus and asks you, "Based on your experience, between 9.25 am to 9.35 am, what percentage of the time the bus arrived here?"
You think and answer, "90% of the time".
Standard Deviation vs Standard Error: Clearing up the Confusion with Visual Examples
Standard deviation and standard error are two statistical concepts that are often confused with each other. Though these two measures are related to variability in the data, they are different.
Standard deviation measures the variability in the dataset. The formula for standard deviation is given below.
Mastering Central Limit Theorem (CLT) with Intuitive Examples
To understand the Central Limit Theorem (CLT), let's use the example of rolling two dice, repeatedly (say 30 times). Then calculate the sample mean (mean of two dice values) and plot its distribution.
Demystifying Degrees of Freedom with Visual Examples: A Beginner's Guide
The concept of degrees of freedom (df) is fundamental and commonly used in statistical analysis. In this blog post, let us thoroughly understand this concept with examples.
Many a time, you might have encountered that the degrees of freedom for a particular test is, let's say (n-1) or (n-2) etc. In the blog post, let us understand this concept with examples.
A) Without any restriction
Suppose I give you three boxes as shown below. You are free to fill all these three boxes with any values of your choice. Hence, in this case, the degrees of freedom are 3. In other words, df=n here.
The Chi-Square Test Explained with Examples: A Beginner's Guide
Imagine, in a large gathering, people were given an option to buy any one of the two products for free. You want to test if is there any relation between gender and buying patterns.
When variables are independent
You take a random sample of 40 persons: in which there were 10 men and 10 women. You asked them what did they buy? A pen or a pencil?
The cross-tabulated data is shown in the following table:
Popular Posts
- ARIMA/SARIMA with Python: Understand with Real-life Example, Illustrations and Step-by-step Descriptions
- Demystifying Principal Component Analysis (PCA): A Beginner's Guide with Intuitive Examples & Illustrations
- Feature Selection: Filter method, Wrapper method and Embedded method
- Confusion Matrix, Accuracy, Precision, Recall, F score Explained with Intuitive Visual Examples
- Components of Time Series: A Beginner's Visual Guide
- Train-Test split and Cross-validation: Visual Illustrations & Examples
- Handling Missing Values in Python: Different Methods Explained with Visual Examples
- Time series Cross-validation and Forecasting Accuracy: Understand with Illustrations & Examples
- Scales of Measurement - Data types: Nominal, Ordinal, Interval and Ratio scale
- Support Vector Machines (SVM) Explained with Visual Illustrations