Data Science Fundamentals
Probability, distributions, hypothesis testing, and more โ explained for engineers, not academics.
10 posts ยท 83 min total read
Probability
Data Science. Probability
Knowing probability and its applications are important to work effectively on data science problems, this post will remind you what probability actually is.
Data Science. Bayes Theorem
Bayes theorem is one of the most important rules of probability theory used in Data Science. It provides us with a way to update our beliefs based on the arrival of new events.
Data Science. Probability Distributions
There are many distributions, but here, we will be talking about the most common and used ones.
Measurement
Data Science. Measures
In order to asses and describe the distribution of characteristics, we need to know a couple of things: the values of these characteristics, which are typical for the distribution under study and how much they are typical.
Data Science. Correlation
Lemme tell stories of what is a correlation and how it is used in Data Science. Btw, did you know that Nicolas Cage "causes" pool deaths?
Data Science. The Central Limit Theorem and Sampling
In this post, I'll explain what CLT means, why sampling is important, the different types of sampling you might come across and an alternative to it.
Analysis
Data Science. Demystifying Hypothesis Testing
Hypothesis testing is an essential procedure in statistics. A hypothesis test evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data.
Data Science. Data Types
Defining the type of variable you are working with is always the first step in the data analysis process to determine the most appropriate type of analysis.
Data Science. Descriptive and Inferential Statistics
Descriptive statistics will teach you the basic concepts used to describe the data sample
Data Science. Exploratory Data Analysis
The science of data relies on data exploration, and it's called EDA