Statistical inference and prediction

Up to this point, we’ve been looking into datasets and learning about the variables and relationships in them. To take our analysis beyond datasets to knowledge about the world, we need to use statistical inference. Statistical inference enables us to learn about entire populations based on what we have found in our sample, i.e. to make inferences from data. Statistical inference is the first step towards asking “why?”-questions to our data and understanding our world better. Prediction on the other hand lets us use previously observed data to make powerful predictions about future observations that can help us make better decisions or automate processes.

We will start by introducing hypothesis testing, the backbone of statistical inference, and then linear and logistic regression which can be used both for statistical inference and for prediction. Once you are familiar with these tools you will be able to answer questions about populations based on a sample, the relationship between both continuous and categorical variables, and making predictions based on observed variables.

In this course we aim to introduce these concepts and showing the possibilities with these methods without going into too much detail. This will allow you to understand when to use which tool, but to be sure you are using the tool correctly you have to go deeper into the theory behind them, or consult an expert. You can think of it as learning about the function of a car, a train, or an airplane. You can know when to use which mode of transportation, even if you don’t know how to operate each one of them and may have to rely on experts to do that.