Welcome!
Do you wish to know how to analyze and solve business and economic questions with data analysis tools? Then Econometrics by Erasmus University Rotterdam is the right course for you, as you learn how to translate data into models to make forecasts and to support decision making.
* What do I learn?
When you know econometrics, you are able to translate data into models to make forecasts and to support decision making in a wide variety of fields, ranging from macroeconomics to finance and marketing. Our course starts with introductory lectures on simple and multiple regression, followed by topics of special interest to deal with model specification, endogenous variables, binary choice data, and time series data. You learn these key topics in econometrics by watching the videos with in-video quizzes and by making post-video training exercises.
* Do I need prior knowledge?
The course is suitable for (advanced undergraduate) students in economics, finance, business, engineering, and data analysis, as well as for those who work in these fields. The course requires some basics of matrices, probability, and statistics, which are reviewed in the Building Blocks module.
* What literature can I consult to support my studies?
You can follow the MOOC without studying additional sources. Further reading of the discussed topics (including the Building Blocks) is provided in the textbook that we wrote and on which the MOOC is based: Econometric Methods with Applications in Business and Economics, Oxford University Press. The connection between the MOOC modules and the book chapters is shown in the Course Guide – Further Information – How can I continue my studies.
* Will there be teaching assistants active to guide me through the course?
Staff and PhD students of our Econometric Institute will provide guidance in January and February of each year. In other periods, we provide only elementary guidance. We always advise you to connect with fellow learners of this course to discuss topics and exercises.
* How will I get a certificate?
To gain the certificate of this course, you are asked to make six Test Exercises (one per module) and a Case Project. Further, you perform peer-reviewing activities of the work of three of your fellow learners of this MOOC. You gain the certificate if you pass all seven assignments.
Have a nice journey into the world of Econometrics!
The Econometrics team

Taught By

Philip Hans Franses

Prof. Dr.

Christiaan Heij

Dr.

Michel van der Wel

Dr.

Dennis Fok

Prof. Dr.

Richard Paap

Prof. Dr.

Dick van Dijk

Prof. Dr.

Erik Kole

Dr.

Francine Gresnigt

PhD candidate

Myrthe van Dieijen

PhD candidate

Transcript

[SOUND] Welcome! The previous lectures have shown that ordinary least squares is a great tool to uncover relationships in economics and business. In this lecture, I'm going to make you aware that this tool does not always work. There are circumstances where OLS breaks down. These circumstances relate to the difference between correlation and causality. Luckily, econometrics also has the solution. But we before we discuss this, let's consider a motivating example. Suppose we want to explain the monthly number of departing flights at an airport using the number of travel insurances sold in the month before. What kind of relationship would you expect if you regress flights as the variable y on a constant, and insurances as the variable x? Most likely we will obtain a positive relationship. For example, like this. Now, what do these estimates really mean? I invite you to think about this by answering a test question. It is correct to use the estimates to make predictions. The positive coefficient indicates that many sold insurances goes together with many flights. Note that this statement merely relies on a correlation. The found positive correlation can be used to make adequate predictions. It is incorrect to interpret the coefficient in a causal way. Selling additional insurances does not cause an increase in flights. There is another variable that causes both the insurances and the flights. This variable is simply the demand for travel. This example shows that we cannot always interpret least squares estimation results, as causal effects. However, identifying causal effects is one of the main goals of econometrics. Ordinary least squares requires some assumptions for it to correctly estimate causal effects. One important assumption is that explanatory variables are exogenous. The violation of this assumption is called endogeneity. In this lecture, and the upcoming ones, you will learn to understand and recognize endogeneity. You will get to know the consequences of this and you will learn how to come up with an alternative estimator. You will learn how this new estimator works and the conditions that are necessary for it to work properly. Finally you will learn how to test these assumptions. Let us start by studying the source of endogeneity. The formal assumption that we violate is the assumption that explanatory variables X in the linear model are non-stochastic. So what does non-stochastic really imply? Literally speaking, non-stochastic means that if you would obtain new data only the y values would be different and the values for X would stay the same. This is like a controlled experiment where the researcher determines the experimental conditions coded in X. This assumption is crucial for the OLS estimator to be consistent. Consistent means that the estimator b converges to the true coefficient beta when the data set grows larger and larger. In economics however, controlled experiments are rare. X variables are often the consequence of an economic process, or of individual decision making. In our example, the travelers together determine the number of insurances sold. From the researcher's point of view, the X variables should therefore be seen as stochastic. Once we allow X to be stochastic, we acknowledge that we would get different X values in a new data set. And if variables are stochastic, they can also be correlated with other variables, even with variables that are not included in the model! In the context of our example, the number of insurances will be correlated with the travel demand. Although travel demand is difficult to observe and not included in the model, it does influence the number of flights. In the model, travel demand is therefore part of the error term epsilon. As a consequence, the X variable, insurances sold, is correlated with epsilon. If an explanatory variable X is correlated with epsilon, we say that X is endogenous. Usually, this correlation is due to an omitted factor. We will later see that this leads to inconsistency of the OLS estimator. That is, OLS does not properly estimate beta. If X is uncorrelated with epsilon, X is called exogenous and OLS is consistent. Now let's consider three possible sources of endogeneity in more detail. Endogeneity is often due to an omitted variable. In our example, the omitted variable was travel demand. Let's consider this situation formally. Suppose that the true model for a variable y contains two blocks of explanatory variables, X1 and X2. And that in this true model, all assumptions are satisfied. However, when we estimate beta, we omit X2. That is, we regress y only on X1. The error term epsilon in this second model, now contains the original error, eta, as well as the omitted effect of X2. From this relationship we can see that in the second model X1 will be correlated with epsilon if X1 and X2 are correlated and beta2 does not equal 0. The derivation at the bottom of this slide proves this. When thinking about whether certain variables in a model are endogenous, it is good to think about potential omitted variables. If you can think of an omitted variable that is related to the included variables, and the dependent variable, you will have endogeneity. Let's practice this a bit in a test question. Suppose we run a regression to explain a student's grade using only the number of attended lectures. What omitted variable leads to endogeneity here? The difficulty of the exam and the introduction of compulsory attendance will both not lead to endogeneity. The first variable cannot affect attendance, while the other does not affect the grade. The omission of the motivation of students does lead to endogeneity. Highly motivated students are likely to attend many lectures and obtain high grades. So a regression of grades on attendance will not show the true impact of attendance. It will partly capture the unobserved motivation as well. A second cause of endogeneity is strategic behavior. Consider a model in which you explain the demand for products using only its price. If the salesperson strategically sets high prices when a high demand is expected, high demand will often go together with high prices! A simple regression may then yield a positive price coefficient. This is of course not the true impact of price. Price is endogenous in this regression as it correlates with the market information, which in turn, determines demand. A third reason for endogeneity, is measurement error. Suppose that we have a variable y, say, salary, That depends on a factor that is difficult to measure. For example, intelligence. Let's denote the intelligence by x*. We can obtain a noisy measurement of intelligence, for example through an IQ test. The test score is called x and is equal to the true intelligence plus the measurement error. In the training exercise, you will be asked to show that such measurement error leads to endogeneity in a model that explains why using the test score x. To summarize, endogeneity is a common and serious challenge in econometrics as OLS is not useful under endogeneity. In the next lectures, we will consider solutions and tests for endogeneity. Now I invite you to make the training exercise, to train yourself with the topics of this lecture. You can find this exercise on the website and this concludes this lecture.

Explore our Catalog

Join for free and get personalized recommendations, updates and offers.