1. Installing Julia + Juno IDE, as well as useful packages

Perhaps the greatest obstacle to using Julia in the past has been the absence of an easy-to-install IDE. There used to be an IDE called Julia Studio which was as easy to use as the popular RStudio for R. Back then, you could install and run Julia + Julia Studio in 5mins, compared to the hours it could take to install Python and its basic packages and IDE. When Julia version 0.3.X was released, Julia Studio no longer worked, and I recommended the IJulia Notebook, which requires the installation of Python and IPython just to use Julia, so any argument that Julia is more convenient to install than Python was lost.

Now, with Julia version 0.4.X, Juno has provided an excellent IDE that comes pre-bundled with Julia for convenience, and you can install Julia + Juno IDE in 5mins. Here are some instructions to help you through the installation process:

After the brief download (the entire Julia language + Juno IDE is less than 1GB), open the file and click through the installation instructions.

Open Juno, and try to run a very simple line of code. For example, type 2+2, highlight this text, right-click, and choose the option Evaluate. A bubble should display 4 next to the line of code.

Trouble-shooting: On my Mac running OS X Mavericks, 2+2 failed and an unhelpful error was produced. After some searching, I found that the solution was to install the Jewel package. To install Jewel from within Juno, just type Pkg.add(“Jewel”), highlight this text, and Evaluate. After this, 2+2 was successful.

You have successfully installed Julia + Juno IDE. Now, you will want to run the following codes to install some other packages used in the econometric exercises below:

2. Defining a structural econometric challenge

To motivate our application, we consider a very simple economic model, which I have taught previously in the mathematical economics course for undergraduates at the University of Chicago. Although the model is analytically simple, the econometrics become sufficiently complicated to warrant the Method of Simulated Moments, so this serves us well as a teachable case.

Let denote consumption and denote leisure. Consider an agent who wishes to maximize Cobb-Douglas utility over consumption and leisure, that is,

.

where is the relative preference for consumption. The budget constraint is given by,

,

where is the wage observed in the data, is other income that is not observed in the data, and is the tax rate.

The agent’s problem is to maximize subject to the budget constraint. We assume that non-labor income is uncorrelated with the wage offer, so that . Although this assumption is a bit unrealistic, as we expect high-wage agents to also tend to have higher non-labor income, it helps keep the example simple. The model is also a bit contrived in that we treat the tax rate as unobservable, but this only makes our job more difficult.

The goal of the econometrician is to identify the model parameters and from the data and the assumed structure. In particular, the econometrician is interested in the policy-relevant parameter , where,

,

and denotes the demand for consumption. is the marginal propensity for an agent with wage to consume in response to the tax rate. is the population average marginal propensity to consume in response to the tax rate. Of course, we can solve the model analytically to find that and , where is the average wage, but we will show that the numerical methods achieve the correct answer even when we cannot solve the model.

3. Data generation, management, and regression visualization

To generate data that follows the above model, we first solve analytically for the demand functions for consumption and leisure. In particular, they are,

Thus, we need only draw values of and , as well as choose parameter values for and , in order to generate the values of and that agents in this model would choose. We implement this in Julia as follows:

This code is relatively self-explanatory. Our parameter choices are , , , and . We draw the wage to have distribution , but this is arbitrary.

We combine the variables into a DataFrame, and export the data as a CSV file. In order to better understand the data, we also non-parametrically regress on , and plot the result with Gadfly. The Julia code is as follows:

4. Numerical simulation of optimal agent behavior under constraints

We now use constrained numerical optimization to generate optimal consumption and leisure data without analytically solving for the demand function. We begin by importing the data and the necessary packages:

This syntax is especially convenient, as it allows us to define vectors of parameters, each satisfying the natural inequality constraints. Next, we define the budget constraint, which also follows this convenient syntax:

Notice that we can optimize one objective function instead of optimizing objective functions because the individual constrained maximization problems are independent across individuals, so the maximum of the sum is the sum of the maxima. Finally, we can apply the solver to this model and extract optimal consumption and leisure as follows:

To make sure it worked, we compare the consumption extracted from this numerical approach to the consumption we generated previously using the true demand functions:

cor(c_opt,array(df[:consump])) 0.9999999998435865

Thus, we consumption values produced by the numerically optimizer’s approximation to the demand for consumption are almost identical to those produced by the true demand for consumption. Putting it all together, we create a function that can solve for optimal consumption and leisure given any particular values of , , and :

5. Parallelized estimation by the Method of Simulated Moments

We saw in the previous section that, for a given set of model parameters and and a given draw of for each , we have enough information to simulation and , for each . Denote these simulated values by and . With these, we can define the moments,

which is equal to zero under the model assumptions. A method of simulated moments (MSM) approach to estimate and is then,

where is a weighting matrix, which is only relevant when the number of moments is greater than the number of parameters, which is not true in our case, so can be ignored and the method of simulated moments simplifies to,

Assuming we know the distribution of , we can simply draw many values of for each , and average the moments together across all of the draws of . This is Monte Carlo numerical integration. In Julia, we can create this objective function with a random draw of as follows:

In order to estimate , we need to run sim_moments(params) many times and take the unweighted average across them to achieve the expectation across . Because each calculation is computer-intensive, it makes sense to compute the contribution of for each draw of on a different processor and then average across them.

In order to implement the parallelized method of simulated moments, the function hh_constrained_opt() and sim_moments() are stored in a file called est_msm_inner.jl. The following code defines the parallelized MSM and then minimizes the MSM objective using the optimize command set to use the Nelder-Mead algorithm from the Optim package:

Parallelization is performed by the @parallel macro, and the results are horizontally concatenated from the various processors by the hcat command. The key tuning parameter here is numReps, which is the number of draws of to use in the Monte Carlo numerical integration. Because this example is so simple, a small number of repetitions is sufficient, while a larger number would be needed if entered the model in a more complicated manner. The process is run as follows and requires 268 seconds to run on my Macbook Air:

Selection bias arises when a data sample is not a random draw from the population that it is intended to represent. This is especially problematic when the probability that a particular individual appears in the sample depends on variables that also affect the relationships we wish to study. Selection bias corrections based on models of economic behavior were pioneered by the economist James J. Heckman in his seminal 1979 paper.

For an example of selection bias, suppose we wish to study the effectiveness of a treatment (a new medicine for patients with a particular disease, a preschool curriculum for children facing particular disadvantages, etc.). A random sample is drawn from the population of interest, and the treatment is randomly assigned to a subset of this sample, with the remaining subset serving as the untreated (“control”) group. If the subsets followed instructions, then the resulting data would serve as a random draw from the data generating process that we wish to study.

However, suppose the treatment and control groups do not comply with their assignments. In particular, if only some of the treated benefit from treatment while the others are in fact harmed by treatment, then we might expect the harmed individuals to leave the study. If we accepted the resulting data as a random draw from the data generating process, it would appear that the treatment was much more successful than it actually was; an individual who benefits is more likely to be present in the data than one who does not benefit.

Conversely, if treatment were very beneficial, then some individuals in the control group may find a way to obtain treatment, possibly without our knowledge. The benefits received by the control group would make it appear that the treatment was less beneficial than it actually was; the receipt of treatment is no longer random.

In this tutorial, I present some parameterized examples of selection bias. Then, I present examples of parametric selection bias corrections, evaluating their effectiveness in recovering the data generating processes. Along the way, I demonstrate the use of the GLM package in Julia. A future tutorial demonstrates non-parametric selection bias corrections.

Example 1: Selection on a Normally-Distributed Unobservable

Suppose that we wish to study of the effect of the observable variable on . The data generating process is given by,

,

where and are independent in the population. Because of this independence condition, the ordinary least squares estimator would be unbiased if the data were drawn randomly from the population. However, suppose that the probability that individual were in the data set were a function of and . For example, suppose that,

if ,

and the probability is zero otherwise. This selection rule ensures that, among the individuals in the data (the in ), the covariance between and will be positive, even though the covariance is zero in the population. When covaries positively with , then the OLS estimate of is biased upward, i.e., the OLS estimator converges to a value that is greater than .

To see the problem, consider the following simulation of the above process in which and are drawn as independent standard normal random variables:

where, in Julia 0.3.0, the command array() replaces the old command matrix() in converting a DataFrame into a numerical Array. This simulation demonstrates severe selection bias associated with using the sample data to estimate the data generating process instead of the population data, as the true parameters, , are not recovered by the sample estimator.

Correction 1: Heckman (1979)

The key insight of Heckman (1979) is that the correlation between and can be represented as an omitted variable in the OLS moment condition,

,

Furthermore, using the conditional density of the standard Normally distributed ,

.

which is called the inverse Mills ratio, where and are the probability and cumulative density functions of the standard normal distribution. As a result, the moment condition that holds in the sample is,

Returning to our simulation, the inverse Mills ratio is added to the sample data as,

We see that the estimate for is now 1.056, which is close to the true value of 1, compared to the non-corrected estimate of 1.452 above. Similarly, the estimate for has improved from -0.859 to -0.167, when the true value is 0. To see that the Heckman (1979) correction is consistent, we can increase the sample size to , which yields the estimates,

Note that this analysis generalizes to the case in which contains variables and the selection rule is,

,

which is the case considered by Heckman (1979). The only difference is that the coefficients must first be estimated by regressing an indicator for on , then using the fitted equation within the inverse Mills ratio. This requires that we observe for . Probit regression is covered in a slightly different context below.

As a matter of terminology, the process of estimating is called the “first stage”, and estimating conditional on the estimates of is called the “second stage”. When the coefficient on the inverse Mills ratio is positive, it is said that “positive selection” has occurred, with “negative selection” otherwise. Positive selection means that, without the correction, the estimate of would have been upward-biased, while negative selection results in a downward-biased estimate. Finally, because the selection rule is driven by an unobservable variables , this is a case of “selection on unobservables”. In the next section we consider a case of “selection on observables”.

Example 2: Probit Selection on Observables

Suppose that we wish to know the mean and variance of in the population. However, our sample of suffers from selection bias. In particular, there is some such that the probability of observing depends on according to,

,

where is some function with range . Notice that, if and were independent, then the resulting sample distribution of would be a random draw from the population (marginal distribution) of . Instead, we suppose . For example,

In this simulated population, the estimated mean and variance of are -0.022 and 1.002, and the covariance between and is 0.501. Now, suppose the probability that is observed is a probit regression of ,

,

where is the CDF of the standard normal distribution. Letting indicate that , we can generate the sample selection rule as,

The sample data has missing values in place of if . The estimated mean and variance of in the sample data are 0.275 (which is too large) and 0.862 (which is too small).

Correction 2: Inverse Probability Weighting

The reason for the biased estimates of the mean and variance of in Example 2 is sample selection on the observable . In particular, certain values of are over-represented due to their relationship with . Inverse probability weighting is a way to correct for the over-representation of certain types of individuals, where the “type” is captured by the probability of being included in the sample.

In the above simulation, conditional on appearing in the population, the probability that an individual of type is included in the sample is 0.841. By contrast, the probability that an individual of type is included in the sample is 0.5, so type is over-represented by a factor of 0.841/0.5 = 1.682. If we could reduce the impact that type has in the computation of the mean and variance of by a factor of 1.682, we would alter the balance of types in the sample to match the balance of types in the population. Inverse probability weighting generalizes this logic by weighting each individual’s impact by the inverse of the probability that this individual appears in the sample.

Before we can make the correct, we must first estimate the probability of sample inclusion. This can be done by fitting the probit regression above by least-squares. For this, we use the GLM package in Julia, which can be installed the usual way with the command Pkg.add(“GLM”).

which are very close to the population values of -0.022319 and 1.00195. The logic here extends to the case of multivariate , as more coefficients are added to the Probit regression. The logic also extends to other functional forms of , for example, switching from Probit to Logit is achieved by replacing the ProbitLink() with LogitLink() in the glm() estimation above.

Example 3: Generalized Roy Model

For the final example of this tutorial, we consider a model which allows for rich, realistic economic behavior. In words, the Generalized Roy Model is the economic representation of a world in which each individual must choose between two options, where each option has its own benefits, and one of the options costs more than the other. In math notation, the first alternative, denoted , relates the outcomes to the individual’s observable characteristics, , by,

.

Similarly, the second alternative, denoted , relates to , by,

.

The value of that appears in our sample is thus given by,

.

Finally, the value of is chosen by individual according to,

if ,

where is the cost of choosing the alternative and is given by,

,

where contains additional characteristics of that are not included in .

We assume that the data only contains ; it does not contain the variables or the functions . Assuming that the three functions follow the linear form and that the unobservables are independent and Normally distributed, we can simulate the data generating process as,

In this simulation, about 38% of individuals choose the alternative . About 10% of individuals choose even though they receive greater benefits under due to the high cost associated with .

Solution 3: Heckman Correction for Generalized Roy Model

The identification of this model is attributable to Heckman and Honore (1990). Estimation proceeds in steps. The first step is to notice that the left- and right-hand terms in the following moment equation motivate a Probit regression:

,

where is the negative of the total error term arising in the equation that determines above, , and,

,

In the simulation above, . We can estimate from the Probit regression of on and .

In summary, we can consistently estimate the benefits associated with each of two alternative choices, even though we only observe each individual in one of the alternatives, subject to heavy selection bias, by extending the logic introduced by Heckman (1979).

The Julia developers have officially released version 0.3.0, and it is a good idea to upgrade. The key packages in Julia that we use for economics — especially DataFrames — are being rapidly developed for 0.3.0, so you will need to upgrade to take advantage of new and improved features.

It’s relatively easy to upgrade Julia. However, it appears that Julia Studio does not and will not support 0.3.0, so the bigger problem users will face is abandoning Julia Studio in favor of another environment for writing and testing codes.

There are a few candidates from which you may choose to replace Julia Studio. My recommendation is IJulia. IJulia is somewhat easy to install, easy to use, and excellent for working with graphics. A drawback of using IJulia is that it loads in your browser and, at least in my experience, it often crashes the browser (I am using Safari on a Mac), so you will need to save your work frequently.

The greater drawback is that IJulia is much more difficult to install than Julia Studio, as it requires the installation of IPython. It used to be true that a complete novice could install and use both Julia and a coding environment with a few clicks. Now, it is more involved.

How to Install Julia 0.3.0 and IJulia

Here is a (somewhat) quick guide to upgrading Julia and replacing Julia Studio with IJulia:

Delete your existing Julia and Julia Studio. On a Mac, these are located within the Applications folder. These can only cause problems by leaving them on your computer. You may need to check for hidden files relating to Julia and delete those as well (this is definitely true on a Mac, as the hidden ~.julia must be found and deleted). You can find instructions specific to your operating system here under Platform Specific Instructions.

Go here and download Julia version 0.3.0. Once it finishes downloading, open the file and allow it to install. On a Mac, you can find Julia-0.3.0 in Applications once it has finished installing. Open Julia by double-clicking the application. It will open a command prompt/terminal that indicates you are using Julia version 0.3.0. You have successfully upgraded Julia.

To install IJulia, first you will need to install IPython here. If you are lucky, pip install works for you and you can install IPython with one command. If not, you should consider installing a Python distribution that includes IPython; the easiest option may be to install Anaconda, although this gives you much more than you need.

If you have made it this far, the rest is easy. Simply open Julia, use the Julia commands Pkg.add(“IJulia”), then Pkg.build(“IJulia”), then Pkg.update().

You are finished. To open IJulia, you can either use the commands using IJulia then notebook() within Julia, or within the command prompt/terminal use the command ipython notebook –profile julia.

Going Forward

Compare these moderately complicated instructions to my previous promise that anyone could have Julia and a Julia coding environment ready to use in under 5 minutes. 5 minute installation was one of my favorite aspects of Julia, and this is an unfortunate casualty of the upgrade to 0.3.0.

At the moment, it is easier to install R and R Studio than to install Julia and IJulia, unless you already have IPython on your computer. Hopefully, the Julia developer community, which is actively engaged and dedicated to the performance of this language, create a new one-click approach to installing both Julia and a Julia coding environment, which used to be available from Julia Studio. In the mean time, IJulia appears to be the best option for new users.

* The script to reproduce the results of this tutorial in Julia is located here.

The Kalman Filter is notoriously difficult to estimate. I think this can be attributed to the following issues:

It necessitates a large number of parameters that are difficult to manage.

It requires the sequential inversion of matrices, so we must be careful to keep the matrices invertible and monitor numerical error.

Most papers and Internet resources about the Kalman Filter assume that we have time-series rather than panel data, so they do not show how to adapt the parameters for panel data estimation

Similarly, this literature makes “macro” assumptions like initializing at the steady state rather than the “micro” approach of estimating the initial state.

Professional software is not well-suited to parameter estimation (more on this below).

In this tutorial, I will show how to write your own program that estimates the MLE of the most basic type of panel data Kalman Filter: time-invariant parameters and dedicated measures. It will explicitly impose assumptions to ensure identification, so that the estimates returned are uniquely determined from the data, unlike what would happen if we maximized the likelihood with the unrestricted Kalman Filter. In Part 2 of the tutorial, I generalize the estimation to permit a time-varying parameter space as well as time-varying parameter values, show how to identify the initial state, permit non-dedicated measures, and show how to “anchor” the model so that the units of the state space are determined to scale.

for time periods . We assume that , referred to as “the measures” or “the data”, is the only observed term in this entire system. The first equation determines the evolution of the unobserved state, , and we will refer to it as the “transition equation”. The second equation determines the relationship between the state and measures or signals of the state, , and we will refer to it as the “measurement equation”. The shocks, and , are assumed to be independent across , across , and even across components so that and are diagonal.

For our purposes, the parameters of interest are typically , although we generally need to identify as well so that we can separate them from . It is no trouble to add other observables to these equations, e.g., in optimal control theory, there is usually an observed “input”, call it , on the right-hand side the transition equation, so that the equation takes the form,

,

but this is a detail that could be added later with little difficulty, so we omit it until Part 2 of this tutorial. Also, we will assume for now that the initial distribution of the state is characterized by and , but relax this assumption in Part 2.

Failure of Identification and Necessary Assumptions

Before simulating the DGP, notice that it is indifferent to the scale or even the sign of . In particular, consider dividing by a constant , and multiplying by . Then,

,

so cancels out and we are left with the same expression.

Since our only information about comes from , this means that we will never be able to know and apart from one another, that is, we will never know the scale of . Similarly, we cannot distinguish and . By choosing , we see that neither the sign nor the magnitude of is determined by knowledge of . Let me emphasize this point: we can change the sign of every entry in as we wish and get the same data. Since is one of the parameters of interest — in fact, our desire to learn is often the motivation for using the Kalman Filter — this is a major issue. Since can take on a continuum of values, we see that the DGP results in the same for a continuum of parameters .

If we need to know the unique value of each parameter, we must make additional assumptions. The following assumptions, adapted from Cunha and Heckman (2008), are sufficient to ensure identification:

Each of the measures (i.e., the components of ) is a measure of only one of the factors (i.e., the components of ). This is the “dedicated measures” assumption, and it is equivalent to requiring each row of to have exactly one entry that is not assumed to be zero.

There are at least 3 dedicated measures for each factor. This means that each column in must have at least three entries that are not assumed to be zero.

The first assumption can be relaxed a bit if there are more than 3 measures per factor, but we can never relax the assumption that there is at least one dedicated measure for one of the factors.

In practice, are these assumptions valid? Possibly, it depends on your faith in the measures available in your data. You must be able to confidently claim that certain measures in your data are measuring only one factor. The alternative is to abandon estimating the model. In other words, the Kalman Filter is not all-powerful — certain models simply cannot be estimated with certain data, and trying to force it to estimate a model that is not identified from your data will result in incorrect estimates.

Notice that professional Kalman Filter software does not allow one to impose these assumptions. This is one of the main motivations for this tutorial: at present, you must program the Kalman Filter yourself if you want valid estimates of the model parameters, and estimating the model parameters rather than projecting the state from given model parameters is the main purpose of the Kalman Filter in microeconomic applications.

Simulation of the Kalman Filter DGP

Before simulating the Kalman Filter DGP, we must first deal with the issue I mentioned at the beginning of this tutorial: the Kalman Filter DGP necessitates a large number of parameters that are difficult to manage. We need a function that can convert an array of parameters into the matrices , including exponentiating variance terms so that when we use an optimizer later, the optimizer can try out different parameters without error. The following function, unpackParams, is my solution to this problem:

where stateDim is the length of and obsDim is the number of measures on each component of . This structure assumes that all measures are dedicated to a single factor, and the same number of measures is available for each factor. The function can be modified as needed to account for other structures. Here is the output from using this function on a set of parameters when there are two components of with 3 measures on each:

Now that we understand the Kalman Filter model, we can consider the Kalman Filter itself. The Kalman Filter is the estimator of what I have been calling the Kalman Filter DGP. The aim of the Kalman Filter for panel data is to estimate the mean and variance of for each and each .

Estimation can be divided into two steps. Denoting as the set of all data available at or before time , the projection step is trivial to derive and given by:

The update step is:

where,

is the error from projecting at time , and,

is called the optimal Kalman gain and is derived as the solution to a least-squares problem.

The individual likelihood of the Kalman Filter is given by,

The initial conditions enter through . In this tutorial, we assume , where is the multivariate normal PDF. We permit individual-specific initial conditions and unknown parameters in Part 2 of the tutorial. Furthermore, .

Finally, since individuals are assumed to be independent,

Now that we have a simple expression for the likelihood, we need only convert it into a Julia code.

Estimation of the Kalman Filter in Julia

To simplify computation, we write a single function that simultaneously predicts and updates (in that order) from any time period to the next time period . It begins with the “posterior” estimates of the expected state and variance of the state at time , that is, , uses these to predict the “prior” estimates at time , that is, , uses the priors to obtain the likelihood of the new observation at time , and finally uses the priors as well as the new observation at time to update to the posterior estimates at time . The function, incrementKF (“a time increment of the Kalman Filter”, is as follows:

We see that the code is quite simple, as it closely mimics the mathematical formulas above in syntax. The part of the code that does not mimic the usual mathematical syntax — preparing parameters — is handled elsewhere by unpackParams.

Now that we have the code that predicts and updates the Kalman Filter for any individual while outputting the contribution to the log-likelihood, we can iterate over time and collect all of the log-likelihood contributions from an individual’s time path:

This initializes the individual’s contribution to the likelihood using the assumed initial distribution of , then proceeds to predict and update the Kalman Filter over time until all log-likelihood contributions have been collected. The only confusing part is obsDict, which is a dictionary containing the column names of the observations, which are assumed to be stored in a DataFrame called data. This strategy isn’t necessary, but it avoids a lot of complications, and also makes our example better resemble a real-world application to a data set.

Finally, we loop through each individual, summing all of their lifetime log-likelihood contributions:

where I like to include a print statement so that I can monitor the convergence of the likelihood. Thus, we have the panel data Kalman Filter likelihood, under the appropriate assumptions to ensure identification. Finally, we use a wrapper to prepare it for optimization:

where I have included a print statement so that I can monitor the parameters that the optimizer is testing and know in advance if the estimates are diverging (if it were diverging, I would know my code has a bug and could stop the optimizer before it wastes too much time).

Everything here is the same as in the DGP code above, except that we are converting the data matrix into a DataFrame. The columns are named using the syntax time_number, so that three_4 means the 4th observation in time period 3. It creates obsDict such that, for example, data[obsDict[3]] would return the variables from three_1 to three_6 — all of the observations at time 3 in the correct order.

Now, we are ready to estimate the Kalman Filter as follows:

MLE = optimize(wrapLoglike,params0,method=:cg,ftol=1e-8)

When it is finished (and we have watched it converge using the print statements), we can view the parameters in the appropriate context by:

On a single processor of my personal laptop, optimization requires 446 seconds. In response to a reader comment, I also added this version of the code that achieves optimization in 393 seconds by removing all global variables.

For those in social sciences at the University of Chicago, the manager of our Acropolis cluster has completed my request to install Julia.

I have responded to all of the comments made on this website. Please provide additional comments to let me know how the tutorials are working for you, how I can make them better, and any tutorials you would like added.

I originally switched to Julia because Julia was estimating a complicated MLE about 100-times faster than Python. Yesterday, I demonstrated how to bootstrap the OLS MLE in parallel using Julia. I presented the amount of time required on my laptop to bootstrap 1,000 times: about 21.3 seconds on a single processor, 8.7 seconds using four processors.

For comparison, I translated this code into Python, using only NumPy and SciPy for the calculations, and Multiprocessing for the parallelization. The Python script is available here. For this relatively simple script, I find that Python requires 110.9 seconds on a single processor, 66.0 seconds on four processors.

Thus, Julia performed more than 5-times faster than Python on a single processor, and about 7.5-times faster on four processors.

I also considered increasing the number of bootstrap samples from 1,000 to 10,000. Julia requires 211 seconds on a single processor and 90 seconds on four processors. Python requires 1135 seconds on a single processor and 598 seconds on four processors. Thus, even as the size of the task became greater, Julia remained more than 5-times faster on one processor and around 7-times faster on four processors.

In this simple case, Julia is between 5- and 7.5-times faster than Python, depending on configuration.