I am currently using the following three methods to solve differential equations:

4th order Runge Kutta Method
Euler Method
Internal scipy methods:

scipy.integrate.odepack
#odeint

I am aware that eulers method is first order, with a total error after N steps being O(h) (I think).

And that RK4 is 4th order, hence its name i.e. has error of: O(h^4), however, I am unsure of this, as I have also been told that when using this method there is no way to estimate errors.

Also, when considering the internal 'integrator' I was unaware of the method that this used, I have found the following site, which explains the algorithms behind them, but it doesn't explain the errors, hence why I am asking on here.

$\begingroup$Yes, you got the order of error correct for the Euler and RK4. But order is essentially meaningless in this field because the constants matter and adaptivity is king, so I completely left this out of my post.$\endgroup$
– Chris RackauckasJan 15 '18 at 15:27

First, let me describe the algorithm. A common multistep algorithm for non-stiff equations are the Adams-Moulton methods. While these are implicit, the Adams-Bashforth methods are generally too unstable to be useful. To solve the implicit equation, a simple Picard iteration (or if you want to get fancy and make it a little better, Anderson acceleration, but LSODA uses Picard) since it's cheap. Using Picard iteration here decreases the stability region (as discussed in Hairer II), but this is considered okay since the algorithm is for non-stiff problems. This is the Adams integrator. As I describe in a blog post comparing the different ODE suites developed over the years, pretty much all of the multistep methods (except a few from Shampine) are based on the same implementation starting from GEAR, so lsode, lsoda, vode, CVode (SUNDIALS) all have pretty much the same Adams implementation for non-stiff problems.

For stiff problems, the go-to multistep method is the Backwards Differentiation Formula (BDF) method. The Wiki page is quite good and shows how the stability gets bad >5, so generally no one uses the 6th order. Every step these have to solve an implicit equation and these use quasi-Newton methods to do so because this does not increase the theoretical stability as much as fixed-point iteration. However, to keep the costs down, it's quasi-Newton instead of Newton because in many cases the Jacobians can be similar between two steps, so re-using the same Jacobian will allow it to still converge to the correct implicit solution without having to factorize $(I - \Delta t J)$ again.

Multistep methods require previous time points in order to calculate their solution (look at the formulas). Because of this, they need some other solutions in order to get started. Thus a very simple way to do this is to create an adaptive order method. These start with a 1st order method (Euler for Adams, Backwards Euler for BDF), and then after calculating one value can move up to second order (based on some criteria), and so forth. As the order increases, the error per step decreases, and thus the stepsize can change. Thus the standard non-stiff multistep solver is an adaptive order 1-12 Adams method with adaptive stepsize, and the standard stiff multistep solver is an adaptive 1-5 BDF method with adaptive stepsize. I won't even describe how the choice of next order or time step occurs here because that gets quite complicated quite fast.

Now in the 90's a few people like Shampine and Petzold had the idea that these kinds of methods could be combined. Using ideas from Shampine, Petzold created LSODA, which is like LSODE except it has a way to adaptively change from the Adams method to the BDF method and vice versa. Basically, it does eigenvalue estimates to compute how stiff the equation is, and if it's too stiff / too non-stiff it switches behaviors.

So LSODA is both an Adams method and a BDF method, which changes its stepsize and its order according to internal error estimates, and in many cases it has to factorize Jacobians for BDF, but it doesn't factorize Jacobians every step if it realizes that the implicit equations will solve fast enough. It is completely and utterly impossible to ascribe any asymptotic algorithmic complexity to this!

One tangent though:

as I have also been told that when using this method there is no way to estimate errors

No, ODE solvers do have internal ways to estimate the (local) error, and this is actually what's used in adaptive time stepping methods like LSODA. Most of these use an embedded method, i.e. a separate lower order method that is a different linear combination of the same stages. By being the same stages, it's almost free to compute. However, no embedded method with RK4 is really used. There is one really good error estimator for RK4, but it's very very rare in practice. In fact, it's only used in state-dependent delay solvers and Shampine describes it in his paper for ddesd. The DifferentialEquations.jl implementation has this error estimator implemented (mostly for its use in delay differential equations), and so if you're curious here's some open-source code you can take a look at for how it's done (and a more efficient version for mutable arrays). That's some niche stuff though.

So okay, how do you in practice find something like "algorithmic complexity" of ODE solvers then? The best and most common method is work-precision diagrams. What you do is you take some test equations, solve them at really high accuracy to get a reference solution, then solve it with a bunch of different adaptivity tolerances and stepsize to find the relation between time and error. That's crucial: time doesn't mean anything without its relation to error.

There are three big benchmark sets. The first is Hairer I and Hairer II which are both very good references for the solver algorithms. The second is this IVP test suite. These are pretty outdated though and only have very old algorithms. They don't tend to use methods which solve the linear systems in a multithreaded fashion either, which is something that is quite common these days and pretty necessary for stiff solver efficiency. They also don't have LSODA in them.

A point of comparison is in order. This lsoda is a wrapper over a C-translation of the Fortran code that SciPy uses, so it will have similar characteristics. The usage in SciPy will be a quite slower though since Python link a compiled f derivative function like Julia does, but other than that there's not going to be anything different about the algorithm's behavior.

Summary

So let me end by summing up a few characteristics about the algorithm. It's very multi-paradigm. It does fine in almost all of the standard benchmarked cases, though never is really the best. But it can switch modes to handle non-stiff problems okay, and stiff problems well (it has issues with stiff problems when the tolerance gets lower though). But if all you ever wanted was one algorithm to use in any case, this would be a great pick (which is most likely what it's the SciPy default).

Notice though that I said "standard benchmarked cases". There are many cases where you would not want to use this algorithm. For example, I describe in a blog post that the cost of using multistep methods increases tremendously if you have lots of events (because they cannot use past time points anymore and need to "go back to Euler"). Also, BDF methods are only L-stable up to second order, so they have problems with very stiff ODEs (which is why for example MATLAB recommends you use something other than ode15s if the problem is too stiff). Multistep methods in general are more advantageous for large systems then they are for small systems due to their single step per time step, though other methods like Runge-Kutta methods can take much larger steps and get the same error due to optimization of constants in the expansion (described here).

That's probably the simplest way to comprehensively describe the algorithmic complexity (or efficiency in solving ODEs) of LSODA, the default choice for odeint in SciPy 1.0.

Edit

For future reference, this is the original paper on lsoda. lsoda is a modification of the lsode Adams and BDF methods, so it's not referred to by name here and is instead referred to as "a modified version of the LSODE package" (which makes it extremely hard to find to search for haha).

$\begingroup$Thank you very much, your answer is extremely helpful. Just one quick question... am I correct in saying that for a larger step size, the Euler method would be the most accurate?$\endgroup$
– GeorgeJan 15 '18 at 21:39

$\begingroup$Asymptotically yes, but in practice that's essentially never going to happen. The reason is because the coefficients are orders of magnitude smaller, and are multiplied by higher order derivatives. Higher order derivatives tend to zero unless the solution is exploding, so this means that, yes if you make dt infinitely large that's true, but you never really make it large enough for that to matter. If you want more details, just ask it as another question.$\endgroup$
– Chris RackauckasJan 16 '18 at 1:08