$$\Gamma^\mu_{\mu\lambda}=\frac{1}{2}g^{\mu\rho}(\partial_\mu g_{\lambda\rho}+\partial_\lambda g_{\mu\rho} -\partial_\rho g_{\lambda\mu})$$
we can cancel the first and third term on the right hand side, yielding
\begin{equation}
\Gamma^\mu_{\mu\lambda}=\frac{1}{2}g^{\mu\rho}\partial_\lambda g_{\mu\rho}
\end{equation}

The idea is to show that this equals $\frac{1}{\sqrt{-g}}\partial_\lambda\sqrt{-g}$:

\begin{align*}
\partial_\lambda\sqrt{-g}&=-\frac{1}{2\sqrt{-g}}\partial_\lambda g=-\frac{1}{2\sqrt{-g}}\frac{|g_{\mu\nu}+\delta g_{\mu\nu}|-|g_{\mu\nu}|}{\delta x^\lambda}\\
&=-\frac{1}{2\sqrt{-g}}\frac{|g_{\mu\nu}||\mathbb{I}+(g_{\mu\nu})^{-1}\partial_\lambda g_{\mu\nu}\delta x^\lambda|-|g_{\mu\nu}|}{\delta x^\lambda}\\
&=-\frac{1}{2\sqrt{-g}}\frac{|g_{\mu\nu}|\bigr(1+\text{Tr}((g_{\mu\nu})^{-1}\partial_\lambda g_{\mu\nu}\delta x^\lambda)\bigr)-|g_{\mu\nu}|}{\delta x^\lambda}\\
&=\sqrt{-g}\frac{1}{2}(g^{\mu\nu}\partial_\lambda g_{\mu\nu})
\end{align*}
Here, I used $|g_{\mu\nu}|$ to denote the determinant of $g_{\mu\nu}$. Multiplying by $\frac{1}{\sqrt{-g}}$ and comparing shows that
$$\Gamma^\mu_{\mu\lambda}=\frac{1}{\sqrt{-g}}\partial_\lambda\sqrt{-g}=\partial_\lambda \ln\sqrt{-g}$$
It follows that
$$\nabla_\mu V^\mu =\frac{1}{\sqrt{-g}}\partial_\mu(\sqrt{-g}V^\mu) $$
Thus, you see that your formula of the d'Alembertian holds only for scalars: the first covariant derivative reduces to a partial derivative, and for $\partial^\mu\phi$ we appeal to the formula I derived. Note, however, that we never used the equations of motion. This is all just a much longer and more explicit version of what Qmechanic already posted while I was typing this up.

There is a really nice derivation of this identity using differential forms, and it completely avoids all the messiness of the Christoffel symbols.

The nice thing about differential forms is that the exterior derivative can be computed using any derivative operator, so it allows us to compare the expressions we get using the covariant derivative to the expression you would get using partial derivatives.

Treating $\phi$ as a 0-form, we are going to compute $*d*d\phi$. Doing it with covariant derivatives will reproduce $\nabla_a \nabla^a \phi$, while with partial derivatives we will get the formula involving $\sqrt{-g}$:

We used that the covariant derivative of the $\epsilon$ tensor is zero, and also the fact that when you contract indices of two $\epsilon$ tensors you get a generalized Kronecker delta. The minus sign comes from the fact that $\epsilon$ is normalized with a factor of $\sqrt{-g}$, whereas when you raise all the indices of the epsilon tensor with an inverse metric you get an overall factor of $g^{-1}$, so $\dfrac{\sqrt{-g}}{g} = -\dfrac{1}{\sqrt{-g}}$.

Now we do the same thing using partial derivatives, and using the normalization of the $\epsilon$ tensor, $\epsilon_{abcd} = \sqrt{-g}\tilde{\epsilon}_{abcd}$, where $\tilde{\epsilon}_{abcd}$ is the alternating symbol, with values of $+1$, $-1$ or $0$ (and hence has vanishing partial derivatives).

\begin{align}
*d*d\phi &=\frac{1}{4!}\epsilon^{abcd}\cdot4\partial_{[a}(\epsilon_{|e|bcd]}\partial^e\phi)\\
&=\frac{4}{4!}\frac{-1}{\sqrt{-g}}\tilde{\epsilon}^{abcd}\partial_a(\sqrt{-g}\tilde{\epsilon}_{ebcd}g^{ef}\partial_f\phi)\\
&=-\frac{4\cdot3!}{4!}\delta^e_a\frac{1}{\sqrt{-g}}\partial_a(\sqrt{-g})g^{ef}\partial_f\phi)\\
&=-\frac{1}{\sqrt{-g}}\partial_a(\sqrt{-g}g^{af}\partial_f\phi)
\end{align}
Hence, we see why the two expressions are equal: it is intimately connected with the differential forms description of this operator.

For differential forms of rank $1$ or greater, the operator $\nabla_a\nabla^a$ is related to the Laplace-Beltrami operator $\triangle = *d*d+d*d*$, but is not exactly equal to this; the difference of the two operators is proportional to the curvature tensor. However, the analog of the derivation above can be derived for a differential $p$-form $\alpha$, with the operator $*d*$, i.e.
$$\nabla_{a_1} \alpha^{a_1\ldots a_p}= -*d*\alpha = \frac{1}{\sqrt{-g}}\partial_{a_1}\left(\sqrt{-g} \alpha^{a_1\ldots a_p}\right)$$

And, as others have mentioned, the derivation does not rely on the equations of motion, and hence holds off-shell.

The first (right-most) covariant derivative $\nabla_{\mu}$ in the formula
$$\tag{1} \Box ~=~ g^{\nu\mu}\nabla_\nu\nabla_\mu$$ acts on a scalar $\phi$ and can hence be replace by a partial derivative $\partial_{\mu}$. (This reduction step would not have been true for a non-scalar.)

Assuming $\phi$ is a scalar, then the second (left-most) covariant derivative $\nabla_{\nu}$ in (1) acts on a co-vector $\partial_{\mu}\phi$, which implies that we also get contributions from the Christoffel symbols.