I would have thought if the system truly is linear then it does not matter how fast you sweep the sine. The problem is only that most systems are only approximately linear.

As long as the input overall has enough bandwidth and enough energy to allow you to measure the response at a sufficiently good signal to noise ratio, then your done. Think about it this way: If you make the FM sweep effectively infinitely fast, then you get a kind of click, and you can collect the impulse response of the system by deconvolving the click from the recording. In theory that's great, but in practice there can be a problem with that because, in doing so, you 'pack a lot of bandwidth' into a short time, so you may need to output a lot of power to get a decent S/N. Your stimulus delivery might start distorting. Slow FM would be one way to deliver substantial stimulus at potentially modest stimulus power (you simply spread it out in time, hence less distortion, and potentially 'cumulatively' a good S/N.

This is a not strictly an auditory question, but it could be
useful for people doing acoustic measurements. If you use a
swept sine wave to measure the frequency response of a linear
system, what is the limitation on the speed of the sweep in
terms of how accurate the result would be? I imagine it has
something to do with how smooth the actual frequency response
is. If it has some pronounced bumps, they could be smoothed
out if the sweep is too fast.

In practice, you could sweep at some arbitrary rate, and then
slow it by a factor of two, and if the result is the same
(within an acceptable tolerance) you could say that you've
converged on the solution.