Description

This is a continuation of #7490, which became too "big" to be usable. Please keep that other ticket closed.

A lot of possible reasons for bad performance have been identified, some of them fixed, most being due to installation and configuration issues. Please refer to the TracPerformance checklist, and help maintain that wiki page with your findings.

This ticket should be used to help identify the remaining performance bottlenecks, a part from those which are already known (see TracDev/Performance).

(and if someone has a version of pystack that terminates or can be interrupted, I'll take it!)

So next_rev is slow (we knew it), but can be extremely slow in the direct-svnfs case. Maybe we should consider bailing out and returning None after a while there, or even have a smarter implementation which uses a dichotomic search for example.

The general topic Trac 0.11 really slow, high cpu usage (continued) and in particular most of the remaining unsolved cases from #7490, have now been addressed part of #9111 by r10002. Condition.wait(timeout) is a performance killer in Python, as it does a kind of "active" wait. This seems to have been fixed in Python 3.2 which uses the appropriate library call (see ​PythonBug:3155). Never assume the underlying system does the "right thing", but look at the code to be sure…

Could this be causing something? I noticed that it says Trac 0.12, which I thought I wasn't running on this system. As said, I thought my version was 0.11.

This is obviously an InstallationIssue… nothing wrong with Trac itself and nothing related to this ticket…

It seems that this TicketExtPlugin will only work with Trac < 0.11.6 (see r8609), and obviously you have 0.12.2 installed (also verify in the /about page as TRAC_ADMIN or in Admin / Plugins). So you can either try to fix that plugin (recommended) or downgrade (uninstall all Trac versions in site-packages, then reinstall 0.11.5 - not recommended).

If you can get anything more specific about the slow down (does it also happen without that plugin?), then please create a new ticket.

Ok, and on a side note let me bump the priority of this ticket, as I think the next_rev thing is causing the slowdown I see in my own production Trac… I didn't have time yet to do an in-depth debugging session but there seem to be a problem with the source:tags/trac-0.12.2/trac/versioncontrol/cache.py@:365-366#L364​ part which doesn't seem to use the indexes (well, indeed there are no indexes​ on the paths…).

I ran a single call of fs.check_path under strace and it looks like libsvn_fs_fs reads the revision root, then follows to the leaf node, *and also* looks up the previous rev of the node. We don't need anything from the previous rev, the node type is already in the most recent rev.

Ah! The only reason revision_root *seems* faster is that check_path fills the dag cache, and node_history etc. get lots of dag cache hits. Actually I believe the *first* call to check_path is slow, all other calls to check_path and other APIs are probably fast, because they work with a warm cache.

I took check_path out of next_rev, this is what the profile looked like:

30% fewer calls to check_path, but only 10% less time. So the check_path calls I removed must have been the "fast" ones. The slow one(s) that fill the cache must still be running, probably the one in SubversionNode.__init__.

This shows how misleading these profiles can be. I guess it's because the C call structure inside libsvn_fs_fs is opaque to the Python profiler - otherwise I would see a lot of tottime spent in some leaf svn function like open_path and the rest would be light.

I don't think there's much optimization left here. In the context of Trac specifically, we eventually need to page in the dag nodes, for rendering the "previous revision" links, "last change" header, etc. It doesn't matter when it happens. Though the binary search will certainly continue to help, because it just avoids paging in a lot of nodes.

So far I've been benchmarking by rendering the same page 10 times. On the second and later renders, this will get very good hit rate in the dag cache. This is unlikely to happen under real traffic, so this is probably an unfair benchmark. At the other extreme, the most pessimistic benchmark would probably be to render random paths. That way the cache will be warm within the same request, but cold across requests. With small repos it probably doesn't matter, because you'll keep a lot in cache at various levels anyway, even across requests. I'm working with a 40000 revision repo.

It makes a huge difference for me.. This is with trunk/setup_wininst.bmp@364​, from a repo svnsync'ed from ​http://svn.edgewall.org/repos/trac with 12623 revs, and packed (svnadmin pack), and repository_type = direct-svnfs. Wall time of dispatch_request('/browser/trunk/setup_wininst.bmp?rev=364'), averaged over 10 runs:

If we have some information about the fact there's a deletion, that would be better than just having a disabled Next Revision link, as the latter could also mean we're at the latest revision. Of course, having an error following such a link is not nice, a rendering of the changeset box and the file content replaced by (the file was deleted in that revision) would be better!

I have been trying to fix the deleted revision bug, but it's difficult because of many edge cases. Fix one case, break another. There is indeed also a problem with scoped repos. This is shown in the svn_tests test case when generating the SVN repository cache - some changesets "disappear" and SvnCachedRepo.get_changeset fails. This is going to take a while longer…