Hi,
I have been recently trying to get thread offloading of the CS ioctl
into r600g in order to reduce the impact of kernel overhead on fps.
That, unfortunately, requires whole winsys/radeon to be used, because
even the buffer management (bo_map, bo_wait, bo_busy) must take into
account that a CS ioctl may be in progress. Besides that, there are
several possible race conditions in r600g, so instead of rewriting
r600g and trying to do what winsys/radeon is doing, I decided to
simply use winsys/radeon.
What's new in r600g:
- Thread offloading of the DRM CS ioctl. I expect 0-15% increase in
performance from that in CPU-bound apps.
- The new GEM_WAIT ioctl is used to avoid waiting for a buffer when
possible. (e.g. Mesa may map an index buffer to compute index bounds,
which shouldn't cause unnecessary waiting now) I have sent the DRM
patches which add the ioctl to dri-devel.
- Thread-safety: There are several possible races in r600g. I
especially don't like radeon_bo::reloc, which may cause pretty ugly
races if a resource is shared and relocated in multiple contexts.
winsys/radeon doesn't have that race and also fixes a couple more.
Hopefully this thread-safeness won't cause performance regressions.
winsys/radeon can do space checking as well, but we don't use that in r600g yet.
Performance improvements - I have been able to find a difference with
these apps:
Unigine Heaven
Before: 7.3 fps
After: 7.6 fps
Torcs
Before: 29 fps
After: 34 fps
Note that every commit in the r600winsys2 branch has been committed
without piglit regressions, so that we can bisect through it if
needed. The net loss is a little over 900 lines of code in r600g. The
fenced cache buffer manager in r600g has turned out not to be superior
to pb_cache_bufmgr with the is_buffer_busy hook set, so I removed the
former too, as the latter is way simpler.
This new work has been pushed into a new branch r600winsys2 in the
main Mesa repository, please review/test.
Marek