It depends on what you call 'good performance' and what you want to achieve. In 1337 1/3 of the total frame time was spent on the double buffer. I can't remember exactly but drawing all the lines was dominating the rest of CPU time.

If you can live with drawing/erasing with EOR and a smaller image area, you can get an idea of what is achievable....

I even made some tests with that routines for 3D shapes. They are really fast, though slower than wireframe (if I recall correctly). The BIG trick of the demo is the usage of alternate lines to draw/erase, which made it really impressive!

I thought about a technique to do really fast horizontal line drawing, but it would (probably) only be good where you are not doing adjacent or overlapping shapes (so I'm thinking of an x-rotator effect, or with precalculated slices a chessboard wobbler for example).

This is where you draw the first six pixels as normal (from a table) then in the next byte put a control code to set the background to the foreground colour, then another to set the foreground to the background colour. Then at the end of the line you draw the last byte as normal (but in inverse) and the byte after that sets the background back to normal.

Obviously, for lines that span less than 4 bytes, you draw as normal.

But for any line longer than that, it takes a constant (small) time to draw. Also, to clear the line you just redraw those few bytes, so that becomes faster too.