Page 393: Figure 13.10 contains some extra fans. Numbering the columns 0 to 15 (left to right), and the rows 0 to 4 (top to bottom), the fans originating at locations (3, 1), (7, 1) and (11, 1) should not be present. Thanks to Peter Longhurst for pointing this out!

Page 417: Line 147 of streamCompact_odd.cuh declares an int instead of T (thanks to Louise Knight for pointing this out!):

It’s so weird why it’s “1-stagingIndex” in “CUDART_CHECK( cudaEventRecord( g_events[1-stagingIndex], NULL ) );”, but not “stagingIndex”, and this causes the copy going serially.
After I changed the “1-stagingIndex” to “stagingIndex”, I still got right result, and got a double speed.