We develop GPU adaptations of the Aho-Corasick string matching algorithm for the the case when all data reside initially in the GPU memory and the results are to be left in this memory. We consider several refinements to a base GPU implementation and measure the performance gain from each refinement. Experiments conducted on an NVIDIA Tesla GT200 GPU that has 240 cores running off of a Xeon 2.8GHz quad-core host CPU show that our Aho-Corasick GPU adaptation achieves a speedup between 8.5 and 9.5 relative to a single-thread CPU implementation and between 2.4 and 3.2 relative to the best multithreaded implementation.