In this paper a fast and robust method is proposed to search a large video collection for given short clips. Compared with existing video searching methods which use visual features only, our scheme performs a two-phase hierarchical matching technique using visual and audio features successively. Considering that video sampling rate (25 or 30 fps) is much lower than that of audio (8 to 48 kHz), a coarse search is implemented with sub-sampled video frames first, and then potential matches will be verified and accurately located using fine audio features. Both features are extracted directly from MPEG compressed video for computational efficiency. Experiments have been conducted on over 10.5 hours of video to search for re-occurrences of 83 TV commercials and one news lead-out clip. All the 220 instances are correctly detected with no false alarm. Our experiments also show that the proposed method is robust to variations of video bit rate, frame rate, frame size and color shifting.