Although video summarization has been studied extensively, existing schemes are neither lightweight nor generalizable to all types of video content. To generate accurate abstractions of all types of video, we propose a framework called Click2SMRY, which leverages the wisdom of the crowd to generate video summaries with a low workload for workers. The framework is lightweight because workers only need to click a dedicated key when they feel that the video being played is reaching a highlight. One unique feature of the framework is that it can generate different abstraction levels of video summaries according to viewers' preferences in real time. The results of experiments conducted to evaluate the framework demonstrate that it can generate satisfactory summaries for different types of video clips.