Before a proper solution can be identified, what does your data look like and what are the criteria for it to be a dupe? You may have hundreds of calls in you log file to a specific graphic, but if they all come from different ip address at different times, they are not a dupe. Also how big is your log file? A hash based dupe system may not work over millions of records.