Crawler G retrieves data elements from attacker page A and commits those contents to persisted storage as G[A] (e.g., a database row).

End user visits application T. Application T's persisted storage is the set of {G}.

End user's interaction with application T results in invocation of JavaScript code whereby G[A] is retrieved, and due to a failure neutralize the content in G[A] either prior to its persisted storage or during JavaScript execution from the DOM, G[A] is executed as active code instead of being properly interpolated as scalar-like primitive data value or closure-guarded object data.

Maturely programmed crawlers often attempt to strip malicious data from crawled resources prior to persistent storage. Additionally, maturely programmed applications often utilize output escaping or JavaScript sandboxing to prevent crawled data from being executed instead of rendered. However, obfuscation of data on a crawled resource may sidestep detection, and reliance strictly on crawler sanitization of crawled resources may result in stored cross-site scripts executing if the target JavaScript context does not actively defend against all non-scalar data.

Arshan Dabirsiaghi surmised that vulnerability to this attack would eventually surface in popular search engines during his presentation at OWASP NYC AppSec 2008, Next Generation Cross Site Scripting Worms. Daniel Chechik and Anat Davidi confirmed Dabirsiaghi's surmisal by demonstrating such vulnerability in the Google Translate web application and Yahoo! cached page results during the DEF CON 21 security conference in their August 2013 Utilizing Popular Websites for Malicious Purposes Using RDI presentation.