Measuring the Perpetrators and Funders of Typosquatting

Abstract

We describe a method for identifying "typosquatting", the intentional registration of misspellings of popular website addresses. We estimate that at least 938,000 typosquatting domains target the top 3,264 .com sites, and we crawl more than 285,000 of these domains to analyze their revenue sources. We find that 80% are supported by pay-per-click ads, often advertising the correctly spelled domain and its competitors. Another 20% include static redirection to other sites. We present an automated technique that uncovered 75 otherwise legitimate websites which benefited from direct links from thousands of misspellings of competing websites. Using regression analysis, we find that websites in categories with higher pay-per-click ad prices face more typosquatting registrations, indicating that ad platforms such as Google AdWords exacerbate
typosquatting. However, our investigations also confirm the feasibility of significantly reducing typosquatting. We find that typosquatting is highly concentrated: of typo domains showing Google ads, 63% use one of five advertising IDs, and some large name servers host typosquatting domains as much as four times as often as the web as a whole.

This note provides an overview of the Harvard Business School course "The Online Economy: Strategy and Entrepreneurship." It covers the framework for the course, key principles within each course module, and a synopsis of each case, along with the lessons the case is meant to convey.