Exploring the Geography of WorldBank.org

Molly Norris

“Once we become critical of the assumption that the Web is a neutral repository of information, the structure of the Web becomes much more interesting.” – M.H Jackson, 1997

Absences speak volumes, and yet, interpreting information gaps online has produced only muffled truths. Studies on the geographical origin of Internet content have shown old divides between rich and poor countries repeat themselves online. For example, the vast majority of the shares of Google’s user generated content, academic journal citations and authorship of Wikipedia entries tilt to the wealthier global north. Admittedly, exploring digital landscapes is far less adventurous than the globetrotting variety. However, these journeys allow us to peer not at the contours of physical unknowns, but at the shapes molded by the collective human brain — its associations, biases and desires codified in the traces left behind transacting in cyberspace.

And so I set forth on a small exploratory study of my own. I wanted to make a first pass at charting the informational disparities within one of global citizenry’s largest projects geared towards egalitarianism: the international aid and economic development industry. I chose worldbank.org as the single best available database with worldwide content coverage, peak traffic relative to its competitors and missions to not only create a world free of poverty, but also a website serving as the number one source of information on this cause. An enterprise website is more than the face of an organization, as some in public relations suggest. It is the cognitive expression of the organization’s values and intent. What image of the world would the self-described “knowledge bank” reveal, as mediated by its staff online?

Instead of examining knowledge output per region, I used hyperlink counts as a measure of both volume, but more relevantly, as an indicator of visibility. As a basic organizational element of the Internet, Halavais describes hyperlink networks as increasingly “meaningful, malleable and powerful.” Hyperlink networks are meaningful as mirrors of the human associative brain at work in a digital environment. Once aggregated, hyperlinks record relational connections made by Internet users between information at the unconscious and intentional level. Hyperlink networks are highly malleable in that they are susceptible to being reordered and manipulated by actors within the network once network position is observed and made valuable. (The benefit of restricting the crawl to a single network neutralizes much of this risk since gaming the system through search engine optimization occurs between websites rarely within them.) Finally, hyperlink networks are powerful enforcers of structural relationships between information. As Manuel Castells explained in “Rise of the Network Society,” information is the raw material for much of the work that goes on in the global economy’s new technological paradigm. The current evidence of digital material domination by wealthier countries provides some evidence to this truth.

Using Thelwall’s SocSciBot, the crawl for the hyperlink network began with worldbank.org and cast a net of all pages within two clicks of the homepage. This resulted in 1,180 valid HTML pages featuring 283,787 links and 751 links to specific regional content after removing duplicates, machine-generated and corporate pages. The regional links were discovered through manual filtration based on URL structure that included regional keywords and were then evaluated for content. I ran a quick regression to make sure links weren’t merely a factor of variables related to the organization’s work. I specifically controlled for a region’s population, poverty rate and dollar amount of World Bank financed activities. None of these variables were statistically significant predictors of the number of links to regional content. Seemingly, the study would be able to capture the raw connections between data, free from the obvious distortions brought about by different regional characteristics.

Hyperlinks by Region on WorldBank.org

The findings reveal that Africa’s link count of 148 is nearest to the mean (125) of any other region; Africa is not marginalized within worldbank.org. This is surprising, considering other information geographies have shown Africa pushed to the extreme periphery. In academic knowledge, Switzerland is represented as three times the size of the entire continent of Africa using the JCR database. In geo-tagged Wikipedia articles, there is more than twice as much content about France as there is about Africa (according to one study). World Bank’s web provides a counter example to a digitally excluded Africa.

Elsewhere, the comparison between the East Asia Pacific region versus South Asia presents a notable contradiction. East Asia is underrepresented, relative to the sample, especially when compared to South Asia. Both regions host a behemoth, China and India respectively, and have similarly high populations, poverty count and amount of financed activities. And yet, World Bank staff appear more willing to link to South Asian than East Asian content. This disparity brings up awkward questions about preferred narratives and politicization within the development industry. Is the questionability of China and other Asian Tigers as a recipient for development financing echoed online? Or does South Asia just prove more salient in development knowledge? It is hard to know what this absence is saying. But as I said, this is just the initial sketch of a new territory, the informational landscape of international development online.