The Link

Issues with the object of study

There are at least three dominant approaches to studying hyperlinks, hypertext theory (Landow, 1994), small world and path theory (Watts, 1999), and associational sociology (Park and Thelwall, 2003). To literary theorists of hypertext, sets of hyperlinks form a multitude of distinct pathways through text. The surfer, or clicking text navigator, may be said to author a story by choosing routes (multiple clicks) through the text (Elmer, 2001). Thus the story told through link navigation is of interest. For small world theorists, the links that form paths show distance between actors. Social network analysts use pathway thought, and zoom in on how the ties, uni-directional or bi-directional, position actors (Krebs, 2004). There is a special vocabulary that has been developed to characterize an actor's position, especially an actor's centrality, within a network. For example, an actor is 'highly between' if there is a high probability that other actors must pass through him to reach each other. To associational sociologists, as least as it's described below, links matter for a different reason. As with social network analysis, the interest is in actor positioning, but not necessarily in terms of distance from one another, or the means by which an actor may be reached through 'networking.' Rather, ties are reputational indicators. Ties, both quantities as well as types, may be said to define actor standing. Additionally, the approach does not assume that the ties between actors are friendly, or otherwise have utility, in the sense of providing empowering pathways, or clues for successful networking.

Approach

Here we seek to remain in the tradition of associational sociology, and adapt it to the specificities of the Web, looking into how an actor may be characterized by the types of hyperlinks given and received. In actor profiling, we show, initially, whether particular domains (.gov, .com, .org, etc) are adequate in characterizing actors' interlinkings (cf. Beaulieu, 2005). In previous research we found linking tendencies among domain types, i.e., governments tend to link to other governmental sites only, non-governmental sites tend to link to a variety of sites, occasionally including critics. Corporate websites tend not to link. Academic and educational sites show links with partners and initiatives. In some sense, hyperlinks show an everyday politics of association. (See Figure One.)

When characterizing an actor according to inlinks and outlinks, one notices whether there is some divergence from the norms as well as more generally whether particular links that are received may say something about an actor's reputation. A non-governmental organization receiving a link from a governmental site would be telling, for example. (See Figure Two.)

Method

The Issue Crawler software, with particular allied tools, has been developed specifically to perform hyperlink analysis, albeit in a technique borrowed from citation analysis. (See the Issue Crawler instructions of use as well as the scenarios of use.) Its built-in analytical approach is co-link analysis, whereby a site must receive at least two inlinks from the other sites in the network to be included. (The Issue Crawler also has an advanced feature that allows for not co-link but snowball analysis, a more traditional means for demarcating a social network (Garrido and Halavais, 2003).)

Once a network is located with the Issue Crawler, individual actors may be profiled, using the actor profiler tool. The actor profiler shows, in a graphic, the inlinks and outlinks of the top ten network actors [1]. It also shows each actor's Google ranking for a particular query, normally the issue in which the network actors are engaged, or more generally the substance that organizes the network. The researcher defines the issue.

For researchers who are not well-versed in the issue area of the network, there are means available to gist significant keywords used by the network actors. One Issue Crawler allied tool, called Issue Discovery, takes as its input an issuecrawler xml file URL, and, with the help of the Yahoo term extraction service as well as other weighting metrics, supplies a list of keywords. As the tool's name indicates, it seeks to 'discover' the terms, and should be used as an indication of the network substance, rather than as the definitive keywords. Forms of content analysis and other qualitative techniques are more traditional means to research substance.

[1.] For raw inlink data generation, the Yahoo! inlink scraper may be used to gather a list of inlinks to one or more sites.

Sample project

Mapping the Global Human Rights Network

In order to map the global human rights network, which is a substantial undertaking, the following provides a method to define the seed URLs for use in the Issue Crawler, and also a means by which to analyse the network. In the analysis, we research the overall 'issue commitments' of the global human rights network. (The actor profiler, discussed above, is not used.) Thus we are interested in which issues are most significant to the network, and which issues are gaining relatively scant attention. Thus the analysis concludes with ranked lists (as well as tag clouds) of the network's issue agenda, if you will.

To begin, how does one determine the starting points, or seed URLs, for the location of global human rights network with the Issue Crawler? In the project, researchers identified three authoritative lists of key human rights actors worldwide, and harvested the organization lists (with their links) from the sites. The effort was to find actor lists that together covered the field most broadly, also in terms of regional coverage, mainly from a north-south point of view. The three authoritative sources are the United Nations Universal Declaration of Human Rights 60th anniversary Website (international view), Amnesty International (western view) and Choike, the southern civil society portal (southern view). Each website, UDHR 60, Amnesty, and Choike have lists of key human rights organizations, but their Websites are differently organized, with certain database structures, making the harvesting of links rather difficult. Once the lists of key actors are located, use the link ripper to harvest, or collect, all the URLs. This may be time-consuming.

The three sets of URLs - one from the UDHR 60, one from Amnesty, and one from Choike - are three separate crawls. Thus enter into the Issue Crawler the UDHR 60 list, name and launch the crawl. Do the same for the other lists. Once the results are returned, retrieve the results of each crawl, i.e., the network actors. (Network actor URLs are available as lists in the Issue Crawler network details pages, or via the extract URLs sub-tool.) Triangulate the three lists, meaning determine which sites are in at least two of the networks. This triangulation is performed using the analyse or compare lists tool, or the more general triangulation tool. Once triangulated, the researcher now has a list of the top human rights actors worldwide, according to the co-link analysis of the Issue Crawler, and the triangulation method or threshold analysis, whereby an actor makes the final list if it appears in at least two networks. This is the 'core list' of human rights actors.

See one rendition of the global human rights issue network, whereby the triangulated results (of all three lists) were re-entered into the Issue Crawler for a global view of the overall network: humanrights_udhr_choike_ai.pdf

Issue Commitment Research

What are the issues that are most significant to the network? What is the network's global human rights issue 'agenda'? Here the researchers build an issue dictionary. Human rights actors, as many NGOs and international organizations, often have issue lists on their sites, be it 'key issues', 'issues on the U.N. agenda' or campaigns with issue language. Browse each of the sites in the original actor lists -- all three lists -- and make a list of key issue language, that is, the listed issues or the campaigns. Allow the organizations to define the issues, i.e., take over their language. This is manual work. If in haste, one may wish to gain only a sense of the issue language per actor, using the issue discovery tool.

Once there is a list of issue language, employ the Google Scraper, an Issue Crawler allied tool. Query each site from the 'core list' for each issue from the issue dictionary.