A codebook view of history: reflections on working with Seshat

I am a graduate student in social anthropology at the University of Oxford and have been involved with Seshat: Global History Databank as a research assistant since the summer of 2015.

Assistants conduct literature reviews, working with university collections and external databases such as the Human Relations Area Files compendium of ethnographic texts. We extract information relevant to the project and code variables for the different polities that have governed a particular geographic area over time, tracking changes in sociopolitical organization, means of communication, the built environment, warfare, ritual, and the means of production. We also add the contextual information needed to interpret a code.

In social statistics, we obviously worry about which questions to ask and how to formulate hypotheses. But on the level of data collection, there are other problems: without good data, even the most elegant analytical models are of little value to an empirical science. The first problem is: what to count? How should we operationalize abstract concepts such as social complexity or dysphoric rituals? We break them down into smaller entities or indicators that can be measured directly, i.e. we write a codebook. The second one is: how to count? We have to reconcile the coding conventions that we have chosen with the messiness of real-world data.

Social anthropologists are usually quick to point out and dismiss the reductionism inherent in statistical approaches. These may be imperfect, but they are necessary to identify and understand recurring patterns and general trends, especially on the level of cross-cultural comparison. The codebook filters the concrete settings and situations described in the literature, omitting information considered irrelevant, though it might be interesting in other ways. It requires coders to think through aspects of social organization in their own right and to think about them through the sets of variables employed in the project.

One common problem has to do with the way information is organized in the ethnographic and historical records. Given the need to operationalize more abstract concepts (such as means of communication or modes of exchange), coders deal with a long list of more concrete entities (such as the existence of a postal system or the use of paper currency). Some variables are surprisingly hard to code due to lack of detail in the sources. This is often the case for quantifiable information and material culture as expressed in the tools and techniques used in warfare, construction, and agriculture. For example, an ethnographer may be more concerned with the general characteristics of swidden cultivation rather than the specific tools used and which materials they were made of. Accordingly, an author might take interest in the contribution of relatives to sowing and harvesting, but tell us little about the ‘nuts and bolts’ of cultivation itself. Or a researcher mentions the use of plows, but does not describe which kind of plow was used in a community, whether it had a moldboard, and whether plow animals were present. Assistants often have to infer the above from more general descriptions. Despite their positivistic outlook, even traditional village ethnographies can be vague when it comes to those details: the authors did not have statisticians in mind when writing their books.

Then there is the codebook itself and the way its variables correspond to real-world data. Some ways to govern, fight, and communicate do not easily fit into the categories provided in the format, necessitating revisions and the introduction of additional variables. Even the most well-defined ones could be counted and coded in multiple ways, given the considerable plasticity of human social life, and the evidence might well fit a code in one way but not another. For example, imperial bureaucracies often relied on a degree of legal pluralism, especially under the conditions of indirect rule. While elders often continued to resolve disputes between the inhabitants of ‘native’ villages, plaintiffs sometimes took cases that could not be resolved locally to colonial courts. Furthermore, colonial governments often installed officials recruited from the local population. As intermediaries they played a complex role, enforcing colonial regulations on the village level, but also contributing to the formalization of local norms and procedures. The resulting systems were neither fully bureaucratic nor entirely ‘customary’. These complexities invite the discussion, modification, and refinement of the codebook.

Another problem has to do with social science more generally. Coders have to navigate the familiar problem of bias in the ethnographic record. This goes beyond ethnocentrism. The type of settlement a researcher worked in and which informants they talked to could bias their work towards more or less hierarchy, bureaucracy, or formal law than found in other places of the same region. Then there is the very real possibility of misunderstandings between the original researcher and his collaborators or between the author and the coder. Preconceptions about the material may intrude through banal coincidences as well, as the sources read first inform the way the reader approaches additional information. The ‘imagined audience’ that an author addresses in their writing exacerbates the problem, which can well result in contradictions. For example, an ethnographer working on ‘native’ legal procedure may feel a need to stress the regular, formal, and standardized aspects of dispute resolution in order to counteract stereotypes about the irrationality of ‘primitives’. Another might stress flexibility in the application of norms and the ability of elders to adapt to new circumstances in order to challenge the notion of ‘custom’ as stagnant, unchanging tradition. This is why expert feedback is so important. Experts are historians and anthropologists with extensive knowledge of a geographical area, time period, or society that we code for. They answer questions we have and verify our codes, helping us to fill in remaining gaps in the database and to deal with contradictions in the literature.

‘Chiefs of the Six Nations at Brantford, Canada, explaining their wampum belts to Horatio Hale September 14, 1871’ Source: Creative Commons

Working with Seshat has been an interesting challenge so far. The process invites assistants to think between the disciplines and about the different ways of doing anthropology. Mediating between the qualitative data contained in the ethnographic and historical records on the one hand and the requirements of quantitative research on the other can be an exercise in restraint and teach sensibility to incongruence, even though scholars more inclined towards the humanities may suppose otherwise. At the same time, one has to guard oneself against losing sight of the major themes and interests of the project and against destructive second-guessing. As there are rarely definite answers to the day-to-day challenges along the way, it has been and continues to be an exercise in searching for the proper balance between the two.

Notes for Editors:

For further information on Seshat: Global History Databank or to get involved with the project contact Jill Levine (jlevine@evolution-institute.org)

Cite this page: “A codebook view of history: reflections on working with Seshat.” http://seshatdatabank.info/reflections-on-working-with-seshat.

Seshat News

Data from the Seshat Databank (data.seshat.info) is under Creative Commons Attribution Non-Commercial (CC By-NC SA) (https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode) licensing. Do you agree to the reasonable and appropriate use of these data?