Category: Intro. to DH

While mapping is one viable tool that you can use to represent your data in a visually interesting way, there are other methods of visualization that can be used to show different relationships between points in a dataset. One way to show these relationships is with a network analysis. A network analysis shows a specific relationship between two points of data in the dataset, and visually represents this relationship in a way that is not possible with a basic map. If a connection exists between two of the points, as I will show later using a dataset based on interpersonal aid between people during the holocaust, it will create a link between the two points of data. If there is enough data in the set, with enough connections, the network analysis will resemble a web, showing the differing degrees of connections between many different data points.

Like creating a map, a network analysis can be created using both Google Fusion Tables and Palladio. Each have their own specific strengths and weaknesses, but each offer the ability to show your data in this way. For my example, I will be using a dataset, that shows who gave aid, who received aid, and what type of aid it was during a specific set of years during the holocaust, which can be found here.
The image on the left is a network analysis of this data showing the basic network analysis using basic parameters. The parameters are those who received the aid, notice that this field is represented by bolded points, and those who gave the aid, the lighter points. This data shows that there was a very intricate connected web of people all giving aid within a relatively small group.

If you want to show a different representation of the same data, for example the gender of those who gave aid, you simple have to change the target data point to reflect the field you want to show. The network on the right reflects this data target, the bolded points represent the gender of the providers of the aid, and we can tell from this that the majority of the providers were male, which is shown by the clustering around the “1” point.

Another option that there is to make a network analysis such as this is Google Fusion Tables. As the example below shows, the basic visualization of the data looks largely the same as Palladio, the main difference being the ability to add colors to the data points rather than bolding them. However, Palladio offers a very important option to the user that Google Fusion Tables does not, the ability to import a second sheet of data into the current project.

Because Google Fusion Tables does not allow the user to do this, you, at least for this specific dataset, cannot accurately scale the data point sizes based on the amount of aid received, as Google Fusion Tables does not have access to all of the information. A potential way around this would be to keep all of the necessary data on a single sheet, alleviating the issue of needing both when it only allows the upload of a single one.

One argument against using this type of visualization is the fact that the ease of use allows for many datasets to be mapped this way. In an article titled Demystifying Networks, Scott Weingart alludes to this fact by stating that almost any dataset can be visualized using network analysis, however, that does not, in his opinion, mean that they should. So while in some cases a network analysis can tell you a lot about your dataset, in other cases creating a network analysis and designating methodologies can skew the data and create relationships that are not there if the user is not careful.

Today’s world is driven on data. From big corporations like Google collecting user data to make browsing the internet more user friendly, as well as allowing Google’s targeting of its users, to small scale data collection done by a college student for a project, data is more than a spreadsheet filled with numbers and letters. Often, the ability of a dataset to have an effect on the viewer, or provide the viewer additional context, requires the need for some form of visual representation of said dataset. There are many different types of visual representations that can be used, everything from graphs to charts can all be employed to better convey the data to the end viewer. However, for some datasets that focus on specific locations and how they potentially relate to each other, another visual tool can also be used.

If you have a dataset that is based on location, such as the one I will use in my example, using maps may be the better choice for a visual representation, rather than a simple chart of graph. The reason for this is that it allows for a sense of location for a project. For example, providing that you have data that shows a certain dataset by location, it will be more useful for the viewer to physically see the distribution of your data on a map of the area, in contrast to a list of locations that are not very precise or visually interesting for the viewer. In addition, data mapped in this way can contribute to a better understanding of what is now called spatial history. The use of spatial history, as explained in the article by Frank Zephyr, is assisted by visualizations, as they allow for the observation of patterns and trends. The ability to view a trend, and the ability to interpret some form of spatial history will be apparent in my upcoming visualization example

Visualization can be done in a variety of ways. One such tool is Palladio, which maps the location of each item, and can even provide a timeline if your dataset has the necessary information. For example, using the dataset of the Cushman Collection, which is a collection of different types of photographs taken around the contiguous United States between 1938 and 1955, you can visually map the geolocations of where the photographs were taken, giving the viewer a sense of the scale of the collection and the distribution patterns of where popular locations were. An example of a Palladio mapping of this dataset can be seen in the photograph above. In this visualization, the viewer can clearly see that many of the photographs were taken on both coasts and in the central and southern United States, with only a small portion taken in the Northern States. This ability to see where the photos were taken and the concentration of their locations over the United States contributes to the spatial history of the particular areas.

Another option that you have if you want to use a visualization software is to use Google Fusion tables. As seen here on the right, the basic data modeling of the same collection on Google Fusion tables looks very similar to Palladio’s. However, unlike Palladio, Google Fusion tables offers a few alternate visualizations that are very interesting to experiment with. For example, Google Fusion tables allows you to create a heat map based on the locations of the data, as shown below. This is the exact same information as show on the other two maps, only it is presented in a different way. However, as interesting as the heat map is, it is clear that there is a problem when using it as the only source of your visualization of the data. Because it maps the data based on the concentration of data in certain locations, locations with less data points are sometimes left out. This can be seen in this particular example as both the Pacific Northwest and Texas have no data points in the heat map view, while the other views show the data points in those areas. For this reason, it is important to take this into consideration when attempting to create a visualization. Failure to pick a software that accurately models your data can create a visual representation that misrepresents the data, thus rendering the visualization unreliable and inaccurate.

The use of digital tools and internet technologies is not a foreign concept to me. As I have stated in the past, technology is one of my biggest hobbies. However, the recent project for our class using Omeka has been an experience that I must say was very interesting and thought provoking in regards to what I could do with this program, or a program like it in the future. While creating our collection, which can be found at http://perpetua-felicitas.carrieschroeder.org/, I was surprised how much work went into even a small-scale project like this one. Searching for relevant items that not only related to our topic, but also enhanced it, was a challenge, especially when you consider that the item must be from a source that can verify its authenticity, meaning no uncredited Google Image search photos that are so common on the internet today.

To speak for a moment about Omeka as a tool before I go into detail about our project, I think that it allows you to do more in considerably less time in contrast to building a similar webpage from scratch using HTML and CSS. You could accomplish virtually the same thing by writing a basic clone of it, but having a bit of experience writing HTML and CSS myself, I can say it would be a significant amount of work, especially if you wanted the same functionality that Omeka provides out of the box. That is not to say it would be impossible build something like the exhibit pages fairly quickly using HTML and CSS, as they could be built using a framework such as bootstrap to layout the page fairly easily. That being said, I think if I were given the option I would use Omeka, as it would be up and running faster, with more functionality. In addition to this, another reason I would choose to use Omeka is the fact that it is open source software. This means that it would be possible to make changes to your Omeka site if you wanted to add functionality or edit the design, providing the user had an existing knowledge of HTML, CSS, and another language such as JavaScript.

Overall, I enjoyed using Omeka and making my contribution to the class exhibit, however, despite of my enjoyment using the program I would not say that there were not any challenges creating it. One of these challenges that I think surprised me the most was the collection and the classification of the metadata for each item. When you think about metadata in its most basic form, simply putting it as “data about data”, as described by Anne Gilliland in her article titled Setting the Stage, it sounds simple enough. However, when you consider that all of the metadata must follow the same format, site wide, if you want your collection to be searchable using keywords or dates, and also give the correct credit to all creators and clearly communicate the copyright status, it is clear to see that it is not as simple as entering the date and title of the work into a blank field.

Perhaps the most enjoyable part for me in the whole process was creating the exhibit. It was very interesting to see how we could take a unified theme about a particular story, and create so many exhibits that gave ample background information to not only the story itself, but also to how the story and the details related to the time it happened, providing context of how martyrdom was viewed in Roman society. I think that is the big draw of using this program. It is not just a simple collection of images and text that relates to a bigger picture, rather, the ability to tell these stories about the collection by using these exhibits effectively paints the picture for the viewer.

Omeka is clearly designed to display collections, meaning that it would only be effective if you have objects (images, photographs, media, videos, audio, etc.) to present in an organized fashion. I would explain Omeka as the iPhones of websites; in the sense that, iPhones are user-convenient with standard designs, compared to androids that require more prior knowledge to customize. Omeka is not your site if you want complete control over how it looks, though this is restrictive, it is very user-friendly because the layouts and fields are provided for you without having to learn something like HTML.

Searching for items to upload was definitely frustrating when I hit a copyright wall. For example, Triumph of Faith from Bridgeman Images is under a license, and it would have been a good addition to the 19th-21st Century Popular Culture Collection. Like Terras, I agree that it is difficult to reuse/remix digitized content because of copyright or unclear right. The solution she proposes insists on is to declutter the vast amount of images and comprise a good amount of quality content (in resolution and material) under public domain, because it’s all about sharing, right, Mark Sample?

Tags can be helpful when searching for key terms, but the search engine provides a broader search

Metadata is extremely essential when it comes to searching. When metadata looks like this (not to call anyone out), it is certainly incomplete and ineffective. Omeka is extra helpful, because it outlines what metadata to fill in, but only is effective when complete. Moreover, the data must be uniform to comply with search engines.

Creating these collections and exhibits for Perpetua and Felicitas, in my perspective, embraces both sides of the practice-theory divide that Fitzpatrick proposes. We are using digital technology (Omeka) to study traditional humanities objects (digitized artwork), and we are asking contemporary humanities questions (in response to Perpetua and Felicitas) to decipher digital objects (digitized artwork). Though we were building collections and exhibits by compiling and classifying information, in essence, we are sharing information to further our understanding of Perpetua and Felicitas, and that is the true spirit of Digital Humanities (according to Sample).

Let’s talk about information – digitized information. When the World Wide Web emerged, it offered a plethora of information, “unpoliced and unregulated,” open to all, regardless of race or class. It was a possible channel for “those who had been silenced to have a voice.” You couldn’t prevent someone from accessing information on the web, and that was the great promise of the Internet; however, you could exclude diverse, cultural information from the web, and a trend has definitely shown. Earhart says that digital humanists are skewed toward traditional texts, thus excluding crucial work by women, people of color, and the GLBTQ community. Therefore, she poses the question (the title of the article) “Can Information [truly] be Unfettered?” (Unfettered meaning free from restraint) Interestingly, the National Endowment of Humanities awarded 141 D.H. grants in three years, with only 29 focused on diverse communities. Clearly, there has been an underwhelming spotlight on the preservation or recovery of diverse community texts. So, the solution is obvious, but it must be blatantly said – we must adopt a mind toward cultural constructions in technology, unless we will continue to exclude vital materials from digitization.

http://invisibleaustralians.org/faces/

Earhart’s persuasive argument can be related to archives such as one created by the Invisible Australians project. This online archive was created to reveal the “real face of White Australia.” Australia defined itself as a white man’s country, but reality is different, and the archive proves just that – easily identifiable, one could click on any picture and see just one of the documents denying them their place in Australia. Without researchers, and even ourselves, being exposed to diversity, digging deeper into cultures, we would digitize an incomplete, false world.

This post mainly encompasses Earhart’s article and the archive, because frankly, after reading McGann’s number, I still don’t know what he is referring to in Radiant Textuality. However, I do understand that he acknowledges the capacity of accessibility and flexibility information has once it is computerized; but, he asks us to give “serious, collective thought” as to how we live and handle our lives and knowledge within these networks.