A GIS Based Approach to Historical Understanding

How, as historians, can we facilitate a physical understanding of the world in which we live, and how it has been formed throughout it’s existence as well as from our own? All human endeavors exist on a dual plane of time and space, and at each moment we are wrapped up in the cosmological game of survival that bleeds forth from the essence of a world in action. Historians, who are interested in the study of man’s experience in time and space, must also, as W. Gordon East reveals, be engaged in discussing the “problems of location.” (East, 1967, p.10) One way to understand these problems is through the incorporation of Geographic Information Systems (GIS) as a tool to answer the question posed by the researcher. A GIS, as J. Scurry writes, is

“a computerized data management system used to capture, store, manage, retrieve, analyze, and display spatial information. […] GIS differs from other graphics systems in several respects. […] data are georeferenced to the coordinates of a particular projection system. This allows precise placement of features on the earth’s surface and maintains the spatial relationships between mapped features. As a result, commonly referenced data can be overlaid to determine relationships between data elements.” (What is GIS?)


In this post, I will show how questions can be answered through the adoption of GIS as a methodological tool for the advancement of historical understanding. To do this, I will begin by setting up the world in history, in a spatial, temporal, and humanitarian context. I will then briefly examine two approaches to history, quantitative history and by looking at Fernand Braudel from the Annales, that have aided in addressing the need for a geographical understanding of history, as well as how these schools can benefit from the use of GIS services. In the last section, where I intend the focus of this post to be, I will show how GIS can be incorporated as a historical approach. To accomplish this I will provide the reader with an understanding of GIS, provide an overview of GIS scholarship, discuss the importance of data to GIS, and close by providing a sampling of scholarly incorporations of GIS as a historical enhancement tool. So, why learn about GIS as a historical tool? Because as humanists, “we are drawn to issues of meaning, and space offers a way to understand fundamentally how we order our world.” (Bodenhamer, 2010, p.14)


The World In History

There are three major avenues for research in historical GIS work. These are the spatial, the temporal, and the humanitarian. Though each of these avenues are distinctive, an analysis of them does not happen in isolation and as we read through them the reader will find many overlapping themes. We will also be coming back to these themes throughout the entirety of the paper because each of them enhances our understanding of the historical world and the realities therein.


The spatial world is the material construct of the world around us. It is made up of shapes, colors, depth, and a myriad number of objects[1]. (Couclelis, 1992) Throughout history there have been numerous ontological frameworks for how we view the world around us. These frameworks help to explain man’s conception of the world he lives in. Predominating outlooks have changed throughout history, but the world in which these outlooks have taken place is metaphysically the same. The physical world provides us with obvious natural boundaries which have had a significant impact on how we have interacted upon it. W. Gordon East, writing in 1967, writes that

“The best frontiers of separation, especially in the past, were afforded by the oceans, the deserts, mountain systems, marshy tracts, and forests, for the good reasons that such areas set obstacles to human movement and could not support a dense population.” (East, p.100)

Even though these boundaries create boundaries of protection, “the physical environment remains a veritable Pandora’s box, ever ready to burst open and to scatter its noxious contents.” (East, p.1) For all of man’s attempts to subjugate the earth, the earth has a way of bringing us back into awareness of our own existence.

Two major distinctions in worldview are between the Ptolemaic view and the Copernican view, which battled over the location of the earth in the universe. Speaking about the Medieval understanding of the Ptolemaic view of the world, Lovejoy writes “The world had a clear intelligible unity of structure, and not only definite shape, but what was deemed at once the simplest and most perfect shape, as had all the bodies composing it. It had no loose ends, no irregularities of outline.” (Lovejoy, 1960, p.101) Later on Lovejoy states that “It is sufficiently evident from such passages that the geocentric cosmography served rather for man’s humiliation than for his exaltation, and that Copernicanism was opposed partly on the ground that it assigned too dignified and lofty a position to his dwelling place.” (Lovejoy, p.102)


Another model of viewing the world involves land ownership, and the definitional arguments over property. Couclelis, speaking on the predominating view of space in relation to the Western tradition, informs us that

“Western culture is apparently unique in its treatment of land as property, as commodity capable of being bought, subdivided, exchanged, and sold at the market place. It is at this lowest level of real estate (from the Latin res, meaning thing), that we find the cultural grounding of the notion of space as object. Further up the hierarchy, at the level of countries, states, or nations, precise boundaries are needed again to determine what belongs to whom, who controls whom and what, and for what purpose.” (Couclelis, p.67)

Couclelis then goes on to inform her readers of six reasons why human territories often elude spatial analysis. These reasons are: there is a constant effort to establish and maintain; they are defined by a nexus of social relations rather than by intrinsic object properties; their internal structure changes not through movement of anything physical, but through changes in social rules and ideas; they do not partition space, although they may share it; their intensity at any time varies from place to place; and finally, they are context and place specific. (Couclelis, p.68)

Geographical space, in many respects, is fluid. The arbitrary lines that are drawn upon it by man have little meaning on the land as an entity to itself. Waldo Tobler, speaking in 1970, invoked the “First Law of Geography” by stating that “everything is related to everything else, but near things are more related than distant things.” (1970) What this means in relation to the geographical conception of space is that it is easier to relate events that take place in a close proximity to one another as opposed to those that take place far away from one another. Couclelis tells us how we should conceptualize our sense of space as being in territorial relationships by stating that

“It is as if landmarks, places, and other geographic entities were defined in neither an absolute nor a relative, but in a relational space, where object identity itself is at least in part a function of the nexus of contextual relations with other objects.” (Couclelis, p.74)

By looking at Tobler’s Law and Couclelis’ “relational space” we see the need for having a conception of transitional changes of movement in our ontological framework.

Though it changes shape, moves around, acts and is acted upon, the land is always with us, and, as such, it “provides a common denominator to all historical periods.[2]” (East, p.8) While we are able to stand upon the space we inhabit, we are not able to remain there indefinitely[3]. In traditional Western thought we conceptualize time along a linear model always moving on towards infinity, but we can only think of “the Earth’s surface [as] fundamentally finite.[4]” (Gregory, 2010, p.61)


On the other side of the space token is the notion of time; the two cannot be separated.[5] (Gregory, p.61) Ian Gregory informs us of the six ways of conceptualizing time (Gregory, p.60) including: linearly, calendary, cyclically, containerly, branching, and multiple perspectives. In a GIS environment, it would seem that the easiest way to view historical spatial changes over time would be highest by using a linear model. Corrigan writes that

“The display of change over time that is observable in such a GIS, especially when presented as an animation, can disclose much about the way in which people and culture move through space, appear and disappear, and exist in relation to natural environments.” (Corrigan, 2010, p.81)

Many of the historical GIS applications that I have seen, and some I will touch on in a later section, have such a linear time tool as a way to view the historical analysis.

History is the subject that “locates us in time,[6]” (Ayers, 2010, p.5-6) and if we relook at Tobler’s law about geography and relate it to time we will see that nearer periods of history are more similar than distant periods of history. As historians we are often prone to large groupings of historical periods such as saying, medieval or roman, but when we dig down deep into the transitionary periods of time it is difficult to see where one period ends and another begins.



The human experience in, and upon, the world is primarily what the field of history is concerned with and looking at how we have been formed by our past, and in this context, humanity has been forced to understand nature and make adjustments.(East, p.2) There are a plethora of articles that discuss how man has had a strong influence upon the world in which we live. It is man that provides meaning to place by establishing it as an object. Robert David Sack writes that

“Space and time are fundamental components of human experience. They are not merely naively given facets of geographic reality, but are transformed by, and affect, people and their relationships to one another. Territoriality, as the basic geographic expression of influence and power, provides an essential link between society, space, and time. Territoriality is the backcloth of geographical context- it is the device through which people construct and maintain spatial organizations. For humans, territoriality is not an instinct or drive, but rather a complex strategy to affect, influence, and control access to people, things, and relationships.” (Sack, 1986, p.216)

This is how places, as objects of study, are created. The human conception of place also has a reliance on the area around it.[7] (East, p.28)

There is a famous adage from Shakespeare’s As You Like It which states “All the world’s a stage, And all the men and women merely players; They have their exits and their entrances, And one man in his time plays many parts,” (Act 2, Scene 7, p.83) but, unlike in a theatrical performance, “history…is not rehearsed before enactment, and so different and so changeful are its manifestations that it certainly lacks all unity of place, time, and action.” (East, p.2) So how do we make sense of humanity’s entrances and exits, and relations with one another and the world itself? Couclelis informs us that “the key is to be sought in human cognition, in learning from how people actually experience and deal with the geographic world.” (Couclelis, p.66)

Quantitative History & One Annales Approach

There are two separate approaches to history that are relevant to a GIS approach that I would like to take a moment to discuss. The first is a quantitative history which relies on statistical data to examine historical questions. The second is by looking at on work by Fernand Braudel from the Annales.

Quantitative History

The value of a GIS methodological approach to answering a quantitative historical question comes precisely because that is what each of them do. The whole concept of quantitative history, and it’s significance to this post, is that it was one of the beginning methodologies that began to make a “reliance on numerical data.” (Green & Troup, 1999, p.141) Though numerical data in and of itself is not useful for a GIS project, if it can be georeferenced, then the historian can begin to gain a greater appreciation for the data and it’s relation to the human experience in the world. For instance, U.S. Census data conveys a lot of information, and all of this data comes from particular regions, territories, and places. This information, collected decennially, provides many historians with the information that they need to track historical trends and patterns. This type of data has allowed for historians to focus their “minds on specific historical problems and on ways in which [they] as historians construct [their] material.[8]” (Green & Troup, 1999, p.148)

Fernand Braudel from the Annales

In his 1966 book The Mediterranean and the Mediterranean World in the Age of Philip II Fernand Braudel goes into great length to convey the historical significance of geography upon cultural history. The first two hundred and thirty pages of his book set the scene for his readers geographically for the historiography that was being developed The first three chapters are: The Peninsulas: Mountains, Plateaux, and Plains; The Heart of the Mediterranean: Seas and Coasts; & Boundaries: The Greater Mediterranean. The last of which explores the effects of the mediterranean on the world which it interacted with and acted upon. “Braudel introduced a multi-layered historical chronology, and initiated a strong focus on quantitative history among the historians influenced by him.”  (Green & Troup, 1999, p.87) Green and Troup go on to describe how “Braudel conceived time in a new way. For example, his famous phrase ‘the Mediterranean was 99 days long’ vividly evoked the effect of sea and horseback travel upon early modern communications. His spatial approach to the sea was equally novel; for Braudel, the Mediterranean extended as far north as the Baltic and eastward to India. Land and sea were inextricably connected….”(Green & Troup, 1999, p.89) The beautiful expressions provided in Braudel’s work remind us just how connected we are to the geography we inhabit and the sense of space we construct around ourselves.

Incorporating GIS as a Historical Methodology

This method is not an all encompassing approach to answering historical questions[9], and the tools available for research are limited in the types of questions they can answer to those that have some form of geographical component. With GIS we can see how humans organize their space, what do they do with their environment, and how their minds work in this organizational process. This works by delineating “space as a set of Cartesian coordinates with attributes attached to the identified location, a cartographic concept, rather than as relational space that maps interdependencies, a social concept.” (Bodenhamer, 2010, p.20) The Cartesian coordinate approach differs from our own relational experience with the world around us, and how we interpret and convey that experience.[10] When historians participate in this new digital form of geospatial mapping they contribute to a more shared understanding of reality[11].

GIS & Historical Scholarship

Ian Gregory and Paul Ell, in their book Historical GIS: Technologies, Methodologies and Scholarship, remain excitedly confident that “GIS has the potential to reinvigorate almost all aspects of historical geography, and indeed bring many historians who would not previously have regarded themselves as geographers into the fold.” (Gregory & Paul, 2007, p.1) If historians wish to engage with GIS tools then they will need to “not only learn the technical skills of GIS, but must also learn the academic skills of a geographer.” (Gregory & Paul, 2007, p.1) As mentioned previously, GIS allows users to shift the focus from a strictly historical narrative approach to a geospatial narrative in which questions can be asked in new, more computationally based, ways. Couclelis informs us that

“in applied geography- the geography that GIS is supposed to serve- the question of whether an object or field view is more correct, is neither a philosophical nor a theoretical issue, but largely an empirical one: how is the geographic world understood, categorized, and acted in by humans.” (Couclelis, p.70)

Scholarship leads the student down a field of study in an attempt to answer a question. This approach helps bring those answers to a visual life, but it requires carefully formed questions in conjunction with well structured, and accurate, data.


This entire system of historical analysis requires two things, data and the knowledge of how to turn it into something useful. All of the data that the historian collects will require some structuring for it to be made into a historical work, and how the historian approaches the structuring “begs the philosophical question of the most appropriate conceptualization of the geographic world.” (Couclelis, p.65) How the data is structured[12] will help determine how the historical questions can be answered.

Ian Gregory has an interesting notion on how, in the future, historians may be able to address and visualize the dream of Lovejoy who was searching for The Great Chain of Being that linked ideas in their historical context and showed how they were viewed in different ways at different times. Gregory writes,

“in an ideal world, we would use spatial & temporal data to explore how a phenomenon has evolved over time, not by comparing two snapshots but by looking at continuous change. In doing so the aim is not to identify the story of how the process evolved but to use different places to explore the different ways in which the phenomenon could occur differently.” (Gregory, 2010, p.66)

Examples of GIS as historical methodology

The means of using GIS to further a historical understanding has been taking shape for some time now, and in this section I would like to take a look at some digital humanities endeavors that have taken form with GIS and that have a focus on history.

The first project is Pleiades. This project is great for those studying classical history, meaning particularly the Greek and Roman periods though they are expanding their gazetteer[13] into “Ancient Near Eastern, Byzantine, Celtic, and Early Medieval geography.” (Ancient World Mapping Center) Pleiades gives researchers the ability to “to use, create, and share historical geographic information about the ancient world in digital form.” (Ancient World Mapping Center) As of mid December 2015, they have just shy of 35,000 place names in their gazetteer. This project was created with the help of the Ancient World Mapping Center from the University of North Carolina, the Stoa Consortium, and the Institute for the Study of the Ancient World from New York University.

The second project comes from the University of Virginia and takes a look at “two American communities, one Northern and one Southern, from the time of John Brown’s Raid through the era of Reconstruction.” (Ayers, The Valley of the Shadow) It contains “original letters and diaries, newspapers and speeches, census and church records, left by men and women in Augusta County, Virginia, and Franklin County, Pennsylvania.” (Ayers, The Valley of the Shadow) The project seeks to give “voice to hundreds of individual people” by telling the “forgotten stories of life during the era of the [American] Civil War.” (Ayers, The Valley of the Shadow) Within this larger project is a section on battle maps that seeks to tell the stories of regiments from Augusta County, Va (Confederate) and Franklin County, Pa. (Union). The map allows for the display of additional layers including: grid and scale, putting a boundary line around Augusta and Franklin counties, displaying modern cities, highlighting historic towns, showing historic roads, resurrecting historic railroads, and highlighting major rivers. After selecting the various layers that needed to be added, the researcher can press play and watch the animated battles that ensue.

One project that I helped to create was the History of Canon Law map for the Religious Studies, Philosophy and Canon Law Library at the Catholic University of America. This was done in the online ArcGIS tool from ESRI . In order to create the content I created .csv files based on information found in the New Catholic Encyclopedia article “Canon Law, History of.” (Vogel et al., 2003) The value of the map is that it helps provide a geographical context for various councils and documents related to the history of canon law. The map breaks down the history of Canon Law into seven periods: Early Church, Carolingian Era, False Decretals To Gratian, Classical Period, The Corpus Iuris Canonici to the Council of Trent, The Council Of Trent To The Code Of Canon Law, and The 1917 Code Of Canon Law & The 1983 Code of Canon Law to the Present. Though the map is far from complete, it does provide the student of canon law the ability to obtain a different type of historical understanding than a traditional reading and analysis; it is also a nice overview for novices to the topic.

Canon Law GIS project

The final project is The History Engine which was created in the Digital Scholarship Lab at the University of Richmond. I list this one last here because of it’s emphasis as more of a pedagogical tool for teaching undergraduate students of American History at the University of Richmond about the research and publication process, though it does appear that the creators are making their project available for use by faculty at other institutions. What is interesting is that while it is teaching students about research and publication, it displays their results in a geospatial and temporal context. The project itself was built using an API from Google Maps and incorporating the Timeline widget produced at Massachusetts Institute of Technology (MIT). I would like to encourage the reader to look at some of the other interesting history projects done by the Digital Scholarship Lab to see more great ideas.


Throughout this post it has been my goal to provide the reader with justification for the value of incorporating GIS tools into their arsenal of historical research methodologies. We have looked at the world in history through a spatial, temporal, and humanitarian lens. We have seen how a couple of approaches to history have paved the way for GIS as an approach to understanding historical questions. Lastly, we have seen the value that GIS can hold for historical scholarship, and a few examples thereof. When using this approach the historian will need to adapt some of their traditional practices, but the efforts will be worth the investment given the historians keen ability for breadth of research and analysis.


A Persuasive Argument for GIS as a Service Offered by Libraries

In this short essay, I wish to impart upon the reader the value that Geographic Information Systems (GIS) software can hold for academic libraries in respect to the students they serve, particularly addressing how GIS can be used to help students understand and appreciate their subjects more fully.

To help the reader understand why GIS is of value to libraries, we must first have a mutually understood definition. One of the best synopses available for what GIS is comes from the National Estuarine Research Reserve System, a program under the National Oceanic and Atmospheric Administration stating that

“GIS, is a computerized data management system used to capture, store, manage, retrieve, analyze, and display spatial information. … GIS differs from other graphics systems in several respects. … data are georeferenced to the coordinates of a particular projection system. This allows precise placement of features on the earth’s surface and maintains the spatial relationships between mapped features. As a result, commonly referenced data can be overlaid to determine relationships between data elements.”

Image of how GIS adds layer upon layer to convey a larger whole of information.

It should be mentioned early on that there are various providers of GIS software. One of the most expansive and well managed is ArcGIS by ESRI which recently put up a free online version; though limited in what it can do in comparison to the full version, it provides a rather reliable service for the needs of many students. ArcGIS also has the most expansive training sources on how to use their product as well as a highly engaged user community. Another GIS software that is worth looking into for libraries is GRASS which was produced by the U.S. Army – Construction Engineering Research Laboratory in 1982, and has been expanded on ever since. Other options for library adoption include QGIS, OpenJump, and uDig. ArcGIS is a proprietary software where as GRASS, QGIS, OpenJump, and uDig are free online open source. (Donnelly, 133)  Buchanan, writing in 2006, concluded that the advantages of GRASS (performance, statistical analysis, image classification, and cost) make it a viable alternative to ArcGIS (Buchanan, 40), but both systems have gone through significant updates since his research was conducted.

The value of GIS for students is that they can take the information they are learning in their courses and apply it in a geographical context; data always needs context. For instance, the map Odysseus’ Journey, produced on the ArcGIS platform, takes the user through fourteen points on Odysseus’ voyage providing images, a synopsis of important events, and it’s geographical location. Though this story has a nearly three thousand year history, the story can be told with fresh eyes when laid out geo-spatially. Another example of GIS in action is Google Lit Trips which “mark the journeys of characters from famous literature on the surface of Google Earth.” (Google Lit Trips: Getting Started). Students can learn about different countries and cultures through GIS maps such as Le Tour de France en 50 citations, or learn about linguistics and endangered languages through maps such as One World, Many Voices: Endangered Languages and Cultural Heritage. Richard White, an American Historian at Stanford, uses GIS to reveal new information about the expansion of the railroads in the American West using information from “letters, freight tables, books, newspapers, accident reports, ledgers, and so on” that is traditionally harder to make sense of. (Zax, Visualizing) Once the data is inserted into the system, the presentation of the information through the GIS tools reveals new and interesting context for students to learn and engage with the material.

So, why should the library be the place for this on campus you may ask?

Image of Stanford Geospatial Center.
Stanford Geospatial Center

Libraries, as is their nature, store, manage, organize, provide access to, and help users retrieve information. Most academic libraries provide access to computers and have various software to meet the needs of their students. As GIS becomes more cross disciplinary, installing the software on library computers provides an easy and central location for users to work on their projects. Libraries, in this respect, can also serve as the spot on campus where various schools and departments come together to utilize the GIS tools for larger, more collaborative, projects. Rory Elliot makes the case that

“In providing GIS services, libraries are expanding the patron’s ability to use information, such as data and statistics, which libraries already provide in some form. … For academic libraries, offering GIS services helps ensure that departments, regardless of individual funds, have the technical abilities to conduct spatial analysis for projects.” (Elliott, 9)

When considering whether or not to integrate GIS tools into the libraries repertoire, libraries should consider issues such as “service, personnel, technical, financial, and coordination.” (Suha et al., 129) It takes a lot to start up a GIS program, but as people become familiar with it less maintenance should have to be done. Libraries will also need to consider how users will access the software, such as available on all computers or merely a select few. Libraries should also consider the needs of their users before finalizing their system, making sure that what they are initializing will meet user expectations and needs.

GIS tools can also help librarians meet their user needs by revealing information about user needs such as reported at one Kansas library in the article Targeting Local Library Patrons: Tapestry weaves common characteristics into community profiles by Jim Baumann. The library was able to take the data to see how they might improve services to their community by looking at their users and how different areas of the community utilize the library.

GIS provides value to students by contextualizing information into a geographic context, and as the information centers at most universities the library serves as the prime location to help facilitate these pursuits. Libraries, by offering GIS tools, can allow the exploration of new and unique data visualization important to the research of their respective user groups.


Using Swirl in R

If you are new to using R and R Studio, but are interested in learning the rich value of the statistical computing and graphics software R, then swirl will serve you as a wonderful guide for learning the great power that lies with in the R software.  Swirl stands for statistics with interactive R learning.

R Programing Language

“swirl is a software package for the R statistical programming language. Its purpose is to teach users statistics and R simultaneously and interactively.” (

Once everything is loaded on your computer (see Students page on swirl website for instructions), you can open up the swirl package in R Studio and begin learning the program.  Once the program is installed, it is very simple to use.  Swirl will prompt you to create a name and will teach you how to save your progress (when ready to leave type: bye()). Swirl will also keep track of your progress on the right side of the console by measuring the percent completed.

Step by step, swirl will teach you how to operate the programming language.  At every step that you are taught something, swirl reaffirms and encourages you (e.g. “You nailed it! Good job!”).  Also, after teaching you something, they go in and highlight what the action does and how it is used.

swirl encouragement and affirmation


The lesson I was doing as I was writing this post went by pretty quick. I would encourage everyone interested in learning R to play around with swirl.  From my brief experience so far it is fun and easy to use.



Playing around in Bamboo Dirt

For a recent course project in Digital Humanities, the professor had us examine various tools available on Bamboo DiRT.  In this post I will be examining three tools available off of BambooDiRT.  I would like to encourage the reader to explore the site for themselves and see what else is available.

“Bamboo DiRT is a registry of digital research tools for scholarly use. Developed by Project Bamboo, Bamboo DiRT makes it easy for digital humanists and others conducting digital research to find and compare resources ranging from content management systems to music OCR, statistical analysis packages to mindmapping software.” (


Tool 1

The first tool I would like to look at is exploratree.  Exploratree is under the “Brainstorm/generate ideas” category off of Bamboo Dirt. “Exploratree is a free web resource where you can access a library of ready-made interactive thinking guides, print them, edit them or make your own. You can share them and work on them in groups too.” (

Exploratree allows users to create projects independently or within the context of a group. Exploratree my groups

There are numerous ready made guides available to use; many of these are also available in Welsh. There is also great versatility in creating purely original content with the amount and types of tools available.

Exploratree ready made templates

When creating content, exploratree offers a zoomable grid so that the user can always bear in mind the part being worked on in relation to the whole. The box, in the center of the picture below, is grabable and can be moved over any section of the larger box.

Exploratree Grid

This tool would be wonderful for preparing supplemental material for lectures, or in class discussions. With it’s ease of usability, accessibility, and functionality this would be a good tool to add to any repertoire.


Tool 2

The second tool I will be examining is Collex.  “Collex allows users to collect, annotate, and tag online objects and to repurpose them in illustrated, interlinked essays or exhibits.” (

Bethany Nowviskie has written a wonderful White Paper on this tool.  I would like to highlight a few passages below:

  • “Collex uses a Dublin Core flavor of RDF, the resource description framework of the semantic web, to define collectible ‘objects’ without limiting them to their expression as web pages.” (Nowviskie  pg. 8)
  • “Where other social bookmarking tools (like or Connotea) are designed to allow collection and annotation of whole web pages, Collex allows contributors of resources to make finer-grained distinctions, and users of the system to build collections and exhibits more attuned to the patterns of attention in humanities scholarship.” (Nowviskie pg. 8)
  • “Because this content  can be expressed as subscription based RSS feeds, a web service, or an API through Collex’s underlying Nutch, Lucene, and Kowari RDF systems, it is possible for the maintainers of scholarly resources to patch into Collex directly from their individual web or listserv interfaces, offering information about user annotations and re-mediations for any given object without reference to Collex at all.” (Nowviskie pg. 9)

Though I found numerous mentions that this tool is open source, I was unable to find out where I could find the source online.  That said, there is a contact page for Collex. Perhaps the software must be requested from Nines (Nineteenth Century Scholarship Online).  There appear to be a few other scholastic endeavors that are utilizing this software as well: 18th Century Connect and MESA: Medieval Electronic Scholarly Alliance.


Tool 3

The last tool I would like to look at here is Pipes. “Yahoo Pipes allows users to combine, filter, translate, and geocode data from RSS feeds, JSON, KML, or other similar formats, and power widgets/badges using that data.” (

Yahoo Pipes

You will need to have a Yahoo account for this.  Once inside Pipes, there are numerous user input options to choose from.

Yahoo Pipes Sources

Creating a series of pipes can be an exciting endeavor.  All that is required is to drag the input box desired into the workstation and then add the material that is desired.  For every user input box, there are useful descriptions of how the boxes are used.


As a side note:

Bamboo DiRT is currently seeking people to serve on their editorial board.  They anticipate the workload to be only a couple hours a month, so this might be a great opportunity to gain experience in the DH realm.

Reflection on Text Analyzing Tools

For a recent class project I was instructed to compare different text analyzing tools and give a description of the pros and cons of each, along with my thoughts on using them. The different tools I will be comparing are Juxta Commons, Voyant, and TAPoRware.

I chose to visualize a text I worked on in my undergraduate philosophy days: Immanuel Kant’s Critique of Pure Reason.  As the original text was written in German, a language I only know a couple words in, I have had to rely on translations when reading it.  So what would be interesting for me to see is how various people have translated the work and looking at how close they are to each other, or how different they are. 

As the original text approaches 500 pages, I narrowed down the scope of this project by focusing on the introduction to the work. The introduction is a little over twenty pages of text.  This still provides me with plenty of material to examine for the project.

Kant Knowledge

Selection for Comparison:

  1. Critique of Pure Reason by Immanuel Kant Translated by J. M. D. Meiklejohn
  2. Critique of Pure Reason by Immanuel Kant Translated by Norman Kemp Smith

Before I could begin examining the works, I needed to manipulate the text and make them as similar as I could in structure without changing anything of substance.  I copied the text from their respective sources and then saved them into a .txt file. For the Maiklejohn translation I needed to rework the line length to make it more comparable with the Smith translation. I also deleted the footnotes and added spacing between the paragraphs for the Maiklejohn translation.  For the Smith translation I needed to remove the page numbering and the various commentary, notes, and explanations that were in the online text.

Results of Juxtacommons:

Juxtacommons is  a ” tool that allows you to compare and collate versions of the same textual work.” (  Within Juxta there are multiple tools that are useful.  These tools include: Heat Maps, Side by Side Views, Histograms, Parallel Segmentation, Edition Starter, and Versioning Machine.  The last two of these, at the time of this writing, are experimental. I will try and explain more about this tool as I go through it, but for a better explanation see A User Guide to Juxta Commons.

A special note before using these tools.  As some of these take a while to visualize, it would be advisable to have something else to work on while you wait. Also, you will need to create an account to use their website, but don’t worry- it is free.

Heat Maps:

Intro Critique of Pure Reason Meiklejohn Trans Juxtacommons pic1

Intro Critique of Pure Reason NORMAN KEMP SMITH Trans Juxtacommons pic1

What we can see here is the degree of variance between the text selected and the other texts that we have in our comparison “witness” list. “The text is color-coded to indicate the degree of variance from the base witness evident at any particular area of the text.” (“User Guide to Juxta Commons” )  In the two screen shots above, we can see that there are numerous differences in the two sections that we can see.

Side By Side View:

juxtacommons side by sideThe Side by Side tool takes the results of the Heat Map and then throws them on a plate next to each other for easy comparison.  By hovering over the shapes in the middle of the tool, the highlighted sections of the text become clearer.  This allows for easy focusing on a certain passage or line of the text.   What I keep seeing in the results between the two translators is that one of the authors would focus in on one part of what Kant was saying, while the other translator would sometimes focus on another aspect of what Kant was saying within the same sentence.  (One of the jokes my undergrad philosophy professor would tell about Kant was that even German’s preferred reading him in translation because of how confusing his use of language was.)

Versioning Machine:

juxtacommons Versioning Machine

With this tool it become clear from the start the variances of the way the two translators looked at the text.  Though they are both still expressing Kant’s idea, how they choose to express that appears differently.  For instance, in the heading to this introduction Meiklejohn translates the heading as “Of the Difference between Pure and Empirical Knowledge” while Smith translates the same heading as “The Distinction between Pure and Empirical Knowledge.” Excluding the preposition at the start of Meiklejohn’s heading, the only difference between the two of them is the noun choice between “Distinction” and “Difference.” Do you think there is any significant meaning difference between the two titles?

If we continue, we can see a clearer distinction in the way the second sentence is translated.  Meiklejohn translates it as:

“For how is it possible that the faculty of cognition should be awakened into exercise otherwise than by means of objects which affect our senses , and partly of themselves produce representations, partly rouse our powers of understanding into activity, to compare to connect , or to separate these , and so to convert the raw material of our sensuous impressions into a knowledge of objects , which is called experience?”

while Smith translates the same sentence as:

“For how should our faculty of knowledge be awakened into action did not objects affecting our senses partly of themselves produce representations, partly arouse the activity of our understanding to compare these representations , and, by combining or separating them , work up the raw material of the sensible impressions into that knowledge of objects which is entitled experience?”

Besides the fact that Kant is long winded, can we pull out anything from the text through this side by side comparison?  Because I read things visually, the way each author expresses this idea forms pictures in my head (e.g. the phrases “awakened into action” vs. “awakened into excercise”, or “to convert the raw material of our sensuous impressions into a knowledge of objects” vs. “work up the raw material of the sensible impressions into that knowledge of objects.” So when I look at these two translations right next to one another, I can see the differences in the verb and adjective selections and try to discern something about what that means.

What I like about using Juxta Commons is that if one of the tools is not understandable, by taking a look at one or two of the other tools in their toolbox the text becomes more understandable.  It is then possible to go back and look at the tool that was incomprehensible at it’s first iteration.

Now that we have looked at Juxta Commons, let’s see what Voyant has to offer.

Results of Voyant:

Unlike Juxta Commons, Voyant only allows for one text to be analyzed at a time.

One of the first things I did in Voyant, after uploading my first file, was go to the settings for Cirrus to add a list of stop words from appearing in the word cloud. The list of words that I chose not to include was: the, of, a, do, which, as, b, any, not, has, any, so, such, which, and, to, in, be, it, its, by, an, at, h, or, with, and from.

Voyant Meiklejohn trans

Let’s take a closer look at the Cirrus, the word cloud for this text:

Voyant Cirrus Meiklejohn trans

Even though I pulled out most of the superfluous words from the Meiklejohn text ,as described earlier, there are still some words that aren’t as insightful that remained.  Cirrus has a limit of 75 characters as stop words. The word frequency of the remaining words is still valuable.  We can discern just from this word cloud what the major topics within this text, or at least this section of the text thereof. Some of the more important words that stick out of the text include: knowledge, reason, conception, a priori, and experience.

If you take a look at the summary located at the bottom left side in the first picture for this section on Voyant, you will see that according to Voyant’s analysis there are 6,776 words and 1,181 unique words. Now opening up the Corpus box located at the bottom right of the screen we are provided with a density ((the number of unique words/ number of words)+1000).  The density of words for this text is 174.3 which is pretty high.

In this next picture we can see two more aspects of Voyant’s toolkit.  By selecting terms in the Words in Entire Corpus tool, we can obtain a graph of word trends and how they are dispersed throughout the text.

Voyant Words in Entire Corpus and Trends

What I can extract from this tool is seeing where the author really begins to explore various aspects of his work.  If we look at the use of conception (i.e. the green line of the graph) we can see that that term is used more throughout the middle of the text we are examining, whereas the term reason (i.e. the pink line of the graph) appears more towards the end of the text we are examining.

Knowing this allows us to better ascertain where in a text we might find a particular section that is relevant to our work.  For instance, if we were looking at writing a paper on Kant’s use of the term reason in the introduction to the Critique of Pure Reason we would know where in the text we could go in and find it.

Without even needing to go back to the original text Voyant will do that for us by their Keywords in Context tool.

Voyant keywords in context

In the above picture we can look at how the term reason is applied throughout the text. We can look at what words the term is associated with. If you can see it, there are two terms that jump out to me that are often situated in front of the term reason: pure and human.

Let’s go ahead and take a look at our last Text Analyzing Tool- TAPoR.

Results of TAPoRware:

Before we get started with this tool, let’s remember that TAPoR is just a portal for text analysis tools.  That said, let’s take a quick look at what we can find on their site.  Unlike Voyant, where you only need to upload your text once, TAPoR requires that you upload your text everytime you want to utilize a different tool.

TAPoRware main screenSince I focused on the Meiklejohn translation when using Voyant, I’m going to focus on the Smith translation for using TAPoR.

The first tool I decided to use was Pattern Distribution. What I liked about this tool was that it showed me the word count of a particular word throughout a set percentile of the work.  Again, by looking at this data, and seeing where the word is most likely to appear in the text,  I can then focus in on a particular facet of my research.

TAPoRware pattern distribution

The next tool from TAPoR we will look at is Speech Tagger. Once the file is uploaded, we get to pick which words we want to focus on, or highlight, within the text.

TAPoR Speech Tagger

Though this tool sounds like it would be beneficial from a grammarian viewpoint, I was unable to figure out how to use it.  I tried selecting many different options and colors, but every time the results were the same.  If someone else has had more success with this and can provide a better description, please leave a comment.

The last tool we’ll be looking at from TAPoR is List Words.  Once a text is uploaded we are given options on how to proceed.  For convenience sake I just selected modified Glasgow Stop Words rather than create my own set.  That said, let’s see the results:

TAPoR Listwords

According to the results the most common word found in the text, excluding those from the stop list, is knowledge.  I would have thought that “reason” would have been higher than fifth in rank. One of the benefits I see of using this tool would be in analyzing Kant’s vocabulary, or his word choices.  I could also see it being beneficial to those studying philology.

Kant Meme

So What? or How can these tools be relevant for our library?

That is a good question.  How I think these tools could be of use in the library field is in assisting researchers to understand a text. All of these tools could help to bring a text to a greater understanding, but the life force behind the text will remain inherent in the text.  These tools can only assist in chiseling away at a text until it’s true form is found.

There is a story about the famous Renaissance artist Michelangelo where he was asked about how he was able to carve such a wonderful statue.  It was as if he had carved a real person out of marble.  His response to the question was “I saw the angle in the marble, and carved until I set him free.

That is what libraries can offer.  The ability to set texts free, or the knowledge that they contain. Since we are a service oriented institution, we can assist users with these tools to help them understand all of the information they are receiving from texts.


Word Clouding

There are various word cloud programs available  online (e.g. voyant, wordle, Tagul, & Tagxedo). In this post, I am going to give a brief overview of what these are and how they can be effective tools in the library.

So what is a word cloud?

Oxford Dictionaries defines word cloud as “an image composed of words used in a particular text or subject, in which the size of each word indicates its frequency or importance.” (“Word Cloud” )

The first Word Cloud tool we will examine is made by Voyant.  As you can see in the picture below, it just looks like a jumble of words that are all different sizes and colors. At a closer look it becomes evident that these different sizes are based on word frequency.  Without scrolling down, can you tell what the text is?

Green Eggs Ham Voyant WordCloud

I’ll give you a hint:  It’s one of the most beloved children’s books of all time, or at least since the mid 20th century.

Yep, it’s Green Eggs and Ham by Dr. Seuss

There are some interesting results that we can gather from the data (i.e. the text)being analyzed.  There are 783 total words and 51 unique words used by Dr. Seuss in Green Eggs and Ham. The most frequent words in the corpus are not (84 ), i (71), them (61), a (59), & like(45).  Rather than having to count all of this and do it by hand, Voyant’s program did it for me.  All I had to do was copy and paste the text (which I found available online)  into a box provided by Voyant. One of the features that Voyant provides is called “Words in Documment” which allows you to see the word frequency and provides a graph and a “relative” synopsis of a selected word per 10,000 words in the document .

Some of the tools are not very clear as to what they are and how to use them, but Hermeneutica, the parent organization of Voyant, has a useful guide to help users understand how it operates.

Now let’s take a look at Wordle.  Sticking to the Dr. Seuss theme from earlier, let’s do Oh, the Places You’ll Go!

Oh, the places you'll go wordcloudSo, what do we notice?  Besides the different words, the structure of the cloud is nearly the same.

There are other numerous word clouding tools available to use, such as Tagul and Tagxedo, as mentioned earlier.  Tagul and Tagxedo are a little more complicated to install, but can be well worth the effort as their design options far exceed the options in Voyant and Wordle.  If on the other hand you are looking for information analysis with your text, I would recommend Voyant.

So What?  Or, How can these be incorporated into library functions and teachings?

Mrs. Lodge’s Library, in her blog post Dewey Word Clouds, describes a project that she came up with for her fourth grade students to help understand the Dewey Decimal System.  The project involved teaching the students basic note taking skills, grouping the students into various Dewey ranges (000-900), and teaching the students how to put the information into a word cloud.  The students could then see the results of their labor by the signage that was created for the library.

Duke University’s Library Blog, in their blog post Moving Beyond the Word Cloud, speaks of using them in their “post-library instruction assessment by compiling student comments and creating a word cloud to share with the group prior to leaving the session.” (Amber 2011) This is helpful in making it easy for students to remember what was discussed in their library instruction.  Amber Welch also spoke of how Duke was moving beyond Word Clouds by using programs such as TAPoR.

Terry, writing for the Shokie Public Library Blog, wrote an article called What’s a word cloud and what’s it good for? in which she describes creating a list of titles that she has read and creating a word cloud to see if there is a common theme in her reading habits.   This is very similar to the project done by Mrs. Lodge’s Library, but has a different goal in mind.  If a library were able to pull their list of titles from the catalog record, then they could easily create a word cloud showing some of their more popular themes.  Another approach to this would be doing subject headings extracted from the catalog record.


Blogging Tips from a Professional Blogger

I was at a conference last week and one of the speakers, Jill Stanek, gave advice on blogging.  I wanted to share her ideas because she has a successful blog that is recognized nationally.  Though her blog does not focus on digital humanities, its success can be used as a template when creating content.

12 Points to Successful Blogging According to JillNote the above link contains only ten points as the post was written a couple years ago.)

  1. Strive for Excellence
  2. Find Your Niche and Become an Expert
  3. Think of Your Blog as a Vocation
  4. Develop a Mission Statement
  5. Blog Strategically
  6. Blog Often
  7. Keep it Pithy (Edit, Edit, Edit)
  8. Give Photo Credit
    1. Fair Use Doctrine
    2. Never Crop Out Copyright or Contributor
  9. Write Original Content
  10. Cross-Post to Other Social Media
  11. Be Accurate- Check Sources
  12. Develop a Thick Skin

So what to get out of this:

When blogging, bear in mind what it is you are saying and strive to create a unique voice that will draw people in. Make sure that you know what you are talking about and have evidence, or citations, to back up your claims. And lastly, fair use is fair use.  If an object is in the public domain feel free to use it, but always sight your sources.