Roundtable on Structured Data

DH Roundtable Poster 2.png

Text of Poster Follows: Graduate Student Digital Humanities Roundtable: Structured Data. Come join us to discuss how information from humanistic sources can be structured, for effective analysis with digital tools! Monday, October 24, 1:00-2:00 pm, Dealy 102. Bring your own sources, projects, or project ideas to discuss! Coffee and donuts will be provided.

 

 

 

 

DH Annotation Tools: Out of the Margins

Come join us to learn how digital annotation tools could be applied to your research!

DH Annotation Tools.jpg

Text of poster follows: A Practical DH Workshop Sponsored by Fordham’s Digital Humanities Working Group and Graduate Student Digital Humanities. Thursday, 20 October, 2016. 1-2 pm Walsh Library 047. DH Annotation Tools: Out of the Margins. An introduction to Annotation Studio and Lacuna Stories, web-based research and learning tools. Led by Shawn Hill, Instructional Technologist for Digital Scholarship, Fordham University. Practical DH is a series of short, one-hour workshops meant to offer concrete, specific information and hands-on introductions to a variety of digital tools and approaches for research and pedagogy. The next workshop will be in November on mapping (likely an introduction to CartoDB). For more information, contact kowaleski@fordham.edu.

Digital Day 2016

For the second year, GSAS futures and the Center for Medieval Studies will be presenting a Digital Day, with sessions on Photoshop and WordPress. If you weren’t able to attend the Digital Day last year, come join the mighty throng! If you were, join the throng anyway, for a refresher!

Digital Day 2016

Monday, August 29, 11:30 am – 3:30 pm

Faber Hall, Room 445

Phone: 718 817 4656

Email Registration: hafner@fordham.edu

digital day

 

Apply today! NYCDH Digital Humanities Graduate Student Project Award

NYCDH  is pleased to announce its third annual cross-institutional NYCDH digital humanities graduate student project award. We invite all graduate students attending an institution in New York City and the metropolitan area to apply by Monday, August 15, 2016.

First prize winner(s) will receive a cash prize of $1000. Two runner up positions will receive $500 each. All three winning proposals will have the opportunity to receive support from one or more of the many centers affiliated with NYCDH. Winners will also receive exposure on our site and through our social media outlets.

Project proposals can be submitted by individuals or teams. In the case a team wins, the prize is to be divided among the team members equally. We are accepting proposals for projects in early or mid stages of development.

All applications should include a clear description of your project, how it falls into realm of the digital humanities, a timeline for the project work, and a transparent, itemized explanation of your funding requirements. For more details, see the Graduate Student DH Project Award page on our website.

We encourage prospective applicants to contact us to talk about your proposal before you submit. To set up an appointment send us an email at nycdigitalhumanities@gmail.com.

Proposals will be judged by a committee selected from the NYCDH Steering Committee. The winners will be chosen based on their intellectual contribution, innovative use of technology, and the clarity of their work plan.

To learn more visit our award information page: http://nycdh.org/nycdh-student-project-award.

TEI and XML Markup for Absolute Beginners

You have likely already come across the term TEI in digital humanities circles. Perhaps you have (like me) downloaded a TEI edition of a text before, but have not known how to read it or what to do with it. You may have (also like me) even deployed the term in casual conversation without completely understanding what TEI is or what it does.

This post aims to provide you with a concise and basic introduction to TEI and XML markup. I hope to provide definitions for all technical terms and to point you to useful resources to reference either when using TEI-encoded texts or when you eventually create your own. I have also included some step-by-step instructions on how to create an XML markup of a text in the last section of this post.

  1. What is TEI?

TEI is an acronym for the Text Encoding Initiative. The “text” here is clear enough for the time being. But what about “encoding?” All text that we view on a website is encoded with information that affects the way we see the text represented, or the way in which your computer sorts and interprets the text. There are many different languages in which one can encode text.

Because there is no single agreed-upon format or standard for encoding text, a consortium of humanists, social scientists and linguists founded the TEI in 1994 to standardize the way in which we encode text. It has since become a hugely popular standard for rendering text digitally and is still widely used today.

  1. What is XML?

Most versions of the TEI standards (of which there are a few) make use of XML: Extensible Markup Language. XML is a text-based coding metalanguage similar to HTML (in fact, both are markup languages, hence the “ML” at the end of each acronym) that, like the standards of TEI, has undergone several changes and updates over the years.

XML documents contain information and meta-information that are expressed through tags. The tags are similar to those used in HTML, if you are familiar with these. Below is a brief example of an XML document:

<p>the quick brown fox</p>

<!–this is a test–>

<p>jumps over the lazy dog</p>

The letters bounded by these symbols “< >” above are tags. In this case, the tag being used is “<p>” which is used to separate paragraphs. All tags must be both opened and closed, or your code will not work. “<p>” is the opening tag and “</p>” is the closing; all tags are closed in the same way. The text in between the opening and closing tags is what will be encoded.

To insert comments into your XML document that will not have any bearing on the function of your code, follow the format found on the middle line of the document above, i.e. <!–YOUR TEXT HERE –>

  1. Why know or use TEI and XML?

Before we delve into the actual process of creating a simple XML markup of a text, you might already be wondering precisely how learning a bit of XML and TEI will be beneficial to your scholarship.

There are a few ways in which we can envision the uses of a working knowledge of XML and TEI. First and foremost among these is the possibility of you creating and disseminating your own TEI editions of texts – perhaps a transcription of a manuscript few others have seen, or handwritten notes you have uncovered in your archival research. In this case, you can let your more tech-savvy colleagues add your editions to their corpus – to query and explore the documents in any way they see fit.

As humanities scholars in the twenty-first century, we often find ourselves asking broad and provocative questions about our disciplines and our work. One particularly captivating question has always been about the nature of what we call “the text.” The digitization of texts has provided this line of questioning renewed energy, and a basic understanding of the encoding process would also arm you with the vocabulary to take part in this conversation.

A final use of TEI and XML would involve you querying documents created by others. There are an enormous number of TEI editions of texts available freely online, and the more comfortable you are with code, the more you can do with them. One deceptively simple way of visualizing your XML code is turning it into a spreadsheet using Microsoft Excel. This is especially useful if you wish to add data from a TEI edition into a database for your thesis or dissertation. To do this in Excel, simply select the “data” tab on the top utility bar and specify “from other sources” before selecting your XML file.

  1. Finally, actually doing TEI and XML.

In this section, I will quickly walk you through the process of creating a simple XML document in accordance with the standards of TEI.

-What you will need

A plain text editing software is the most important tool you are going to need. Atom is a wonderful free software that works well on Macs, as is Sublime Text 2. When saved as an XML file, your tags should appear in color, making the markup process easier, and making it more noticeable when you have forgotten to close a tag. You will also need a plain text copy of the text you are working with.

-Tags

We have already gone over the basics of tags above. Another key issue here is deciding which tags to use, i.e. which tags are useful for the goals of your project or edition. These tags and their correct formatting can be found online (look http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-tag.html for a list of element tags) but a cursory list of common ones can be found below:

<p>; <body> ; <l> ; <name> ; <placeName> ; <abbr> ; <head> ; <quote> ; <date> ; <time>

Once you have tagged all the elements you wish to tag in your text (making sure to close all of your tags!) you have to add a TEI header in order to ensure that your document adheres to the TEI standards. Examples of TEI headers can be found here: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html

-Get Validated!

A useful resource for checking your code can be found here: http://teibyexample.org/xquery/TBEvalidator.xq

This validator will let you know whether your code follows the TEI standards, and if not, where your mistakes can be found.

  1. Further Reading and Resources

There are lots of online resources for TEI and XML, but here are a few of my personal favorites:

http://teibyexample.org/

http://www.tei-c.org/index.xml

https://wiki.dlib.indiana.edu/display/ETDC/TEI+Tutorial

ftp://ftp.uic.edu/pub/tei/teiteach/slides/louslide.html

 Alexander Profaci is an outgoing MA student in Medieval Studies at Fordham University. He will be entering the PhD program in History at Johns Hopkins University in the fall. Follow him on Twitter @icaforp.

Digital Humanities and the Lure of Parchment (An After-Action Report of Fordham Medieval Studies’ Oxford Outremer Map Colloquium)

A post by HASTAC Scholar Tobias Hrynick

Outremer Map.png

On April ninth, I had the opportunity to present during in the Oxford Outremer Map Colloquium (see the previous blog entry for the lovely poster!). The Colloquium was organized around a project by Fordham’s Medieval Studies program to digitize a map drawn in the mid-thirteenth century by the English chronicler, artist, and cartographer, Matthew Paris, depicting the eastern Mediterranean, stretching from Armenia to Egypt. The map demonstrated a number of interesting features which were not typical of maps of this period (north-orientation, relatively close adherence to the physical shape of the region depicted, and the absence of an obvious symbolic program). Several features – such the close integration of pictorial and verbal elements (hard to depict in traditional print editions), the roughness of the writing, and the bleed-through of ink from the opposite side of the sheet – have long served to discourage study of the map, but could be ameliorated with digital photo-editing and annotation, and made the map a good candidate for a digital edition. The project containing this edition can be viewed on the Medieval Studies website.

The colloquium focused around this project was divided into four sections. During the first two, three prominent scholars of medieval mapping – Evelyn Edson, P. D. A. Harvey, and Asa Mittman – discussed the place of the Oxford Map in the scholarship, and numerous issues surrounding its creation: its purpose, its time of composition, its sources, its place in the manuscript tradition, and the best way of understanding the map in the context of the bible with which it came to be bound. In the third section, Asa Mittman delivered a paper discussing the ways in which manuscripts are affected by being placed in digital contexts, with myself and Abigail Sergeant (who both worked on the Oxford Outremer project as Masters students) responding. In the final section, David Pedersen, a Ph. D. candidate in Fordham’s English Department, introduced a discussion of the Oxford Outremer map and website as pedagogical tools.

I regret that I cannot justify a more detailed discussion of the first, second and forth sessions here – all were productive and engaging. Here, however, given the purpose of this blog, I would like to zero in on several topics which arose in the second section, pertaining particularly to manuscripts in digital contexts, which resonated with other remarks I have heard expressed recently about the possibilities and limitations of digital tools in manuscript studies.

Dr. Mittman, during his talk, emphasized the losses implicit in presenting manuscripts digitally when compared to experiencing them as physical objects. Manuscript digitization, in his view, though useful if the manuscript was unavailable, or in parallel with the manuscript itself, risked disrupting the immediate, visceral connection to the past which real manuscripts offered. This central problem – balancing the benefits of easier access against those of physical contact – was a recurring theme in the subsequent discussion.

This question, in one form or another, is common throughout the digital humanities, and indeed, society’s engagement with text even outside academic contexts: consider the vigor with which some people oppose the very concept of e-book readers. However, such concerns are felt more acutely in certain DH sub-disciplines than in others – though digital humanists pride themselves on their methodology’s ability to foster interdisciplinary approaches, the old disciplinary divisions do still have some force. History has yet to develop a theoretical framework which clearly articulates the role of digital scholarship as against more conventional approaches, as structures like Franco Morretti’s conception of “distant reading,” have done for literary analysis. Though medievalists have, by and large, been relatively welcoming of digital approaches to analysis, attempts to digitize manuscript source materials have been treated much more ambivalently.

Dr. Mittman’s talk was the second talk I have attended in the last two months which focused on the potential losses of approaching manuscripts through digital imagery. During Fordham Medieval Studies’ Manuscript as Medium conference, Dr. Martha Easton delivered the provocatively titled talk “‘If Everyone is Special Then No One Is:’ Manuscripts for the Masses.” Easton, like Mittman, emphasized the experiential and physical aspects of medieval manuscripts, but Easton went still further, including the very action of travelling to archives as an integral part of the experience of text, rendering scholarship into a special kind of pilgrimage (though she too admitted that digitization projects were often useful, when the real thing was unavailable).

Certainly, the experience of the manuscript – the thrill of holding something with a message from a thousand years ago – cannot be entirely replaced by manuscript digitization. This comparison between digital versions and manuscripts, however, is somewhat misleading, and can cause us to be unduly pessimistic about the possibilities. What digitization projects have the potential to replace are not the manuscripts themselves, but print editions and micro-films – it is these tools which represent the traditional recourse of scholars who cannot practically consult all the necessary sources in manuscript. People able to visit manuscripts will continue to do so, but those who cannot will be given much stronger alternatives. The very fact that digital versions are being widely compared to manuscripts, however negative that comparison, is a mark of how strikingly true to life they can be.

Returning to a more optimistic view of manuscript digitization is important not only as a morale boost for those, like myself, who have been involved in this kind of project. It also has important implications for the kinds of digitization projects which take place. Pessimism could too easily lead to extremely conservative digital editions, which do in fact simply try to accomplish the impossible, in replicating the experience of the manuscript, and end up producing merely photographs. As Dr. Mittman pointed out at the colloquium, however, the best digitization projects do not merely post manuscript images (though even this can be extraordinarily valuable) – rather they offer some scholarly intervention, serving either to contextualize a manuscript, or to make it more easily usable to the chosen audience. It is vitally important that we accept manuscript digitization as a way of talking about manuscripts, rather than replacing them, so that we can fully exploit our new tools.