On Translating Voyant Tools into Arabic

@DJWrisley and @NajlaJarkas1
Department of English
American University of Beirut

Voyant Tools is a “web-based reading and analysis environment for digital texts.”  The developers of Voyant, Stéfan Sinclair and Geoffrey Rockwell, reached out to the international digital humanities community this summer to ask for volunteers to translate it into languages other than English.  My colleague in the Department of English Najla Jarkas and I set out to translate it into Arabic.  Our draft of the translation of the version 2.1 interface can be found here.

Both of us have worked in the domain of Arabic-English-Arabic translation, but neither of us has translated within the specific domain of computing.  The language of Voyant Tools posed a challenge for us, since it blends the lexical fields of interfaces, data analysis and visualization as well as computational textual analysis.  We imagine that it is a new blend of terminology in many languages; it certainly is in Arabic.

We went to the library to check out English-Arabic dictionaries in computational linguistics and computing.  To some extent these were helpful, but other issues concerning the specific meanings used in Voyant arose.  Reading some portions of the co-authors’ new book Hermeneutica: Computer-Assisted Text Analysis for the Humanities, we were in a better position to understand the blended language of code, tools and explanatory text that make up the Voyant endeavor.  The blogosphere about data analysis, as well as multilingual Wikipedia, were invaluable sources of inspiration.  Microsoft’s language portal was very useful, and yet on some more basic words, we disagreed totally with its doxa.  Take two of its translations of visualization (الرسوم المرئية, مرئيات), literally, “visuals” or “graphical drawings” that we replaced with the Tuftian equivalent of “visual display/presentation” (العرض المرئي).

The language of computing is not fixed across the Arabic-speaking world, but varies according to region and individual usage.  Some terms like interface (واجهة), tools (أدوات) and corpus (مكنز) might provoke very little debate, and yet others like Scatterplot (مخطط التشتت), type/token ratio (نسبة الرموز للانماط), StreamGraph (عرض انسيابي), limited access (وصول محدود) or even skin (غلاف) might be found in print with a considerable amount of variance.

We believe that we have even made some creative interventions in certain cases where either Voyant Tools has perhaps coined new expressions such as conceptual visualization (عرض مرئي مفهومي) or where very recent trends in the digital humanities have done so with notions such as non-consumptive (لا استهلاكي) usage.

How to distinguish between documents and files? between lines and rows? between terms and words?  We wondered sometimes if the base English of Voyant was even consistent. Other most basic issues arose such as the key concept of the query.  Sometimes synonymous with the action of making a term search (عملية البحث), query also meant the contents of that search itself (كلمة البحث).  In the end, our goal was to make the interface as understandable as possible to an Arabic speaker for whom this emergent idiom of digital textual analysis is probably very new.

Theories of translation from English into Arabic are sometimes divided along national and regional lines between adopting equivalence faithful to both the structure and deeper meanings of the Arabic language and more calque-like expressions taken from foreign languages.  Since the theory behind Voyant lies in the creation of functions that can be reused as widgets in different web-based environments, we decided when it came to these names of tools (called Titles in the code) to be as ecumenical as possible.  For example, for the tool Bubbles we transliterated (بوبلز), but also gave an expression more faithful to Arabic (فقاعات).  We followed this strategy throughout providing an equivalence that translates the function of the tool and a transliteration: Workset Builder (ورك سيت بيلدر/ إنشاء المكنز الجزئي), TermsRadio (ترمز راديو /عرض زمني), even the name Voyant Tools itself (فواينت تولز / ادوات فواينت).  The user will discover these binomials positioned prominently in the Arabic interface. The idea was to provide a diversity of Voyant users in Arabic both styles–translation and transliteration–for these iconic titles.

We are aware that our translation is a kind of translingual digital humanities essai and we hope others will jump in to comment and build on our work. We have no doubt made some errors in judgment.  Embedding right-to-left language in HTML was a big challenge and will need to be fixed by others more adept at that process than we are.  We welcome the input of the growing community of regional digital humanists, as well as anyone else who uses the Arabic interface so that we can make it better.  We were humbled by the exercise that will hopefully inspire others to begin to forge a language allowing a broader public to embrace such forms of web-based reading and analysis in the Arabic language.


MAA 2016 Toponymic Strata in a Large Corpus of Medieval French

David Joseph Wrisley
“Place in Corpora” panel
Medieval Academy of America
Boston, 26 February 2016

Computational models are “however finely perfected, they are temporary states in a process of coming to know, rather than fixed structures of knowledge.” (McCarty, 26)

A view from down under










Visualization 1:  Peripleo.  A geographic view of many different digital objects related to the places of Herodotus.  Click here to explore the same query live.

Herodotus at Peripleo



Visualization 2: The places of Joinville’s Vie de saint Louis, data by @DJWrisley

places of Joinville


Visualization 3: Top 50 Places names in the medieval French corpus.

Visualization 4: The literary geographies of the full corpus of Christine de Pizan.  Open geodata set by myself (about 1000 place names) for download.



Visualization 5: Full dataset with a Time Slider (almost 10000 place names, 60% geocoded). 

VMP time slider




Visualization 6: A Faceted Browser for Placenames in Medieval (French) Literature (with Stefan Jaenicke, DH 2013) (almost 3000 place names).

Screenshot 2016-02-21 15.30.46


Visualization 7: Medieval French corpus place names layover with high population areas c 1300. (base map: Richard Hoffmann)

Visualization 8: Medieval French corpus place names layover with agricultural systems c. 1300. (base map: Richard Hoffmann)

Visualization 9: Comparative Cross-Language Literary Geographies of Marian poetry: Gautier de Coincy, Gonzalo de Berceo, Alfonso el Sabio (Old French, Castilian, Galician) (608 points) (map data by myself)

Visualization 10: Comparative Arabic-French late Medieval Historiography (al-Nuwairi Al-Iskandarani vs. Guillaume de Machaut) (map data by myself, accessible color palette)



Non-Embedded Works Mentioned:

“Australia on top down under!” Nucolorvue Productions PTY Ltd.

Center for Medieval Studies / Fordham University (2016). Exploring Place in the French of Italy.

Doueihi, M. (2011). Pour un humanisme numérique. Paris: Seuil.

Elliot, T. and S. Gillies (2009). “Digital Geography and Classics“ DHQ 3.1

Hoffmann, R. (2014).  An Environmental History of Medieval Europe. Cambridge: CUP.

Jessop, M. (2008). “The Inhibition of Geographic Information in Digital Humanities Scholarship” LLC 23.1: 39-50.

Mostern, R. et al (2016, forthcoming) Placing Names: Enriching and Integrating Gazetteers. Bloomington: Indiana UP.

Presner, T. and D. Shepard (2016). Mapping the Geospatial Turn” The New Companion to Digital Humanities. Malden, MA/Oxford: Wiley Blackwell.

Simon, R. et al. (2016).  “Peripleo: a Tool for Exploring Heterogeneous Data through the Dimensions of Space and Time”  Code4Lib 31.

Stoa Consortium (2016). Pleiades.

Suard, F. (2011). Guide de la chanson de geste et sa postérité littéraire. Paris: Champion.

Turnator, E. (2015). Summary of the Proceedings of the Linking the Middle Ages“ Workshop.

Wrisley, D. (2016). Visualizing Medieval Places.

MLA S649 The Visual Display of Literary Information

MLA16 special session –  Austin, TX

Saturday, 9 January, 3:30–4:45 p.m.

Organizer: David Joseph Wrisley, Amer U of Beirut
Presiding: Roopika Risam, Salem State Univ.

1. “Visualizing Anti-Information: New Critical Diagrams of Attitude,” Andrew Hines, Vanderbilt Univ.
2. “What Did Jane Austen See at the Shakespeare Gallery in 1796?” Janine G. Barchas, Univ. of Texas, Austin
3. “How to Do Spaces with Words: Revisualizing Literary Geographies,” Matthew Price, Penn State Univ., University Park
4. “Visualizing Modernists’ Cities,” Kathryn Tanigawa, Univ. of Victoria

keywords: digital humanities, visualization, literary history, literary geography, 3D modeling


This panel’s title playfully evokes the landmark book in statistics and visual graphics by Edward Tufte, The Visual Display of Quantitative Information (1983, 2nd ed. 2001). Tufte’s work foregrounded visual design as more than just secondary to the understanding of data. Whereas his arguments were largely destined for an audience of graphic designers, today’s digital humanists will benefit from the lessons of his scholarship, as they puzzle through how best to display, or to use more contemporary parlance, to visualize their objects of study. The data visualization community, arguably only decades old, has developed best practices for precision in screen-based representation of certain phenomena, and yet in collaboration with humanists they encounter problems like uncertainty, complex temporality and nuance that we ourselves struggle to express clearly in prose argument, let alone in diagrams, maps or 3D. The replacement of Tufte’s adjective “quantitative” with “literary” in the title of this panel is also meant to provoke discussion about the nature, even the possibility, of literary information.

The critical background against which our panelists will speak only begins with Tufte, and extends to Moretti (Atlas of the European Novel; Graphs, Maps and Trees), Drucker (Graphesis), Presner/Shepard/Kawano (HyperCities), Bender/Marrinan (The Culture of the Diagram), Travis (Abstract Machine) and others. At the heart of those theoretical discussions is to what extent the visualization of literary information goes beyond pure display and participates in the re-creation of literary objects. What is the nature of such visualization? Presentational? Critical? Epistemological? To what extent is visualization an end product, or part of the research process?

The panel is composed of literary scholars, three advanced doctoral students, one full professor with an associate professor organizing, and an assistant professor presiding, all of whom have been involved in the practice of the humanities in distinctly digital environments: in 3D visualization labs, in a variety of interactive 2D mapping environments as well as in the humanities lab and the classroom. The panel looks both backwards to ways that we have sought to render our research problems visual, and forwards to new horizons in visualization afforded to us by digital culture. Far from imagining visualization as simplification or purely output of data, the presenters will reflect on the very idea of the literary transformed into data. Each of these environments has its own visual rhetorics, drawing largely on the needs of non-literary communities. The panelists will also address the generative, interpretative quality of their visualizing practices, stopping to ask what the complex issues of literary studies have to offer the community of visualization and 3D modeling in turn.

In his paper “Visualizing Anti-Information: New Critical Diagrams of Attitude,” Andy Hines reminds us that visual display of data is not only a late 20th century digital phenomenon, but finds diagrammatic expression in a number of prominent works of mid-century New Criticism. He argues that in their haphazard representation of the communicative attitude of literature, key critics’ diagrams anticipate a response to Johanna Drucker’s imperative “to find graphical conventions to show uncertainty and ambiguity in digital models.”

In her paper “What did Jane Austen see at the Shakespeare Gallery in 1796?,” Janine Barchas will present a digital reconstruction of the Shakespeare Gallery. It is a twice-removed visualization, a modern attempt to recreate a Georgian experiment in the visualization of literature. Her data originates in archival payment records in the Folger Shakespeare Library, surviving fragments in many museums, and John Boydell’s own engravings (for aspect ratio and visual stand-ins).

In his paper “How to do Spaces with Words: Re-Visualizing Literary Geographies,” Matthew Price engages with the practive of visualizing literary geographies, foregrounding the networks of “major” and “minor” spaces as his central organizing principle. Through the example of Charles Dickens’s Little Dorrit, he look at the difficulties posed by the project of mapping an entire diegetic geography, and at the potential benefits offered by this “distant reading” of a single text.

Finally, in her paper “Visualizing Modernists’ Cities” Katie Tanigawa outlines how she goes about creating three-dimensional maps (known as z-axis research) to carry out both close and distant readings of modernist novels such as Jean Rhys’ Quartet and Virginia Wolff’s Mrs Dalloway. In so doing, she is able to show which areas of the city are privileged by the texts and to analyze the socio-political significance of such privileged zones.

As you can tell, what links the papers on this panel is not a common time period, or even a common linguistic tradition, but rather the means of visual representation of different forms of data extracted from the literary text and/or context. Such visual forms, based on abstractions, must be interrogated for what they add to literary studies, how they provoke additional questions about the literary object at hand, as well as the forms of loss generated from formal representation.

Since the visual drives our discussion, I have suggested that all speakers prepare a presentation of maximum 12-13 minutes (48-52 minutes), with a small set of key visuals integrated into the body of the talk. I hope that this format (a kind of extended lightning talk) will leave ample time for discussion (20+ minutes) to focus on the promises and challenges of visualization in literary studies.


Janine Barchas (panelist) is Professor of English at the University of Texas at Austin. Her first book, Graphic Design, Print Culture and the Eighteenth-Century Novel (CUP, 2003), won the SHARP DeLong prize in the history of the book. Her most recent book is Matters of Fact in Jane Austen: History, Location, and Celebrity (Johns Hopkins UP, 2013). She is the creator of the digital heritage project “What Jane Saw,” which reconstructs two Georgian museum exhibitions as witnessed by Jane Austen (www.whatjanesaw.org). In addition, she is co-curator of an upcoming brick-and-mortar exhibition, entitled “Will & Jane: Shakespeare, Austen, and the Cult of Celebrity,” at the Folger Shakespeare Library in fall 2016.

Andy Hines (panelist) is a PhD candidate in the Department of English at Vanderbilt University, where he will defend his dissertation in May 2015. His dissertation, Understanding Criticism: An Institutional Ecology of U.S. Literary Criticism and Theory, attends to the various media of literary criticism in the mid-twentieth century to theorize a new mode of disciplinary history. His articles appear or are forthcoming from English Language Notes and Criticism. Hines participated in the Mellon Institute in Digital and Public Humanities for Early Career Scholars in May 2014. He has blogged about the impact of digital interfaces on the writing process and the use of digital visualization tools in the classroom for the Vanderbilt Writing Studio and the Vanderbilt Institute for Digital Learning. (See, for example, http://vanderbilt.edu/writing/2013/12/the-blank-screen-and-the-blinking-cursor-the-design-of-writing/ and https://my.vanderbilt.edu/vidl/2014/08/guest-post-by-andy-hines-the-%E2%80%9Cdigital%E2%80%9D-classroom/) In addition, he has contributed the entry for “New Criticism” to the Routledge Encyclopedia of Modernism.

Matthew Burroughs Price (panelist) is a PhD candidate in English at the Pennsylvania State University, where he will serve as a 2015-16 Predoctoral Fellow in the newly formed Center for Humanities and Information. His research and teaching interests focus on 19th- and 20th-century British fiction, with particular emphasis on narrative theory, queer studies, and literary geography. Drawing on both older and newer methods of spatial analysis, his dissertation demonstrates how literary spaces can be read using narratological methods commonly reserved for literary characters—that is, as oscillating hybrids of real and fictional, structural and referential significance, in a qualitatively and quantitatively differential system of major and minor nodes. As a whole, the project demonstrates how “major” and “minor” novelistic spaces signify in diverse, dynamic, but nevertheless visually representable ways. Work from this project has appeared in ELN: English Language Notes, and his article “A Genealogy of Queer Detachment” is forthcoming in PMLA.

Roopika Risam (presider) is an assistant professor of English and English Education at Salem State University. She is currently finishing her monograph, Postcolonial Digital Humanities, which is under contract with Northwestern University Press. Her work has recently appeared in Digital Humanities Quarterly and First Monday and she has an article forthcoming in Left History.

Katie Tanigawa (panelist) is a Ph.D. candidate at the University of Victoria. She is also a research assistant for the Modernist Versions Project (MVP) and Implementing New Knowledge Environments (INKE) where she works on computational analyses and three-dimensional modelling of maps for modernist novels. In 2014 Tanigawa participated in a roundtable titled “Problems and Solutions for Modernist Digital Humanities” at the Modernist Studies Association and co-presented the MVP’s work on digital mapping and modelling at the MMLA. Tanigawa also co-wrote a paper about digital humanities mapping practices for the 2014 Digital Humanities conference and presented work at the 2013 Canadian Society for Digital Humanities conference. In 2013, she also ran a Digital Humanities Summer Institute sponsored Hello World workshop on the Mandala Browser, a rich-prospect browsing interface that visualizes relationships between data points in marked up texts.

David Joseph Wrisley (organizer) is Associate Professor in the Department of English at the American University of Beirut. His digital interests include the relations between visualization and literary history, distant reading in a corpus of medieval French as well as different aspects of the spatial humanities including literary GIS, historical gazetteers and the social creation of spatial data. He is the author of a recent article on the Spatial Humanities (Porphyra 22, Dec 2014, 96-107). He blogs about his digital project on medieval space-time at visualizingmedievalplaces.wordpress.com and can be found on Twitter as @DJWrisley. He is directing an undergraduate spatial humanities project that aims at mapping languages in contact in Beirut hel.djwrisley.com/index.php/mapf15/. He is the organizer of the Digital Humanities Institute – Beirut (dhibeirut.org), a member of the DHSI international digital humanities training network (dhsi.org).