Italian Trulli

Analysing Visual Communication in Theory and Practice with Archetype

Peter Anthony Stokes (peter.stokes@ephe.psl.eu), École Pratique des Hautes Études – Université PSL and Debora Marques de Matos (debora.matos@uni-muenster.de), University of Münster and Neil Jakeman (neil.jakeman@kcl.ac.uk), King's College London

1. Brief description

The goal of this workshop is to teach participants how to use Archetype while introducing them to the theoretical and practical issues around describing and categorising handwriting and other visual communication in a digital context. This will be done primarily through practical application of the software to a range of different cases including not only left-to-right alphabets but also hieroglyphs, ideographs and others as well as right-to-left and top-to-bottom scripts. Attention will also be paid to decoration and other forms of graphic communication, insofar as it comprises repeated visual elements that can be compared by Art Historians or others much as components of letters are compared by palaeographers.

A screenshot of Archetype as used in the DigiPal project
Figure 1. A screenshot of Archetype as used in the DigiPal project

Archetype (formerly known as the DigiPal Framework) is a generalised system for the online presentation of images with structured annotations and data which allows users to analyse, describe, search for, view, and organise detailed characteristics of handwriting or other material in both verbal and visual form. Designed primarily for the palaeographical analysis of historical handwriting, it was first developed at King’s College London for the Digital Resource and Database for Palaeography, Manuscript Studies and Diplomatic (DigiPal) project, funded by the European Research Council, and has since been extended by the King’s Digital Lab particularly through the Models of Authority and Exon Domesday projects. Available as FOSS, it now used in around two dozen research projects, most of which are small private studies but some of which are the result of large funded research projects. Further details are available from the URLs and bibliography below. Applications to date range from medieval Latin charters to modern draft manuscripts; parchment to inscriptions on stone and coins; writing in Hebrew and Greek to Chinese or Cuneiform; and decoration in manuscripts, tapestries and paintings. Results can be interrogated via a web API or through XML exports with fully customisable templates.

For this workshop, participants will be introduced to the principles and practice of the software and will discuss some of the underlying theoretical issues. They will learn to understand and use the Archetype model for writing and other graphic communication, including how to customise the software for their own purposes. They will see different ways in which the software can be and is being used, not only for small and large research projects but also for teaching and public engagement. Example materials will be provided, but participants are strongly encouraged to bring their own materials and research questions to experiment with. Indeed, the workshop will end with an opportunity for participants to work on their own materials, individually or in small groups, while sharing the challenges, problems and solutions that they face with the leaders and other participants.

Screenshot of Archetype applied to decoration
Figure 2. Screenshot of Archetype applied to decoration

Two key themes will underlie the workshop. The first is that Archetype is designed primarily to enable manual annotations of images. This emphasis on manual annotations is in deliberate contrast to word spotting, character segmentation and OCR/HTR: the goal is not to annotate every occurrence of every symbol, but rather to enable the researchers to make their own decisions and to communicate those decisions to others, a principle very much in the spirit of ‘algorithmic accountability’ which is often difficult if not impossible to achieve with machine vision and deep learning.

The second key theme of the workshop is that Archetype involves structured annotations, and this distinguishes it from other image annotation systems. To use Archetype requires creating a model of the handwriting or other visual communication, reflecting on which visual elements are significant to the research questions at hand. This then enables users to carry out requests such as ‘show me examples of letters with ascenders written by a given scribe’; ‘show me examples of the hands of different people drawn in a given set of manuscripts’; or ‘show me images of personal names in this document’. However, defining an appropriate model is a significant challenge that we will discuss and experiment with in practice. Comparison and connections will also be made to some other relevant ontologies and descriptive models such as IconClass, the Biblissima and CRMtex adaptations/extensions of CIDOC-CRM, and the IDIOM ontology for Mayan script.

Screenshot of Archetype as used in the Exon Domesday project
Figure 3. Screenshot of Archetype as used in the Exon Domesday project

Participants can expect to end the day with a local installation of Archetype which they will have customised during the workshop, and the skills to extend their Archetype installations in a structured manner that will lend itself to effective research. The workshop will focus on how to use Archetype on participants’ own computers for their private use, but the results could later be installed on a server for team work and/or publication.

The default home page of Archetype before customisation
Figure 4. The default home page of Archetype before customisation

2. Workshop leaders

Peter A. Stokes, École Pratique des Hautes Études, Université PSL
peter.stokes@ephe.psl.eu

Peter has around fifteen years of experience in applying digital methods to palaeography and manuscript studies, including leading the team that developed Archetype. He has a degree in Computer Engineering and a joint degree in Classics and medieval English, and a PhD in palaeography (University of Cambridge). He has led or co-led several large projects in digital humanities, including an European Research Council Starting Grant (DigiPal) and two Arts and Humanities Research Council grants (Models of Authority and Exon Domesday), as well as a Leverhulme Early Career Fellowship in digital palaeography. He has published on palaeography and digital humanities, as well as name-studies, lexicography, and Anglo-Saxon charters.

Debora Marques de Matos, University of Münster
debora.matos@uni-muenster.de

Debora is a researcher at the University of Münster, Germany. With a background in art history, Hebrew palaeography and digital humanities, she currently leads 'The Other Within,’ a project that crosses research areas such as iconography, big data, and computer vision. Before that, she conducted research on the material transition from manuscript to print, and her PhD thesis was dedicated to the production of illuminated Hebrew books in Portugal. She has extensive experience with Archetype applied to Hebrew script and book iconography.

Neil Jakeman, King’s Digital Lab, King’s College London
neil.jakeman@kcl.ac.uk

Neil has been working on DH projects at King’s since 2011, having contributed both as a Developer and more recently as a Research Software Analyst. He has an MSc in Geographical Information Systems and Environmental Management (University of Brighton) and a BSc in Geology (University of Cardiff). Neil’s experience with geospatial systems complements the technological approaches used in the annotation of images and he is now acting as an ambassador for Archetype to push the framework into new domains and applications such as Numismatics, Iconography, and Art History.

3. Target audience and expected numbers

The target audience is people interested in structured annotations applied to visual communication, including interest in problems of modelling and communicating such content. The core application is palaeography and epigraphy, including not just alphabetic languages but also hieroglyphs, ideographs, and so on, but the tool is also useful for those interested in manuscripts, inscriptions and other forms of writing more generally, as well as for art history. It is also relevant to those interested in modelling writing and models for deeply structured annotation. Previous experience has shown considerable interest from PhD students and librarians as well as from those in traditional positions of teaching and research. Archetype and its immediate predecessors have been presented several times at DH and have received significant interest. Expected numbers for this event are therefore around ten and potentially up to twenty participants.

4. Length and format

Length: One day

Format: The workshop will be very hands-on, with participants required to work actively on their laptops throughout the day. The general format will be that the workshop leaders present a new element of Archetype, both in terms of its theoretical rational and its practical usage, and then attendees test this out on the example material that will be provided to them. This pattern will largely be repeated throughout the day. At the end of the day, however, participants will be given time to start developing their own project with the help of the workshop leaders.

A tentative outline is provided. Please note that this will be adjusted according to time and in order to accommodate participants’ interests and existing skill-level.

  1. Theoretical introduction: Principles and models for describing handwriting and written communication.
  2. Getting started: setup, annotating images, adding texts.
  3. Adding Images and Hands.
  4. Adding Manuscripts (Items and Item Parts).
  5. Describing Graphs; adding new Characters, Allographs, Components and Features.
  6. Textual markup; linking text and image.
  7. Customising the framework: adding static content, changing menu structure, customising the home page.
  8. Advanced customisations: customising the search pages (basic Python required); using the Web API for custom searching (basic XSLT required); introduction to the customisation framework.
  9. Participants working time: an opportunity for participants to work on Archetype according to their own interests, with the support of workshop leaders and each other.
  10. Wrap-up discussion and further steps.

5. Requirements for attendees

Attendees are required to bring their own laptops with administrator rights to install new software. Archetype depends on Docker which works with Linux, MacOS or Windows, but it is somewhat limited on Windows and so Linux or MacOS are strongly recommended if possible (see the installation instructions on DockerHub in the references below). Those running Windows must also ensure in advance that their system supports virtualisation, as Docker and therefore Archetype will not function without it.

Attendees will be expected to install the Docker and Archetype software in advance of the workshop. Instructions are available at https://hub.docker.com/r/kingsdigitallab/archetype/, and attendees should contact the workshop leaders directly if they encounter any problems in the installation process.

Attendees will be provided with sample materials to work with; however, they are encouraged to bring their own materials to use with the system. To do this, they should come with one or more other documents in mind, including digital images of the documents and ideally some research questions to investigate.

Most of the day will be working with a web-based GUI interface, but the ‘advanced customisation’ session will involve working with Python, XSLT and command-line interfaces. Participants who are not already familiar with these are still extremely welcome and will receive the necessary support from the workshop leaders, but at least a basic knowledge of these will allow participants to gain more from this session.

Appendix A

Bibliography
  1. Anon. (2014) Digital Resource and Database of Palaeography, Manuscript Studies and Diplomatic. London: King’s College. http://www.digipal.eu (accessed 8 January 2019).
  2. Anon. (2017a). Archetype [Website]. London: King’s College. https://archetype.ink (accessed 8 January 2019).
  3. Anon. (2017b). Models of Authority: Scottish Charters and the Emergence of Government 1100–1250. London: King’s College. https://www.modelsofauthority.ac.uk (accessed 8 January 2019).
  4. Brookes, S., Stokes, P. A., Watson, M. and Marques de Matos, D. (2015). The DigiPal Project for European scripts and decorations. In Conti, A., Da Rold, O. & Shaw, P. A. (eds), Writing Europe 500–1450: Texts and Contexts. D.S. Brewer, pp. 25–58.
  5. Castro, A. (2017). VisigothicPal : Dating and Placing Visigothic Script. London: King’s College. http://visigothicpal.com (accessed 8 January 2019).
  6. Felicetti, A. and Murano, F. (2017). Scripta manent: a CIDOC CRM semiotic reading of ancient texts. International Journal on Digital Libraries, 18(4): 263–70 doi: 10.1007/s00799-016-0189-z.
  7. Gettatelli, V. (2018). Using Archetype: reflections on my Erasmus traineeship. King’s Digital Lab Blog. London: King’s College https://www.kdl.kcl.ac.uk/blog/using-archetype/ (accessed 9 January 2019).
  8. Gronemeyer, S. and Diehr, F. (2018). Organising the unknown: a concept for the sign classification of not yet (fully) deciphered writing systems exemplified by a digital sign catalogue for Maya hieroglyphs. Digital Humanities 2018 Book of Abstracts. Mexico City https://dh2018.adho.org/en/organising-the-unknown-a-concept-for-the-sign-classification-of-not-yet-fully-deciphered-writing-systems-exemplified-by-a-digital-sign-catalogue-for-maya-hieroglyphs/ (accessed 8 January 2019).
  9. Murano, F. and Felicetti, A. (2017). Definition of the CRMtex: An Extension of CIDOC-CRM to Model Ancient Textual Entities, version 0.8 (CIDOC-CRM) http://www.cidoc-crm.org/crmtex/sites/default/files/CRMtex_v0.8.pdf (accessed 8 January 2019).
  10. Noël, G. et al. (2019b). Archetype [Source Code]. London: King’s College. https://github.com/kcl-ddh/digipal (accessed 8 January 2019).
  11. Noël, G. et al. (2019c). Archetype [Docker Image]. London: King’s College https://hub.docker.com/r/kingsdigitallab/archetype/ (accessed 8 January 2019).
  12. Stokes, P. A. (2012). Modelling medieval handwriting: a new approach to digital palaeography. DH2012 Book of Abstracts. Hamburg, pp. 382–85 http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/modeling-medieval-handwriting-a-new-approach-to-digital-palaeography (accessed 8 January 2019).
  13. Stokes, P. A. (2017). Scribal attribution across multiple scripts: a digitally aided approach. Speculum, 92(S1): S65–85 doi: 10.1086/693968.
  14. Stokes, P. A. (2018). Modelling multigraphism: the digital representation of multiple scripts and alphabets. Digital Humanities 2018 Book of Abstracts. Mexico City, pp. 292–96.
  15. Stokes, P.A. et al. (2018). Exon: The Domesday Survey of South-West England. London: King’s College. http://www.exondomesday.ac.uk (accessed 8 January 2019).
  16. Stokes, P.A. et al. (2019a). Archetype [Wiki]. London: King’s College. https://github.com/kcl-ddh/digipal/wiki (accessed 8 January 2019).
  17. Stokes, P.A., Brookes, S., Noël, G., Buomprisco, G., Marques de Matos, D., and Watson, M. (2014). The DigiPal Framework for script and image. Digital Humanities 2014 Book of Abstracts. Lausanne: pp. 512–14. http://dharchive.org/paper/DH2014/Poster-193.xml (accessed 8 January 2019). Poster available at http://www.digipal.eu/media/uploads/uploads/images/blog_posts/2014/DH2014%20Poster.pdf (accessed 8 January 2019)
  18. Stokes, P.A., Brookes, S., Noël, G., Davies, J.R., Webber, T., Broun, D., Taylor, A., and Tucker, J. (2016). The Models of Authority project: extending the DigiPal Framework for script and decoration. Digital Humanities 2016: Conference Abstracts. Krakow: pp. 896-99, http://dh2016.adho.org/abstracts/387 (accessed 8 January 2019)