Introducing the Tattoos Project


Sharon Howard


3 November 2018

The project

This website accompanies the British Academy funded project Criminal Tattoos.

The remarkable increase in tattooing among convicts in the nineteenth century is poorly understood. It is unclear why convicts marked their bodies in ways which facilitated official surveillance, nor do we understand the complex mixture of sentiments expressed. Building on existing, but limited, case studies, this project will analyse tattoos on 60,700 convicts in Britain and Australia from 1788 to 1925, examining descriptions of tattoos alongside evidence of convicts’ personal backgrounds. While this evidence is available within the Digital Panopticon web resource, this topic is currently impossible to research because 1) information about tattoos is often cryptic and enmeshed within broader descriptive fields, and 2) existing search and visualisation facilities do not enable the data to be usefully interrogated and synthesised. This project will develop new methods of data extraction and visualisation in order to better understand the meanings embedded within this and other rich bodies of fragmentary textual evidence.

The project has two main components:

Data extraction and classification

The physical description fields in the datasets consist of undifferentiated and fragmentary language which can include various physical attributes (eg heights, scars, disabilities) in addition to tattoos. Descriptions of tattoos are also varied and include a variety of names, initials, dates, emotions, objects and symbols. These scraps of descriptive language need to be identified, delineated and classified, but the datasets are too large in most cases for it to be feasible to mark them up manually. Instead, the project will build on and extend automated techniques developed for the Digital Panopticon. The Digital Humanities Institute, Sheffield, will be primarily responsible for the development work.

Data analysis and visualisation

Which is where this blog comes in! I’ll be using R to experiment with data analysis and visualisation techniques in order to explore some of the project’s research questions. In addition to tattoos, we’ll have significant amounts of biographical information for many individuals, including gender, age, occupation, religion, type of crime, previous convictions, and punishment and we’ll explore ways to use visualisation to summarise and analyse this evidence, in order to identify the specific contexts in which tattooing, and particular types of tattoos, were used.

In the process I hope to offer methods that will be transferable to analyse other types of complex and fragmentary textual data found in many historical (and contemporary) sources.

Research questions include:

  • Why, over the course of the nineteenth century, did convicts increasingly use their bodies as sites for recording their life events and expressing their identity and sentiments, despite the fact the state collected evidence of their tattoos for surveillance purposes?
  • Which convicts, and from which social contexts, were most likely to have tattoos? How does tattooing vary by gender, age, occupation, religion and place of origin? How did these patterns change over the course of the nineteenth century?
  • Are there any significant differences between the tattoos of convicts who were transported to Australia and those who were imprisoned in Britain? Were recidivists more likely to have tattoos, and how did convicts chart their penal experiences?
  • What light does the practice of tattooing shed on changing attitudes towards the body in the nineteenth century?

The datasets

The project focuses on evidence of tattooing found in six major datasets in the Digital Panopticon. The broad chronological and geographical range allows the project to analyse change over time and to compare tattoos on transported and imprisoned convicts.

The counts of individuals with tattoos are estimates. The data is based on textual descriptions of tattoos, not images. (Very occasionally officials added rough sketches of tattoos, but this seems to be uncommon.)

dataset location start end individuals
Criminal Registers London 1791 1801 25
Founders and Survivors Tasmania 1803 1853 9000
Millbank Prison Register London 1816 1826 120
Convict Indents Western Australia 1850 1868 1000
Convict Licences London 1853 1887 550
Registers of Habitual Criminals Eng+Wales 1881 1925 50000

Some of the datasets focus on London criminals: the Criminal registers listed only London/Middlesex prisoners, and the Digital Panopticon transcribed in detail only London prisoners in the Millbank prison register and Convict licences). The other datasets have national coverage (Founders and Survivors, WA convict indents, Registers of habitual criminals).

The descriptions

I’ll take a closer look at the data in later posts, but here are a few examples of descriptions containing tattoos, to show why the data extraction part of the project is such a challenge.

Criminal registers

The number of tattooed individuals in this dataset is very small and probably not suitable for quantitative analysis, but 18th-century evidence of tattoos in Britain is rare so this may still provide useful context.

Aged 19. 5F/4I Sallow brown hair grey eyes Monaghar Ireland a Marine has an Anchor marked on left arm his right
25 Yrs. 5F/8I. Fresh Complex. brown hair grey eyes Kelse Scotland a Mariner served his apprenticeship from Shields in Ye. So Sea & Greenland. Trade & in the Shunderer was at has a woman & a Square & Compass marked on his left arm & on or cast .on the Cross
Aged 20. 5F. 5I. Fair Complex brown hair dark eyes Ireland in his right arm is marked emblems of Musenry, on is left arm Arist or certified, with the Mitials of new name Marner
Aged 21. 5F. 5I. Fair Complex brown hair dark eyes Cork Ireland on his left arm is marked Mermaid wish W. I. & W. M. I. M. on to wright arm taken in the Manly Captn. Adams of Pool.

Founders and Survivors

These were recorded after arrival in Tasmania and at least some were acquired after leaving Britain, so it’ll be interesting to compare them with the data for prisoners in Britain.

JR’ ‘MAH’ above elbow joint right arm. Ring on middle finger right hand. Large scar right hand near little finger. ‘SH’ on right arm.
T.F.R.H. rt arm S.H left arm - Stout made -
Lost some front Teeth Lower Jaw 6 Blue dots on left Hand
<[X: H]> TT. A.N. H. WC LL on Rt Shoulder
3 blue dots on left Hand

Registers of Habitual Criminals

Individuals could appear on numerous occasions in the registers. This makes the data more complex, but offers the potential to trace evolving personal biographies rather than snapshots at a single moment in time.

Sailor and flag right arm, mole right shoulder
Rose, thistle and shamrock on left wrist, ship on right arm
Blue dot right arm, mole right breast, birth-mark top of left foot, lost all front upper teeth
Ring third right finger, dot back of each hand, cross-flags left arm, sailor right arm, moles on back and neck
T. and three indistinct marks right arm, scars on chest, left eye, left wrist, back, back of head and left of head

Code and data

I’ll be using R, a programming language particularly geared towards statistical analysis and visualisation, throughout this blogging project.

More specifically, my R workflow has these key components:

  • RStudio, an “integrated development environment” (IDE) for R (free open source software, multi-platform)
  • The Tidyverse, “an opinionated collection of R packages designed for data science” (personally speaking, this was the single most important thing that transformed R into a language I could actually understand and use)
  • Markdown and RMarkdown for writing
  • Quarto and Github Pages for publishing

This enables me to do most of what I need for the project - data wrangling and tidying, exploratory analysis, data modelling and visualisation, blogging - in one place, so that (hopefully…) everything I do will be easily re-usable, either to test and reproduce my own results or as examples that can be adapted for other data and research.

There will be a fair bit of R code in most blog posts; this will be familiar to any readers who follow data science blogs, but less so to many historians. One of the goals of this blog is to show how R can be of practical use for working with large, complex historical datasets, right through the research process from data management to writing and publication. So, with this in mind, I’m also compiling a list of resources, which will be regularly updated during the project, focusing on resources (tutorials, guides, packages, etc) that I’ve actually used and found helpful.

(It doesn’t have to be R; Python, for example, can be used with the same kind of workflow. I think R is stronger specifically on statistical analysis and visualisation options, especially for data that doesn’t need a lot of work to make it ready for use, whereas Python would be better if I needed to do more heavyweight data processing and development.)

All R code and data will be made available for re-use at the project’s Github repository. Some of the data being used by the project is already publicly available from the University of Sheffield data repository and other datasets will be added there or on Github during the course of this project. Datasets will be Creative Commons-licensed (or similar) for re-use unless otherwise stated, but the exact license terms are likely to vary.

What’s next?

The initial phase of the project will be focusing on data cleaning and preparation for the process of extraction and classification. My first posts here will take a closer look at the descriptions data. I’ll be comparing datasets and using some basic textmining, exploring how the descriptions were originally recorded, the range of information included, and the variability of the language used.

In later months, I’ll experiment with ways of mining and visualising the tattoos corpus, and using the linked life archives data to analyse relationships between tattooing and convicts’ biographical and social characteristics.