Ga direct naar: Inhoud
-
Blog

Introduction: Wartime letters and the ‘Postbus NIOD’ campaign week

Published on 19 April 2023
Since 2020, in the project ‘First-hand Accounts of War: War Letters (1935-1950) from NIOD Digitised', NIOD has been working to digitise the special collection of handwritten letters dating from the period before, during and after the German occupation of the Netherlands and the Indonesian War of Independence. This project involves more than 160,000 documents, many of them personal, which have been gathered at NIOD since the liberation in 1945. During the ‘Postbus NIOD’ campaign week (31 March-7 April 2023), NIOD is appealing to anyone who has personal war correspondence languishing in their attic at home to donate it to NIOD.

The importance of these documents was stressed back in 1944 by the Dutch Minister of Education, the Arts and Sciences, Gerrit Bolkestein, then based in London: ‘If future generations are to be fully aware of what we as a people have endured in these years [...], it is precisely the simple documents that we need: a diary, letters from a worker in Germany, [...].’ NIOD never stopped collecting these documents – and continues to do so to this day.

Personal documents, also known as ‘ego documents’, form a key part of the NIOD collection and are important to historians and other interested parties for various reasons. Wartime letters offer a fairly direct reflection of the personal communication between letter-writer and letter-receiver. Letters show how contemporaries captured their emotions, experiences and expectations in text, not how they were later made into a story, once the course of history was already known. Whereas eyewitness accounts from interviews in the 1980s and 1990s allow us to hear the voices of people who were relatively young during the war, letter-writers also include older people, for example, or people who were unable to recount their war experiences  later. The importance of historical wartime letters was further highlighted in late 2022, when NIOD’s collection of wartime letters, together with the diary collection, was included in the Dutch UNESCO ‘Memory of the world’ register.

-
UNESCO’s visit to NIOD

The importance of digitisation

NIOD not only collects and preserves the letters, but we also digitise them. In ‘First-hand Accounts of War’, we are working hard to conserve, scan and transcribe the collection and make it digitally available. The first important reason for doing this is that digitisation makes it easier to access the wartime letters. This means that as well as consulting them in the NIOD reading room, when legislation allows, it will also be possible to search and read the letters online. This will soon be possible at the NIOD website, via Archieven.nl, or via Netwerk Oorlogsbronnen. Thanks to this online access, anyone who wants to will be able to search the letter collection. Moreover, the digitised and transcribed wartime letters are also being made available as a dataset for scholarly research. This is creating new opportunities for scientific historical research; the sources can be systematically searched and analysed using a computer, for example.

Digitisation is also important because it helps to preserve the original historical documents, many of which are fragile. In times when paper was scarce, many letters were written on flimsy scraps of paper or strips of old tea towel. As it will soon be possible to consult the collection from a screen, the original documents will be exposed to less wear and tear.

-
NIOD’s paper archives

Getting started on digitisation

Over the past two years, the wartime letters in NIOD’s current Collection 247: Correspondence were initially conserved, restored if necessary, and then scanned. The latter was done by a professional scanning company. Once we had received the scans from the scanning company, we started to prepare for the automatic transcription of the handwritten letters. The process of automatic transcription is also known as Handwritten Text Recognition (HTR), and is related to the use of computers to transcribe typed text, also known as Optical Character Recognition (OCR). To transcribe the handwritten text, we use the software program Transkribus, developed by the Austrian READ-COOP in consultation with archivists, scientists, historians and IT specialists.

-
screenshot Transkribus

First, we manually retyped a relatively small number of the digitised scans; about 1,000 scans, all in all. We were assisted by enthusiastic volunteers, who actively contributed their ideas about developing the project as well as transcribing the letters. In addition to making transcriptions, these volunteers are also working on identifying and annotating significant elements in the letters, such as the names of the sender and the receiver, or the place and time of writing. This will be covered in more detail in a future blog during the campaign week. With the virtually error-free transcriptions produced by manual transcription, also known as Ground Truth, we trained a computer model using the Transkribus software program. ‘Training’, in this sense, refers to the process in which the computer learns to recognise handwritten text and transcribe it automatically. These computer models are trained using Artificial Intelligence (AI) technology.

The computer at work

Producing acceptable results proved to be a process of trial and error. The quality of the automatic transcriptions is expressed as an error percentage (ideally as low as possible) at the level of recognised characters in the text, known as the Character Error Rate (CER). Our experience with training and testing the computer models suggested that up to a certain point, more letters and greater variation in different handwriting yielded a more useful computer model for automatic transcription. With around 1,000 Ground Truth transcripts as training material for the computer, we now seem to have reached the point where adding more material and greater variation to the computer training is no longer producing significantly improved results (in other words, a lower CER).

The final computer model that we have trained on the Ground Truth set has become rather good, in our humble opinion, at automatically transcribing scans of handwritten Dutch. Whilst the transcriptions made with the computer model are not perfect, they are readable and can be searched perfectly well with a computer. We have now achieved an error rate (CER) of 4.7%. That is to say, 95.3% of the characters in a handwritten text, on average, are correctly recognised and transcribed by the computer. The model can effectively read, recognise and automatically transcribe Dutch manuscripts from the period 1935-1950. We will thus shortly be using the model to automatically transcribe all of the wartime letters that have already been scanned.

The computer model that we have trained is also being made available to interested parties, researchers and other archive and heritage institutions, so it can be used to automatically transcribe handwritten Dutch documents from the mid-twentieth century. A pilot version of the model is available online and can be tested by anyone using their own scans. 

Share this page
Sign up for our newsletter
Follow us on
NIOD
Herengracht 380
1016 CJ Amsterdam
020 52 33 800
Opening hours reading room
  • Tue - Fri09:00 - 17:30 u
  • Closed on Mondays