International Corpus Linguistics Research Unit (ICLRU)

The International Corpus Linguistics Research Unit, with its focus on the empirical investigation of linguistic data collected in naturalistic situations in the field, supports researchers working on a number of languages in a range of domains.

Corpus material gathered by researchers attached to the Unit includes:

  1. The Beeching Corpus of Spoken French [K1] (155.00 words).  Contact Dr Kate Beeching.
  2. Parallel (translation) corpora [K2] (zip file), including aligned translations in English and French, German, Spanish and Vietnamese in specialist areas such as Accounting, Cancer and Oncology, Legal, Official documentation, Physiotherapy, Renewable Energy.
  3. Small Corpora of Mosetén and Pirahã (spoken in the Bolivian Andes and Northwestern Brazil, respectively). Contact Dr Jeanette Sakel.
  4. Corpus of British and American song lyrics (500,000 words). Contact Professor Jonathan Charteris-Black.
  5. Corpus of British and American Political Speeches (1 million words). Contact Professor Jonathan Charteris-Black.
  6. The Bristol Corpus of Role-play dialogues (36,000 words). Contact Dr Kate Beeching.
  7. The Bristol Corpus of Learner Language, co-ordinated by Jeanine Treffers-Daller, now Director of the Centre for Literacy and Multilingualism, University of Reading. Contact Dr Kate Beeching.
  8. The John Turlik Corpus of Learner Language - Introduction to data and John Turlik Corpus (zip file).

Much of the work of researchers allied to the Unit is devoted to the study of language variation and change, language acquisition, intercultural communication, metaphor, stylometry and translation.

Back to top