The International Corpus of English (ICE) began in 1990 with the primary aim of collecting material for comparative studies of English worldwide. Twenty-six research teams around the world are preparing electronic corpora of their own national or regional variety of English. Each ICE corpus consists of one million words of spoken and written English produced after 1989. For most participating countries, the ICE project is stimulating the first systematic investigation of the national variety. To ensure compatibility among the component corpora, each team is following a common corpus design, as well as a common scheme for grammatical annotation.
Contact information: Professor Gerald Nelson, Department of English, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR. Email: firstname.lastname@example.org Fax: +852 2603 5270
Contact information for individual ICE teams may be found here.
News June 2015
Voices of the International Corpus of English (VOICE) CANADA
Voice Canada is a compilation of 70 sound recordings of speakers of Canadian English, based on recordings made as part of the data collection required for creating the Canadian component of the International Corpus of English (ICE-CANADA). The recordings are free and available for download, along with transcripts, through the University of Alberta Dataverse repository (requires registration in Dataverse and acceptance of terms and conditions).
ICE Workshop at ICAME 2015
A workshop entitled 'The Future of the International Corpus of English (ICE) project: New challenges, new developments', was held on May 27 this year as a pre-conference workshop of ICAME 2015 in Trier, Germany. Details of the workshop are available here.
I am grateful to Ulrike Gut and Robert Fuchs for organising this workshop.
Downloading ICE corpora: Note to teachers
Release of ICE NIGERIA
The full version of the ICE Nigeria corpus is now available. The download includes audio files for the spoken part. You can download the corpus here.
A POS-tagged version of the written part is also available, and a tagged version for the spoken part will be available as soon as possible.
I am very pleased to welcome a new team to the ICE project. ICE Gibraltar will be coordinated by reasearchers at the University of Vigo and the University of the Balearic Islands, Spain. For more details, please visit the ICE Gibraltar page on this site.
I am very pleased to announce the launch of the ICE-Scotland project. The project is based at Westfälische-Wilhelms-Universität and at Otto-Friedrich-Universität, Germany. Further details are available here.
ICE IRELAND and SPICE IRELAND
The ICE Ireland corpus, and the SPICE Ireland corpus, are now available to download from this site. SPICE Ireland consists of the spoken component of ICE Ireland, with prosodic and pragmatic annotation. For more information, see the ICE Ireland page.
The following recent publications have made extensive use of ICE corpus materials:
Biermeier, Thomas (2008) Word-Formation in New Englishes: A Corpus-based Analysis. Reihe: Anglistik/Amerikanistik.
Deuber, Dagmar (2014) English in the Caribbean: Variation, Style and Standards in Jamaica and Trinidad. Studies in English Language. Cambridge: Cambridge University Press.
Lange, Claudia (2012) The Syntax of Spoken Indian English. VEAW G45, Amsterdam: Benjamins.
Hundt, Marianne and Ulrike Gut (eds) (2012) Mapping Unity and Diversity Worldwide: Corpus-based Studies of New Englishes. VEAW G43, Amsterdam: Benjamins.
Aarts, Bas (2011) Oxford Modern English Grammar. Oxford: OUP.
Hasselgård, Hilde (2010) Adjunct Adverbials in English. Cambridge: CUP.
ICAME Journal No 34, April 2010, dedicated to 'new' ICE corpora.
Tagged ICE corpora now available
The tagging of all currently available ICE corpora with CLAWS7 and the USAS semantic tagger is now complete, and the corpora are available for non-profit, academic research. If you wish to download any of the tagged corpora, please send an email to email@example.com with the subject line "Tagged ICE Corpora". Your email should also indicate your academic affiliation. I will then get back to you with details of how to proceed.
Thanks to Dr Paul Rayson, Director of the UCREL research centre at Lancaster University, for his generous cooperation in this initiative.
Release of ICE Sri Lanka (written)
I am very pleased to announce the release of the written component of the ICE Sri Lanka (ICE-SL) corpus. The corpus is available in standard SGML format and in a POS-tagged version, using the CLAWS C7 tagset. To obtain a copy of the corpus and Manual, please email firstname.lastname@example.org.
Release of ICE USA (written)
Launch of ICE Uganda project
I am very pleased to announce the launch of ICE Uganda. The project is directed by Prof. Dr. Christiane Meierkord at Ruhr-University of Bochum, Germany.
Last updated: 29 June 2015 © The ICE Project
ICE corpora are available for non-commercial,