Available ICE Corpora @



The following corpora are available free (under Licence) to download from this site:

CANADA (ICE-CAN - 1m words, lexical)
(ICE-JA - 1m words, lexical)
(ICE-HK - 1m words, lexical)
EAST AFRICA (ICE-EA - Kenya & Tanzania, v 2, 1m words, lexical, including a version for Wordsmith)
INDIA (ICE-IND - 1m-words, lexical)
SINGAPORE (ICE-SIN - 1m words, lexical)
PHILIPPINES (ICE-PHI - 1m words, lexical)
USA (ICE-USA, written component
- c.400,000 words, lexical)
IRELAND (ICE-IRL - 1m words, lexical)
SPICE-IRELAND (SPICE-IRL - c.600,000 words with prosodic and pragmatic annotation)

The following corpora are also available, from the addresses shown:

GREAT BRITAIN (ICE-GB - 1m words, POS-tagged and parsed, distributed with ICECUP retrieval software) . Available from: Survey of English Usage, University College London, Gower St, London WC1E 6BT, UK. Order Form

NEW ZEALAND (ICE-NZ - 1m words, lexical)
Available from:
School of Linguistics & Applied Language Studies, Victoria University of Wellington PO Box 600, Wellington, New Zealand. Order form

SRI LANKA (ICE-SL - written component; lexical and POS-tagged with CLAWS C7 tagset)
Available from the Department of English, the University of Giessen, Germany.
To obtain a copy of the corpus, please email

NIGERIA (ICE-NG - written component). The corpus is available here.

Please note that texts in all ICE corpora are protected by copyright law. They are available for non-profit academic research purposes only.



© 2013 The ICE Project