Corpus Resources
Corpus Resources
Curated links to corpora, concordancers, and tools for corpus-informed teaching and research. Adapted from the Corpus-Aided Platform (CAP) at EdUHK.
Name | Website | Introduction | Availability |
---|---|---|---|
Online Corpora | |||
BYU Corpora | Widely used suite of corpora and teaching resources. | free | |
British National Corpus (BNC) | 100M words of British English (written + spoken). | free (registration) | |
COCA | Large, balanced American English corpus. | free (registration) | |
Compleat Lexical Tutor | Online tools for lexical analysis and pedagogy. | free | |
MICASE (spoken) | Academic spoken English transcripts (U-Michigan). | free | |
MICUSP (upper-level papers) | Upper-level student papers (written academic English). | free | |
IntelliText (Leeds CTS) | Search builder and POS editor interface. | free | |
BACKBONE (multilingual pedagogic spoken corpus) | Pedagogic spoken corpus with video; online search. | free | |
ICNALE (International Corpus Network of Asian Learners of English) | Learner corpus with texts/audio from Asian EFL learners. | free | |
MOECS (Multilingual Opinion Essays by College Students) | Multilingual student essay corpus. | free | |
TECCL (Ten-thousand English Compositions of Chinese Learners) | Large Chinese learner corpus (raw and POS-tagged). | free | |
TED Corpus Search Engine | Search TED talk transcripts with links to videos. | free | |
WebParaNews (EN–JA parallel news) | Japanese–English parallel newspaper corpus. | free | |
SCoRE (Sentence Corpus of Remedial English) | Sentence corpus & pattern browser for lower-proficiency learners. | free | |
Online Corpora by EdUHK LML | |||
LML Corpus Linguistics Projects | Department projects and showcases. | free | |
EAP Corpora | English for Academic Purposes collections. | free | |
Corpus-based Pronunciation Learning | Pronunciation learning with corpus support. | free | |
English–Chinese Parallel Concordancer | Parallel concordancer for EN–ZH. | free | |
Concordance & Collocation Tools | |||
Word and Phrase | Collocations and full-text analysis (COCA-based). | free (registration) | |
Just The Word | BNC-based collocation finder. | free | |
Linggle | Web-scale collocation/phrase search. | free | |
Corpus Concordance English (Lextutor) | Simple online concordancer with sub-corpus options. | free | |
WebCorp Live | Build ad-hoc corpora from the web. | free | |
SkELL (Sketch Engine for Language Learning) | Simple concordances and word sketches; learner-friendly. | free | |
Skylight (browser concordancer) | In-browser concordancer for quick classroom materials. | free | |
Netspeak | Collocation/phrase suggestion engine from web data. | free | |
Corpus Tools | |||
AntConc | Desktop toolkit for concordancing and text analysis. | free | |
WordSmith Tools | Windows suite for word pattern analysis. | paid (trial) | |
VersaText | Explore the language of a single text. | free | |
Sketch Engine | 400+ ready-to-use corpora in 90+ languages. | paid (free trial) | |
Verbal Stratagems | Communicative function phrase lists (agreeing, clarifying, etc.). | free | |
Cambridge English Vocabulary Profile | CEFR-level vocabulary profiles (A1–C2). | free (registration) | |
LancsBox | Toolkit with collocation network visualiser (GraphColl). | free (registration optional) | |
NoSketch Engine | Open-source variant of Sketch Engine for your own corpora. | free (open source) | |
TextSTAT | Lightweight concordancer/keyword counter; great for DIY corpora. | free | |
AntCorGen | Compile a research-article corpus from PLoS ONE. | free | |
ICEweb | DIY corpus-building interface for the ICE project. | free | |
Corpus-informed / Derived Resources | |||
English Profile | CEFR-coded sense inventory & resources (broader than EVP). | free (registration) | |
FLAX | Corpus-derived OERs, games, and apps for learning. | free | |
New General Service List (NGSL) | High-frequency word lists (incl. Academic lists) with resources. | free | |
WriteAway | Corpus-grounded phrase/chunk suggestions for writing. | free | |
Learner Dictionaries & Bilingual Resources | |||
Cambridge Learner’s Dictionary | CEFR-tagged senses, abundant examples. | free | |
Oxford Learner’s Dictionaries | Clear interface, rich learner examples. | free | |
Longman Dictionary of Contemporary English | Collocations & example-rich learner dictionary. | free | |
Merriam-Webster Learner’s Dictionary | US-oriented learner dictionary. | free | |
Linguee | Parallel examples from real-world translations. | free | |
PHaVE Dictionary (Phrasal Verbs) | Frequent phrasal verbs with sense distinctions & examples. | free | |
General Guides / Getting Started | |||
Intro to Text Analysis (PSU Libraries) | Orientation to text analysis concepts & tools. | free | |
Text Mining – Web-based Resources (PSU Libraries) | Curated links to web tools & readings for text mining. | free |
Notes are brief summaries; please see each site for full details. Last updated: 2025-10-03.