OpenDNA Projects
February 16th, 2012
Wikipedia & Project Gutenberg Ngram databases | Files: Geek Out, Public, SFU, Uncategorized | Tags: ngram, programming, project gutenberg, wf.sh, wikipedia, word frequency

A follow-up to the ngram.sh post with data sources for Wikipedia and Project Gutenberg ngrams. The ngram.sh script can easily be modified to extract keywords from these databases, and many others.