A follow-up to the ngram.sh post with data sources for Wikipedia and Project Gutenberg ngrams. The ngram.sh script can easily be modified to extract keywords from these databases, and many others.
A follow-up to the ngram.sh post with data sources for Wikipedia and Project Gutenberg ngrams. The ngram.sh script can easily be modified to extract keywords from these databases, and many others.
The Google Ngram Viewer is a database browser used to chart the relative frequency of words or phrases. The data source is the Google Books database and the graphic engine is Google Charts. It’s cool. It’s pretty. It doesn’t easily give up the raw data. This script helps.
Drawing inspiration from General Inquirer (1966) and KWIC, this post proposes an iterative hybrid of available methods in a quest for a more flexible and robust machine-assisted content analysis system.
A short bibliography of the famous wf.sh from 1986:
“Given a text file and an integer K, you are to print the K most common words in the file (and the number of their occurrences) in decreasing frequency.”
A longitudinal study of keyword frequencies in New York Times between 2001 and 2008 supported the hypothesized typologies of catastrophic myths. Patterns of occurrence are consistent between natural and man-made disasters.
Cooper, Mendel (April 29 2007) Advanced Bash-Scripting Guide: An in-depth exploration of the art of shell scripting, Version 4.3. Retrieved from jamsb.austms.org.au. This tutorial assumes no previous knowledge of scripting or programming, but progresses rapidly toward an intermediate/advanced level of instruction . . . all the while sneaking in little […]
This site has useful documentation on the use of chmod and permissions on unix servers. http://www.web-ho sting.com/support/unix/cgi/chmod.html
I just tried to install MoinMoin wiki. No dice. I’ll try WikiRootry and may come back to MoinMoin if there’s no luck there either. The Installation Directions for MoinMoin just SUCK. Not for the rookie, IMHO. Useful commands I’ve dug up in the process: gunzip = for .gz files (1) […]
It looks like I’ve got Blosxom installed on sdf.lonestar.org (AKA freeshell.net). Blosxom is set-up, running, and reading the style-sheets. The layout’s been converted, though it’ll probably be re-designed later. I’ve begun the conversion of posts from Blogger to Blosxom and old posts will start appearing here as that conversion progresses. […]
Some people develop an emotional attachment to technology. For some guys it’s their trucks, when I was 12 it was the BBS. I was looking around for some old software from the Pre-Netscape era when I pulled up a May 2002 Salon.com article titled When 300 baud was the bomb. Yeah. That’ s what […]