Classic Shell Scripting: the history of wf.sh

A short bibliography of the famous wf.sh from 1986:
“Given a text file and an integer K, you are to print the K most common words in the file (and the number of their occurrences) in decreasing frequency.”

wf.sh is a frequently used in introductory CompSci courses to get students to rethink how they imagine data. The classic puzzle is expressed as:

“Given a text file and an integer K, you are to print the K most common words in the file (and the number of their occurrences) in decreasing frequency” (Bentley & Knuth, May 1986).

McIlroy’s solution (June 1986) forms the core of my Catastrophic Frequency script, which can process an infinitely large corpus at a rate of approximately 100 megabytes per minute (on a single 2Ghz processor). Below the fold: the sources.

Leave a Reply

Your email address will not be published. Required fields are marked *