Word frequency statistics for a file
From Vim Tips Wiki
Tip 1531 Previous Next Created: November 13, 2007 Complexity: basic Author: vale.smth Version: 7.0
To generate a table of the occurrence frequency for every word in a file, just enter these commands.
:%s/\_A\+/\t1\r/g :sort i :g/\c\(.\+\)\n\1$/norm $yiwj@"^Akdd
In the above, ^A represents CTRL-A, and should be input directly (press CTRL-V then CTRL-A). On Windows, you would probably need to press CTRL-Q then CTRL-A, and you would probably first have to issue the command :unmap <C-A>.
Note that the above changes the file, so you would want to be working on a copy of your text, or you will need to undo the changes.
[edit] Comments
TO DO
Incorporate following explanation (briefly expanded) into tip.
The 1st command makes every word per line, and appends "1" at the end of each line.
The 2nd command sort lines.
The 3nd command find every pair of lines that are same, adds the number of the 1st line to the 2nd line's, then deletes the 1st line.
In command line, ^A can be input directly (i.e. do not need CTRL-V+CTRL-A).
- Doesn't seem to work for me under Windows with gvim 7.1. I need to use CTRL-Q CTRL-A and cannot input ^A directly.
