[LeetCode]Word Frequency

题目描述：

Write a bash script to calculate the frequency of each word in a text file words.txt.

For simplicity sake, you may assume:

words.txt contains only lowercase characters and space ' ' characters.
Each word must consist of lowercase characters only.
Words are separated by one or more whitespace characters.

For example, assume that words.txt has the following content:

the day is sunny the the
the sunny is is

Your script should output the following, sorted by descending frequency:

the 4
is 3
sunny 2
day 1

Note:

Don't worry about handling ties, it is guaranteed that each word's frequency count is unique.

题目大意：

编写一段bash脚本计算文本文件words.txt中每个单词的频度。

为了使问题简化，你可以做如下假设：

words.txt中只包含小写字母和空格
每一个单词只由小写字母组成
单词之间以一个或者多个空格分隔

例如，假设words.txt包含下面的内容：

the day is sunny the the
the sunny is is

你的脚本应该输出下面的结果，按照单词频度倒序排列：

the 4
is 3
sunny 2
day 1

注意：

无需考虑并列排名的情况，测试样例确保每一个单词的频度都是唯一的。

Bash脚本：

参阅：https://leetcode.com/discuss/29049/my-simple-solution-one-line-with-pipe

# Read from the file words.txt and output the word frequency list to stdout.
cat words.txt | tr -s ' ' '\n' | sort | uniq -c | sort -rn | awk '{print $2" "$1}'

tr -s: 使用指定字符串替换出现一次或者连续出现的目标字符串（把一个或多个连续空格用换行符代替）

sort: 将单词从小到大排序

uniq -c: uniq用来对连续出现的行去重，-c参数为计数

sort -rn: -r 倒序排列， -n 按照数值大小排序（感谢网友长弓1990 指正）

awk '{ print $2, $1 }': 格式化输出，将每一行的内容用空格分隔成若干部分，$i为第i个部分。

本文链接：http://bookshadow.com/weblog/2015/03/24/leetcode-word-frequency/
请尊重作者的劳动成果，转载请注明出处！书影博客保留对文章的所有权利。

周一	周二	周三	周四	周五	周六	周日
2015年2月				2015年4月
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31