Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
541 views
in Technique[技术] by (71.8m points)

scripting - Best way to simulate "group by" from bash?

Suppose you have a file that contains IP addresses, one address in each line:

10.0.10.1
10.0.10.1
10.0.10.3
10.0.10.2
10.0.10.1

You need a shell script that counts for each IP address how many times it appears in the file. For the previous input you need the following output:

10.0.10.1 3
10.0.10.2 1
10.0.10.3 1

One way to do this is:

cat ip_addresses |uniq |while read ip
do
    echo -n $ip" "
    grep -c $ip ip_addresses
done

However it is really far from being efficient.

How would you solve this problem more efficiently using bash?

(One thing to add: I know it can be solved from perl or awk, I'm interested in a better solution in bash, not in those languages.)

ADDITIONAL INFO:

Suppose that the source file is 5GB and the machine running the algorithm has 4GB. So sort is not an efficient solution, neither is reading the file more than once.

I liked the hashtable-like solution - anybody can provide improvements to that solution?

ADDITIONAL INFO #2:

Some people asked why would I bother doing it in bash when it is way easier in e.g. perl. The reason is that on the machine I had to do this perl wasn't available for me. It was a custom built linux machine without most of the tools I'm used to. And I think it was an interesting problem.

So please, don't blame the question, just ignore it if you don't like it. :-)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
sort ip_addresses | uniq -c

This will print the count first, but other than that it should be exactly what you want.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...