Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
442 views
in Technique[技术] by (71.8m points)

linux - calculate and print the average value of strings in a column

I got a .txt file with 2 columns of values. They are 2D coordinates, so the first column represent the x value and the second one is the z value. Unfortunately there are some lines with the same x value but a different z value. I'd like to calculate the average of the z values in order to associate a single z to a single x. A sample of what i have is:

 435.212 108.894
 435.212 108.897
 435.212 108.9
 435.212 108.903

As you can see the x value 435.212 is associated with 4 different z value. What i'd like to have is:

435.212 108.8985

where 108.8985 is the result of (108.894+108.897+108.9+108.903)/4. Of course i don't want to modify the other x and z values, so the result would be something like that:

BEFORE:

 435.238 108.9
 435.25 108.9
 435.262 108.9
 435.275 108.9
 435.212 108.894 <---
 435.212 108.897<---
 435.212 108.9<---
 435.212 108.903<---

AFTER:

 435.238 108.9
 435.25 108.9
 435.262 108.9
 435.275 108.9
 435.212 108.8985 <---average

The number of z values associated with a single x may vary.

I am using the linux command line and I though to use awk for the job, although any other program/utility i can use on a linux command line could be good.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is one way with awk:

$ awk '{a[$1]+=$2; ++b[$1]} END {for (i in a) print i, a[i]/b[i]}' file
435.212 108.899
435.25 108.9
435.238 108.9
435.262 108.9
435.275 108.9

Explanation

{a[$1]+=$2; ++b[$1]}

  • Store the z values (2nd column) in the array a.
  • Store the amount of elements for each x value (1st column) in the array b.

END {for (i in a) print i, a[i]/b[i]}'

  • Print the result looping through the values stored in the array.

To have another number format (4 float values for example) you can also use:

printf "%d %.4f
", i, a[i]/b[i]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...