Averaging text files with bash and awk

2010/03/192024/10/23 Claus2 Comments

Small script that might be useful for analysing data. The script takes multiple files (filemask is $1) which are composed of data columns, and produces the average of each data point across the files. Leaving around for future reference.

Usage: $ average.sh “filename.*” > filename.avg

sum=`ls -l $1 | wc -l`
tf=`ls $1 | tail -n 1`
fld=`tail -n 1 $tf | wc -w`

count=1
while [ $count -lt $(($fld + 1)) ]; do
paste -d” ” $1 | nawk -v s=”$sum”
-v fld=”$fld” -v f=”$count” ‘{
for(i=0;i< =s-1;i++) { ta=f + i*fld tta=tta+$ta } print tta/s tta=0 }' >> tmp$count
count=$(($count + 1))
done
paste -d” ” tmp*
rm tmp*

Bonus points for anyone who can make the above code simpler.

UPDATE: Thanks Pat for the bugfix. Whups.

2 thoughts on “Averaging text files with bash and awk”

Pat says:

2010/04/16 at 5:42 AM

Does finding a bug get bonus points? You actually want to pass fld to awk as well (say -v fld=”$fld”) and then do

ta=f+i*fld

instead of ta=f. Right now tta is just adding the ‘f’th entry s times and dividing it by s, giving back the ‘f’th entry.

Reply
Claus says:

2010/04/19 at 6:35 PM

Oh my. Thanks for the comment Pat! Good thing I did not publish these results yet :-). That’s what I get for trying to hammers voodoo recipes without fully understanding them.

There is another error in the above code. Calling “$ ./average.sh filename.*” will expand to the first file name, which will average a single file. You need to put “filename.*” in “” in order to prevent it from expanding until it enters the script.

Fixing now.

Reply

/proc/claus

Blogs will never go out of style!

Averaging text files with bash and awk

2 thoughts on “Averaging text files with bash and awk”

Leave a Reply Cancel reply