Averaging text files with bash and awk

Small script that might be useful for analysing data. The script takes multiple files (filemask is $1) which are composed of data columns, and produces the average of each data point across the files. Leaving around for future reference.

Usage: $ average.sh “filename.*” > filename.avg

sum=`ls -l $1 | wc -l`
tf=`ls $1 | tail -n 1`
fld=`tail -n 1 $tf | wc -w`

count=1
while [ $count -lt $(($fld + 1)) ]; do
paste -d” ” $1 | nawk -v s=”$sum”
-v fld=”$fld” -v f=”$count” ‘{
for(i=0;i< =s-1;i++) { ta=f + i*fld tta=tta+$ta } print tta/s tta=0 }' >> tmp$count
count=$(($count + 1))
done
paste -d” ” tmp*
rm tmp*

Bonus points for anyone who can make the above code simpler.

UPDATE: Thanks Pat for the bugfix. Whups.

2 thoughts on “Averaging text files with bash and awk

  1. Does finding a bug get bonus points? You actually want to pass fld to awk as well (say -v fld=”$fld”) and then do

    ta=f+i*fld

    instead of ta=f. Right now tta is just adding the ‘f’th entry s times and dividing it by s, giving back the ‘f’th entry.

  2. Oh my. Thanks for the comment Pat! Good thing I did not publish these results yet :-). That’s what I get for trying to hammers voodoo recipes without fully understanding them.

    There is another error in the above code. Calling “$ ./average.sh filename.*” will expand to the first file name, which will average a single file. You need to put “filename.*” in “” in order to prevent it from expanding until it enters the script.

    Fixing now.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.