Averaging text files with bash and awk

Small script that might be useful for analysing data. The script takes multiple files (filemask is $1) which are composed of data columns, and produces the average of each data point across the files. Leaving around for future reference.

Usage: $ average.sh “filename.*” > filename.avg

sum=`ls -l $1 | wc -l`
tf=`ls $1 | tail -n 1`
fld=`tail -n 1 $tf | wc -w`

count=1
while [ $count -lt $(($fld + 1)) ]; do
paste -d” ” $1 | nawk -v s=”$sum”\
-v fld=”$fld” -v f=”$count” ‘{
for(i=0;i< =s-1;i++)
{
ta=f + i*fld
tta=tta+$ta
}
print tta/s
tta=0
}' >> tmp$count
count=$(($count + 1))
done
paste -d” ” tmp*
rm tmp*

Bonus points for anyone who can make the above code simpler.

UPDATE: Thanks Pat for the bugfix. Whups.

2 Responses to “Averaging text files with bash and awk”

  1. Pat Says:

    Does finding a bug get bonus points? You actually want to pass fld to awk as well (say -v fld=”$fld”) and then do

    ta=f+i*fld

    instead of ta=f. Right now tta is just adding the ‘f’th entry s times and dividing it by s, giving back the ‘f’th entry.

  2. Claus Says:

    Oh my. Thanks for the comment Pat! Good thing I did not publish these results yet :-). That’s what I get for trying to hammers voodoo recipes without fully understanding them.

    There is another error in the above code. Calling “$ ./average.sh filename.*” will expand to the first file name, which will average a single file. You need to put “filename.*” in “” in order to prevent it from expanding until it enters the script.

    Fixing now.

Leave a Reply


"The secret of happiness is total disregard of everybody."
Fortune file