An Introduction To AWK

              The awk language is a small, C-style language designed 
for the processing of regularly formatted text. This usually includes
database dumps and system log files.The AWK utility is a data extraction
and reporting tool that uses a data-driven scripting language consisting
of a set of actions to be taken against textual data either in files or 
data streams for the purpose of producing formatted reports.Awk's funny 
name comes from the names of its original authors, Alfred V. Aho, 

Brian W. Kernighan and Peter J. Weinberger.

Basic structure
                 AWK is a language for processing files of text. A file is
treated as a sequence of records, and by default each line is a record. Each 
line is broken up into a sequence of fields, so we can think of the first word
in a line as the first field, the second word as the second field, and so on.

               An AWK program is of a sequence of pattern-action statements.
AWK reads the input a line at a time. A line is scanned for each pattern in 
the program, and for each pattern that matches, the associated action is 

An AWK program is a series of pattern action pairs.

condition {action}

Here the condition is typically an expression and action is a series of 
commands. The input is split into records, where by default records are 
separated by newline characters so that the input is split into lines. 
The program tests each record  against each of the conditions in turn, and 
executes the action for each expression that is true. Either the condition
or the action may be omitted. The condition defaults to matching every record.
The default action is to print the record.

           Another important pattern is specified by the keywords "BEGIN" and
"END" .These two words specify actions to be taken before any lines are read, 
and after the last line is read. The AWK program below:

BEGIN { print "START" }
      { print         }
END   { print "STOP"  }

This will adds one line before and one line after the input file.
             The general form of the awk command is 
awk <pattern> '{print <stuff>}' <file>

In this case, stuff is going to be some combination of text, special 
variables that represent each word in the input line, and perhaps a 
mathematical operator or two. As awk processes each line of the input
file, each word on the line is assigned to variables named $1 (the first
word), $2 (the second word), and so on.The variable $0 contains the 
entire line.
     Let's start with a file,, that contains these lines:

nail hammer wood
pedal foot car
clown pie circus

Now we'll use the print function in awk like this:

awk '{print "Hit the",$1,"with your",$2}'
Hit the nail with your hammer
Hit the pedal with your foot
Hit the clown with your pie

We can also put some numeric data in the input file as in the

Rogers 87 100 95
Lambchop 66 89 76
Barney 12 36 27

Then we can perform some calculations like this:

awk '{print "Avg for",$1,"is",($2+$3+$4)/3}'
Avg for Rogers is 94
Avg for Lambchop is 77
Avg for Barney is 25
Also if we want to exclude lines from being processed, we can enter 
something like this:

awk /^clown/'{print "See the",$1,"at the",$3}'
See the clown at the circus

Here, we told awk to consider only the input lines that start with clown.
Note that there is no space between the pattern and the print specifier.

About ramyabkrishna

I am Ramya B Krishna doing MCA at GEC Thrissur. I am now in the process of taking a Leap into the world of Linux and hoping to do a few projects as well.

Posted on August 7, 2011, in Uncategorized. Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: