Skip to the content.

Home / cs-notes / Operating System / Linux / Commands / awk

GAWK(1)                                               Utility Commands                                              /

NAME
       gawk - pattern scanning and processing language

SYNOPSIS
       gawk [ POSIX or GNU style options ] -f program-file [ -- ] file ...
       gawk [ POSIX or GNU style options ] [ -- ] program-text file ...
       -f program-file
       --file program-file
              Read the AWK program source from the file program-file, instead of from  the  first  command  line  argument.    
              Multiple  -f  (or  --file)  options may be used.  Files read with -f are treated as if they begin with an im‐    
              plicit @namespace "awk" statement.

       -F fs
       --field-separator fs
              Use fs for the input field separator (the value of the FS predefined variable).

       -v var=val
       --assign var=val
              Assign the value val to the variable var, before execution of the program begins.  Such variable  values  are    
              available to the BEGIN rule of an AWK program.
    Fields
        The variable NF is set to the total number of fields in the input record.
    Built-in Variables

       FILENAME    The  name  of the current input file.  If no files are specified on the command line, the value of FILE‐
                  NAME is “-”.  However, FILENAME is undefined inside the BEGIN rule (unless set by getline).

       FS          The input field separator, a space by default.  See Fields, above.

       NF          The number of fields in the current input record.

       NR          The total number of input records seen so far.

       OFMT        The output format for numbers, "%.6g", by default.

       OFS         The output field separator, a space by default.

       ORS         The output record separator, by default a newline.

       RS          The input record separator, by default a newline.

       RT          The  record terminator.  Gawk sets RT to the input text that matched the character or regular expression
                   specified by RS.

   Octal and Hexadecimal Constants
       You may use C-style octal and hexadecimal constants in your AWK program source code.  For example, the  octal  value
       011 is equal to decimal 9, and the hexadecimal value 0x11 is equal to decimal 17.

   String Constants
       String  constants in AWK are sequences of characters enclosed between double quotes (like "value").  Within strings,
       certain escape sequences are recognized, as in C.  These are:

       \\   A literal backslash.

       \a   The “alert” character; usually the ASCII BEL character.

       \b   Backspace.

       \f   Form-feed.

       \n   Newline.

       \r   Carriage return.

       \t   Horizontal tab.

       \v   Vertical tab.

       \xhex digits
            The character represented by the string of hexadecimal digits following the \x.  Up to two following  hexadeci‐
            mal digits are considered part of the escape sequence.  E.g., "\x1B" is the ASCII ESC (escape) character.

       \ddd The  character  represented  by the 1-, 2-, or 3-digit sequence of octal digits.  E.g., "\033" is the ASCII ESC
            (escape) character.

       \c   The literal character c.

       In compatibility mode, the characters represented by octal and hexadecimal escape sequences  are  treated  literally
       when used in regular expression constants.  Thus, /a\52b/ is equivalent to /a\*b/.
AWK PROGRAM EXECUTION
       An AWK program consists of a sequence of optional directives, pattern-action statements, and optional function defi‐    
       nitions.

              @include "filename"
              @load "filename"
              @namespace "name"
              pattern   { action statements }
              function name(parameter list) { statements }


awk '{[pattern] action}' {filenames}
# separator
awk -F

awk -F, '{print $1,$2}' inputfile
awk 'BEGIN{FS=","} {print $1,$2}' inputfile
awk -F '[ ,]'  '{print $1,$2,$5}' inputfile
# varialbe
awk -v

awk -va=1 -vb=s '{print $1,$1+a,$1b}' inputfile
# scriptfile
awk -f

awk -f scriptfile inputfile
# condition
awk '$1>2' log.txt
awk '$1==2 {print $1,$3}' log.txt
awk '$1>2 && $2=="Are" {print $1,$2,$3}' log.txt
# build-in variables

awk -F\' 'BEGIN{printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n", "FILENAME","ARGC","FNR","FS","NF","NR","OFS","ORS","RS";
                printf "---------------------------------------------\n"}
                 {printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n", FILENAME,ARGC,FNR,FS,NF,NR,OFS,ORS,RS}' log.txt

awk '{print NR,FNR,$1,$2,$3}' log.txt

awk '{print $1,$2,$5}' OFS=" $ "  log.txt
# regular expression

awk '$2 ~ /th/ {print $2,$4}' log.txt

awk '$2 !~ /th/ {print $2,$4}' log.txt
awk '!/th/ {print $2,$4}' log.txt
# case

awk 'BEGIN{IGNORECASE=1} /this/' log.txt
# awk script

# cat script.awk

BEGIN {}
{}
END {}
# file size
ls -l *.txt | awk '{sum+=$5} END {print sum}'
# filter line length
awk 'length>80' log.txt
seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}'

REF