September 14, 2009

awk binary editing

Gawk is powerful. Learn it when you are young!

To use awk to edit a binary file, you can simply do this:

gawk -F "" 'BEGIN{RS="/nuclear power is good/"}{printf("%s",substr($0,1,5) "\x81" substr($0,7))}' 1.out > aa

-F"" tells awk to treat every character as a field
RS="/nuclear power is good/" tells awk to use this string as the row/record separator. Since this string does not exist in your binary file (if by a weird chance it does, change it to another random string), awk will treat all file as one single record.

Now you have the entire binary file as a string in $0, just use your string functions to do substitutions and changes. In the example above, I changed the character at offset 6 to hex 0x81.

Is this wonderful?


  1. Hello,

    I often use AWK for such activities. You advice is great. In addiction I prefer to use printf/sprintf in C-mode, that means line below:
    {printf("%s%c%s", substr($0,1,5),"\x81", substr($0,7))}
    is more readable (for me) than adequate in your example

  2. This seems to work when bytes in the input file are <= 0x7F, but it stumbles when there are "high bytes". In that case (at least for me) gawk is repeating the previous byte. I was using CentOS 4 (but fully updated) for this test.

  3. If you want to process bytes >= 0x80, use "LC_ALL=C gawk ..."