I'd like to be able to work with a file in awk
where records are separated by a blank line and each field consists of a name followed by a colon, some optional whitespace to be ignored/discarded, followed by a value. E.g.
Name: Smith, John
Age: 42
Name: Jones, Mary
Age: 38
Name: Mills, Pat
Age: 62
I understand that I can use RS=""
to have awk
understand the blank-lines as record separators and FS="\n"
to split the fields properly. However, I'd like to then create an array of name
→value
pairs that I can use for further processing of the form
if a["Age"] > 40 {print a["Name"]}
The order is usually consistent, but since it would be dumped in an associative array, the incoming order shouldn't matter or be assumed consistent.
How can I transform the data into an awk
associative array with the least fuss?
Method 1
We use
split
to split each field into two parts: the key and the value. From these, we create associative arraya
:Method 2
Here, we split fields at either a colon or a newline. Then, we know that the odd numbered fields are keys and the even ones the values:
Improvement
Is there a chance that any record will be missing a value? If so, we should clear the array
a
between each record. In GNU awk, this is easy. We just add a delete statement:For other awks, you may be required to delete the array one element at a time like: