I need this awk command to replace ss:Width="252" in the first XML tag in the text with ss:Width="140" and leave the rest of the tags alone:
cat <<- EOF > text
<ss:Column ss:AutoFitWidth="1" ss:Width="252"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="252"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="189"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="189"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="252"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="252"/>
EOF
awk '{c=++count[$0]} c==1 {sub(/ss:Width=\"[0-9]{1,4}\"/,"ss:Width=\"140\"")} {print}' text > newf
cat newf
Instead, it replaces the expression in the first instances of each of the three unique matches (three total replacements, whereas I want only one.)
<ss:Column ss:AutoFitWidth="1" ss:Width="140"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="140"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="252"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="140"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="189"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="252"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="126"/>
<ss:Column ss:AutoFitWidth="1" ss:Width="252"/>
Why does it behave this way? How is the incrementer behaving in my awk command? I expected it to increment after the first qualifying match of /ss:Width=\".*\"/ but it seems like it's not incrementing until all unique matches are found, then ignoring subsequent non-unique matches only. Is that right? I tried to force the counter to increment at the end of the c == 1 block like this:
awk '{c=++count[$0]} c==1 {sub(/ss:Width=\".*\"/,"ss:Width=\"140\"");c++} {print}' text > newf
But I get the same output. I didn't have any luck trying this task in sed & I'd rather do it in awk anyway. I'm specifically interested in understanding this awk syntax.
Edit: I tested this theory by changing one of the width attributes to another random number. It does also replace that one with 140. So, it is limiting to the first instance of all matching expressions, not the first matching expression itself.
Edit: As Cody pointed out my regex is greedy. I changed .* to be [0-9]{1,4} however the behavior is the same - it still replaces only the first instance of every unique match. I also changed one of the XML tags' width attributes to a 3rd unique number and updated the output to illustrate the behavior I'm trying to fix.
This is AIX/ksh.
You might be able to shorten that a bit.
Your old approach was keeping an array of counters indexed by the line of input. That's why it was exhibiting the behavior you weren't expecting.
Some of the other answers assume that all lines will match the
/ss:Width/
regex and/or always find the width attribute at the end of a line. It's probably true in your case but worthy of noting. I decided not to assume those things in the script above.