I have this converted dictionary to use in Pure Data. It consists of a series of 3 things: the word, how to pronounce it, and a semicolon to finish. In the converted dictionary, some semicolons are missing, so I want AWK to find the missings and put semicolons for me. I used delimiters before, but this one is difficult for me, so any help will be appreciated. See the text file: the first 3 are good, the last three are wrong, there is a semicolon missing at the end. I think the AWK delimiter will be between non-capital letters and capital letters, and the action is to put a semicolon if there is no semicolon already. How can I put this in AWK code?
ELFKIN
Elf
kin;
ELFLAND
Elf
land
;
ELFLOCK
Elf
lock
;
ELGIN
El
gin
ELICIT
E
lic
it
ELICIT
E
lic
it
I used some Delimiters before, but i do not know how to specify between in AWK. So the Delimiter is non-capital letters and Capital letters, and put a semicolon there. so some code would look like this awk 'length($0)>1 && line with All capitals put semicolon before this line' or awk 'line with non-capitals if Next line is Capitals put semicolon after line I have tryed this
awk 'length($0>1) && /[:^, upper:]/{l=l";"}NR>1{print l}{l=$0}END{print l}' file2
This is not good working.
Or am i pointing is the wrong direction.
I would harness GNU
AWKfor this task following way, letfile.txtcontent bethen
gives output
Explanation: setting
RSto empty string engage paragraph mode, asfile.txthas not blank line, it is treated as 1 row. Then I usegensubstring function to replace all (glike globally) occurences of lowercase letter followed by newline followed by uppercase letter by 1st of that letters followed by semicolon followed by newline followed by 2nd letter.(tested in GNU Awk 5.1.0)