Powershell replacing Unicode characters like "çöşİğü"

786 views Asked by At

total coding noob here, trying to fix my script to help my job.

Basically, I want to add "*" after "ç, ö, İ, ş, ü, , ı" Turkish letters in a text file.

A quick example: "Andaç" should be replaced with "Andaç*" (It can be "Andac*" too, doesn't matter. I just need to mark the letter with * or # or @, or even "XXX", whatever!)

A powershell script that I execute via .bat file below works for Latin characters:

powershell -Command "(gc test.txt -Raw) -creplace 'a', ('a*') | Out-File test2.txt

It successfully changes "a" with "a*"

But when I use "ç" instead of "a", the output is "ç" when I run it. Or just "ç" with other encodings. It basically ignores "-creplace" command for that special character and does nothing.

How can I achieve this?

Long story if this will help:

I use this to detect missing punctuation marks in a text file.

For example, test.txt contains this:

First sentence.

Second sentence

Third sentence.

To mark the missing dot in "Second sentence", I use the command below:

-creplace 'e\r\n\', ('e*'+$([Environment]::NewLine))

and the output (test2.txt) becomes this:

First sentence.

Second sentence*

Third sentence.

So I repeat this code for each letter from a to z. But it changes nothing when it is that specific Turkish letters.

1

There are 1 answers

0
Bryan van Rijn On

A little late on this topic, but it might help others.

Apparently the Powershell client from my Development VDI had a different behavior as my local powershell.

So I set my encoding type while doing a Get-Content like:

$json =  Get-Content -Encoding "UTF8" -Path $jsonName | ConvertFrom-Json

Hope this helps