Extracting Substring from File Name

79 views Asked by At

I have a list of files with the following file name format:

[some unknown amount of characters][_d][yyyymmdd][some unknown amount of characters]

I want to extract the substring that contains the date (yyyymmdd) which I know will always be proceeded by "_d". So basically I want to extract the first 8 characters after the "_d".

What is the best way to go about doing this?

1

There are 1 answers

2
fedorqui On BEST ANSWER

I would use sed:

$ echo "asdfasd_d20150616asdasd" | sed -r 's/^.*_d(.{8}).*$/\1/'
20150616

This gets a string and removes everything up to _d. Then, catches the following 8 characters and prints them back.

  • sed -r is used to be able to catch groups with just () instead of \(\).
  • ^.*_d(.{8}).*$
    • ^ beginning of line
    • .* any number of characters (even 0 of them)
    • _d literal _d you want to match
    • (.{8}) since . matches any character, .{8} matches 8 characters. With () we catch them so that they can be reused later on.
    • .*$ any number of characters up to the end of the line.
  • \1 print back the catched group.