Let's say that we have multiple .log files on the prod unix machine(Sunos) in a directory: For example:
ls -tlr
total 0
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2017-01.log
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2016-02.log
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 todo2015-01.log
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 fix20150223.log
The purpose here is that via nawk I extract specific info from the logs( parse logs ) and "transform" them to .csv files in order to load them to ORACLE tables afterwards. Although the nawk has been tested and works like a charm, how could I automate a bash script that does the following:
1) For a list of given files in this path
2) nawk (to do my extraction of specific data/info from the log file)
3) Output separately each file to a unique .csv to another directory
4) remove the .log files from this path
What does concern me is that the loadstamp/timestamp on each file ending that is different. I have implemented a script that works only for the latest date. (eg. last month). But I want to load all the historical data and I am bit stuck.
To visualize, my desired/target output is this:
bash-4.4$ ls -tlr
total 0
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2017-01.csv
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2016-02.csv
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 todo2015-01.csv
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 fix20150223.csv
How could this bash script please be achieved? The loading will only takes one time, it's historical as mentioned. Any help could be extremely useful.
An implementation written for readability rather than terseness might look like:
To understand how this works, see:
*.log
is replaced with a list of files with that extension.for
Loop -- Thefor infile in
syntax, used to iterate over the results of the glob above.${infile%.log}
syntax, used to expand the contents of theinfile
variable with any.log
suffix pruned.<"$infile"
and>"$outfile"
, opening stdin and stdout attached to the named files; or>&2
, redirecting logs to stderr. (Thus, when we runawk
, its stdin is connected to a.log
file, and its stdout is connected to a.csv
file).