bash - How can I iterate over .log files, process them through awk, and replace with output files with different extensions?

Question

Welcome To Ask or Share your Answers For Others

bash - How can I iterate over .log files, process them through awk, and replace with output files with different extensions?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

bash - How can I iterate over .log files, process them through awk, and replace with output files with different extensions?

Let's say that we have multiple .log files on the prod unix machine(Sunos) in a directory: For example:

ls -tlr                                                                                                                                                                                                                     
total 0                                                                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2017-01.log                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2016-02.log                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 todo2015-01.log                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 fix20150223.log

The purpose here is that via nawk I extract specific info from the logs( parse logs ) and "transform" them to .csv files in order to load them to ORACLE tables afterwards. Although the nawk has been tested and works like a charm, how could I automate a bash script that does the following:

1) For a list of given files in this path

2) nawk (to do my extraction of specific data/info from the log file)

3) Output separately each file to a unique .csv to another directory

4) remove the .log files from this path

What does concern me is that the loadstamp/timestamp on each file ending that is different. I have implemented a script that works only for the latest date. (eg. last month). But I want to load all the historical data and I am bit stuck.

To visualize, my desired/target output is this:

bash-4.4$ ls -tlr                                                                                                                                                                                                                     
total 0                                                                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2017-01.csv                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2016-02.csv                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 todo2015-01.csv                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 fix20150223.csv

How could this bash script please be achieved? The loading will only takes one time, it's historical as mentioned. Any help could be extremely useful.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:42:30+0000

An implementation written for readability rather than terseness might look like:

#!/usr/bin/env bash
for infile in *.log; do
  outfile=${infile%.log}.csv
  if awk -f yourscript <"$infile" >"$outfile"; then
    rm -f -- "$infile"
  else
    echo "Processing of $infile failed" >&2
    rm -f -- "$outfile"
  fi
done

To understand how this works, see:

Globbing -- the mechanism by which *.log is replaced with a list of files with that extension.
The Classic for Loop -- The for infile in syntax, used to iterate over the results of the glob above.
Parameter expansion -- The ${infile%.log} syntax, used to expand the contents of the infile variable with any .log suffix pruned.
Redirection -- the syntax used in <"$infile" and >"$outfile", opening stdin and stdout attached to the named files; or >&2, redirecting logs to stderr. (Thus, when we run awk, its stdin is connected to a .log file, and its stdout is connected to a .csv file).

Categories

bash - How can I iterate over .log files, process them through awk, and replace with output files with different extensions?

bash - How can I iterate over .log files, process them through awk, and replace with output files with different extensions?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags