Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
284 views
in Technique[技术] by (71.8m points)

regex - Parsing timestamp in nginx logs

I need help as I am new to log parsing. I'm trying to extract all log lines that have a 200 status, with a timestamp of 15 hours before 15:35. I am not able to figure out the regex to be used.

Here is a sample of the log:

198.104.78.160 [26/Dec/2016:15:24:12 -0500] 200 190.50.175.65:8080 200 testtest.com GET /api/bid_request?feed=1&auth=qwerty&ip=85.194.119.3&ua=Mozilla%2F5.0+%28Windows+NT+6.1%3B+Win64%3B+x64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F48.0.2564.97+Safari%2F537.36&lang=tr-TR%2Ctr%3Bq%3D0.8%2Cen-US%3Bq%3D0.6%2Cen%3Bq%3D0.4&ref=http%3A%2F%2Fserve.pop.net%2Fs HTTP/1.0 - - - 174.194.36.141 - 0.109-0.009 US /

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use awk to do that :

awk -v status_code=200 -v ts_at_hour=15 -v ts_before_hour=15 -v ts_before_min=35 '

    {
        match($0, /[0-9]+.[0-9]+.[0-9]+.[0-9]+s+[[0-9]{2}/[a-zA-Z]{3}/[0-9]{4}:([0-9]{2}):([0-9]{2}):([0-9]{2})s+[+-][0-9]{4}]s+([0-9]{3})/, items)

        if (items[1] == ts_at_hour && 
            items[1] <= ts_before_hour && 
            items[2] < ts_before_min &&
            items[4] == status_code){
          print $0
        }
    }
' data.txt

Set some variables to store your requirements status_code, ts_at_hour, ts_before_hour and ts_before_min (you can define environment vars to them)

The regex is a match that focus on 4 groups : hour, minutes, seconds defined by ([0-9]{2}) and status_code at the end ([0-9]{3})

To decompose the regex, you have :

  • the IP address [0-9]+.[0-9]+.[0-9]+.[0-9]+ followed by space s+ (or more)
  • the date part which includes hour,minutes and seconds [[0-9]{2}/[a-zA-Z]{3}/[0-9]{4}:([0-9]{2}):([0-9]{2}):([0-9]{2})s+[+-][0-9]{4}] (notice the 3 groups between ())
  • the status code with ([0-9]{3})

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...