Category Archives: awk

Apache – Logs from the Last Hour

I am using a cPanel account and have an Apache 2.4 access log that stores its logs like:

66.249.93.30 - - [04/May/2018:21:26:39 +0200] "GET / HTTP/1.1" 302 207 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko; Google Page Speed Insights) Chrome/41.0.2272.118 Safari/537.36"
66.249.93.30 - - [05/May/2018:10:26:39 +0200] "GET / HTTP/1.1" 302 207 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko; Google Page Speed Insights) Chrome/41.0.2272.118 Safari/537.36"

The date is in format date "+%d/%B/%Y:%k:%M:%S"

Using a bash script I would like to extract just the lines that were logged in the last hour, for example:

Full Log file:

66.249.93.30 - - [04/May/2018:21:26:39 +0200] First Line
66.249.93.30 - - [05/May/2018:11:00:21 +0200] Second Line
66.249.93.30 - - [05/May/2018:11:15:39 +0200] Third Line
66.249.93.30 - - [05/May/2018:12:00:11 +0200] Fourth Line

Current Time: 05/May/2018:12:01:06

Logs from: 5th of May between the time interval of 11:01 - 12:01

Filtered Output:

66.249.93.30 - - [05/May/2018:11:15:39 +0200] Third Line
66.249.93.30 - - [05/May/2018:12:00:11 +0200] Fourth Line

I have tried using awk and several other suggestions but I can't get it to work, any help will be appreciated!

Using awk to generate report from apache http logs

Hoping someone can help me with a bash linux script to generate report from http logs.

Logs format:

domain.com 101.100.144.34 - r.c.bob [14/Feb/2017:11:31:20 +1100] "POST /webmail/json HTTP/1.1" 200 1883 "https://example.domain.com/webmail/index-rui.jsp?v=1479958955287" "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko" 1588 2566 "110.100.34.39" 9FC1CC8A6735D43EF75892667C08F9CE 84670 - - - -  

Output require:

time in epoch,host,Resp Code,count  

1485129842,101.100.144.34,200,4000  
1485129842,101.101.144.34,404,1889

what I have so far but nothing near what I am trying to achieve:

tail -100 httpd_access_*.log | awk '{print  $5 " " $2 " " $10}' | sort | uniq

Show IP Address and IP Address Count Per Hour from Apache Logs

How do I use awk to parse the Apache access log file to display information in the following format?

   Date     Time  Count   IP Address
2016-05-26  00:00  200    192.168.1.x
2016-05-26  00:00  152    172.17.100.x
2016-05-26  00:01   43    192.168.1.x

Let me be clear. I do not want to show total requests per hour. I do not want to show total requests per minute. I know how to write basic awk scripts to perform both of those tasks.

I want to see how many requests per minute each unique IP address is sending. I'm not savvy enough with awk to do this.

Basic Server Information:

OS: Ubuntu 12.04.5 LTS

Apache: Apache 2.2.22

Apache Log Format

LogFormat "%h %l %u %{%F %T %z}t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\""

Example 1:

grep '2016-05-26' access.log | awk '{print $1}' | sort | uniq -c | sort -n | tail -40 | awk '{print $2,$2,$1}' | logresolve | awk '{printf "%6d %s (%s)\n",$3,$1,$2}'

Produces the following output

307 135-23-174-138.cpe.pppoe.ca (135.23.174.138)
313 5265DCE5.cm-8.dynamic.ziggo.nl (82.101.220.229)
378 92-108-204-76.dynamic.upc.nl (92.108.204.76)
405 0191301456.0.fullrate.ninja (90.185.180.167)
632 ec2-52-58-151-132.eu-central-1.compute.amazonaws.com (52.58.151.132)
798 187.228.212.148 (187.228.212.148)
877 207.246.75.253 (207.246.75.253)
966 ec2-54-213-177-120.us-west-2.compute.amazonaws.com (54.213.177.120)
1116 ec2-54-186-148-0.us-west-2.compute.amazonaws.com (54.186.148.0)
1224 ppp121-44-247-209.bras2.syd2.internode.on.net (121.44.247.209)
1369 ec2-54-187-239-46.us-west-2.compute.amazonaws.com (54.187.239.46)
1584 45.55.189.64 (45.55.189.64)
2658 50-77-47-70-static.hfc.comcastbusiness.net (50.77.47.70)

Example 2:

grep "2016-05-26" access.log | awk '{ print $4, $5, $1}' | cut -f2 | awk -F: '{ print $1":"$2 }' | sort -nk1 -nk2 | uniq -c | awk '{ if ($1 > 10) print $0 }'

That gives the following output:

560 2016-05-26 00:00
534 2016-05-26 00:01
538 2016-05-26 00:02
554 2016-05-26 00:03
566 2016-05-26 00:04
534 2016-05-26 00:05
559 2016-05-26 00:06
531 2016-05-26 00:07
540 2016-05-26 00:08
435 2016-05-26 00:09
312 2016-05-26 00:10

All help is much appreciated.

Show IP Address and IP Address Count Per Hour from Apache Logs

How do I use awk to parse the Apache access log file to display information in the following format?

   Date     Time  Count   IP Address
2016-05-26  00:00  200    192.168.1.x
2016-05-26  00:00  152    172.17.100.x
2016-05-26  00:01   43    192.168.1.x

Let me be clear. I do not want to show total requests per hour. I do not want to show total requests per minute. I know how to write basic awk scripts to perform both of those tasks.

I want to see how many requests per minute each unique IP address is sending. I'm not savvy enough with awk to do this.

Apache Log Format

LogFormat "%h %l %u %{%F %T %z}t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\""

Sample

I tailed the end of the log file. Here's a small sample of what it contains. (We have over 100K entries for today. It's not feasible to share them all here. If more lines are needed please ask.)

54.213.236.39 - - 2016-05-26 14:38:51 -0400 "GET /p1077921.html HTTP/1.0" 403 400 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"
54.213.236.39 - - 2016-05-26 14:38:51 -0400 "GET /p1060432.html HTTP/1.0" 403 398 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"
54.213.254.166 - - 2016-05-26 14:38:51 -0400 "GET /p819757.html HTTP/1.0" 403 400 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"
54.213.236.39 - - 2016-05-26 14:38:51 -0400 "GET /p1084269.html HTTP/1.0" 403 400 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"
107.23.252.229 - - 2016-05-26 14:38:51 -0400 "GET /p305987.html HTTP/1.0" 403 399 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_77)"

Example 1:

grep '2016-05-26' access.log | awk '{print $1}' | sort | uniq -c | sort -n | tail -40 | awk '{print $2,$2,$1}' | logresolve | awk '{printf "%6d %s (%s)\n",$3,$1,$2}'

Produces the following output

307 135-23-174-138.cpe.pppoe.ca (135.23.174.138)
313 5265DCE5.cm-8.dynamic.ziggo.nl (82.101.220.229)
378 92-108-204-76.dynamic.upc.nl (92.108.204.76)
405 0191301456.0.fullrate.ninja (90.185.180.167)
632 ec2-52-58-151-132.eu-central-1.compute.amazonaws.com (52.58.151.132)
798 187.228.212.148 (187.228.212.148)
877 207.246.75.253 (207.246.75.253)
966 ec2-54-213-177-120.us-west-2.compute.amazonaws.com (54.213.177.120)
1116 ec2-54-186-148-0.us-west-2.compute.amazonaws.com (54.186.148.0)
1224 ppp121-44-247-209.bras2.syd2.internode.on.net (121.44.247.209)
1369 ec2-54-187-239-46.us-west-2.compute.amazonaws.com (54.187.239.46)
1584 45.55.189.64 (45.55.189.64)
2658 50-77-47-70-static.hfc.comcastbusiness.net (50.77.47.70)

Example 2:

grep "2016-05-26" access.log | awk '{ print $4, $5, $1}' | cut -f2 | awk -F: '{ print $1":"$2 }' | sort -nk1 -nk2 | uniq -c | awk '{ if ($1 > 10) print $0 }'

That gives the following output:

560 2016-05-26 00:00
534 2016-05-26 00:01
538 2016-05-26 00:02
554 2016-05-26 00:03
566 2016-05-26 00:04
534 2016-05-26 00:05
559 2016-05-26 00:06
531 2016-05-26 00:07
540 2016-05-26 00:08
435 2016-05-26 00:09
312 2016-05-26 00:10

All help is much appreciated.

Magento 2 – Cron Jobs not indexing

I have searched Stack with no luck, I have come from Magento 1 where you could index from the admin to Magento 2 where you have to run cron jobs.

I have a fresh install on a CentOS 6.6 with Cpanel. Running php 7.

I have put up about 150 products so far, however I have reached a point they are no longer appearing on my site. I believe this is due to my cron jobs aren't running.

The jobs I have tried via the Cpanel are as follows.

======================================================================== This set of jobs give no errors 2016-05-21 15:14:03 Ran jobs by schedule. 2016-05-21 15:15:04 Ran jobs by schedule. 2016-05-21 15:16:03 Ran jobs by schedule. 2016-05-21 15:17:03 Ran jobs by schedule. 2016-05-21 15:18:03 Ran jobs by schedule. 2016-05-21 15:19:02 Ran jobs by schedule. 2016-05-21 15:20:03 Ran jobs by schedule. 2016-05-21 15:21:02 Ran jobs by schedule. 2016-05-21 15:22:03 Ran jobs by schedule. 2016-05-21 15:23:03 Ran jobs by schedule. 2016-05-21 15:24:03 Ran jobs by schedule. 2016-05-21 15:25:03 Ran jobs by schedule.

*/1 * * * * /usr/local/bin/php /home/downupne/public_html/bin/magento cron:run | awk '{ print strftime("\%Y-\%m-\%d \%H:\%M:\%S"), $0; fflush(); }' 2>&1 >> ~/magento_cron.log


*/1 * * * * /usr/local/bin/php /home/downupne/public_html/update/cron.php | awk '{ print strftime("\%Y-\%m-\%d \%H:\%M:\%S"), $0; fflush(); }' 2>&1 >> ~/magento_cron.log


*/1 * * * * /usr/local/bin/php /home/downupne/public_html/bin/magento setup:cron:run | awk '{ print strftime("\%Y-\%m-\%d \%H:\%M:\%S"), $0; fflush(); }' 2>&1 >> ~/magento_cron.log

======================================================================= This set of jobs nothing happens in the log

*/1 * * * * /usr/bin/php -c /etc/php5/apache2/php.ini /var/www/magento2/bin/magento cron:run | grep -v "Ran jobs by schedule" >> /var/www/magento2/var/log/magento_cron.log


*/1 * * * * /usr/bin/php -c /etc/php5/apache2/php.ini /var/www/magento2/update/cron.php >> /var/www/magento2/var/log/update_cron.log


*/1 * * * * /usr/bin/php -c /etc/php5/apache2/php.ini /var/www/magento2/bin/magento setup:cron:run >> /var/www/magento2/var/log/setup_cron.log

Any advise would be greatly appreciated.

Regards

Nathan

Parsing Apache Error.Log in Ubuntu 14.04

I tried to parse Apache error log using the following command:

sudo tail -f /var/log/apache2/error.log |  awk  '$8 ~ /(400|500)/ {print $6}'

Where i am trying to only view 400 error OR 500 error. But when i run that ssh command, it doesn't return anything. In other words, nothing is displayed on the screen.

For the record, i have try to purportedly point my browser to a bogus URL page so that it generates 400 error / 500 error.

Am i missing anything ?

Why does grep not filter correctly if i try to get the apache server version

I use this command to get the apache server version:

apachectl -V | grep -i "Server version" | tr "/" " " | awk '{ print $4 }'

But this does not work on every system. Sometimes i get some other output before my server version output.

enter image description here

AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message

The question is why do i get this output, even though grep should filter it off? I know that i can suppress it, but why does it even show even though i use grep?

Why does grep not filter correctly if i try to get the apache server version

I use this command to get the apache server version:

apachectl -V | grep -i "Server version" | tr "/" " " | awk '{ print $4 }'

But this does not work on every system. Sometimes i get some other output before my server version output.

enter image description here

AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message

The question is why do i get this output, even though grep should filter it off? I know that i can suppress it, but why does it even show even though i use grep?

bash: get nextline if previous column is higher than ‘xxx’

I need to analyze the output of apache server-status. I need to match every entry which has high delays in "Sending Reply" parsing the stastic page. The content looks like this:

11-1 24986 7/9/7288 K 0.08 3 1 77.5 0.08 23.17 IP-CLIENT
hostname:80 GET /static/img/securoty.png HTTP/1.1
12-1 23648 65/108/8176 K 5.74 2 51 90.6 0.16 24.50 IP-CLIENT
hostname:80 POST /php/toolbar_ajax.php HTTP/1.1
13-1 22887 95/118/7672 K 5.38 2 47 140.5 0.17 18.65 IP-CLIENT
hostname:80 POST /php/toolbar_ajax.php HTTP/1.1
14-1 24987 4/6/8016 K 0.09 4 379 288.5 0.28 22.42 IP-CLIENT
hostname:80 GET /static/img/bg_dealers.jpg HTTP/1.1
15-1 24518 7/43/8425 K 2.36 4 53 10.2 0.18 23.24 IP-CLIENT
hostname:80 POST /php/toolbar_ajax.php HTTP/1.1
40-3 12970 14/27/5335 W 10.37 0 0 26.7 0.05 18.44 IP-CLIENT
hostname:80 GET /php/r_fin_new3_std.php HTTP/1.1

Every odd line has this legend:

 __________________________________________________________________

Srv  Child Server number - generation
PID  OS process ID
Acc  Number of accesses this connection / this child / this slot
 M   Mode of operation
CPU  CPU usage, number of seconds
SS   Seconds since beginning of most recent request
Req  Milliseconds required to process most recent request
Conn  Kilobytes transferred this connection
Child Megabytes transferred this child
Slot  Total megabytes transferred this slot
 __________________________________________________________________

Every even line contains the asked URL by the client. I need to match every line containing "Mode of operation" in "W" (Sending Reply) and the "SS" (Seconds since beginning of most recent request) greather then 10. After matching these lines I need to print out the line and the line after. In this case I would need to print:

40-3 12970 14/27/5335 W 10.37 0 0 26.7 0.05 18.44 IP-CLIENT
hostname:80 GET /php/r_fin_new3_std.php HTTP/1.1

First line, column 4 (Mode of operation) is "W" = TRUE

First line, column 5 (Seconds since beginning of most recent request) is 10.37 > 10 = TRUE

Then print the first line and the next one, which gives me the asked URL.

I've the server-status saved (append) every 5 minutes in a logfile. If I use this command I get all "Sending Reply" and the line after, but cannot filter by those greater than 10:

# grep " W " -A 1 /var/log/server-status.log

Any idea?

Thanks,

simon

bash: get nextline if previous column is higher than ‘xxx’

I need to analyze the output of apache server-status. I need to match every entry which has high delays in "Sending Reply" parsing the stastic page. The content looks like this:

11-1 24986 7/9/7288 K 0.08 3 1 77.5 0.08 23.17 IP-CLIENT
hostname:80 GET /static/img/securoty.png HTTP/1.1
12-1 23648 65/108/8176 K 5.74 2 51 90.6 0.16 24.50 IP-CLIENT
hostname:80 POST /php/toolbar_ajax.php HTTP/1.1
13-1 22887 95/118/7672 K 5.38 2 47 140.5 0.17 18.65 IP-CLIENT
hostname:80 POST /php/toolbar_ajax.php HTTP/1.1
14-1 24987 4/6/8016 K 0.09 4 379 288.5 0.28 22.42 IP-CLIENT
hostname:80 GET /static/img/bg_dealers.jpg HTTP/1.1
15-1 24518 7/43/8425 K 2.36 4 53 10.2 0.18 23.24 IP-CLIENT
hostname:80 POST /php/toolbar_ajax.php HTTP/1.1
40-3 12970 14/27/5335 W 10.37 0 0 26.7 0.05 18.44 IP-CLIENT
hostname:80 GET /php/r_fin_new3_std.php HTTP/1.1

Every odd line has this legend:

 __________________________________________________________________

Srv  Child Server number - generation
PID  OS process ID
Acc  Number of accesses this connection / this child / this slot
 M   Mode of operation
CPU  CPU usage, number of seconds
SS   Seconds since beginning of most recent request
Req  Milliseconds required to process most recent request
Conn  Kilobytes transferred this connection
Child Megabytes transferred this child
Slot  Total megabytes transferred this slot
 __________________________________________________________________

Every even line contains the asked URL by the client. I need to match every line containing "Mode of operation" in "W" (Sending Reply) and the "SS" (Seconds since beginning of most recent request) greather then 10. After matching these lines I need to print out the line and the line after. In this case I would need to print:

40-3 12970 14/27/5335 W 10.37 0 0 26.7 0.05 18.44 IP-CLIENT
hostname:80 GET /php/r_fin_new3_std.php HTTP/1.1

First line, column 4 (Mode of operation) is "W" = TRUE

First line, column 5 (Seconds since beginning of most recent request) is 10.37 > 10 = TRUE

Then print the first line and the next one, which gives me the asked URL.

I've the server-status saved (append) every 5 minutes in a logfile. If I use this command I get all "Sending Reply" and the line after, but cannot filter by those greater than 10:

# grep " W " -A 1 /var/log/server-status.log

Any idea?

Thanks,

simon