Search multiple log files at once - Alert / Show detected entries from each log

Monitoring multiple logs(on Linux, AIX, SunOS systems)

Example 1 - (this shows the matching entries found in each log under /var/log):

./logrobot /var/log 'error_P_panic_P_fail_P_fault' -ndshow

Example 2 - (this shows the total count of each matching entry in each log found)

./logrobot /var/log 'error_P_panic_P_fail_P_fault' -ndfoundmul

NOTE:

The '_P_' represents the pipe "|"(OR) symbol.  If using this tool as a log monitoring alert system, specifying "_P_" instead of "|" prevents unnecessary errors.

The default log file age limit is 60 minutes.  That means, the above commands will only scan log files that were updated or created within the last 60 minutes.

To change the age limit, see the full syntax example below...simply replace the 60m with whichever age you prefer

If no entries are found matching the patterns you specified, but you believe there should be, simply add a ".*" to the beginning and end of each pattern...i.e:

'.*error.*_P_.*panic.*_P_.*fail.*_P_.*fault.*'
Example 1 - (this shows the matching entries found in each log):

Command:

	./logrobot localhost /var/tmp/logXray,tail=10 autonda /usr/WebSphere/AppServer_ast_/profiles/paposa_ast_AppServer_ast_/logs/rmcosCluster1-paposa_ast_-node_ast_-server_ast_/SystemOut.log 60m 'Total.*time.*taken' '.' 1 1 testing3 -ndshow

CRITICAL: [/usr/WebSphere/AppServer_ast_/profiles/paposa_ast_AppServer_ast_/logs/rmcosCluster1-paposa_ast_-node_ast_-server_ast_/SystemOut.log][4]
/usr/WebSphere/AppServer2/profiles/paposa01AppServer02/logs/rmcosCluster1-paposa01-node2-server1/SystemOut.log:P=(2)_F=(13s,1s)_R=(39232,39253=21)
/usr/WebSphere/AppServer1/profiles/paposa01AppServer01/logs/rmcosCluster1-paposa01-node1-server2/SystemOut.log:P=(2)_F=(13s,6s)_R=(75789,75811=22)
/usr/WebSphere/AppServer2/profiles/paposa01AppServer02/logs/rmcosCluster1-paposa01-node2-server2/SystemOut.log:P=(2)_F=(13s,0s)_R=(105911,105932=21)

usr_WebSphere_AppServer2_profiles_paposa01AppServer02_logs_rmcosCluster1-paposa01-node2-server2_SystemOut.log:::
[11/16/16 13:48:41:722 PST] 000004e3 SystemOut O TOK : Total time taken to De-Tokenize a number is [12] ms.
[11/16/16 13:48:53:265 PST] 000004b6 SystemOut O TOK : Total time taken to De-Tokenize a number is [15] ms. 2

usr_WebSphere_AppServer2_profiles_paposa01AppServer02_logs_rmcosCluster1-paposa01-node2-server1_SystemOut.log:::
[11/16/16 13:48:43:915 PST] 000004f6 SystemOut O TOK : Total time taken to De-Tokenize a number is [17] ms.
[11/16/16 13:48:52:317 PST] 000004f6 SystemOut O TOK : Total time taken to De-Tokenize a number is [17] ms. 2

usr_WebSphere_AppServer1_profiles_paposa01AppServer01_logs_rmcosCluster1-paposa01-node1-server2_SystemOut.log:::
[11/16/16 13:48:45:693 PST] 000002e3 SystemOut O TOK : Total time taken to De-Tokenize a number is [14] ms.
[11/16/16 13:48:47:873 PST] 000002b2 SystemOut O TOK : Total time taken to De-Tokenize a number is [26] ms. 2

usr_WebSphere_AppServer1_profiles_paposa01AppServer01_logs_rmcosCluster1-paposa01-node1-server1_SystemOut.log::: 0

Example 2 - (this shows the total count of each matching entry in each log)

Command: 

	./logrobot localhost /var/tmp/logXray,tail=10 autonda /usr/WebSphere/AppServer_ast_/profiles/paposa_ast_AppServer_ast_/logs/rmcosCluster1-paposa_ast_-node_ast_-server_ast_/SystemOut.log 60m 'Total.*time.*taken' '.' 1 1 testing3 -ndfoundmul

CRITICAL: [/usr/WebSphere/AppServer_ast_/profiles/paposa_ast_AppServer_ast_/logs/rmcosCluster1-paposa_ast_-node_ast_-server_ast_/SystemOut.log][4] 

/usr/WebSphere/AppServer1/profiles/paposa01AppServer01/logs/rmcosCluster1-paposa01-node1-server2/SystemOut.log:P=(Total__time__taken=8)_F=(25s)_R=(76970,77031=61)
/usr/WebSphere/AppServer2/profiles/paposa01AppServer02/logs/rmcosCluster1-paposa01-node2-server1/SystemOut.log:P=(Total__time__taken=4)_F=(25s)_R=(40355,40503=148)
/usr/WebSphere/AppServer1/profiles/paposa01AppServer01/logs/rmcosCluster1-paposa01-node1-server1/SystemOut.log:P=(Total__time__taken=3)_F=(25s)_R=(23434,23467=33)
/usr/WebSphere/AppServer2/profiles/paposa01AppServer02/logs/rmcosCluster1-paposa01-node2-server2/SystemOut.log:P=(Total__time__taken=9)_F=(25s)_R=(106908,106997=89)

NOTE:

The '_P_' represents the pipe "|"(OR) symbol.  If using this tool as a log monitoring alert system, specifying "_P_" instead of "|" prevents unnecessary errors.

The default log file age limit is 60 minutes.  That means, the above commands will only scan log files that were modified/created within the last 60 minutes.

To change the age limit, see the full syntax example below...simply replace the 60m with whichever age you prefer

If no entries are found matching the patterns you specified, but you believe there should be, simply add a ".*" to the beginning and end of each pattern...i.e:

'.*error.*_P_.*panic.*_P_.*fail.*_P_.*fault.*'
./logrobot localhost <default-dir> <feature> <logfile> <age> <str-1> <str-2> <WARNING> <CRITICAL> <tag> <option>
Example:

logrobot  localhost  /tmp/logXray  autonda  /var/log/kern.log  60m  'error'  '.'  1  2  app_err_monitor  -ndfoundn
Explanation of Parameters:

logrobot - This is the tool that does the work for you 

/var/tmp/logXray - This is the designated default directory where logrobot will process its data

autonda - This is the feature that allows logrobot to perform this particular auto-resolve task for you

/var/log/kern.log - This is the log file which is going to be scanned

To scan a directory, simply specify the directory path instead...i.e. /var/log

age - The age the monitored log file must be for it to be monitored

'error' - This is where you specify the string/pattern to look for in the log

Make sure there are no spaces in the patterns you specify.

For instance, to search for the pattern "error found in data", you can specify it this way:

'error.*found.*in.*data'

'.' - This is where you specify an additional pattern you wish to look for on the same line as the previous string

Useful if you want to filter out specific log entries

1 - This is the WARNING number of entries that must be found in the log before an alert is generated.

2 - This is the CRITICAL number of entries that must be found in the log before an alert is generated.

app_err_check - This is the tag name given to this particular log check

The name should describe the application/database or function that's writing to the log - Basically, give this a deserving name

-ndshow - When entries are found in the log, this option will show you those entries

-ndfoundn - When entries are found in the log, this option will NOT them - It will tell you the total count of the newest entries found matching your criteria

[root@localhost jserver]# 
[root@localhost jserver]# 
[root@localhost jserver]# time ./logrobot localhost /var/tmp/logXray autonda /var/log 60m 'error' '.' 1 2 appmon -ndfoundn
CRITICAL: [/var/log] maillog:P=(25)_F=(107s)_R=(0,281=281) up2date:P=(5)_F=(51s)_R=(0,73=73), Xorg.0.log:P=(1)_F=(197s)_R=(0,659=659) 

real 0m1.571s
user 0m0.694s
sys 0m0.637s

[root@localhost jserver]# 
[root@localhost jserver]# 
[root@localhost jserver]# time ./logrobot localhost /var/tmp/logXray autonda /var/log 60m 'error' '.' 1 2 appmon -ndfoundn
OK: [/var/log] up2date:P=(0)_F=(5s)_R=(73,73=0) boot.log:P=(0)_F=(5s)_R=(58,58=0) cron:P=(0)_F=(5s)_R=(214,214=0) messages:P=(0)_F=(5s)_R=(643,643=0) dmesg:P=(0)_F=(5s)_R=(502,502=0) Xorg.0.log:P=(0)_F=(5s)_R=(659,659=0) maillog:P=(0)_F=(5s)_R=(281,281=0) pm-powersave.log:P=(0)_F=(5s)_R=(2,2=0) secure:P=(0)_F=(5s)_R=(13,13=0)

real 0m1.604s
user 0m0.674s
sys 0m0.634s

[root@localhost jserver]# 
[root@localhost jserver]# 
[root@localhost jserver]# 
[root@localhost jserver]# time ./logrobot localhost /var/tmp/logXray autonda /var/log/messages 60m 'error' '.' 1 2 appmsg -ndfoundn
OK: [/var/log/messages] /var/log/messages:P=(0)_F=(383s)_R=(0,643=643) 

real 0m1.331s
user 0m0.734s
sys 0m0.622s
[root@localhost jserver]#