Monitoring Logs with Timestamps
Monitor log files with timestamps similar to 'Aug 21 15:21:58 ...'
Case Scenario: Within the last 30 minutes, find out how many lines in the log file [ /var/log/app.log ] contain both entries of "ERROR" and "Client". If any lines are found containing these two strings (ERROR.*Client), take note of that. From the list of lines found, see if there are any lines that also contain the keywords "error 404" OR "updateNumber". If there are, remove them from the list. After removing them, show me what is left. If the number of lines left is between 5 and 9, alert as WARNING. If equal to or over 10, alert as CRITICAL. If below 5, do not alert! Command: logxray autofig /var/log/app.log 30m 'ERROR.*Client' '(error 404|updateNumber)' 5 10 -showexcl
Case Scenario: For instance, within the last 30 minutes, if logxray does not find at least 2 lines containing the words "Success" and "Client" and "returned 200" OR "update:OK" in the log file, it must alert. So in other words, the lines to search for MUST contain both words of Success & Client (Success.*Client) AND one or both of the strings returned 200 and update:OK. Command: logxray autofig /var/log/app.log 30 'SUCCESS.*Client' '(returned 200|update:OK)' 2 2 -notfoundn
This is particularly helpful in cases where you might want to see the actual lines that contain the patterns you instructed the tool to search for. Example: logxray autofig /var/log/app.log 30 'ERROR.*Client' '(error 404|updateNumber:OK)' 5 10 -show Example: logxray autofig /var/log/app.log 30 'SUCCESS.*Client' '(returned 200|update:OK)' 5 10 -show
For instance, to pull out 2 days of information from within a large log file and to find out how many lines contain certain strings and patterns, you can run a command similar to this: Example: logxray autofig /var/log/app.log 2d 'ERROR|error|panic|fail' '.' 5 10 -foundn From this specific example, I'm telling logxray that I care about EVERY single line that contains any of the keywords I provided. The [ 2d ] of course means 2 Days. See below for the different ways of specifying a preferred time frame: 5m = 5 minutes (changeable to any number of minutes) 10h = 10 hours (changeable to any number of hours) 2d = 2 days (changeable to any number of days) 2w = 2 weeks (changeable to any number of weeks) 3mo = 3 months (changeable to any number of months)
Syntax: ./logxray autofig (logfile) (timeframe-in-minutes) '(string1)' '(string2)' (warn) (critical) (-foundn) Basic Usage: [root@monitor jbowman]# [root@monitor jbowman]# [root@monitor jbowman]# logxray autofig /var/log/messages 1440 'ntpd' 'stratum' 5 10 -foundn 2---240---108---ATWFILF---(Apr/13)-(03:35)---(Apr/14)-(03:35:23) [root@monitor jbowman]# [root@monitor jbowman]# So now lets break this down: logrobot is the tool name. autofig is an option that is passed to the logrobot tool to tell it what to do. In this particular case, autofig is instructing logrobot to "automatically figure out" what type of log file /var/log/messages is, and if the format of the log file is supported, perform the remaining functions. /var/log/messages is of course the log file. 1440 is the amount of previous minutes you want to search the log file for. 1440 = last 24 hours. "ntpd" is one of the strings that is in the lines of logs that you're interested in. "stratum" is another string on the same line that you expect to find the "ntpd" string on. Specifying these two strings (luance and Err1310) isolates and processes the lines you want a lot quicker, particularly if you're dealing with a huge log file. 5 specifies Warning. By specifying 5, you're telling the program to alert as WARNING if there are at least 5 occurrences of the search strings you specified, in the log file within the last 60 minutes. 10 specifies Critical. By specifying 10, you're telling the program to alert as CRITICAL if there are at least 10 occurrences of the search strings you specified, in the log file within the last 60 minutes. -foundn specifies what type of response you'll get. By specifying -foundn, you're saying if anything is found that matches the specified strings within the 60 minute time frame, then that should be regarded as a problem and outputted out. Summarized Explanation: As you can see, the logrobot tool is monitoring a log file. The arguments that are passed to the tool instructs it to do the following: Within the last 60 minutes, if the tool finds less than 5 occurrences of the specified strings in the log file, DO NOT alert. If the tool finds between 5 to 9 occurrences of the specified strings in the log, it'll alert with a WARNING. If the tool discovers 10 or more instances of the strings in the log within the last 60 minutes, it'll alert with a CRITICAL. Now, let us look at the result of the command: 2---240---108---ATWFILF---(Apr/13)-(03:35)---(Apr/14)-(03:35:23) There are 6 columns which are separated by 3 hyphens (---). The first column shows the exit code of the command you just ran. 0 means all is well. 1 means WARNING, which means, LOGROBOT discovered conditions that fell under the WARNING specification you provided. 2 means CRITICAL, which means, the worst case scenario has been reached. In this particular example, here's what the output is telling us: You requested to have the /var/log/messages file scanned as far back as 24 hours ago (1440 minutes). The timeframe that was scanned was from [ April 13, 03:35 ] to [ April 14, 03:35 ]. After scanning through the records that were written to the log in that time frame, LOGROBOT found 108 lines that contained both strings of "ntpd" and "stratum 2". Also, as an FYI, the last date and time those specific strings were found in the log file was 240 seconds ago.
Other common log monitoring scenarios
- Show only the total count of each pattern found in log
- Apache/HTTP Log Monitoring - Frequency of status codes
- Expected Entries - Alert when not found in monitored log
- Pattern Exclusions - Specify a list of patterns to exclude
- Log Exclusions - Specify logs to exclude from monitoring
- Dynamic Logs - Monitoring dynamically named Log Files
- Tail Log files using Time Frames - Get precise log data
- Graph various log file metrics - Trend historical log data
- Hot Spot - Identify times with unusually high errors
- Alert based on values in specific columns in log entries
- Email Alerts - Configure log monitoring through Crontab
- Nagios Alerts - Configure log monitoring through Nagios
- Zabbix Alerts - Configure log monitoring through Zabbix
- Zenoss Alerts - Configure log monitoring through Zenoss
Log File Content
Scan content of log files for new occurrences (or lack thereof) of specific keywords, strings or patterns.
Log File Size
Monitor the sizes of single or multiple log files - alert if log size breaches predefined thresholds.
Log File Growth
Monitor the growth of single or multiple log files - alert when the monitored logs stop receiving new data.
Log File Timestamp
Monitor the timestamp of single or multiple logs. Alert, if logs are older than X amount of minutes or hours.