Help_Topics

Understand how monitoring takes place

Being familiar with the basic steps of taking a monitoring sample is helpful when viewing raw log files or diagnosing alerts.

Monitoring Interval

Your monitoring service runs on a certain monitoring interval, which is specified in the Settings tab for each individual service. Every interval minutes, the service is scheduled to run. See Adjust_the_monitoring_interval for more information on setting the monitoring interval.

Not all services on the same monitoring interval run at the same time, so you may have a service that runs every five minutes at :01, :06, :11, :16, etc. after the hour, and another that also runs every five minutes but at :03, :08, :13, :18, etc. after the hour.

Agents

After determining when the sample should be collected, the next thing to determine is what agent the sample should be collected from:

Sequential Monitoring (default)

The service rotates through the list of user-selected agents in a round-robin fashion.

If baseline agents are defined, then the service alternates between the list of baseline agents and the list of additional agents, so approximately half of the samples come from baselines and the other half come from the additional agents. For example, if the following agents are defined:

  • Baseline Agents: San Diego, Los Angeles, Phoenix
  • Additional Agents: San Francisco, Chicago, New York, Boston, Paris, Sydney

Then the order the samples are taken might be: San Diego, San Francisco, Los Angeles, Chicago, Phoenix, New York, San Diego, Boston, Los Angeles, Paris, Phoenix, Sydney

Simultaneous Monitoring

By default, all services are set to monitor sequentially, taking one sample from one agent every interval minutes as described above. However, you can also choose to monitor simultaneously, meaning that several samples from different agents will be taken at the same time every interval minutes. How many samples is determined by whether or not you have configured baseline agents.

If you have not configured baseline agents, then a number of agents chosen from the Additional Agents list will perform the transaction simultaneously. The number of agents chosen is determined by, and equal to, the number of strikes you have selected for the service (default three). For example, if the monitoring interval is every 10 minutes, and the Additional Agent list contains:

  • Additional Agents: Atlanta, Chicago, New York, San Diego, Seattle

Then the order the samples are taken might be something like the following:

  1. The first set of samples are from Atlanta, Chicago, New York at 13:52.
  2. The second set of samples are from San Diego, Seattle, Atlanta at 14:02.
  3. The third set of samples are from Chicago, New York, San Diego at 14:12.

If baseline agents have been configured, then the simultaneous samples will be taken from all baseline agents plus one additional agent. For example, if the monitoring interval is every 10 minutes, and following agents are defined:

  • Baseline Agents: Atlanta, Chicago, New York
  • Additional Agents: San Diego, Seattle, San Francisco

Then the order the samples are taken might be something like the following:

  1. The first set of samples are from Atlanta, Chicago, New York, San Diego at 13:52.
  2. The second set of samples are from Atlanta, Chicago, New York, Seattle at 14:02.
  3. The second set of samples are from Atlanta, Chicago, New York, San Francisco at 14:12.

More information about defining baseline and additional agents can be found at Specify_monitoring_agents and Set_up_Baseline_Agents.

Collection and Analysis

Now the sample is collected from that agent, using the defined URL, script, etc. configured for your monitoring service.

The results are then analyzed to see if an error or strike occurred for this sample, depending on your service's settings. If there was an error, the service then determines whether to send an alert and to which contacts, usually using the three-strike rule. More information about alerting options and alert contacts can be found at Configure_alerting_options and About_alerts.

Logging

Finally, the results are written to the service's log files. The Dashboard statistics are updated to include this latest sample and the log snapshot on the service's Logs tab is also updated. Please see Interpret_the_Dashboard_statistics and Access_log_files for more information about the Dashboard and viewing log files.

See also

About_alerts
Access_log_files
Adjust_the_monitoring_interval
Configure_alerting_options
Interpret_the_Dashboard_statistics
Set_up_Baseline_Agents
Specify_monitoring_agents