Monitoring

KDE System Guard

Lesson 1
Lesson 2
Lesson 3
Our Monitoring Implementation

We use the product StatusCake to monitor the uptime on our various different websites. In our current configuration, StatusCake simply pings our URLs, and sends alerts when it encounters discouraging HTTP Status Codes.

We've configured StatusCake to send emails to the project owners, as well as Slack messages to appropriate, project-specific channels.

There's definitely room for improvement here, but StatusCake is a good starting place. Here are some things that we might consider when improving on this foundation:

  • StatusCake is a black box solution -- it doesn't have any visibility of the internals of our program, it provides us with data on how our website looks to users. It would be nice to have a monitoring solution that combines black-box reporting with logs and stack traces.
  • StatusCake, as it's configured right now, only checks for HTTP status codes. However, it's possible for our web server to ACK with empty or incorrect pages. Ideally, we would test for content, in addition to headers, to mitigate this scenario.
  • We're not collecting our logs or runtime metrics in any meaningful way. That should be a crucial next step in aiding disaster response.
Other Readings
Quiz

    There is no quiz available for this module.