![]() I believe that Heka can do something similar: īoth Riemann and Heka can be used in many other ways, though. Rieman has the ability to calculate moving averages and derivatives of metrics: Seyren is another alerting-from-Graphite-metrics tool: You’ll get alerted at 90%, whether it was at 89% a week ago or 50% an hour ago. What it doesn’t do is analysis to determine whether to send the alert in the first place. You would still be getting alerted if your server disk is at 90%, but you’d be getting a Graphite graph in the alert email for context. While it will put Graphite graphs in your Nagios alerts, it doesn’t quite accomplish what I described above. If you just want graphs in your Nagios alerts, the Etsy folks also have nagios-herald: The Etsy folks have something called Kale that’s a combo of 2 tools, oculus and skyline: Having that extra bit of context is extremely useful. It’s something else entirely if it was at 50% an hour ago. Instead of just alerting me when a server’s disk usage is at 90%, why not alert me when the usage has gone up by more than a certain rate in the past few hours? It’s one thing if a server disk is at 90% right now, was at 90% yesterday and the day before and was at 89% last week. ![]() What would alerting on metrics look like? What doesn’t currently exist in works-out-of-the-box form is a way to get alerting on metrics. Graphite or InfluxDB (with something like Riemann in front of it to gather the metrics and then write to Influx) for metric storage plus Grafana for the actual graphing works pretty well strictly for graphing.ĬollectD or StatsD on the machines being monitored are 2 of the better options for getting stats (collectd) and aggregating or transforming them (eg, use statsd to combine the system and user CPU usage stats from collectd to get a total CPU usage metric that’s more useful). ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |