Cloud monitoring ensures that applications hosted on private, public, or hybrid cloud infrastructures are always available and performing optimally. The data collected and evaluated encompasses a variety of services relate to:
- AWS, Azure, and Google Cloud
- Cloud hosted websites
- Virtual Machine Instances
- IT Infrastructure
Because cloud based environments rely on a complicated set or resources, readily identifying the availability and performance issues that most affect business services is challenging. IT needs to be able to holistically monitor application health and the accompanying cloud infrastructure components.
Cloud Monitoring Design
The following outline is a list of items to take into account when implementing a cloud monitoring system:
What should you monitor?
- Monitor Cloud Resource Utilization - virtualization and storage bottlenecks
- Monitor Application Performance – application slowdowns
- Monitor End User Experience - page load time and availability
- Monitor Virtual Networks - resource utilization and network latency
- Monitor Cloud-hosted log files - errors and audit detail
What constitutes a problem?
- KPIs that exceed threshold values
- Alarms generated by cloud infrastructure
- Poor application performance
- Inability to access services
- Problems as identifed by built-in knowledge base
What should you do when a problem is identified?
- For recurring problems build detailed repair notes into the alert to speed repair
- Prioritize and escalate high severity alerts with text messages or email alerts
- Automate an OS command or script to fix the problem if possible
What are the benefits of monitoring and tracking ?
- Proactively troubleshoot performance and availability problems before they reach end users
- Improve end-user performance
- Expose problem areas between on-premises and cloud infrastructure
- Right-size the cloud infrastructure to cost effectively support application workloads