Módulo 1: Introduction to Google Cloud Monitoring Tools
- Understand the purpose and capabilities of Google Cloud operations-focused components: Logging, Monitoring, Error Reporting, and Service Monitoring.
- Understand the purpose and capabilities of Google Cloud application performance management focused components: Debugger, Trace, and Profiler.
Módulo 2: Avoiding Customer Pain
- Construct a monitoring base on the four golden signals: latency, traffic, errors, and saturation.
- Measure customer pain with SLIs.
- Define critical performance measures.
- Create and use SLOs and SLAs.
- Achieve developer and operation harmony with error budgets.
Módulo 3: Alerting Policies
- Develop alerting strategies.
- Define alerting policies.
- Add notification channels.
- Identify types of alerts and common uses for each.
- Construct and alert on resource groups.
- Manage alerting policies programmatically.
Módulo 4: Monitoring Critical Systems
- Choose best practice monitoring project architectures.
- Differentiate Cloud IAM roles for monitoring.
- Use the default dashboards appropriately.
- Build custom dashboards to show resource consumption and application load.
- Define uptime checks to track aliveness and latency.
Módulo 5: Configuring Google Cloud Services for Observability
- Integrate logging and monitoring agents into Compute Engine VMs and images.
- Enable and use Kubernetes Monitoring.
- Extend and clarify Kubernetes monitoring with Prometheus.
- Expose custom metrics through code and with the help of OpenCensus.
Módulo 6: Advanced Logging and Analysis
- Identify and choose among resource tagging approaches.
- Define log sinks (inclusion filters) and exclusion filters.
- Create metrics based on logs.
- Define custom metrics.
- Use Error Reporting to link application errors to Logging.
- Export logs to BigQuery.
Módulo 7: Monitoring Network Security and Audit Logs
- Collect and analyze VPC Flow logs and Firewall Rules logs.
- Enable and monitor Packet Mirroring.
- Explain the capabilities of Network Intelligence Center.
- Use Admin Activity audit logs to track changes to the configuration or metadata of resources.
- Use Data Access audit logs to track accesses or changes to user-provided resource data.
- Use System Event audit logs to track GCP administrative actions.
Módulo 8: Managing Incidents
- Define incident management roles and communication channels.
- Mitigate incident impact.
- Troubleshoot root causes.
- Resolve incidents.
- Document incidents in a post-mortem process.
Módulo 9: Monitoring Network Security and Audit Logs
- Collect and analyze VPC Flow logs and Firewall Rules logs.
- Enable and monitor Packet Mirroring.
- Explain the capabilities of Network Intelligence Center.
- Use Admin Activity audit logs to track changes to the configuration or metadata of resources.
- Use Data Access audit logs to track accesses or changes to user-provided resource data.
- Use System Event audit logs to track GCP administrative actions.
Módulo 10: Optimizing Stackdriver Costs
- Understand Stackdriver billing.
- Analyze Stackdriver resource utilization.
- Implement best practices for Stackdriver cost control.