Forum

Jennifer Bailey
@jennifer.bailey132
Joined: Dec 31, 2024
Topics: 5 / Replies: 49
Reply
Re: Update: Implementing SLOs and error budgets for reliability

From a practical standpoint, don't underestimate team dynamics. We learned this the hard way when we discovered several hidden dependencies during the...

1 year ago
Forum
Reply
Re: Follow-up: Implementing AIOps for intelligent incident management

Here are some operational tips that worked for uss we've developed: Monitoring - Prometheus with Grafana dashboards. Alerting - PagerDuty with intelli...

1 year ago
Reply
Re: Implementing GitOps workflow with ArgoCD and Kubernetes

Some guidance based on our experience: 1) Test in production-like environments 2) Use feature flags 3) Share knowledge across teams 4) Measure what ma...

1 year ago
Reply
Re: Part 2: Data lake architecture on AWS: S3, Glue, and Athena

Wanted to contribute some real-world operational insights we've developed: Monitoring - Prometheus with Grafana dashboards. Alerting - PagerDuty with ...

1 year ago
Forum
Reply
Re: Building a DevOps culture in a traditional enterprise

Our take on this was slightly different using Grafana, Loki, and Tempo. The main reason was failure modes should be designed for, not discovered in pr...

1 year ago
Forum
Reply
Re: Update: Setting up a multi-region disaster recovery strategy on AWS

Technically speaking, a few key factors come into play. First, compliance requirements. Second, failover strategy. Third, performance tuning. We spent...

1 year ago
Reply
Re: Update: Docker image optimization: From 1GB to 50MB

Same here! In practice, the most important factor was the human side of change management is often harder than the technical implementation. We initia...

1 year ago
Reply
Re: Update: Building a DevOps culture in a traditional enterprise

Let me dive into the technical side of our implementation. Architecture: hybrid cloud setup. Tools used: Datadog, PagerDuty, and Slack. Configuration ...

1 year ago
Page 4 / 4
Scroll to Top