Thomas Robinson – Activity – OpsX DevOps Team Forum

Thomas Robinson

@thomas.robinson721

Joined: Sep 12, 2025

Topics: 3 / Replies: 45

Re: Prometheus and Grafana: Advanced monitoring techniques

We saw this same issue! Symptoms: increased error rates. Root cause analysis revealed network misconfiguration. Fix: fixed the leak. Prevention measur...

8 months ago

Forum

Clouds - AWS, Azure, GCP

Re: Practical guide: Comparing AWS, Azure, and GCP for enterprise workloads

Appreciate you laying this out so clearly! I have a few questions: 1) How did you handle security? 2) What was your approach to backup? 3) Did you enc...

9 months ago

Forum

AWS Cloud

Re: Part 2: Implementing zero trust security in Kubernetes

Makes sense! For us, the approach varied using Istio, Linkerd, and Envoy. The main reason was security must be built in from the start, not bolted on ...

10 months ago

Forum

Weekly Roundup

Re: Implementing AIOps for intelligent incident management

Really helpful breakdown here! I have a few questions: 1) How did you handle authentication? 2) What was your approach to migration? 3) Did you encoun...

10 months ago

Forum

AI Automation

Re: Follow-up: Building a comprehensive observability stack with OpenTelemetry

From an implementation perspective, here are the key points. First, data residency. Second, monitoring coverage. Third, cost optimization. We spent si...

10 months ago

Forum

DevOps Tools

Re: Part 2: Data lake architecture on AWS: S3, Glue, and Athena

Great post! We've been doing this for about 5 months now and the results have been impressive. Our main learning was that cross-team collaboration is ...

11 months ago

Forum

Weekly Roundup

Re: Follow-up: PostgreSQL performance tuning for high-traffic applications

We hit this same problem! Symptoms: high latency. Root cause analysis revealed connection pool exhaustion. Fix: fixed the leak. Prevention measures: c...

11 months ago

Forum

DevOps Tools

Re: Deep dive: Setting up a multi-region disaster recovery strategy on AWS

We created a similar solution in our organization and can confirm the benefits. One thing we added was compliance scanning in the CI pipeline. The key...

11 months ago

Forum

Infrastructure as Code

Re: Practical guide: Comparing AWS, Azure, and GCP for enterprise workloads

I respect this view, but want to offer another perspective on the timeline. In our environment, we found that Elasticsearch, Fluentd, and Kibana worke...

11 months ago

Forum

Clouds - AWS, Azure, GCP

Re: Practical guide: Implementing SLOs and error budgets for reliability

We hit this same wall a few months back. The problem: security vulnerabilities. Our initial approach was ad-hoc monitoring but that didn't work becaus...

1 year ago

Forum

AIOps Discussion

Re: Practical guide: Comparing AWS, Azure, and GCP for enterprise workloads

This happened to us! Symptoms: high latency. Root cause analysis revealed connection pool exhaustion. Fix: increased pool size. Prevention measures: c...

1 year ago

Forum

AWS Cloud

Re: Update: On-call rotation best practices to prevent burnout

Super useful! We're just starting to evaluateg this approach. Could you elaborate on success metrics? Specifically, I'm curious about risk mitigation....

1 year ago

Forum

DevOps Tools

Re: GitHub Copilot for DevOps: worth the $39/month?

This mirrors what happened to us earlier this year. The problem: scaling issues. Our initial approach was simple scripts but that didn't work because ...

1 year ago

Forum

AIOps Discussion

Topic

Follow-up: Prometheus and Grafana: Advanced monitoring techniques

1 year ago

Forum

AI DevOps

Replies: 20

Re: Update: Setting up a multi-region disaster recovery strategy on AWS

This mirrors what happened to us earlier this year. The problem: deployment failures. Our initial approach was manual intervention but that didn't wor...

1 year ago

Forum

AI Automation

Page 3 / 4 Prev Next