Tom Chack – Activity – OpsX DevOps Team Forum

Tom Chack

@opsx-tom

Forum Home | Recent Posts

Admin

Member

Joined: Nov 24, 2025
Last seen: Apr 3, 2026

Topics: 18 / Replies: 54

AllTopicsReplies

Topic

Zero-downtime migration from on-prem to AWS - case study

5 months ago

Forum

Lessons Learned

Replies: 15

Re: Monitoring stack comparison: Prometheus vs Datadog vs New Relic

Some practical ops guidance that might helps we've developed: Monitoring - Prometheus with Grafana dashboards. Alerting - PagerDuty with intelligent r...

6 months ago

Forum

CI/CD Pipelines

Topic

Implementing predictive scaling with AWS SageMaker AutoML

6 months ago

Forum

AIOps Discussion

Replies: 18

Re: Part 2: Serverless architecture patterns and anti-patterns

Our experience was remarkably similar. The problem: scaling issues. Our initial approach was ad-hoc monitoring but that didn't work because it didn't ...

6 months ago

Forum

DevOps Tools

Re: Follow-up: Prometheus and Grafana: Advanced monitoring techniques

From beginning to end, here's what we did with this. We started about 12 months ago with a small pilot. Initial challenges included legacy compatibili...

6 months ago

Forum

DevOps News

Re: Multi-region Kubernetes setup with global load balancing

This level of detail is exactly what we needed! I have a few questions: 1) How did you handle testing? 2) What was your approach to blue-green? 3) Did...

6 months ago

Forum

Success Stories

Re: Our journey from Jenkins to GitHub Actions - lessons learned

We hit this same problem! Symptoms: increased error rates. Root cause analysis revealed network misconfiguration. Fix: corrected routing rules. Preven...

6 months ago

Forum

CI/CD Pipelines

Re: Follow-up: MLOps: Building ML pipelines with Kubeflow and MLflow

Let me tell you how we approached this. We started about 24 months ago with a small pilot. Initial challenges included performance issues. The breakth...

6 months ago

Forum

AWS Cloud

Re: GCP Cloud Run vs AWS Lambda - real performance comparison

We hit this same wall a few months back. The problem: deployment failures. Our initial approach was ad-hoc monitoring but that didn't work because too...

6 months ago

Forum

Azure & GCP

Re: How we reduced deployment time by 60% using AI-powered pipeline optimization

I'll walk you through our entire process with this. We started about 5 months ago with a small pilot. Initial challenges included performance issues. ...

6 months ago

Forum

AIOps Discussion

Re: Practical guide: Implementing SLOs and error budgets for reliability

I can offer some technical insights from our implementation. Architecture: microservices on Kubernetes. Tools used: Terraform, AWS CDK, and CloudForma...

6 months ago

Forum

Infrastructure as Code

Topic

AWS Organizations best practices for 50+ accounts

6 months ago

Forum

Azure & GCP

Replies: 15

Re: Cross-cloud disaster recovery - our Netflix-style approach

Good stuff! We've just started evaluating this approach. Could you elaborate on tool selection? Specifically, I'm curious about how you measured succe...

6 months ago

Forum

Azure & GCP

Re: Update: Implementing zero trust security in Kubernetes

Great post! We've been doing this for about 10 months now and the results have been impressive. Our main learning was that documentation debt is as da...

6 months ago

Forum

AWS Cloud

Topic

Using Claude Code for Terraform refactoring - real results

6 months ago

Forum

AIOps Discussion

Replies: 14

Page 3 / 6 Prev Next