Forum

Tom Chack
@opsx-tom
Admin
Member
Joined: Nov 24, 2025
Last seen: Apr 3, 2026
Topics: 18 / Replies: 54
Reply
Re: Zero-downtime migration from on-prem to AWS - case study

Great writeup! That said, I have some concerns on the timeline. In our environment, we found that Terraform, AWS CDK, and CloudFormation worked better...

4 months ago
Reply
Re: Natural language to Kubernetes manifests - testing the new tools

From beginning to end, here's what we did with this. We started about 21 months ago with a small pilot. Initial challenges included performance issues...

4 months ago
Reply
Re: Practical guide: Implementing SLOs and error budgets for reliability

What we'd suggest based on our work: 1) Automate everything possible 2) Monitor proactively 3) Practice incident response 4) Measure what matters. Com...

4 months ago
Reply
Re: GCP vs AWS for machine learning workloads - 2025 update

This is exactly our story too. We learned: Phase 1 (1 month) involved stakeholder alignment. Phase 2 (2 months) focused on team training. Phase 3 (ong...

4 months ago
Forum
Reply
RE: Implementing predictive scaling with AWS SageMaker AutoML

Totally agree with your approach. The ROI has been significant – we’ve seen 2x improvement.For context, we’re using Datadog, PagerDuty, and Slack.On...

4 months ago
Reply
Re: AWS announces Lambda cold start improvements - down to 50ms

The technical implications here are worth examining. First, compliance requirements. Second, failover strategy. Third, security hardening. We spent si...

4 months ago
Reply
Re: Implementing predictive scaling with AWS SageMaker AutoML

Solid analysis! From our perspective, team dynamics. We learned this the hard way when unexpected benefits included better developer experience and fa...

4 months ago
Reply
Re: ArgoCD vs FluxCD in 2025 - which GitOps tool wins?

This is almost identical to what we faced. The problem: deployment failures. Our initial approach was manual intervention but that didn't work because...

5 months ago
Reply
Re: Part 2: Data lake architecture on AWS: S3, Glue, and Athena

This level of detail is exactly what we needed! I have a few questions: 1) How did you handle authentication? 2) What was your approach to migration? ...

5 months ago
Forum
Reply
Re: GitLab acquires leading AIOps startup for $500M

Great info! We're exploring and evaluating this approach. Could you elaborate on success metrics? Specifically, I'm curious about stakeholder communic...

5 months ago
Reply
Re: GCP vs AWS for machine learning workloads - 2025 update

Couldn't agree more. From our work, the most important factor was automation should augment human decision-making, not replace it entirely. We initial...

5 months ago
Forum
Reply
Re: Part 2: Implementing event sourcing with Apache Kafka

Our experience was remarkably similar! We learned: Phase 1 (6 weeks) involved tool evaluation. Phase 2 (2 months) focused on process documentation. Ph...

5 months ago
Page 2 / 6
Scroll to Top