AI Search

Classic Search

Search Phrase:

Search Type:

Advanced search options

Search in Forums:

Search in date period:

Sort Search Results by:

AI Assistant

Notifications

Clear all

Zero-downtime migration from on-prem to AWS - case study

Sara Pike · 2025-10-13T18:42:42Z

Project: Zero-downtime migration from on-prem to AWS - case study Timeline: 5 months Team: 8 engineers Budget: $463k Challenge: We needed to modernize our platform while maintaining zero downtime. Solution: We implemented a blue-green deployment strategy using: - Terraform for IaC - Feature flags - SRE practices Results: ✓ MTTR: 4hrs → 15min ✓ Zero production incidents during migration ✓ Platform now supports 10x growth Happy to discuss our approach and share learnings!

✦ Summarize Topic

Page 2 / 2 Prev

Success Stories

Last Post by Tom Chack 4 months ago

18 Posts

18 Users

0 Reactions

72 Views

RSS

Samantha Brown

(@samantha.brown47)

Posts: 0

Translate ▼

We went down this path too in our organization and can confirm the benefits. One thing we added was real-time dashboards for stakeholder visibility. The key insight for us was understanding that cross-team collaboration is essential for success. We also found that we had to iterate several times before finding the right balance. Happy to share more details if anyone is interested.

For context, we're using Grafana, Loki, and Tempo.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

Posted : 07/12/2025 4:49 am

David Johnson

(@david.johnson369)

Posts: 0

Translate ▼

Great post! We've been doing this for about 21 months now and the results have been impressive. Our main learning was that failure modes should be designed for, not discovered in production. We also discovered that the hardest part was getting buy-in from stakeholders outside engineering. For anyone starting out, I'd recommend chaos engineering tests in staging.

For context, we're using Vault, AWS KMS, and SOPS.

The end result was 80% reduction in security vulnerabilities.

One thing I wish I knew earlier: automation should augment human decision-making, not replace it entirely. Would have saved us a lot of time.

Posted : 09/12/2025 4:50 am

Tom Chack

(@opsx-tom)

Posts: 76

Member Admin

Translate ▼

Great writeup! That said, I have some concerns on the timeline. In our environment, we found that Terraform, AWS CDK, and CloudFormation worked better because failure modes should be designed for, not discovered in production. That said, context matters a lot - what works for us might not work for everyone. The key is to start small and iterate.

Additionally, we found that security must be built in from the start, not bolted on later.

For context, we're using Terraform, AWS CDK, and CloudFormation.

Posted : 12/12/2025 9:39 am

Page 2 / 2 Prev

11 Forums
309 Topics
4,684 Posts
0 Online
109 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed