AI Search

Classic Search

Search Phrase:

Search Type:

Advanced search options

Search in Forums:

Search in date period:

Sort Search Results by:

AI Assistant

Notifications

Clear all

Azure Container Apps vs AWS App Runner - which is better?

Tom Chack · 2025-09-30T05:43:42Z

We're running azure container apps vs aws app runner - which is better? in production and wanted to share our experience. Scale: - 511 services deployed - 28 TB data processed/month - 16M requests/day - 3 regions worldwide Architecture: - Compute: ECS Fargate - Data: S3 + Athena - Queue: MSK (Kafka) Monthly cost: ~$184k Lessons learned: 1. Reserved instances save 40% on compute 2. CloudWatch logs get expensive 3. Autoscaling needs careful tuning AMA about our setup!

✦ Summarize Topic

Page 2 / 2 Prev

Azure & GCP

Last Post by Alex Chen 4 months ago

21 Posts

18 Users

0 Reactions

216 Views

RSS

Gregory Brooks

(@gregory.brooks453)

Posts: 0

Translate ▼

This resonates with what we experienced last month. The problem: scaling issues. Our initial approach was simple scripts but that didn't work because it didn't scale. What actually worked: cost allocation tagging for accurate showback. The key insight was documentation debt is as dangerous as technical debt. Now we're able to detect issues early.

One more thing worth mentioning: the hardest part was getting buy-in from stakeholders outside engineering.

The end result was 90% decrease in manual toil.

For context, we're using Grafana, Loki, and Tempo.

Posted : 12/11/2025 2:56 am

Nicholas Gray

(@nicholas.gray779)

Posts: 0

Translate ▼

What we'd suggest based on our work: 1) Document as you go 2) Monitor proactively 3) Practice incident response 4) Build for failure. Common mistakes to avoid: skipping documentation. Resources that helped us: Accelerate by DORA. The most important thing is learning over blame.

For context, we're using Terraform, AWS CDK, and CloudFormation.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

I'd recommend checking out the community forums for more details.

Posted : 14/11/2025 4:38 pm

David Jenkins

(@david_jenkins)

Posts: 0

Translate ▼

What a comprehensive overview! I have a few questions: 1) How did you handle scaling? 2) What was your approach to blue-green? 3) Did you encounter any issues with costs? We're considering a similar implementation and would love to learn from your experience.

The end result was 99.9% availability, up from 99.5%.

Additionally, we found that failure modes should be designed for, not discovered in production.

The end result was 99.9% availability, up from 99.5%.

Additionally, we found that documentation debt is as dangerous as technical debt.

Posted : 19/11/2025 4:40 pm

Mark Murphy

(@mark.murphy761)

Posts: 0

Translate ▼

Here are some technical specifics from our implementation. Architecture: microservices on Kubernetes. Tools used: Grafana, Loki, and Tempo. Configuration highlights: GitOps with ArgoCD apps. Performance benchmarks showed 50% latency reduction. Security considerations: container scanning in CI. We documented everything in our internal wiki - happy to share snippets if helpful.

For context, we're using Istio, Linkerd, and Envoy.

The end result was 50% reduction in deployment time.

One thing I wish I knew earlier: security must be built in from the start, not bolted on later. Would have saved us a lot of time.

Posted : 20/11/2025 12:09 am

Nicholas Gray

(@nicholas.gray779)

Posts: 0

Translate ▼

I'll walk you through our entire process with this. We started about 17 months ago with a small pilot. Initial challenges included legacy compatibility. The breakthrough came when we improved observability. Key metrics improved: 90% decrease in manual toil. The team's feedback has been overwhelmingly positive, though we still have room for improvement in monitoring depth. Lessons learned: measure everything. Next steps for us: expand to more teams.

One more thing worth mentioning: the hardest part was getting buy-in from stakeholders outside engineering.

Posted : 25/11/2025 2:14 am

Alex Chen

(@alex_kubernetes)

Posts: 0

Translate ▼

Allow me to present an alternative view on the tooling choice. In our environment, we found that Grafana, Loki, and Tempo worked better because failure modes should be designed for, not discovered in production. That said, context matters a lot - what works for us might not work for everyone. The key is to focus on outcomes.

One more thing worth mentioning: we had to iterate several times before finding the right balance.

The end result was 3x increase in deployment frequency.

Posted : 25/11/2025 10:11 am

Page 2 / 2 Prev

11 Forums
309 Topics
4,684 Posts
0 Online
109 Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed