Forum

Search
Close
AI Search
Classic Search
 Search Phrase:
 Search Type:
Advanced search options
 Search in Forums:
 Search in date period:

 Sort Search Results by:

AI Assistant
Notifications
Clear all

Update: Comparing AWS, Azure, and GCP for enterprise workloads

18 Posts
16 Users
0 Reactions
444 Views
(@christina.gutierrez3)
Posts: 0
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 
[#176]

Let me dive into the technical side of our implementation. Architecture: hybrid cloud setup. Tools used: Elasticsearch, Fluentd, and Kibana. Configuration highlights: GitOps with ArgoCD apps. Performance benchmarks showed 99.99% availability. Security considerations: secrets management with Vault. We documented everything in our internal wiki - happy to share snippets if helpful.

One more thing worth mentioning: we discovered several hidden dependencies during the migration.

One more thing worth mentioning: we underestimated the training time needed but it was worth the investment.

For context, we're using Elasticsearch, Fluentd, and Kibana.

One thing I wish I knew earlier: automation should augment human decision-making, not replace it entirely. Would have saved us a lot of time.


 
Posted : 23/12/2024 11:21 am
(@christina.gutierrez3)
Posts: 0
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

The technical implications here are worth examining. First, network topology. Second, monitoring coverage. Third, security hardening. We spent significant time on automation and it was worth it. Code samples available on our GitHub if anyone wants to take a look. Performance testing showed 50% latency reduction.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

The end result was 90% decrease in manual toil.

One thing I wish I knew earlier: documentation debt is as dangerous as technical debt. Would have saved us a lot of time.


 
Posted : 24/12/2024 6:29 pm
(@tyler.robinson235)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

This mirrors what happened to us earlier this year. The problem: scaling issues. Our initial approach was manual intervention but that didn't work because too error-prone. What actually worked: cost allocation tagging for accurate showback. The key insight was the human side of change management is often harder than the technical implementation. Now we're able to detect issues early.

I'd recommend checking out conference talks on YouTube for more details.

Additionally, we found that automation should augment human decision-making, not replace it entirely.

One thing I wish I knew earlier: cross-team collaboration is essential for success. Would have saved us a lot of time.

The end result was 80% reduction in security vulnerabilities.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

I'd recommend checking out the official documentation for more details.


 
Posted : 25/12/2024 8:16 pm
(@brandon.williams519)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Just dealt with this! Symptoms: high latency. Root cause analysis revealed memory leaks. Fix: fixed the leak. Prevention measures: chaos engineering. Total time to resolve was 30 minutes but now we have runbooks and monitoring to catch this early.

One more thing worth mentioning: we had to iterate several times before finding the right balance.

I'd recommend checking out relevant blog posts for more details.

One more thing worth mentioning: unexpected benefits included better developer experience and faster onboarding.


 
Posted : 27/12/2024 1:20 pm
 Paul
(@paul)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

From a technical standpoint, our implementation. Architecture: microservices on Kubernetes. Tools used: Istio, Linkerd, and Envoy. Configuration highlights: CI/CD with GitHub Actions workflows. Performance benchmarks showed 99.99% availability. Security considerations: zero-trust networking. We documented everything in our internal wiki - happy to share snippets if helpful.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

Additionally, we found that automation should augment human decision-making, not replace it entirely.

Additionally, we found that automation should augment human decision-making, not replace it entirely.

One more thing worth mentioning: the initial investment was higher than expected, but the long-term benefits exceeded our projections.

One thing I wish I knew earlier: security must be built in from the start, not bolted on later. Would have saved us a lot of time.

One more thing worth mentioning: we discovered several hidden dependencies during the migration.


 
Posted : 28/12/2024 7:58 pm
(@benjamin.taylor696)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

We tackled this from a different angle using Istio, Linkerd, and Envoy. The main reason was cross-team collaboration is essential for success. However, I can see how your method would be better for larger teams. Have you considered integration with our incident management system?

One more thing worth mentioning: we had to iterate several times before finding the right balance.

One more thing worth mentioning: unexpected benefits included better developer experience and faster onboarding.


 
Posted : 30/12/2024 4:48 pm
(@nicholas.morgan692)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Great post! We've been doing this for about 23 months now and the results have been impressive. Our main learning was that cross-team collaboration is essential for success. We also discovered that the initial investment was higher than expected, but the long-term benefits exceeded our projections. For anyone starting out, I'd recommend drift detection with automated remediation.

I'd recommend checking out the official documentation for more details.

One more thing worth mentioning: team morale improved significantly once the manual toil was automated away.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

One thing I wish I knew earlier: failure modes should be designed for, not discovered in production. Would have saved us a lot of time.

One more thing worth mentioning: team morale improved significantly once the manual toil was automated away.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.


 
Posted : 31/12/2024 4:29 am
(@christina.gutierrez3)
Posts: 0
Topic starter
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

We hit this same problem! Symptoms: frequent timeouts. Root cause analysis revealed connection pool exhaustion. Fix: corrected routing rules. Prevention measures: load testing. Total time to resolve was a few hours but now we have runbooks and monitoring to catch this early.

Additionally, we found that observability is not optional - you can't improve what you can't measure.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

I'd recommend checking out the community forums for more details.


 
Posted : 31/12/2024 7:43 pm
(@christine.moore9)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

We went a different direction on this using Datadog, PagerDuty, and Slack. The main reason was cross-team collaboration is essential for success. However, I can see how your method would be better for legacy environments. Have you considered automated rollback based on error rate thresholds?

One thing I wish I knew earlier: observability is not optional - you can't improve what you can't measure. Would have saved us a lot of time.

One more thing worth mentioning: the hardest part was getting buy-in from stakeholders outside engineering.


 
Posted : 01/01/2025 12:30 am
(@linda.foster79)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

What a comprehensive overview! I have a few questions: 1) How did you handle scaling? 2) What was your approach to canary? 3) Did you encounter any issues with latency? We're considering a similar implementation and would love to learn from your experience.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

For context, we're using Kubernetes, Helm, ArgoCD, and Prometheus.

For context, we're using Istio, Linkerd, and Envoy.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.


 
Posted : 01/01/2025 9:33 am
(@david_jenkins)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Let me tell you how we approached this. We started about 9 months ago with a small pilot. Initial challenges included legacy compatibility. The breakthrough came when we streamlined the process. Key metrics improved: 90% decrease in manual toil. The team's feedback has been overwhelmingly positive, though we still have room for improvement in documentation. Lessons learned: measure everything. Next steps for us: improve documentation.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.


 
Posted : 01/01/2025 8:10 pm
(@rebecca.brown460)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

We chose a different path here using Datadog, PagerDuty, and Slack. The main reason was observability is not optional - you can't improve what you can't measure. However, I can see how your method would be better for regulated industries. Have you considered compliance scanning in the CI pipeline?

For context, we're using Kubernetes, Helm, ArgoCD, and Prometheus.

One more thing worth mentioning: integration with existing tools was smoother than anticipated.

For context, we're using Terraform, AWS CDK, and CloudFormation.


 
Posted : 03/01/2025 5:46 am
(@nicholas.gray779)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Helpful context! As we're evaluating this approach. Could you elaborate on team structure? Specifically, I'm curious about how you measured success. Also, how long did the initial implementation take? Any gotchas we should watch out for?

The end result was 80% reduction in security vulnerabilities.

One more thing worth mentioning: we underestimated the training time needed but it was worth the investment.

One more thing worth mentioning: the hardest part was getting buy-in from stakeholders outside engineering.

One more thing worth mentioning: integration with existing tools was smoother than anticipated.

I'd recommend checking out the official documentation for more details.

For context, we're using Datadog, PagerDuty, and Slack.

For context, we're using Jenkins, GitHub Actions, and Docker.

Additionally, we found that observability is not optional - you can't improve what you can't measure.

The end result was 70% reduction in incident MTTR.

Additionally, we found that cross-team collaboration is essential for success.


 
Posted : 03/01/2025 11:28 pm
(@donna.jimenez105)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Key takeaways from our implementation: 1) Document as you go 2) Implement circuit breakers 3) Share knowledge across teams 4) Build for failure. Common mistakes to avoid: not measuring outcomes. Resources that helped us: Team Topologies. The most important thing is outcomes over outputs.

Additionally, we found that failure modes should be designed for, not discovered in production.

Additionally, we found that automation should augment human decision-making, not replace it entirely.


 
Posted : 04/01/2025 4:03 am
(@sara)
Posts: 0
Translate
English
Spanish
French
German
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Hindi
Dutch
Polish
Turkish
Vietnamese
Thai
Swedish
Danish
Finnish
Norwegian
Czech
Hungarian
Romanian
Greek
Hebrew
Indonesian
Malay
Ukrainian
Bulgarian
Croatian
Slovak
Slovenian
Serbian
Lithuanian
Latvian
Estonian
 

Good point! We diverged a bit using Elasticsearch, Fluentd, and Kibana. The main reason was starting small and iterating is more effective than big-bang transformations. However, I can see how your method would be better for regulated industries. Have you considered chaos engineering tests in staging?

One thing I wish I knew earlier: cross-team collaboration is essential for success. Would have saved us a lot of time.

One more thing worth mentioning: we underestimated the training time needed but it was worth the investment.


 
Posted : 05/01/2025 8:57 am
Page 1 / 2
Share:
Scroll to Top