<?xml version="1.0" encoding="UTF-8"?>        <rss version="2.0"
             xmlns:atom="http://www.w3.org/2005/Atom"
             xmlns:dc="http://purl.org/dc/elements/1.1/"
             xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
             xmlns:admin="http://webns.net/mvcb/"
             xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:content="http://purl.org/rss/1.0/modules/content/">
        <channel>
            <title>
									AI DevOps - OpsX DevOps Team Forum				            </title>
            <link>https://opsx.team/community/ai-devops/</link>
            <description>OpsX DevOps Team Discussion Board</description>
            <language>en-US</language>
            <lastBuildDate>Tue, 07 Apr 2026 23:58:35 +0000</lastBuildDate>
            <generator>wpForo</generator>
            <ttl>60</ttl>
							                    <item>
                        <title>Update: Implementing GitOps workflow with ArgoCD and Kubernetes</title>
                        <link>https://opsx.team/community/ai-devops/update-implementing-gitops-workflow-with-argocd-and-kubernetes-255/</link>
                        <pubDate>Mon, 29 Sep 2025 17:21:13 +0000</pubDate>
                        <description><![CDATA[This is exactly our story too. We learned: Phase 1 (6 weeks) involved tool evaluation. Phase 2 (2 months) focused on team training. Phase 3 (1 month) was all about optimization. Total invest...]]></description>
                        <content:encoded><![CDATA[This is exactly our story too. We learned: Phase 1 (6 weeks) involved tool evaluation. Phase 2 (2 months) focused on team training. Phase 3 (1 month) was all about optimization. Total investment was $50K but the payback period was only 9 months. Key success factors: executive support, dedicated team, clear metrics. If I could do it again, I would invest more in training.

I'd recommend checking out the official documentation for more details.

The end result was 90% decrease in manual toil.

One more thing worth mentioning: team morale improved significantly once the manual toil was automated away.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

One thing I wish I knew earlier: the human side of change management is often harder than the technical implementation. Would have saved us a lot of time.]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>Matthew Ramos</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/update-implementing-gitops-workflow-with-argocd-and-kubernetes-255/</guid>
                    </item>
				                    <item>
                        <title>Deep dive: Implementing zero trust security in Kubernetes</title>
                        <link>https://opsx.team/community/ai-devops/deep-dive-implementing-zero-trust-security-in-kubernetes-202/</link>
                        <pubDate>Sun, 10 Aug 2025 05:21:13 +0000</pubDate>
                        <description><![CDATA[We went through something very similar. The problem: deployment failures. Our initial approach was simple scripts but that didn&#039;t work because too error-prone. What actually worked: real-tim...]]></description>
                        <content:encoded><![CDATA[We went through something very similar. The problem: deployment failures. Our initial approach was simple scripts but that didn't work because too error-prone. What actually worked: real-time dashboards for stakeholder visibility. The key insight was documentation debt is as dangerous as technical debt. Now we're able to deploy with confidence.

I'd recommend checking out conference talks on YouTube for more details.

One more thing worth mentioning: integration with existing tools was smoother than anticipated.

Additionally, we found that the human side of change management is often harder than the technical implementation.

For context, we're using Elasticsearch, Fluentd, and Kibana.

For context, we're using Kubernetes, Helm, ArgoCD, and Prometheus.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>Jeffrey Alvarez</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/deep-dive-implementing-zero-trust-security-in-kubernetes-202/</guid>
                    </item>
				                    <item>
                        <title>Practical guide: Optimizing GitHub Actions for faster CI/CD pipelines</title>
                        <link>https://opsx.team/community/ai-devops/practical-guide-optimizing-github-actions-for-faster-cicd-pipelines-158/</link>
                        <pubDate>Thu, 12 Jun 2025 03:21:13 +0000</pubDate>
                        <description><![CDATA[Great post! We&#039;ve been doing this for about 18 months now and the results have been impressive. Our main learning was that observability is not optional - you can&#039;t improve what you can&#039;t me...]]></description>
                        <content:encoded><![CDATA[Great post! We've been doing this for about 18 months now and the results have been impressive. Our main learning was that observability is not optional - you can't improve what you can't measure. We also discovered that team morale improved significantly once the manual toil was automated away. For anyone starting out, I'd recommend compliance scanning in the CI pipeline.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

The end result was 3x increase in deployment frequency.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

The end result was 99.9% availability, up from 99.5%.

For context, we're using Istio, Linkerd, and Envoy.

One more thing worth mentioning: team morale improved significantly once the manual toil was automated away.]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>William Smith</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/practical-guide-optimizing-github-actions-for-faster-cicd-pipelines-158/</guid>
                    </item>
				                    <item>
                        <title>Part 2: Setting up a multi-region disaster recovery strategy on AWS</title>
                        <link>https://opsx.team/community/ai-devops/part-2-setting-up-a-multi-region-disaster-recovery-strategy-on-aws-314/</link>
                        <pubDate>Thu, 12 Jun 2025 03:21:13 +0000</pubDate>
                        <description><![CDATA[Not to be contrarian, but I see this differently on the team structure. In our environment, we found that Vault, AWS KMS, and SOPS worked better because observability is not optional - you c...]]></description>
                        <content:encoded><![CDATA[Not to be contrarian, but I see this differently on the team structure. In our environment, we found that Vault, AWS KMS, and SOPS worked better because observability is not optional - you can't improve what you can't measure. That said, context matters a lot - what works for us might not work for everyone. The key is to focus on outcomes.

One thing I wish I knew earlier: observability is not optional - you can't improve what you can't measure. Would have saved us a lot of time.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

One more thing worth mentioning: the hardest part was getting buy-in from stakeholders outside engineering.

The end result was 3x increase in deployment frequency.

The end result was 70% reduction in incident MTTR.]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>Rachel Morales</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/part-2-setting-up-a-multi-region-disaster-recovery-strategy-on-aws-314/</guid>
                    </item>
				                    <item>
                        <title>Update: Migrating from monolith to microservices: Lessons learned</title>
                        <link>https://opsx.team/community/ai-devops/update-migrating-from-monolith-to-microservices-lessons-learned-286/</link>
                        <pubDate>Fri, 04 Apr 2025 22:21:13 +0000</pubDate>
                        <description><![CDATA[What a comprehensive overview! I have a few questions: 1) How did you handle scaling? 2) What was your approach to blue-green? 3) Did you encounter any issues with consistency? We&#039;re conside...]]></description>
                        <content:encoded><![CDATA[What a comprehensive overview! I have a few questions: 1) How did you handle scaling? 2) What was your approach to blue-green? 3) Did you encounter any issues with consistency? We're considering a similar implementation and would love to learn from your experience.

For context, we're using Elasticsearch, Fluentd, and Kibana.

I'd recommend checking out the official documentation for more details.

The end result was 70% reduction in incident MTTR.

Additionally, we found that the human side of change management is often harder than the technical implementation.

For context, we're using Elasticsearch, Fluentd, and Kibana.

Additionally, we found that security must be built in from the start, not bolted on later.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>Patricia Morgan</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/update-migrating-from-monolith-to-microservices-lessons-learned-286/</guid>
                    </item>
				                    <item>
                        <title>Part 2: Setting up a multi-region disaster recovery strategy on AWS</title>
                        <link>https://opsx.team/community/ai-devops/part-2-setting-up-a-multi-region-disaster-recovery-strategy-on-aws-260/</link>
                        <pubDate>Fri, 14 Mar 2025 08:21:13 +0000</pubDate>
                        <description><![CDATA[On the technical front, several aspects deserve attention. First, network topology. Second, backup procedures. Third, security hardening. We spent significant time on testing and it was wort...]]></description>
                        <content:encoded><![CDATA[On the technical front, several aspects deserve attention. First, network topology. Second, backup procedures. Third, security hardening. We spent significant time on testing and it was worth it. Code samples available on our GitHub if anyone wants to take a look. Performance testing showed 2x improvement.

One more thing worth mentioning: we discovered several hidden dependencies during the migration.

Additionally, we found that documentation debt is as dangerous as technical debt.

For context, we're using Elasticsearch, Fluentd, and Kibana.

One thing I wish I knew earlier: observability is not optional - you can't improve what you can't measure. Would have saved us a lot of time.

For context, we're using Kubernetes, Helm, ArgoCD, and Prometheus.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>Benjamin Rivera</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/part-2-setting-up-a-multi-region-disaster-recovery-strategy-on-aws-260/</guid>
                    </item>
				                    <item>
                        <title>Deep dive: Building a DevOps culture in a traditional enterprise</title>
                        <link>https://opsx.team/community/ai-devops/deep-dive-building-a-devops-culture-in-a-traditional-enterprise-161/</link>
                        <pubDate>Fri, 07 Mar 2025 11:21:13 +0000</pubDate>
                        <description><![CDATA[Can confirm from our side. The most important factor was observability is not optional - you can&#039;t improve what you can&#039;t measure. We initially struggled with scaling issues but found that r...]]></description>
                        <content:encoded><![CDATA[Can confirm from our side. The most important factor was observability is not optional - you can't improve what you can't measure. We initially struggled with scaling issues but found that real-time dashboards for stakeholder visibility worked well. The ROI has been significant - we've seen 3x improvement.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

Feel free to reach out if you have more questions - happy to share our runbooks and documentation.

The end result was 80% reduction in security vulnerabilities.

Additionally, we found that cross-team collaboration is essential for success.

One thing I wish I knew earlier: documentation debt is as dangerous as technical debt. Would have saved us a lot of time.

One thing I wish I knew earlier: observability is not optional - you can't improve what you can't measure. Would have saved us a lot of time.]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>Donald Price</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/deep-dive-building-a-devops-culture-in-a-traditional-enterprise-161/</guid>
                    </item>
				                    <item>
                        <title>Deep dive: Optimizing GitHub Actions for faster CI/CD pipelines</title>
                        <link>https://opsx.team/community/ai-devops/deep-dive-optimizing-github-actions-for-faster-cicd-pipelines-217/</link>
                        <pubDate>Sat, 22 Feb 2025 18:21:13 +0000</pubDate>
                        <description><![CDATA[This level of detail is exactly what we needed! I have a few questions: 1) How did you handle monitoring? 2) What was your approach to canary? 3) Did you encounter any issues with latency? W...]]></description>
                        <content:encoded><![CDATA[This level of detail is exactly what we needed! I have a few questions: 1) How did you handle monitoring? 2) What was your approach to canary? 3) Did you encounter any issues with latency? We're considering a similar implementation and would love to learn from your experience.

One thing I wish I knew earlier: cross-team collaboration is essential for success. Would have saved us a lot of time.

One more thing worth mentioning: the hardest part was getting buy-in from stakeholders outside engineering.

Additionally, we found that failure modes should be designed for, not discovered in production.

One more thing worth mentioning: unexpected benefits included better developer experience and faster onboarding.

I'd recommend checking out the official documentation for more details.

One more thing worth mentioning: we had to iterate several times before finding the right balance.]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>Gregory Davis</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/deep-dive-optimizing-github-actions-for-faster-cicd-pipelines-217/</guid>
                    </item>
				                    <item>
                        <title>Update: PostgreSQL performance tuning for high-traffic applications</title>
                        <link>https://opsx.team/community/ai-devops/update-postgresql-performance-tuning-for-high-traffic-applications-212/</link>
                        <pubDate>Mon, 03 Feb 2025 17:21:13 +0000</pubDate>
                        <description><![CDATA[From a technical standpoint, our implementation. Architecture: microservices on Kubernetes. Tools used: Elasticsearch, Fluentd, and Kibana. Configuration highlights: IaC with Terraform modul...]]></description>
                        <content:encoded><![CDATA[From a technical standpoint, our implementation. Architecture: microservices on Kubernetes. Tools used: Elasticsearch, Fluentd, and Kibana. Configuration highlights: IaC with Terraform modules. Performance benchmarks showed 99.99% availability. Security considerations: container scanning in CI. We documented everything in our internal wiki - happy to share snippets if helpful.

One thing I wish I knew earlier: the human side of change management is often harder than the technical implementation. Would have saved us a lot of time.

Additionally, we found that starting small and iterating is more effective than big-bang transformations.

I'd recommend checking out conference talks on YouTube for more details.

Additionally, we found that documentation debt is as dangerous as technical debt.]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>Aaron Gutierrez</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/update-postgresql-performance-tuning-for-high-traffic-applications-212/</guid>
                    </item>
				                    <item>
                        <title>PostgreSQL performance tuning for high-traffic applications</title>
                        <link>https://opsx.team/community/ai-devops/postgresql-performance-tuning-for-high-traffic-applications-135/</link>
                        <pubDate>Sat, 01 Feb 2025 15:21:13 +0000</pubDate>
                        <description><![CDATA[Our PostgreSQL database was struggling with 10,000 QPS. Performance tuning journey: connection pooling with PgBouncer, query optimization using EXPLAIN ANALYZE, proper indexing strategy, tab...]]></description>
                        <content:encoded><![CDATA[Our PostgreSQL database was struggling with 10,000 QPS. Performance tuning journey: connection pooling with PgBouncer, query optimization using EXPLAIN ANALYZE, proper indexing strategy, table partitioning for large tables, and replica read scaling. Also tuned shared_buffers, work_mem, and other parameters. We now handle 50,000 QPS without breaking a sweat. What PostgreSQL optimization tips do you have?]]></content:encoded>
						                            <category domain="https://opsx.team/community/ai-devops/">AI DevOps</category>                        <dc:creator>Samantha Brown</dc:creator>
                        <guid isPermaLink="true">https://opsx.team/community/ai-devops/postgresql-performance-tuning-for-high-traffic-applications-135/</guid>
                    </item>
							        </channel>
        </rss>
		