100x’s cover photo
100x

100x

Software Development

About us

AI agents to troubleshoot complex software Our AI agents analyze tickets, alerts, logs, metrics, traces, code and knowledge to pinpoint problems & remediate production issues.

Website
https://100x.so
Industry
Software Development
Company size
2-10 employees
Type
Privately Held

Updates

  • 100x reposted this

    View profile for Anand Sainath 💯

    Co-Founder & Head of Engineering | Ex Moveworks, Tableau

    Building AI agents that actually work is not easy. At 100x - we build AI agents to troubleshoot complex software. Hear from our AI engineering team - Eddie (Yixin) G. and Swaraj Raibagi on how we've observe, instrument and optimize our agents - using Arize AI's phoenix.

    View organization page for Arize AI

    19,935 followers

    When incidents happen, fast resolution is everything. That’s why 100x (fresh out of stealth!) is building AI agents to help engineering teams troubleshoot with speed and precision. But to make these agents reliable, they need deep visibility into their performance. In this conversation, Dat Daryl Ngo Ngo (Arize AI) sits down with Eddie (Yixin) G. and Swaraj Raibagi to discuss how they use Phoenix for observability, tracing, and performance monitoring. From auto-instrumentation to OpenTelemetry integrations, they break down how Phoenix helps them close the gap between alerts and resolution. See the full conversation here: https://lnkd.in/gQ5_Ck6K Or check out the clip below to see how they're changing AI-driven troubleshooting. 👇

  • As we build agents for observability, observing our own agents is extremely crucial - thanks to Arize for making this seamless!

    View organization page for Arize AI

    19,935 followers

    When incidents happen, fast resolution is everything. That’s why 100x (fresh out of stealth!) is building AI agents to help engineering teams troubleshoot with speed and precision. But to make these agents reliable, they need deep visibility into their performance. In this conversation, Dat Daryl Ngo Ngo (Arize AI) sits down with Eddie (Yixin) G. and Swaraj Raibagi to discuss how they use Phoenix for observability, tracing, and performance monitoring. From auto-instrumentation to OpenTelemetry integrations, they break down how Phoenix helps them close the gap between alerts and resolution. See the full conversation here: https://lnkd.in/gQ5_Ck6K Or check out the clip below to see how they're changing AI-driven troubleshooting. 👇

  • Imagine a world where on-call is the absolute last resort, not the norm

    View profile for Rahul Kayala

    Founder & CEO | Ex- Moveworks, Microsoft, Apple

    On-call shouldn't be a rite of passage. It should be a last resort. Modern on-call principles: 1. Systems should self-diagnose 2. Context should be automatic 3. Knowledge should compound 4. Patterns should be recognized 5. Prevention should be primary Shift from: "Who's on-call?" To: "What prevented the call?" Your engineers deserve better than 2am alerts. Your systems can provide it. 

    • No alternative text description for this image
  • 100x reposted this

    View profile for Rahul Kayala

    Founder & CEO | Ex- Moveworks, Microsoft, Apple

    We analyzed 1000+ incidents. Here's what separated 15 min fixes from 2 hr outages: Most teams focus on getting better tooling or more data access. But here's what actually determines resolution speed: → Knowing which questions to ask first → Understanding what patterns indicate problems → Having context about past failure modes → Seeing how similar issues were solved before Think about your best engineer during an incident. They're following an investigation pattern built from years of experience: → "Last time this happened, it was a connection pool issue" → "When I see this error pattern, I usually check..." → "This metric spike typically means..." This is the "senior engineer algorithm" - and it's usually invisible to everyone else. Making these investigation patterns visible and reusable in the throws of an incident is huge: → Knowledge transfer happens naturally → New engineers learn actual debugging patterns → Teams discover common failure modes → Investigation steps become reusable These days tools and data aren't the bottleneck. It's scaling your team's incident investigation knowledge.

    • No alternative text description for this image
  • 100x reposted this

    View profile for Rahul Kayala

    Founder & CEO | Ex- Moveworks, Microsoft, Apple

    Last year, I helped 10 elite engineering teams cut their incident response time by 60%. Here's what worked: Most "unexpected" incidents follow predictable patterns, so understanding the system is crucial. We did this by: Service Dependency Mapping → Documenting both direct and indirect dependencies → Mapping critical user flows to underlying services → Tracking SLO impact across service boundaries Resource Profiling → Establishing baseline behaviors for key services → Documenting peak traffic patterns and resource needs When every team monitors and alerts differently, debugging becomes exponentially harder. So, standardization became our next focus. This looked like a collection of templates & utilities to help individual engineering teams: → Creating ready-to-use monitoring and alert templates for common service types → Providing default dashboards teams can deploy in minutes → Improving signal to noise ratio by auto-identifying noisy alerts & flagging alerts with >80% auto-resolution We focused on getting the fundamentals right. This helps human engineers in the short term and AI operators in the long term. If you’re considering AI agents for DevOps / Incident Response, happy to share more learnings and best practices.

Similar pages

Browse jobs