UNLIMITED
Episode 2: Shoving a SAN into us-east-1: When companies migrate to the Cloud, they are literally changing how they do everything in their IT department. If lots of customers exclusively rely on a service, like us-east-1, then they are directly impacted by outages. There is safety in a herd and i by Screaming in the CloudUNLIMITED
How to Investigate the Post-Incident Fallout with Laura Maguire, PhD
UNLIMITED
How to Investigate the Post-Incident Fallout with Laura Maguire, PhD
ratings:
Length:
31 minutes
Released:
Feb 8, 2022
Format:
Podcast episode
Description
About LauraLaura leads the research program at Jeli.io. She has a Master’s degree in Human Factors & Systems Safety and a PhD in Cognitive Systems Engineering. Her doctoral work focused on distributed incident response practices in DevOps teams responsible for critical digital services. She was a researcher with the SNAFU Catchers Consortium from 2017-2020 and her research interests lie in resilience engineering, coordination design and enabling adaptive capacity across distributed work teams. As a backcountry skier and alpine climber, she also studies cognition & resilient performance in high risk, high consequence mountain environments. Links:
Howie: The Post-Incident Guide: https://www.jeli.io/howie-the-post-incident-guide/
Jeli: https://www.jeli.io
Twitter: https://twitter.com/lauramdmaguire
TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Today’s episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that’s built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you’re defining those as, which depends probably on where you work. It’s getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that’s exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn’t eat all the data you’ve gotten on the system, it’s exactly what you’ve been looking for. Check it out today at min.io/download, and see for yourself. That’s min.io/download, and be sure to tell them that I sent you.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig is the solution for securing DevOps. They have a blog post that went up recently about how an insecure AWS Lambda function could be used as a pivot point to get access into your environment. They’ve also gone deep in-depth with a bunch of other approaches to how DevOps and security are inextricably linked. To learn more, visit sysdig.com and tell them I sent you. That’s S-Y-S-D-I-G dot com. My thanks to them for their continued support of this ridiculous nonsense.Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. One of the things that’s always been a treasure and a joy in working in production environments is things breaking. What do you do after the fact? How do you respond to that incident?Now, very often in my experience, you dive directly into the next incident because no one has time to actually fix the problems but just spend their entire careers firefighting. It turns out that there are apparently alternate ways. My guest today is Laura Maguire who leads the research program at Jeli, and her doctoral work focused on distributed incident response in DevOps teams responsible for critical digital services. Laura, thank you for joining me.Laura: Happy to be here, Corey, thanks for having me.Corey: I’m still just trying to wrap my head around the idea of there being a critical digital service, as someone whose primary output is, let’s be honest, shitposting. But that’s right, people do use the internet for things that are a bit more serious than making jokes that are at least funny only to me. So, what got you down this path? How did you get to be the person that you are in the industry and standing in the position you hold?Laura: Yeah, I have had a long circuitous route to get to where I am today, but one of the common
Howie: The Post-Incident Guide: https://www.jeli.io/howie-the-post-incident-guide/
Jeli: https://www.jeli.io
Twitter: https://twitter.com/lauramdmaguire
TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Today’s episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that’s built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you’re defining those as, which depends probably on where you work. It’s getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that’s exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn’t eat all the data you’ve gotten on the system, it’s exactly what you’ve been looking for. Check it out today at min.io/download, and see for yourself. That’s min.io/download, and be sure to tell them that I sent you.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig is the solution for securing DevOps. They have a blog post that went up recently about how an insecure AWS Lambda function could be used as a pivot point to get access into your environment. They’ve also gone deep in-depth with a bunch of other approaches to how DevOps and security are inextricably linked. To learn more, visit sysdig.com and tell them I sent you. That’s S-Y-S-D-I-G dot com. My thanks to them for their continued support of this ridiculous nonsense.Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. One of the things that’s always been a treasure and a joy in working in production environments is things breaking. What do you do after the fact? How do you respond to that incident?Now, very often in my experience, you dive directly into the next incident because no one has time to actually fix the problems but just spend their entire careers firefighting. It turns out that there are apparently alternate ways. My guest today is Laura Maguire who leads the research program at Jeli, and her doctoral work focused on distributed incident response in DevOps teams responsible for critical digital services. Laura, thank you for joining me.Laura: Happy to be here, Corey, thanks for having me.Corey: I’m still just trying to wrap my head around the idea of there being a critical digital service, as someone whose primary output is, let’s be honest, shitposting. But that’s right, people do use the internet for things that are a bit more serious than making jokes that are at least funny only to me. So, what got you down this path? How did you get to be the person that you are in the industry and standing in the position you hold?Laura: Yeah, I have had a long circuitous route to get to where I am today, but one of the common
Released:
Feb 8, 2022
Format:
Podcast episode
Titles in the series (100)
- 35 min listen