Learning from Incidents is what good SREs do with Laura Nolan

Incidents happen! And when asking Laura Nolan who was an SRE at Google and Slack, healthy organizations should take proper time to analyze and learn from them. This will improve future incident response as well as overall system resiliency.Tune in to this episode and hear Laura’s tips & tricks what makes a good SRE organization. It starts with doing good write ups of incidents, doing your research on incident reports of software and services that you are looking into using. We also spent a good amount of time discussing root cause analysis where she highlighted an incident that happened at her time at Google and what she learned about outdated alerting.Thanks Laura for a great discussion and lots of insights.Here are the additional links we discussed during the podcastLaura on LinkedIn: https://www.linkedin.com/in/laura-nolan-bb7429/Laura on Twitter:https://twitter.com/lauraliftsIncident Template talk @ SRECon: https://www.usenix.org/conference/srecon22emea/presentation/nolan-breakWhat SRE could be talk @ SRECon: https://www.usenix.org/conference/srecon22emea/presentation/nolan-sreHowie Post-Incident Guide: https://www.jeli.io/howie/welcomeMy philosophy on Alerting article:

Om Podcasten

The brutal truth about digital performance engineering and operations.

Andreas (aka Andi) Grabner and Brian Wilson are veterans of the digital performance world. Combined they have seen too many applications not scaling and performing up to expectations. With more rapid deployment models made possible through continuous delivery and a mentality shift sparked by DevOps they feel it’s time to share their stories. In each episode, they and their guests discuss different topics concerning performance, ranging from common performance problems for specific technology platforms to best practices in development, testing, deploying and monitoring software performance and user experience. Be prepared to learn a lot about metrics.

Andi & Brian both work at Dynatrace, where they get to witness more real world customer performance issues than they can TPS report at.