Catchpoint, in partnership with Blameless, today published an annual survey of 559 site reliability engineers (SREs) that found 59% of respondents didn’t view tool sprawl to be a major concern. Another 40% said tool sprawl is a moderate (33%) to serious problem (8%).
More than half of respondents said they build as much as 30% of the tools they use themselves, with more than half (54%) noting their organization tracks three or more different types of telemetry data to drive observability. The top three types of telemetry data collected are infrastructure (62%), application (58%) and network monitoring (55%) and, on average, the survey suggested the median toil level of SREs dropped 5% year-over-year.
Just under half of respondents (46%) also noted they saw no or low value from artificial intelligence for IT operations (AIOps) compared to 30% that have seen moderate to high value. Another 25% are still unsure.
Finally, the survey concluded that organizations that have adopted a “just culture”—where organizations ensure that incident reporting is not met with punitive measures—are 500% more likely to be Elite performers in terms of the DevOps Research and Assessment (DORA) metrics defined by Google.
Leo Vasilou, director of product marketing for Catchpoint, said that the survey results provided some insights into the challenges SREs face as they strike a balance between trying to achieve DevOps goals and the fact that hiring has become more constrained during an economic downturn.
That stress, however, has less to do with managing toolchains than it does ensuring that applications driving critical business processes are consistently available, he noted.
SREs have emerged as a distinct DevOps function in recent years, focused almost exclusively on programmatically managing the dependencies that exist between a wide range of applications and the underlying infrastructure upon which they depend. Optimizing applications in the cloud-native era has become more challenging because the number of dependencies that exist between the microservices that make up a modern distributed computing environment continues to exponentially increase. In many cases, applications are continually added to IT environments without a commensurate increase in the overall size of the SRE team.
Regardless of the economic climate, hiring and retaining SREs remains a challenge given the demand for IT professionals with those skills. Of course, there is no formal SRE certificate that is widely recognized so who qualifies to be an SRE tends to vary from one organization to the next. The one thing that is generally agreed upon is that SRE automate a wide range of tasks that would otherwise require multiple IT administrators to perform. Glassdoor also reports that, on average, an SRE makes north of about $125,000, which ranks among the highest salary level within any IT organization.
Each organization, depending on their geographic region, would need to determine how many IT administrators they could replace if they hired an SRE. That doesn’t mean IT administrators that rely on a wide range of graphical tools to manage IT tasks are going away any time soon, but as DevOps continues to evolve, the dynamics between SRE and IT administrators also will continue to evolve.