What if we could use an artificial intelligence (AI) engine to predict the next lines of software code to build? What if machine learning (ML) could trigger the creation of automated test cases to validate our solutions?
Automation has been a focus for the information and communications technology (ICT) industry for years, and there are some lessons that could apply to the software developer community, too, as well as other verticals. Today, more streamlined execution of operational and software delivery processes, such as continuous integration, continuous delivery and continuous deployment (CI/CD) in app development, continue to drive new levels of cost reductions, allowing products to get to market and update faster.
However, with the increased amount of data comes more complex ecosystems and the need for the next level of automation. Hyperautomation, or the use of AI/ML to further streamline processes, could build on these CI/CD processes, revolutionize the way multiple industries deliver software and optimize operational processes.
Today’s Network Complexities and Challenges
In the telecommunications industry, disruptions from hyperscalers and other innovative players, as well as increasing levels of investment in 5G technology, are forcing the hand of communications service providers (CSPs). This is causing them to turn to automation to improve their levels of operational efficiency and innovation to keep up with the need to push frequent updates.
However, challenges remain in the form of:
- Total cost of ownership and operating costs: With more frequent needs for upgrading existing networks, additional levels of automation including AI/ML and hyperautomation are required to manage and maintain today’s complex networks.
- Time to market: In an increasingly competitive market, the CSP’s ability to react to market needs and/or new marketing ideas and to rapidly launch those needs and ideas to their end users and enterprise customers is a key competitive differentiator.
- Customer experience: Digitalizing business processes, providing an increased level of self-service and seeking ways to optimize customers’ overall journey remains an important need.
- Employee productivity: In an effort to increase revenue, more efficiently drive their networks and improve end user customer experience, CSPs also need to increase employee productivity. They can do this by engaging their employees to solve bigger problems. As a result, the simplest and most repetitive low-level tasks can be automated.
The Power of AI and ML
The automation journey started with breaking down complex processes into smaller parts and automating discrete parts using scripts, runbook automation (RBA) and/or robotic process automation (RPA).
With the introduction of AI emulating human intelligence through mechanical and computational processes and ML using algorithms that learn from data to make predictions and decisions, CSPs can now bring automation and hyperautomation to the next level.
The potential use cases for the delivery and maintenance of telco software solutions are unlimited, and many of these lessons can be applied to other sectors. Some examples include:
Autonomous Fault and Anomaly Detection for Cloud-Native Solutions
Today’s network solutions are deployed on containers and microservices. The identification of issues and assurance of service quality becomes increasingly challenging with thousands of workloads running in parallel. Container failure can cause severe service disruptions. In fact, a single service failure or abnormal behavior could have a butterfly effect on the overall system and negatively affect customer satisfaction.
A proactive, AI-based monitoring solution can continuously search for anomalous behavior in the environment. Different container metrics and logs can automatically analyze and detect problems proactively and help find the right corrective actions for the application workflows. This is based on identifying anomalies in the container orchestration metrics related to pods, workloads and namespace. These techniques enable a more reliable and predictable application runtime environment with higher service quality and faster identification and resolution of system faults and, ultimately, a better overall customer experience
Machine Learning-Triggered Closed-Loop Automation
Large clusters of compute, storage and networking resources are managed throughout a data center to enable the deployment of custom workloads. Due to the inherent nature of applications running on the virtualized and cloud infrastructure, there are usage highs and lows which need constant infrastructure component deployment or reconfiguration of the cluster to ensure cost-efficient service availability.
A machine learning-based solution can detect anomalies in the infrastructure. Specifically, with closed-loop automation, enterprises can bring in a layer of intelligence over the current setup to monitor application performance and infrastructure metrics and identify anomalies that require immediate system attention. If those are found, they can trigger the automated addition of capacity to the infrastructure. When anomalies have subsided and once the utilization is capable of being handled by the original capacity, the solution can consistently monitor metrics to roll back the virtual or cloud-native functions. This allows for an efficient and fully automated use of network capacity, minimizing human oversight.
Smart Code Recommendations
One of the most clear-cut ways tools can reduce the effort of software development for programmers is through code completion. AI-powered code completion is a convenient way to access descriptions of functions and their parameter lists. These solutions also speed up software development by reducing keyboard input and the necessity of name memorization. They also allow users to refer less frequently to external documentation, as interactive documentation appears dynamically in the form of tooltips. AI-powered code completion can also use an automatically generated in-memory database of classes, variable names and other constructs or references based on machine learning to open code repositories.
AI-Driven Test Recommendation
Significant time is spent understanding solution requirements and identifying critical scenarios for testing. This is usually followed by a manual test plan creation and execution. To solve this, AI/ML techniques can be applied to identify what tests will need to be done during each cycle of the software development stage, including automating the analysis of required documentation. This can be used to increase overall test-cycle time and improve productivity.
AI-Enabled Test Automation
API testing is critical for validating application logic and needs to go beyond the GUI layer to validate the core of an application. This implies complexity because the protocols are usually difficult to understand, the API inventory is usually large and sequencing of the API calls is critical to a successful test specification and execution. With AI/ML and natural language processing techniques, the analysis of the solution test plan can be automated to obtain recommendations on what test frameworks to utilize, identify and suggest the parameters for each test case, identify the API call sequence and automate the development of test cases based on reusable functions. This can increase overall quality of service, simplify change management, improve the productivity of the testing team and offer more reliable test cases.
Test Failure Analytics
During execution of automated test cases, multiple failures can be observed in various logs. Manually analyzing the logs and correlating them with the application issues can be time-consuming. Also, the identification of unreliable tests impacting system stability is complex.
AI/ML techniques such as vectorization, K-means clustering or topic modeling can be applied to automate the collection of test logs across applications and label failures including the type of cluster failed tests. The test subject matter expert (SME) is provided with a failure analysis report summarizing issues and problems automatically from many logs across applications, simplifying the identification of issues in test automation areas or scripts. These techniques can significantly increase test productivity and overall troubleshooting time.
Accelerated Root Cause Identification and Analysis (RCI&A)
When a problem occurs in a telco network, it tends to cause a ripple effect, affecting other integrated services and creating an alert storm. In such scenarios, root cause identification can become complex and cumbersome. Event aggregation and synchronization are needed to reduce environment complexity and variance.
Machine learning weighted analysis techniques can be used to determine the influence individual IT metrics have in affecting the overall health and performance of hosted applications and services so that pinpointing the root cause of any service impact in real-time can become easier for the team operating the network. Efficient AI/ML techniques for accelerated RCI&A can lead to significant timeline reductions.
AI/ML Driving Operational Innovation and Market Agility
In telco, there is great potential for AI/ML-driven use cases to further improve the cloud-native journey and enable an IT-like software life cycle management process for the solutions of the future.
More broadly, highly automated and self-healing networks will drive higher levels of efficiency and customer satisfaction. This includes faster testing with less resources; more efficient utilization of network resources and cloud infrastructure; faster time-to-market with more software releases per year with similar capacity as well as an increased quality for faster and more agile monetization of the CSPs capabilities.