When it comes to software development, data is king. From business metrics to network throughput, measuring the performance and quality of a product is crucial to understanding where and how to improve an application. The most popular method for collecting and analyzing application performance data is through application performance monitoring software.
APM tools are applications and services that help software application managers keep track of whether the infrastructure behind an application is performing as expected. While these platforms can help reduce the financial overhead of an application by fine-tuning hardware and software requirements, the ultimate goal of APM software is to ensure a high-quality end-user experience.
How does APM software work?
In a nutshell, APM software provides the tools necessary to quickly discover, isolate and solve performance issues that would otherwise hurt the end-user experience. While every tool works a little bit differently, the general aim behind most APM platforms is that they aggregate application performance metrics using everything from log files to hardware component statistics to network throughput reports. In other words, if the tools generate information, they then quantify that information.
But APM tool capabilities are not limited to quantification. APM tools don't just monitor the performance of an application. They also provide mechanisms for correlating data, identifying bottlenecks and even alerting stakeholders to potential problems within a system. All of these functions combined make up a platform that monitors and manages the health of an application's infrastructure, while also providing all of the information necessary for identifying the root cause of application performance issues.
APM tools aim to collect just about everything there is to know about a system and its dependencies, but the method of that collection can come in a number of different shapes and sizes, depending on the provider. Regardless of the collection method, most APM platforms collect data from three categories: hosting platform, application environment and supporting infrastructure.
With extensive research into APM software, TechTarget editors have focused this series of articles on vendors that offer APM capabilities as a separate platform rather than as part of a larger system. Our research included Gartner, Forrester and TechTarget surveys.
Whether it is hosted in an on-premises server rack or a cloud-based virtual environment, every application requires a platform to run on. While hardware has given way to virtualization, admins can still gather and analyze the same base metrics. Things like processor utilization, memory demands and disk read/write speeds can provide a clear picture about how well an application is running on its provided architecture.
It is important to note that hosting platform performance is one of the more shallow aspects of APM that admins can aggregate. This is because the hosting platform data is often little more valuable than the check engine light in a car. It can identify that there is a problem, but far more information is needed to identify what that problem actually is.
Every server, virtual or otherwise, needs a CPU. The CPU is what determines the number of operations per second the server can perform. Generally speaking, this means that, if the CPU utilization on a given server is incredibly high, then application performance issues are virtually guaranteed to exist because admins can drastically reduce the number of operations per second. While this is an incredibly basic metric to track, it is one of the most useful for identifying what kind of issues an application is experiencing.
Where high processor utilization can limit the number of operations per second an application server can handle, high memory usage can bring a system to a grinding halt by limiting the amount of ephemeral data the CPU can store. Like processing power, memory is a finite resource on a machine, so identifying memory-intensive processes on an application server can go a long way toward optimization and stability.
The faster data can be read from -- and written to -- a disk, the faster an application can run. While the need to measure disk latency is becoming a thing of the past thanks to the cloud, it can still be a useful metric for identifying read/write-heavy processes and potential hardware issues.
In the most modern sense of application development, containerization is a new favorite. Container technology -- think Docker -- is becoming so prevalent that the ability to monitor container management infrastructure is a near requirement in APM software. In addition to the standard hardware metrics, container management tools, like Docker Swarm, Kubernetes and Mesosphere, have introduced a whole suite of new server-level information to better identify issues within a running application.
Where the hosting platform is the hardware component of an application, the application environment is the software component. As discussed above, hardware issues are only a small part of APM. A larger aspect is the application itself, because, without it, there would be no cause to monitor anything. Application performance is a constant dance of hardware and software optimizations, and no matter how efficient and powerful the hardware is, poorly written software can cost a lot in both time and money.
APM tools help mitigate this risk by not only identifying issues in the hardware, but also within the software itself. While developers themselves must determine how much access they have to application-level information that an APM tool can monitor, there are a few interesting use cases to consider, such as error rates, automated load balancing statistics and code profiling.
Errors are unavoidable. That said, when they happen, it is important to gather as much information as possible about them. In the context of APM, admins can use this information to root out the cause of high processor or memory utilization, or provide additional context toward identifying the real culprit behind errors.
Automated load balancing
Most modern applications rely on automated load balancing to deal with the highs and lows of processor utilization and network bandwidth. While this is an incredibly valuable way to maintain high uptime, it can mask performance issues by keeping the hosting platform metrics at reasonable numbers. To mitigate this, admins can configure APM tools to not only monitor the status of one server, but also monitor the status of every server. This configuration can provide valuable insight into the "true" hosting platform metrics, while also offering a way to mitigate unexpected infrastructure costs.
Code execution profiling
Some APM tools take analysis to the next level by profiling the production code execution. This can come in the form of identifying bottlenecks in the execution of a memory-intensive process, or it can even highlight slow database queries and provide ways to speed them up. This type of analysis is one of the most valuable features an APM platform can provide, as it offers a second set of eyes on everything an application is doing.
While not every company has moved entirely into the cloud, it's a good bet that the vast majority of them have at least one foot in the proverbial waters. But this shift has introduced significantly more dependencies into the application development lifecycle. Content distribution networks, APIs, caches and databases are just some things that a successful application requires to thrive. Manually monitoring each service is simply impractical without the use of an APM platform.
It is rare for applications at scale to not require some sort of network component. Whether it is communicating with dependent services or directly with the end user, reliable network communication is paramount to application performance. Because of this, most APM tools offer mechanisms for monitoring not only network latency, but also the number of requests -- both incoming and outgoing -- as well as the response status of those requests.
Caches, databases and data stores, while functionally different, all serve the same fundamental purpose: to store data. While some APM software can identify performance problems in complex database queries, they can also identify larger issues in caching and simple storage platforms, as well as like communication or quota information.
AI in APM software
AI is the hottest buzzword in software sales and with good reason: It saves time and money. To illustrate the value of AI, consider large-scale closed-circuit TV systems. While they provide some value in identifying problems after they've happened, they don't identify problems as they are happening. This is because, at scale, it is virtually impossible to have human eyes on every camera feed. AI helps solve this problem by providing an intelligent and automated way to track and triage application performance issues.
Although AI is a relative novice in terms of APM, the ultimate goal behind adding some automated intelligence to these platforms is to save time. A great example of how admins can use AI to more efficiently monitor application performance is through automated alerting thresholds. Rather than manually setting thresholds for critical alerts, admins can utilize machine learning to identify and set the optimal threshold for these alerts. In addition, they can use this same automated intelligence to auto classify and rank issues by order of severity and even diagnose and determine the root cause of a particular performance issue. As applications become larger and more distributed, these features will be more than just useful; they will be critical.