| Application Performance Management |
Article Index for Application |
Website Links For Application |
Information AboutApplication Performance Management |
| CATEGORIES ABOUT APPLICATION PERFORMANCE MANAGEMENT | |
| system administration | |
| computer systems | |
| system software | |
|
APM can be defined as workflow and related IT tools deployed to detect, diagnose, remedy and report on application performance issues to ensure that application performance meets or exceeds end-users’ and businesses’ expectations. Application performance relates to how fast transactions are completed on behalf of, or information is delivered to the end user by the application via a particular network, application and Web services infrastructure. Application Performance Management can be applied to both packaged applications as well as custom applications. One particular variety of Application Performance Management, Web APM, focuses on managing Web Application s. While most APM tools are designed for the client-server environments, newer tools address the needs for dynamic Web applications including monitoring application performance experienced by end-users, business groups and Web services, and relating potential performance issues to points within or without the data center for speedy problem diagnostics and resolution. METHODOLOGY IN APPLICATION PERFORMANCE MANAGEMENT Monitoring Service Response Time In Web Application Server, service response time can be expressed as measurement of customer satisfaction. Even if there are some bugs in a system, if the bug does not cause any problem in service response time or site’s functionality, it cannot be seen as a problem. As such, even if there is no bug found in the system, if the service response time is not fast enough to fulfill customer satisfaction, the system itself has a problem and cannot be considered normal. Service response time is an important information source in measuring system’s stability and diagnosing system problems. Following documentation describes using service response time to resolve system performance issues and why monitoring the system resource alone is not the correct approach to Application Performance Management. Resource Usage Cannot Exceed 100% System resource usage cannot exceed 100%. This means that system resource usage cannot be used to diagnosis system capacity. Let’s take look at a situation where vmstat is being used to monitor CPU usage. CPU usage is constantly very high, 95~100%. Is this a problem? Most system administrators cannot determine if this is a problem. All they can say is that the CPU is being used heavily. The administrators cannot determine whether the number of incoming requests exceed system capacity just by monitoring the system resources alone. For example, lets say that it takes 20 concurrent request to max out the CPU usage of a WAS server. What if there are 30 concurrent requests? Whether there are 20 or 30 requests, the CPU usage will 100% in both cases. Of course administrator usually cannot tell how many concurrent incoming requests will max out system resource. . Monitoring all system resources is an inefficient. Another limitation of resource monitoring is that there are too many things to monitor. In any given system, there exist many H/W and S/W related resources such as CPU, Memory, NET, HEAP, Connection Pool, etc…; it is inefficient (also probably impossible) to monitored all these system resources individually and its is not really necessary neither. Incoming requests exceeding system capacity results in delayed response time. In order to overcome the limitations of system resource monitoring, service response time must be monitored. As the incoming requests exceed system capacity, service response time is increased indefinitely, letting administrators know that resource shortage exists within the system. Since response time increases if any system resource is lacking, response time can be used to monitor system resource. Response time must be measured per transaction. Then how should service response time be measured? Before discussing this point, let’s look at the relationship between service and the resource. In a web system, service may interact with many different components such as class, DB, LDAP, file, etc… and when the different system resources that are tied to each component is combine, that number can be very large. Also, resources are only used by specific requests while others may be used by many different services. To conclude, the relationship between resource and service is N:M relation and it cannot be clearly defined. The N:M relationship between services and resources are only expressible as average response time grouped by service name or functional category in a line graph. Instead the individual transaction must be plotted separately. There are a few reasons why service response time must be measured individually rather than in groups. First, when identical services are executed multiple times, the response time may be delayed for specific transactions only. No matter how the grouping is done, the individual service response time will be diluted if it is averaged out with other services in the group. Secondly, there is mapping issue between response time and profiling. If the grouping is done by service name, the mapping would somewhat make sense, but if the mapping is done by business object, the mapping will be too complicated to be used effectively. Thirdly, Service cannot be classified easily by name. Since service name is determined by the initial request that called it, it does not capture the internal changes that occur during its process. Grouping different services with that has changes dynamically during its process simply because they share the same service name is not very effective way to group the services. SEE ALSO
|
|
|