t is possible to realize performance improvements and consistent processing efficiency in the OS/390 and z/OS environment using several methods?methods that do not require additional mainframe-expert staffing resources and methods that can reduce constant battles with application development teams. These methods address and overcome the major causes of performance degradation that typically remain unnoticed until peak processing periods: hidden programming errors, improper or inefficient DB2 database calls, and poor use of VSAM buffers.
Application performance has grown increasingly difficult to manage in the OS/390 and z/OS environments. Contributing factors to these difficulties include corporate downsizing, a dwindling mainframe knowledge base, and rapid improvements to hardware and software.
However, we cannot forget the large number of application changes currently underway at many companies. These changes are occurring to provide access to legacy systems and legacy data from new Web and wireless business systems. Additionally, in most cases, application programmers make changes to support these initiatives with little consideration for performance. To them, performance is an issue for the tech support staff or systems programmers.
Through attrition and divergent corporate staffing requirements, the number of mainframe-skilled, performance analysts continues to diminish. These shrinking numbers mean that less people are keeping an eye on the system performance, typically forcing performance management to become a firefighting issue during peak periods when a slow response becomes embarrassingly noticeable to users and customers. Even in cases where an application is well tuned, over a period of time as program and file attributes change, even the smallest change could cause a performance problem. Performance “ignorance” reduces application availability, causes an escalation in costs due to extraneous processing and surges in batch processing with excessive CPU utilization and unnecessary job wait time, and results in unnecessary hardware upgrades. In addition, especially for the Web and wireless applications that are accessing legacy systems, poor performance?resulting in poor response times?drags a business to its knees.
However, it is possible to realize performance improvements and consistent processing efficiency in the OS/390 and z/OS environment using several methods?methods that do not require additional mainframe-expert staffing resources and methods that can reduce constant battles with application development teams. These methods address and overcome the major causes of performance degradation that typically remain unnoticed until peak processing periods: hidden programming errors, improper or inefficient DB2 database calls, and poor use of VSAM buffers.
Copyright 2002 Technical Enterprises, Inc. Reprinted with permission from Technical Support Magazine.
Finding Hidden Programming Errors
Application programmers rarely believe in hidden programming errors. Denial remains strong after SDSF reveals that a single program is reaching 80 percent CPU utilization. Moreover, even when they do admit to the existence of a performance problem, such an admission is usually tethered to the claim that performance is a systems problem, not an application issue.
It is possible to resolve the finger-pointing dilemma by identifying the hidden programming error. First, it is a good idea to validate SDSF and look at CPU utilization within an address space. While this high-level task may seem redundant in contrast to the SDSF statistics, it provides an overview of the entire address space and helps to reveal whether the program in question has been impacted by any other activity. For example, such an analysis can indicate that a program is consuming more CPU resources than necessary, causing several other programs to be in a wait state. It is the facts that matter most, and this task confirms whether the program in question truly is experiencing a problem. Most importantly, this task helps provide some of the information needed to pinpoint the cause of the problem.
After validating SDSF statistics, it usually becomes necessary to further explore CSECT utilization in the load module or instruction or, statement utilization within the CSECT. Gaining information such as percentage of execution within the supervisor state or problem program state helps to determine whether the problem resides within the program code.
Taking an analysis to the instruction level provides the ability to determine exactly which instruction is consuming resources within the context of a specific CSECT as well as the context of the total CPU. For example, while a higher-level CSECT analysis may reveal that CSECT IXLM8MR is using 20 percent of total CPU, an instruction analysis can clearly indicate that the specific instruction within the CSECT is using 100 percent of the entire CSECT’s 20 percent. Additionally, for COBOL programs, a statement analysis is beneficial for clearly revealing the instruction section within a COBOL program that may be causing the problem.
These activities determine the root of the problem by producing information that is more inarguable. Moreover, armed with the facts, it makes it possible to accurately assign responsibility for correcting the problem.
Resolving Improper or Inefficient DB2 Database Calls
Improper DB2 database calls often cause a substantial drain on the operating system. In many instances, the root cause of the performance drain remains hidden behind what appears to be slow program processing. However, a detailed analysis helps to reveal that the actual performance culprit may reside within improper SQL statements or invalid pointers to tables and indexes. Utilizing the EXPLAIN data can help solve these problems.
For SQL errors, it is vital to pinpoint not only the exact statement, but exactly how that statement is being used within the context of the batch job or CICS transaction, the DB2 subsystem, the authorization ID, DB2 plan, and DBRM. It is vital to understand whether the statement is dynamic or static as well as the type of operation being performed?delete, insert, select, or update. Most importantly, to validate that the problem is actually occurring within the SQL statement, it is important to know the actual percentage of CPU time used by the SQL statement. Understanding the CPU usage by the SQL statement helps identify statements that require further research performed through an analysis of EXPLAIN data.
Analyzing EXPLAIN data associated with an SQL statement provides the means to increase the granularity of research and to determine if problems exist within DB2 tables that result in performance degradation. An EXPLAIN data analysis further helps determine if SQL statements need improvement. It is also important to understand whether the method employed for accessing a table is the cause of degradation. For example, a tablespace scan with no index may result in the use of unnecessary overhead. When it is discovered that a tablespace is lacking an index, you can alert the development team that a change should be made to the access method. An analysis can also help reveal the accuracy of a table. For example, a table with an incomplete definition may be the result of a missing primary index that results in an inordinate amount of extraneous processing that consumes resources.
Improving the Use of VSAM Buffers
OS/390 and z/OS offer two ways of addressing buffering of VSAM data. The first method uses non-shared resources (NSR) where a buffer pool is dedicated to the processing of a single VSAM file. Using logically shared resources (LSR), the second method shares the buffer pool among multiple VSAM files. While both NSR and LSR can be defined within an application, there is a significant difference when it comes to how an application maximizes its use of these buffer pools to process data; sequential processing of data works best using NSR and random data processing works best using LSR. Incorrect use of a buffering strategy results in a significant increase in I/O, thus causing long-running batch jobs and poor performance. The correct buffering strategy results in optimal I/O, thus causing greater CPU efficiency and improved elapsed or response time.
Does this mean that applications need to be changed? Sometimes they need changing. Moreover, sometimes it is vital to learn if the performance-plagued application is experiencing a problem due to buffering or if the problem exists elsewhere within the operating system. Sometimes it is necessary to understand the type, access method, format, and options of the file. For example, is the VSAM file a KSDS? Is the file an RRDS? Is it extended format or extended format compressed? Are the buffering options NSR or LSR? Do dataset options call for key access (KEY), sequential access (SEQ), addresses access (ADR), access to CI (CNV)? Comprehensive knowledge of the construct and use of the file provides the information needed to determine if the application requires modification.
However, with dwindling resources and time, it often proves cost-ineffective and time-ineffective to perform these tasks. Instead, another option is to seek a method to dynamically optimize all VSAM, regardless of access strategy, through a series of rules based on device type, access type and file attributes. These rules can be used to build VSAM resource pools for shared resources, defer direct PUTs until the buffer is needed, save DASD space by allowing Sequential Insert Strategy and reduce Start I/Os. Rules can be used to dynamically override incorrect or inaccurate application program definitions while performing VSAM Sequential I/O load leveling. Sequential I/O load leveling balances the sequential I/O load of various batch jobs, ensuring that the overall workload completes in an optimum period.
For example, regardless of definitions within an application program, random-access VSAM processing can be automatically directed to use LSR which will eliminate buffer stealing, exploit look-aside processing, ESA Hiperspaces, and in-core Indexes. Additionally, rules can be established to ensure creation of the optimal number of buffers to achieve maximum performance. For sequential processing, rules can be established to ensure allocation of the correct number of NSR buffers for look-ahead processing.
Of course, it is a well-known fact that the fastest I/O is the one that is never issued. That is to say, an application that is having I/O performance problems will perform better, if we can cut down on the number of I/Os. The whole concept of SMS automatic blocksizes according to DASD device was to reduce the number of I/Os for that specific DASD device type. Similarly, for VSAM data, automatic, effective buffering can reduce the number of I/Os, thereby improving application performance.
Performance in the Future
It is highly likely that OS/390 and z/OS performance management issues will continue to increase in future years?especially as more business systems seek to tap into mainframe resources. Even with the forthcoming 64-bit register CPUs that provide increased processing capabilities, performance issues will continue to exist?just take a look at how many programs today still operate “below the line” and experience performance issues.
Performance will always remain an issue for tech support and systems programming teams. It is up to those teams to remain vigilant and to analyze, identify and gather the necessary information to correct the problems?especially those caused by poor development practices. Employing methods that address and overcome the major causes of performance degradation remains a vital and necessary task.
Copyright 2002 Technical Enterprises, Inc. Reprinted with permission from Technical Support Magazine.