Performance data is collected from several sources.
To make debug information and transformation reports available please refer to the section on building with required compiler options.
It is often the case that a simple change to the build or runtime environment can end up having a large impact on an application's performance without actually having to change the application's source code. The build and runtime hosts can be analyzed and "scored" based on several criteria; including hardware level, OS level, compiler version and build options. A report called the System Scorecard is generated which provides recommendations for how to improve the system configuration for better application performance.
The raw performance data collected for a running application comes from low level system tools that are based on sampling. For example, tprof will continually record samples of the state of the processor's instruction pointer, which contains the memory address of the currently executing instruction. After the performance run is complete the compiler debug information is used to map the recorded addresses back to their corresponding source code lines. The sample rate is configurable but is typically once every 10 milliseconds. Each sample is called a tick.
Sampling based performance data collection does have its drawbacks. There could be several processes executing on the target system in addition to the application being profiled. Its common that many of the ticks get attributed to competing processes. For this reason it is recommended to run on a relatively "quiet" system. An ideal situation would be to have a machine or LPAR dedicated to performance testing.
The ticks that correspond to the application being profiled show up under the "My Application" node in the Hotspots Browser. All other ticks are assigned to the "Other Processes" node. On AIX, system idle time for each processor is attributed to a process named wait, so "Other Processes" will always contain a wait process for each process. If a significant number of ticks are being attributed to other non-idle processes consider reducing the system load during subsequent performance runs if possible.
Since the data is based on sampling increasing the number of samples increases the statistical relevance of the collected data. It is recommended that the application being profiled run for at least 30 seconds, and preferably much longer. This can be done by choosing an input set that causes the application to run for an extended length of time or by using a script to run the application in a loop.
By default ticks are not displayed. Instead hotness is represented as a percentage relative to a scope, such as the application or the entire system. To enable the display of ticks first open the preferences dialog from the main menu at . Then navigate to the preferences page found under . Under Timing Data select show both percentage and ticks.
Call stack sampling data is collected from low level system tools. This data comes from procstack on AIX and from OProfile on Linux.
Approximately once every second the application's call stack is sampled and all of the application's currently executing functions are recorded. After the performance run is complete this information can be analyzed using the Invocations Browser.
XLC compilers have the ability to produce XML report files that describe optimizations that were performed during compilation. These reports are not strictly required, but if they are generated then more information will be available for analysis. One of the most interesting things these reports reveal is the location of function calls that got inlined during compilation.