Introduction – Performance is primarily about resource (e.g., processors, memory, I/O) usage. So understanding what the resources of interest are is a good first step. This article looks at resources from a high-level in order to make some stereotypical statements about them. Most of the information presented here is obvious, but perhaps you will gain some unexpected insights.
Focusing on performance, I will highlight what measurements are common. Counts track numbers of requests and lengths of queues. Time intervals measure how long resources are held, how long requests wait, and how much time transpires between requests. The two are combined to express rates. When these aspects are measured and studied, the performance of the system can be understood and efficiently managed.
Characteristics and Performance Parameters – Each resource can have characteristics that describe aspects of the resource and can have performance parameters associated with those characteristics. The two most fundamental characteristics involve time and space. Time concerns durations where the resource is involved. Common performance parameters are latency and response time. Other examples are duration, elapsed time, busy time, and idle time. Space concerns data or work that the resource interacts with. Common performance parameters are capacity and throughput. Throughput is rate (i.e., count/time). Capacity can be a rate or simple measurement (i.e., a size). Utilization is a performance parameter that is the ratio of either times or spaces. Utilizations can be expressed over time as well.
Resources have physical characteristics (i.e., what they are made of) which contribute to their financial cost. This cost can often be translated into terms of some fundamental performance parameter (e.g., cost/time, cost/space). Physical things can break or fail, so how reliable a resource is is important. This is often expressed as a duration until failure is likely to occur. Availability is a performance parameter that is ratio of times expressing how often that the resource can perform its function. Non-availability can occur due to lack of space or due to lack of reliability.
Resources require consumables (e.g., power). The consumables can be expressed as a performance parameter. Resources often have emissions (e.g., heat, sound/vibration). The emissions can also be expressed as a performance parameter. Coolant can be a consumable if there is too much heat as an emission. It is of general interest to minimize consumables and emissions, especially in certain computing domains (e.g., mobile devices). [Hennessy:2012] describes a situation in which a chip’s performance can change based upon the temperature that it is operating at because parts of the chip (i.e., processors) turn off.
Most resources are serial in nature but can be shared through a variety of means. Some resources (like processors) time share, meaning that they handle primitive operations sequentially, but interleave blocks of operations from different workloads. Other resources (like storage) can decompose into basic components. These basic components might not be shared, but the resource as a whole is.
Many resource have ranges for the values of the performance parameters, where the values are dictated by the actual design and implementation. Different parameters can be traded. This means that one parameter can be improved at the expense of another. Time and space parameters usually oppose each other (e.g., decreasing time by increasing space). A common parameter that is sacrificed is cost. When trading parameters, not all of them can be optimized independently. However, given explicit objectives, the parameters as a group might be optimized. “Fast, small, cheap: pick two” is common adage that highlights this.
Types of Resources – Resources can be divided into processors, storage, communication media, and Input/Output (I/O) devices.
Processors – A processor is a resource that executes work. Some processors, such as servers, are a complex organization of many resources. Processors are usually time-shared. A processor has an associated utilization which expresses the fraction of time during an interval that the resource was in use.
Processor performance is highly dependent upon the associated memory (storage) hierarchy. Common performance parameters are: execution time/latency/response time, throughput, utilization, and availability. An atypical parameter that is common is the clock rate (e.g., n gigahertz). Throughput is often expressed in instructions per second or (floating point) operations per second. Execution time might be given for a known benchmark.
Storage – Storage resources store data. Storage has some unique characteristics associated with it when compared to other resource types. Storage is usually decomposable into basic units (e.g., bytes), but, because of resource management, it is aggregated into bigger units (e.g., blocks, pages). Each unit can be allocated and held indefinitely. Most storage is recoverable, meaning it can be freed and re-used. Write-once media is an example of irrecoverable storage. Storage might be volatile; such storage loses its contents when power is removed. With proper management, most storage can be shared.
Storage has a limit, called its capacity. Storage utilization is not a ratio of the time that the storage is in use, but a ratio of the storage used to the capacity; this, of course, can vary with time. The latency of accessing storage is a performance measurement. This will vary for storage hierarchies. When storage is an I/O device, throughput is an interest.
Reliability concerns the ability of the storage to retain the data for recall. Availability is closely related to utilization and reliability; if storage is not available, then either there is insufficient free space or the storage has a failure.
Communication Media – Communication media transmit data from one place to another. In a sense, data are stored in the media. For networks, this “storage” is very brief. Storage (e.g., DVDs, hard drives) can be media as well (these, of course, require a means to move data from an output device to an input device).
Media usually have a capacity. For networks, this is a bound on throughput. For storage, this is the storage limit. Latency, throughput, reliability, and availability are common performance parameters.
I/O Devices – Most I/O devices are an interface (1) between the physical world and the computing system or (2) to a network or storage. There are numerous example of the former: monitor, printer, keyboard, scanner, camera, microphone. I/O devices might do some processing like a processor resource, but are not considered as such. Most I/O devices cannot be decomposed or (time) shared, requiring serial access. Aggregated devices (like multiple disks) can be shared as a whole, but not the individual devices. Latency, throughput, reliability, and availability are common performance parameters.
Devices may have unique characteristics: resolution (dots per inch), color depth, or signal range.
Others – Constructions within software allows the possibility of numerous other resources to exist, but they are usually built on top of the ones already mentioned. For example, a database table might be considered to be a resource in a database management system. But this is just storage. Network ports are just software on top of the network (communication media). Processes and threads are just higher level resources on top of the processor and memory (storage). Similar statements can be made for (sub)systems.
When resources are assembled into complex systems, the scalability of the system is a characteristic to express. Scalability can have a multiple definitions:
- Improving performance for a specific workload by adding more resources, or
- Maintaining performance for an increasing workload by adding more resources.
Evaluating performance for different workloads for a fixed configuration is not scalability. For an example of the first case, Amdahl’s law expresses speedup in terms of the execution time improvement when more processors are added to a system to solve the same problem. Some people do not think this is a valid perspective any more, but in some domains it still is. For an example of the second case, Gustafson’s law extends Amdahl’s law so that the problem size can increase. Again, speedup is the performance parameter communicating the scalability of the system.
Resource Management – Resource management (often an operating system) controls the resources. Management strives for higher throughput, overall lower latency (although individual latencies can be higher), and high availability. Resource scheduling algorithms (e.g., first-come, first served) order the access to resources, addressing the throughput and latency goals. Mutual-exclusion controls improve availability by preventing resources from becoming unavailable due to deadlock. When multiple instances of the same resource exist, load balancing improves overall performance. Resource management introduces overhead, but the overall benefit overcomes this shortcoming.
Conclusions – This paper provides a brief overview of computing resources. It discusses many characteristics of resources and appropriate performance parameters. One simple conclusion is that there does appear to be very many resource categories (I gave four). Also, performance parameters can be traded against each other when trying to optimize a system. One of my recent papers [Wilson:2011] discusses (quality) trading in a more general sense. What else can you think of?
Dr. Tom Wilson (LinkedIn) is a scientist who specializes in system performance for highly critical realtime systems. Dr. Wilson aims to better understand all of the performance domains without getting too involved in specific technologies. By profession Tom is a Performance Scientist, Mission Analyst at Lockheed Martin and is based out of Orlando, Florida. Tom provides analysis for performance (workload analysis, capacity planning) at the user-level which are called mission analysis. He also works on mission activities which require derivation to the system level.