Howto Meter A Short Duration Problem

Some performance problems come and go in a minute or two. Depending on the industry, the company goals, and the expectations of the users, these problems are either a big deal or ignored with a yawn.stopwatch

For short duration performance problems where you know when they will start (market open, 10pm backup, etc.) here are some tips for setting up special metering to catch them:

  • Start your meters well before the problem happens. Have them run a few times to be sure they are working as expected and have them just sleep until 15 minutes before the problem starts.
  • Meter at a frequency that is at least ¼ of the expected duration of the event – this gives you multiple samples during the event.
  • Let the meters run for 15 minutes after the problem is usually over.
  • Now you have meters collected before, during, and after the event. Compare and contrast them looking for what changed and what stayed the same during the event.

It is quite common for people to be suspicious that the new metering you are running is making the problem much worse. That’s why it is a very good idea to have it running well before the anticipated start of the problem. There is some cause for this suspicion as a small typo can turn a “once per minute” metering macro into a “fast as you can” metering macro that burns CPU, fills disks, and locks data structures at a highly disruptive rate. Like a physician, your primary goal should be: First, do no harm. It is always a good idea to test your meters at a non-critical time and (if possible) to meter the meters so you can show the resource usage of your meters.

If the problem happens without warning, then, if possible, identify something or some event that usually precedes the problem that you can “trigger on” to start the intensive metering. A trigger might be when the queue has over X things in it, or when something fails, or when something restarts, etc. Finding the trigger can sometime be frustrating, as correlation does not always mean causality. Keep searching the logs and any other meters you have.

Sometimes, all you have to go on is that it “happens in the morning” or “mostly on Mondays.” Work with what you’ve got and meter during those times.

If the problem has no known trigger and seems to happen randomly, you’ll have to intensively meter for it until it happens again. This will burn some system resources and give you a mountain of data to wade through. If this is a serious problem, then buckle up and do the work.


Bob Wescott’s (LinkedIn), is semi-retired after a 30 year career in high tech that was mostly focused on computer performance work. Bob has done professional services work in the field of computer performance analysis, including: capacity planning, load testing, simulation modeling, and web performance. He has even written a book on the subject: The Every Computer Performance Book. Bob’s fundamental skill is explaining complex things clearly. He has developed and joyfully taught customer courses at four computer companies and I’ve been a featured speaker at large conferences. Bob’s goal is to be of service, explain things clearly, teach with joy, and lead an honorable life. His goal, at this stage of the game, is to pass on what we’ve learned to the next generation.

Every Computer Performance Book

Price: $19.99

4.4 out of 5 stars (31 customer reviews)

27 used & new available from $11.94

Related Posts

  • How To Collect Workload Data With Performance MetersHow To Collect Workload Data With Performance Meters Many performance meters in your computing world will tell you how busy things are. That’s nice, but to make sense of that data, you also need to know how much work the system is being asked to handle. With workload data you see the performance meters with fresh eyes, as now you can […]
  • How To Catch A Problem How To Catch A Problem Article Summary  - I spent many years traveling to different companies solving computer (mostly performance) problems. At almost every company there were people there who were unsure how to begin metering in order to catch a problem.  Here is my take on how to begin, how often to […]
  • Using Performance Analytics, Forecasting & Prediction on Agile, DevOps ProjectsUsing Performance Analytics, Forecasting & Prediction on Agile, DevOps Projects Introduction - The move to digital over the last decade has brought about significant changes in the way most industries and business interact with their customers. The pace of change in most industries has been relatively high with the evolution of technology driving a lot of that […]
  • Digging Into Response TimeDigging Into Response Time Bob Wescott’s (LinkedIn), is semi-retired after a 30 year career in high tech that was mostly focused on computer performance work. Bob has done professional services work in the field of computer performance analysis, including: capacity planning, load testing, simulation modeling, and […]