Background – When was the last time you were told, “Let’s not go overboard with out investments in a performance validation exercise.!!! On my last project we expended a lot of effort on various Non-Functional initiatives, however the issues we experienced in production were never picked up in the Non-Functional testing environment.” Having been involved with the design, build and implementation of some very large systems over the last 15 years I’ve heard the above quote mentioned to me in various different circumstances by individuals of differing seniority. When I hear those words, I have a sense of Dejavu flowing through your mind. Unfortunately in this case, not for best reasons.
We live in digitally connected world where customers today expect seamless end user experiences irrespective of the devices they access their data from, or the location they request the service from. In such a world, where the pace of change around us requires us to constantly evolve, as system developers, designers, architects we have to work hard to make sure we understand what our customers are expecting from us. Expect to just delivering the same brick and mortar services by digitally enabling them with little or no empathy for the overall customer experience i.e. functional and non-functional, isn’t necessarily going to deliver the expected outcomes.
From personal experience, among the various Non Functional attributes Performance, Scalability comes low down on the list of priorities for most organizations unless the system is a customer facing internet application where poor end user experience potentially equals lost revenue. Even Availability and Reliability at times get looked at much earlier in the Systems Development Life Cycle but the Non Functional attributes of Performance and Scalability are “mostly” relegated to right at the end. Unless you are product organization responsible for building a solution which is your bread and butter its highly unlikely that the Non Functional attributes of Performance, Scalability, Reliability and Availability will be at the top of your agenda, some parts of it will surely be on your agenda but the extent to which you will chose to focus on some of these Non Functional attributes will depend on the extent to which your business will be impacted if these systems do not perform as they should.
Going behind the covers – So given the above background let’s spend some time to understand where are these comments coming from – The situation described above isn’t an isolated one and am sure there are others among you reading this right now who are saying, “Ah yes, that’s surely happened to me as well”. So let’s take a few minutes to understand the human psyche behind such comments and what we might do to fix some of the fundamental issues responsible –
- In-ability to articulate sensible Non Functional requirements – Drafting Non-Functional requirements is easy, however drafting achievable Non-Functional requirements that address real world business requirements while helping achieve optimal utilization of investments across systems is a completely different paradigm. Personal experience over the last decade and a half confirms the fact that the Information Technology industry has been quick to adopt new development paradigms e.g. Devops, Continuous Integration, etc., however the world of Information Technology is still struggling with the agility required to put together a good set of Non Functional requirements and build systems in the record times business expects IT to deliver new capability for launch of new services. There is definitely a lot of organizations out there with good technology delivery capability but mostly limited to industries which deal with complex and large business critical workloads i.e. Banking & Finance, Wagering, etc. As I’ve mentioned above, it’s likely that unless you are working for a product or services vendor where the application being designed is your main bread and butter it’s highly likely the architecture/performance skills required to engineer scalability, reliability, performance and availability into your application is possibly not as strong as it should be.
- Lack of understanding of the real production workload – One of the other very common issues I’ve stumbled across is the inability of the teams to articulate what the current workload across their systems in production really looks like. Over the last few years, tools like Splunk, ELK (Elasticsearch, Logstash, Kibana) have given users the ability to mine their machine data for patterns and understand user access patterns a lot better (unless you had strong linux skills and were at home using Python/Shell to sed/awk/grep log files and convert statistics into meaningful cacti graphs). Very frequently I find that teams are unable to truly articulate the different dimensions of their own production workload. In some instances I’ve attributed the lack of understanding of system workload to the segregated nature of responsibility across the development/support teams, in some instances it’s just the lack of understanding of how to go about consolidating and visualizing the machine generate data across the various systems and sometimes it’s simply due to the ignorance of the relevant teams not wanting to dive deep into the system and understand what’s happening behind the covers. All it takes is a bit of inquisitiveness and asking the right questions to get to the right set of information from the relevant resources administering the relevant systems. I am not suggesting it’s a walk in the park, but if you lack the inquisitive attitude and are unable to ask the right questions’ you’ll either end up with substandard workload data or no workload data in the worst case scenario. Not knowing precisely what your production workload looks like when embarking on putting together your next generation design/solution in my opinion is plain ignorance which has high costs associated with it down the line.
- In-ability to articulate a sensible Performance Engineering approach – Articulating a sensible approach to managing Performance across the program requires you have strong architecture skills, strong performance skills and good development skills so translate all of that into scalable code. Investing in the relevant capability early on and bringing on the right talent to help document your Non Functional requirements, architecture your system for scalability and then recommend a suitable Performance Engineering approach that goes beyond just Performance Testing will go a long way in making sure you are making the right investments and building a scalable system. Putting together a scalable solution is the first step, knowing what Non Functional requirements you should be meeting, what sort of design patterns you should be implementing and the sort of engineering approaches involved in scaling the solution are all critical parts of a holistic Performance Engineering approach. A sensible Performance Engineering approach involves collaboration with the Architects, Developers, Infrastructure Architects, Storage Architects including Business and Technology teams who understand the customer much better than an external consultant ever will.
- In-ability to articulate a Performance Testing approach that can deliver the expected outcomes – While writing this piece I ask myself what is it that makes a great Performance Testing approach, is it the quality of paper it’s printed on, the number of years of experience of the Performance Engineer has who’s writing the document or the number of execution cycles that have been baked into the Performance Testing approach. Putting together a good Performance Testing approach/strategy/plan is just one part of the puzzle, the most interesting part of the challenge is executing the Performance Testing approach in such a way that you make good of all the resources you have been given in-spite of the constraints imposed on you. I personally treat all plans like an MVP (Minimum Viable Product – See Eric Ries’ book on Lean Product Development) where I start with a set of assumptions and learn from the outcomes refining the approach along the way. As a smart man once said, “No battle plan lasts beyond first contact with the enemy”. At a very high level a good Performance Testing approach consists of, a good understanding of the system workload, a set of sensible but achievable Non Functional requirements, a realistic timeframe for achieving the expected outcomes, relevant tools to test and analyse performance across the board and finally all the technical skills required to engineer performance into the system. Great technical skills are no replacement for lack of process or rigor, so avoid going into battle unless you’ve all the relevant tools required to get your job done. There’s a simple three step mantra I would recommend i.e. Plan, plan and plan.
- Lack of collaboration across the various teams to support the Performance Engineering effort – I find it interesting when I come across organizations where the so called Performance Engineering or Performance Testing teams are focused purely on injecting workload into the systems while the developers are expected to triage and address the performance, scalability bottlenecks across the various tiers of the system while the infrastructure/networks/storage experts are expected to provide their 2 cents by looking at the system logs at the conclusion of a test run. Now don’t get me wrong, am sure there’s value in working that way but I personally doubt the effectiveness of such an effort towards delivering the expected outcomes. Personally what I’ve found works best on any program is a good collaborative effort between IT (customer), developers, architects, performance engineers, infrastructure experts, storage experts, etc. It’s important to have the relevant skills to performance test, monitor the relevant systems across the landscape, triage performance issues as they arise and provide the relevant fixes for performance testing to be able to continue, but what’s really important is the ability of the architect or the lead performance engineer to pull together the whole team in the right direction.
Conclusion – In short have empathy for your customer, understand your customers needs, try to understand his business and what he needs to be successful at what he does. Expecting to do what we’ve been doing for the last decade just because we think it’s the right think to do is not necessarily going to get you the expected outcomes. Challenge the status quo, ask questions, be inquisitive, be collaborative and most importantly….have empathy.
This short piece intended to look at some of the reasons why senior members within organizations/programs fail to see value in investing in engineering initiatives aimed at helping the program meet the agreed Non Functional requirements. The sad part is that the situation has not changed much since I first began working on technology 15 years ago. The hype around “Going Digital” with your business has definitely accelerated some investment in APM (Application Performance Monitoring) solutions, but the organizational approaches to engineering the key Non Functional attributes into systems hasn’t changed that much.
As an architect, developers, performance tester or performance engineer think through your approach for engineering performance into the solution and the resources you are going to need to be able to deliver the expected outcomes. It’s highly likely that unless you have a strong argument, your request for resources will be knocked back….!!!!
Trevor Warren (Linked In) loves hacking open source, designing innovative solutions and building communities. Trevor is inquisitive by nature, loves asking questions and some times does get into trouble for doing so. He’s passionate about certain things in life and building solutions that have the ability to impact people’s lives in a positive manner is one of them. He believes that he can change the world and is doing the little he can to change it in his own little ways. When not hacking open source, building new products, writing content for Practical Performance Analyst, dreaming up new concepts or building castles in the air, you can catch-him bird spotting (watching planes fly above his house).
Practical Performance Analyst as an Open Body Of Knowledge on Systems Performance Engineering (SPE) built + maintained by Trevor with the support of his army of volunteer elves (PPA Volunteers). You can reach trevor at – trevor at practical performance analyst dot com. The views expressed on this web site are his own.