This series of articles focuses on the basic concepts and gotchas around building high performance applications using Amazon’s AWS platform. Dmitry Agranat explores the foundation principles of AWS while gradually introducing the reader to various important fundamental concepts around AWS/EC2. Dmitry hopes that through this series of articles, you as the Performance Engineer would gain enough of background knowledge on Amazon’s Cloud capability to become an effective contributor when it comes to engaging in AWS related discussions/projects within your organization. These articles essentially are aimed at helping readers build stronger understanding of AWS/EC2 and eventually greater appreciation for what Amazon’s AWS has to offer.
If you are still unsure about the various Amazon Cloud Computing Infrastructure blocks and the basic concepts around Amazon’s Cloud Computing Infrastructure we would highly recommend that you go read up the previous four articles in this series. Links to each of two articles are provided below –
- You can read “Engineering High Performance Applications with AWS – Part 1” by clicking here.
- You can read “Engineering High Performance Applications with AWS – Part 2” by clicking here.
- You can read “Engineering High Performance Applications with AWS – Part 3” by clicking here.
- You can read “Engineering High Performance Applications with AWS – Part 4” by clicking here.
Who Foots the Bill – When you think traditional on premise infrastructure, you think large Capex and low Opex which basically means that your organization pays a large upfront fee to purchase the infrastructure and then pays only for ongoing infrastructure maintenance and support which ideally is a fraction of the overall purchase cost, In addition to that you will also have operational costs which include the costs for cooling, power costs, datacenter maintenance or hosting costs including costs for personnel.
These tertiary costs aren’t obvious to you since they fold up into the IT group salaries but nevertheless are costs for keeping the infrastructure in your internal data centres humming along. In contracts when you consider AWS, you do not pay any upfront costs and don’t own the underlying physical infrastructure. What you pay for are the operational costs of the virtual infrastructure you have leased for that duration of time.
Who’s responsible for a large part of the charges – As we have mentioned in our previous articles if as part of the Performance Engineering organization you have convinced your management to go down the path of using AWS for purposes of build, test, performance test and production you will have to also work hard at keeping the costs down. There are various reasons why the costs for running AWS infrastructure could skyrocket due to activities related to Performance Engineering.
- Performance Engineering ideally uses based all validation, testing and optimization on production equivalent infrastructure
- Performance Engineering ideally conducts Stress, Volume, Testing activity for production equivalent workload and executes numerous such iterations on a frequent basis
- Performance Engineering most likely creates large volumes of traffic over the network, large amount of disk iops and large amounts of logs due to the huge volume of transactional/batch requests it’s pushing through into the test infrastructure\
Due to various reasons we’ve listed above you as the Performance Engineering lead need to be on top of the entire environment usage model and also be keeping a hawks eye on the daily environment usage charges. Given what we’ve mentioned above you would agree that a very large portion of the overall usage costs are going to be attributed to activities related to Performance Engineering during the development phase. This obviously would change once the application went live and you switched to quarterly or half yearly release cycles.
Here are some of the questions you should be asking yourself as the Performance Engineering lead –
- What activities are going to generate most of the network traffic across the environment
- What activities are going to consume most of IOPS across the environment
- What activities are going to require use of the larges and most expensive Instances across the environment.
- Who is going to work on environment with scaled tiers (several web, app, DB’s)
- Who is going to execute long (endurance) tests and how many such tests would need to be executed across the testing phases
It’s again highly likely (during the development phase at-least) that all the non Performance Engineering related activities on AWS would only be responsible for a small part of the overall environment usage costs.
Keeping Costs Under Control – Our advice here is two fold. First you need to mentally prepare your managers and especially the one who is paying the bill that Performance Engineering would be responsible for the largest of the costs when using AWS infrastructure. Performance Engineering activities no doubt would be the most expensive part of AWS activities. You should use AWS calculator to perform some back of the envelope calculations.
Second, start implementing some “saving” policies. In short, when you are not using your environments, all performance engineering environment could be shut down. Trust me, you can cut the bill by 50%-75% by implementing effective resource utilization policies. This could be done by either your company’s tools or by CloudFormation service. For example –
- Every day at 19:00 all Instances are shut down (snapshot is taken automatically before shutdown).
- Instances are not auto started in the morning. Instances are only started on demand
Obviously you need to make sure there is a exceptions list where you can enter your environment details and thus prevent shutting down those virtual machines that need to do work overnight. You also could set billing alerts per service to catch unwanted leaks ($). The best way to understand your spending is to analyze your first bill and by breaking down usage to the various activities performed during that period.
It does help if you remembered to TAG your Instances to distinguish among all the rest. We strongly recommend using AWS Spot Instances where price is significantly lower, at least at the beginning and for QA and Developers as a rule. Take into account that AWS Spot instances are only good for interruption-tolerant tasks. With Spot Instances you should always be prepared for the possibility that your Spot Instance may be interrupted and your instances could be terminated. There are tons of tools out there to help you gain cost savings when using AWS infrastructure, one of them we would recommend is the AWS Trusted Advisor.
Dmitry Agranat (LinkedIn), is a Performance Engineering Principal with primary focus on bringing a positive impact on overall organizational system performance. His professional career has started in 2005 at Mercury (now HP Software). During this period, he has played various roles in the Performance Engineering (PE) with extremely high focus on Application and Database Performance. He has served as Performance Doctor (fixing unhealthy systems) , Performance Fireman (triage of production fires) , Performance Paramedic (putting the system together by applying field triage, duct tape, whatever data, domain knowledge, best guess, wishful thinking or folk-lore available to fix a problem) and Performance Plumber (removing blockages and searching for leaks). He has worked with many internal and external customers to both solve immediate performance problems as well as to educate on building solid organizational performance management methodology. Dmitry believes that Performance Engineering (PE) is still not well understood, agreed upon, unified or can be quantified, thus requires building solid and effective entrepreneurship around Performance domain.