Introduction – With the increasing maturity of cloud service offerings and realization by customers of the benefits of moving to cloud based applications/platforms/services it’s not a surprise that IT shops around the world have cloud adoption some where on their agenda. In context of this article we define the cloud as being a source of applications/platforms/services provided by a partner or vendor ecosystem which offers all you can eat or an unlimited on-demand service for various IT services an organization would otherwise deliver using it’s internal IT capability e.g. service infrastructure, backup storage/retention, CRM, etc. It would be the rare for an IT shop small/medium/large these days not to have “Cloud Adoption” as part of their overall Information Technology strategy roadmap. Leveraging the efficiencies of scale through “Adoption of Cloud” platforms is an an item on most CIO’s agenda’s.
We understand however, adoption of “Cloud” based platforms and services depends a great deal on the industry you are in, availability of cloud based applications/platforms/services for your key business critical needs, privacy guidelines that govern the industry you are in and the extent to which the relevant cloud service providers comply with the government regulations that apply to the business you are in, etc. Like most things in life every decision has it’s upside and downside. With all the benefits of a move to the cloud come also challenges and risks that one has to consider when designing applications that will be deployed on the cloud. As an Architect, Application Developer, System Administrator, Performance Engineer or Tester you want to be cognizant of the various challenges posed by a move to the cloud and what your organization or project should consider when designing, building and deploying applications for a cloud based service/platform.
This article provides a view of the challenges and risks of designing applications for the cloud while touching upon some of the key considerations you ought to keep in mind when developing or designing applications for the cloud.
Consideration 1 : What are the SLA’s your Cloud service provider is signing up to – Over the last few years there’s been tremendous hype generated around the benefits of adopting cloud based applications/platforms/services as these service offerings and platforms mature. We are sure businesses like yours have a lot to gain by moving critical parts of the IT infrastructure to proven cloud platforms/services which offer much better bang for buck. While the CIO considers a move to the cloud you as an Application Architect, Performance Engineer or a Systems Administrators need to understand the overall Non Functional Requirements for your application and how the Cloud Service providers SLA (Service Level Agreements) tie into your Non Functional Requirements.
Our experiences dealing with Cloud Service providers over the years tells us that Cloud Service Application/Platform Service providers generally offer SLA’s that are limited to the confines of their data center. Now what we mean by that is, from a Non Functional Requirements perspective you might have signed up to deliver an end to end transaction for <=4s and this transaction spans multiple tiers i.e. internal web / app / database server which then connects to cloud based API. While you are on the hook for delivering the system that meets the agreed Response Time Non Functional Requirements of <=4s, the cloud service provider in this instance is only on the hook for the SLA’s that govern him within the confines of the data center.
Unless you have explicitly agreed to in your contracts with the cloud service provider the need for them to deliver capability bound by a certain set of Non Functional Requirements don’t expect them to come to the party. These Non Functional Requirements could be response times, throughput based, RTO (Recovery Time Objective), RPO (Recover Point Objective), etc. Cloud service platforms/services have definitely matured over the years and we would expect to see consolidation of offerings across certain platforms so definitely keep an eye on the evolving market while also constantly reminding yourself that the SLA’s (Service Level Agreements) that drive your IT business are not necessarily the SLA’s (Service Level Agreements) that your cloud service provider is going to be signing up to. So definitely do you due diligence with regards to the cloud solutions/platforms that meet your need but also consider designing your systems accordingly and account for those traits as part of the solution architecture.
Consideration 2 : What visibility do you have across the components of the system that interface with cloud services/applications – When designing applications internally you mostly have complete control over the components and are able to interface them with the relevant monitoring and diagnostics tools to understand how these systems behave under different conditions. The ability to instrument our internal IT systems is something we generally take for granted is a capability you definitely want to have especially when you receive that call late in the evening about a missing transaction or an unresponsive/sluggish system. However, when it comes to interfacing with cloud based applications/platforms/services it’s a completely different ball game. Vendors generally provide limited or no visibility into the performance of their applications/platforms to end customers and avoid tying themselves into stringent Service Level Agreements either. So what do you do when you desire visibility across a transaction chain which spans multiple components hosted across various 3rd party SaaS (Software As A Service), PaaS (Platform As A Service) or IaaS (Infrastructure As A Service) based service providers?
As an Architect, Performance Engineer, Developer or Tester keep in mind the challenge and design your systems accordingly. It’s not always possible to get your vendor to provide you insight into the system performance for all of the SaaS based components you have invested. However using simple instrumentation techniques and a combination of 3rd party monitoring tools i.e. End User Monitoring, Real User Monitoring, etc. you will gain some insight into those tiers of the application that is causing you all the grief. You can then have your support team work with the vendor to resolve the performance issues. Accounting for the relevant monitoring, diagnostics tools at the start of the program to provide you relevant visibility across the application stack including third party SaaS (Software As A Service) based components is critical to pro-actively identifying and resolving issues before the become show stoppers.
Consideration 3 :What is your approach for dealing with noisy neighbors – As netizens or citizens of the internet we have learned to live with the challenges of congestion. Infact we cope quite well (most of the times i.e.) with the congestion thanks to all of the underlying communication protocols that the internet uses. Most of the protocols used today for communication on our networks were designed by IETF, IEEE, etc. and are smart enough to deal with the congestion they encounter. Congestion has become part of our lives on the internet and we are evolving smarter ways of dealing with it over time. However when it comes to cloud based platforms and SaaS based services you have to keep in mind the impact that noisy neighbors could have on your application. Let’s take for example a situation where you have a few hosted customers residing on the same physical host. One of the hosted customers suddenly witnesses a surge in application traffic which begins to consume majority of the compute resources on your the same infrastructure your applications are being hosted on. In such a case it’s likely that you would witness degradation of Service Levels and performance depending on the nature of increase in workload across the various other customers residing on the same box.
In some cases the degradation of service levels would be so minor that it would be barely noticeable and in some case so severe that your applications are gasping for breath, unable to respond to incoming customer connections. This is not a very uncommon situation given the nature of the public cloud. Also do keep in mind that most vendors will not disclose the nature of impact to you or your service due to other workloads running on the same physical resources that host your applications. One of the ways to address the noisy neighbor issue is to design for it and ensure that your application architecture is able to cope with the impact of degradation of service levels or decrease in performance by taking appropriate measures. These measures could include disabling the impacted node and spawning up additional hosts or moving from one region to another ensuring that customers are not impacted. Some of the more mature offerings like AWS (Amazon Web Services) offers capability for you to detect degradation in service levels or performance across your applications and automatically move parts of your workload to alternate regions or possibly even terminate the affected host and launch resources on alternate hosts.
Consideration 4 : What are the licensing and cost implications with regards to dev/test environments – Moving to the cloud offers numerous advantages and one of them is the ability to spawn up additional environments for purposes of development and test without too much of effort. Also given that you are consuming a service the management overhead of maintaining/managing and keeping the development/test environments upto-date and entirely on the service provider. However, keep in mind that SaaS vendors charge for development and test environments and it’s something you would need to negotiate for as part of the procurement process. Avoid making assumptions around availability of environments or licenses for purposes of development and test. Some providers even have policies that allow you to price development environments but do not provide environment access for purposes of performance test due to various security and performance issues which they would rather not deal with. The mature service providers do tend to have policies that clearly communicate what customers can and can’t do with the services they have procured so be careful and make sure you do your due diligence accordingly.
Consideration 5 :What is my vendor’s approach for dealing with security, performance, reliability, availability, etc. – When designing your applications you will obviously put in effort to documenting the relevant Non Functional Requirements with regards to Performance, Scalability, Reliability, Availability, etc. We would recommend that you speak to your service provider to understand how they address the various Non Functional Requirements that matter to you as a customer and what assurances if the cloud service/platform provider able to provide you in terms of allowing you to meet your Non Functional Requirements.
Conclusion – We are seeing an increase in consumption of cloud services/applications/platforms and along with that comes additional responsibility that Architects, Developers, Performance Engineers, Testers, etc. have to take on to ensure that the systems/applications they are designing deal with unique challenges that the cloud imposes on them. None of this is rocket science and if you have done your due diligence well, understand the nature of the challenges and how they impact your system you will be well positioned to make architectural changes to be able to address them effectively.
Trevor Warren (Linked In) loves hacking with open source, designing innovative solutions and building communities. Trevor is inquisitive by nature, loves asking questions and some times does get into trouble for doing so. He’s passionate about certain things in life and building solutions that have the ability to impact people’s lives in a positive manner is one of them. He believes that he can change the world and is doing the little he can to change it in his own little ways. When not hacking open source, building new products, writing content for Practical Performance Analyst, dreaming up new concepts or building castles in the air, you can catch-him bird spotting (watching planes fly above his house). You can reach trevor at – trevor at practical performance analyst dot com. The views expressed on this web site are his own.