Monday, April 8, 2013

2nd Generation Cloud Lessons

So where's the post on 1st generation lessons? I didn't bother to write one because I feel the popular press has done a good job of detailing the promise and shortcomings of the public cloud.  The fact we had to divide cloud into private and public (and the mythical hybrid cloud) underscores we started on a rough foundation.  Marketers have done a great job of making cloud as confusing as possible to the point that again, the term is meaningless because it's so misapplied and misunderstood.  Ok, off the soap box.

What I want to share are the 10 lessons I've documented by those in the throws of the 2nd generation of cloud which is primarily the combination of Public PaaS and Private IaaS.  These are listed in the order in which companies seem to run into each issue.

Lesson #1: The Fear of Automation
I have seen a tremendous fear of turning over the operation of an infrastructure to rules based management systems. First, the rules by which administrators operate are not as well defined and understood as managers would like to believe. More intuition and on-the-fly problem solving are involved than anyone wants to admit. Its hard to automate something that is not well understood and for which rule cannot be defined. However the bigger limitation seems to be fear; if we automate it we lose control of it. I would agree if the concept of fail-safes and emergency stops didn't exist.  If we can automate assembly lines to the point modern factories are often lights-out environments, we can automate the data center.  The same rules apply.

Lesson #2 Dynamics of Image Management
Imaging is an important part of rapid provisioning.  Each environment from operating system through the application is wrapped into a nice package for quick delivery to compute resources when required. However the constant need for patches, updates, emergency fixes and the like make images obsolete within days of creation. It is important to understand the days of a static, golden server image are over. There are several approaches of which at least three seem to prevail: active manual image management, automated image management, and image cascading.  Active manual management requires a subset of the team to rebuild images as necessary based on changes to ensure images stay up to date.  Automated management lets the team push the burden of updates onto a tool which merges changes into the image.  Image cascading de-couples the variety of installed components and does an automated install of each rather than a true image copy.  Each approach has it's merits and drawbacks, but the biggest issue to date is the quality of the available tools.

Lesson #3 Where's the "pay as you go"?
The economics of cloud are what make cloud such an enticing tool for the enterprise. Even in the single minded private cloud model, the business users who are paying the bills only want to pay for what they consume. Today the IT department solves this problem, if at all, by jacking up the per hour rate so the end result looks like the consumer is only paying for what they use. Per hour consumption of resources makes IT services a commodity and glosses over the value of what they provide. Hence there is tremendous fear in providing a "pay as you go" model to the business. However once some smart analyst compares internal costs to the public cloud all hell breaks loose and he explanations begin. In the end its this "trapped customer" opinion that will go a long way to unraveling a significant part of the on-premise private cloud. The end model needs to provide the business with options: host internally with all the control and security at X per CPU hour, host off-premise in a private cloud with a bit less control and security, or go for the bargain basement at Z for little control and minimum security. Of course when options Y and Z are on the table it will require the vendors to provide "pay as you go" just like the regular public cloud, and they fear this as much as internal IT.

Lesson #4 De-provisioning
Once the "pay as you go" mountain has been climbed, the next challenge is to actively manage the costs. Public cloud users have been on this bandwagon for the past several years, but its new to the Enterprise. As budgets for cloud are moving from IT to the lines of business for Public SaaS, it stands to reason they will for other technology services and with good reason. Want IT to be competitive? Require them to compete! The challenge is identifying servers which are at low utilization and migrating jobs off those servers to servers with capacity while not interrupting users. Tools, integration, and automation are all required to make de-provisioning happen in a meaningful way.

Lesson #5 Tool Maturity (and the lack thereof)
By this time organizations get pretty frustrated by the amount of integration and business rules development required, and the barriers caused by having disparate systems from vendors whose visions may overlap but rarely lead to the same future. Open standards and open source will be the keys, but both will necessitate patience which threatens the competitive advantage of cloud. Organizations that run complex IT shops well will benefit while others should understand what they are getting into. Moving to a software defined world will help, but it will not be the silver bullet.

Lesson #6 The Application Development Gap
Most Private IaaS implementations fail to garner the widespread adoption used in the business case to justify the investment, and a pattern has emerged on the reason why. Application developers are rarely invited to the party. As its name implies, implementers of Private IaaS tend to be infrastructure people who focus on hardware, virtual machines, and operating systems. To them the job is done once the self-serve web site is up and running enabling consumers to build the server which suits their need. Web developers tend to be comfortable in this model however they are not the norm. Most developers cannot double as system administrators and wouldn't know what to do if a Linux or Windows box fell into their lap. My favorite quote of all time from a Computer Science major was "My knowledge of the hardware is limited to the on/off button". And that's why I got my degree in Computer Engineering!  Application developers have needs too, and if those needs are ignored they will return the favor. Every large enterprise I know of with one exception has made same gaff.

Lesson #7 The Role of Automated Testing
Developing applications in the cloud is half the battle; the other half is testing. Clouds tend to be built on web technologies (Apache, server side scripting languages, web services, HTML5 and CSS3, etc.) using web approaches (frameworks, Agile methods, etc.) which result in a much faster pace of development and release. As a result its important to be able to test the application in each iteration; add in the constant patches and updates and it becomes paramount. Without automated testing cloud development is stillborn.

Lesson #8 N+1
Ugh. From the stories I've heard and situations I've experienced, this is the one challenge that catches EVERYONE off guard. Moving from one cloud version to another is not for the faint of heart, but it's a reality. Major upgrades in capability or underlying components are often the cause of significant outages at even the big players like Amazon, Netflix, eBay and Facebook. The only way to avoid this is to grow new clouds next to existing clouds and migrate over time using automated testing, keen observation, strong alerting and exception management, and a bit of prayer. Falling back to the previous environment is a requirement. It's the change management gurus ultimate test!

Lesson #9 Security Scares
At some point the security team needs to get comfortable with the idea data will sit off site. I know of no Fortune 500 with a well defined policy on what data can be stored in a cloud off-site and what corresponding standards, processes and tools are to be used. yet every Fortune 500 company has data stored off-site. If for no other reason, the proliferation of data and the emerging business value of mobility will compel the business to need capabilities outside the traditional data center. 

Lesson #10 Leader Egos and Agendas:
Managers build kingdoms of control, leaders unify kingdoms into nations. However many a great leader still has an ego and an agenda and the wheels came come off the cart quickly when leaders clash. As the budgets move to the business and IT moves toward a commodity, the claws will come out. CIO's are increasingly a competitive threat to COO's and I believe will be the next generation of CEO's, a view shared by several friends in management consulting. I have to believe CFO's will have a say, a traditional source for Fortune 500 CEO's for decades. In my humble opinion the CIO role will dissolve and CIO's will become the next generation COO's for all the right reasons: technology and the business will be viewed as one coin instead of two sides.