Tuesday, December 29, 2009

We're Through the Looking Glass - Cloud Security

I feel it's appropriate to consider an Alice in Wonderland world when thinking through the cloud computing landscape and it's security implications. Most security experts I know, including many CISO's at clients some of whom are quoted on the topic, appear to throw water on the burning desire of CTO's everywhere to go "cloud". It's understandable and, in my opinion, quite reasonable. Let's face it - the cloud isn't ready for prime time.

I have no argument against using the cloud for non-critical tasks but I tell clients day in and day out we are 2-3yrs from enterprise clouding (I love new domains where we can make up words!). Comments like that get me in the good graces of CISO's, at least until my next sentence, "You better get started now." What? Why do we have to get started now if the enterprise version won't be ready for 2-3yrs? Well because that's when it will be easy and everyone will have it - don't you want a competitive advantage? Well then put your nose to the grindstone and get that whole security thing figured out pronto so IT can move forward...

...or get run over, your choice!

With the cloud computing juggernaut gaining speed now is not the time for "No, but..." responses. What CIO's and CTO's need now are "Yes, if..." answers on how to pursue secure cloud services. We have lots of existing models, standards, and solutions so nobody can tell me the cloud is entirely unique. What it does present is a new architecture to which we need to plug in known solutions to known problems and some new solutions to cover feared gaps.

One of the biggest gaps clients identify today is data security. "How do I know my data is secure at a cloud provider?" Honestly I don't know in a holistic way but the old stand-by of encrypt data in transit and data at rest seems to pose the foundation of a solution. The immediate response, as the responder's face wrinkles so their eyes become nothing but slits in the creases of skin below their brow, "But that's too much overhead". Oh. So are we taking this security thing seriously or not? If we are then again, lets take our foundation and now get to work on the speed issue. Solving that problem involves the economics of speed where money is often the answer, governance so we don't speedily reach a cliff, and improving performance so the overhead of encryption becomes a round-off error.

Economically we don't have much of an issue. Cheap bandwidth. Cheap cloud storage. At $90k per 50TB of data storage at Amazon S3 we can afford encryption even if it increases our data sets by an order of magnitude in size. Governance is an issue but as we increase the use of automation in the cloud we should be automating governance as well. We need strong tools enabling us to enforce policies, especially on data which is hasn't been categorized.

How do we increase performance? Encryption takes time but if we can convince the cloud storage providers to provide hardware based encryption we can reduce the cost. Next we need a new way of thinking that takes advantage of cloud: lots of network bandwidth, storage services available on the fly, and ubiquitous availability.

How about applying a grid storage idea to the problem for data archival. Take a set of data and split it up into multiple chunks, each chunk with a sequence number, and encrypt it. Store a random set of chunks at three or more storage vendors and manage which data is stored where using a private index engine. Because each site contains a portion of the total data, the data is non-contiguous, and the data is random the value of the data at the site is dramatically reduced. A hacker would be required to hack all the sites, decrypt the data, and reassemble it to get the full picture. With the landscape being inherently more difficult to hack and the value of any independent data set being low a less onerous encryption method, such as one using 64bit keys, can be used.

Where could such a solution be used? How about monthly billing. Once the bill is generated and paid, the details are rarely if ever used again. Archive the data to the cloud. If it needs to be retrieved it can be, but its value is low to begin with for most hackers. Securing the data through obfuscation will make most hackers look for easier targets. Hacking is a numbers game.

So for all the CISO's out there consider that now is the time to identify the gaps and start looking at how to fill them in. One thing I can assure you of as we talk to CIO's and CTO's, the cloud computing train is coming and it's starting to build some serious momentum.

Be prepared to lead, follow, or get out of the way!

Sunday, November 15, 2009

The Achilles Heel of Cloud Computing

I think everyone understands the "Cloud" in cloud computing is an undefined network incredibly similar to, but not necessarily synonymous, with the Internet. What many apparently have not spent time thinking through is the impact building private and public clouds will have on the network architecture at most public companies. Three fundamentals which need to be revisited are: the Internet connectivity architecture, Internet bandwidth, and network security.

Many, if not most, large corporations consolidate their internet connectivity into a very few and sometimes a single point. I was working recently with a large non-governmental organization who has consolidated all of their internet access for primary and field offices into their Chicago data center. It's a great model for using the Internet, not so good for incorporating the Internet. In the world of Cloud Computing the Internet is less an end point and more one of several intermediate points during the execution of a function. Public clouds must be accessible at all times from any location to be of value. This mandate implies there is no single point of failure between the corporation and the public internet. A new architecture is required with many-to-one access to the internet instead of a one-to-one model. If New York cannot connect to the Internet it cannot jeopardize the entire corporation. And remember that Internet backbones do go down and more likely will due to ever increasing loads in the foreseeable future. If the public cloud function leverages data within the data center the reverse is true; multiple paths provide redundancy. I'm sure some will argue with me but I cannot make sense of data travelling from Denver to Chicago just to gain access to the Internet; it's an archaic model at best.

The second major issue is existing Internet bandwidth will have to grow. At the same time as traffic moves to the Internet it will move off of internal WAN's. We've grown accustomed to cheap bandwidth but with the explosion of WiFi, the coming of WiMax, and the growth of rich media on the Internet I expect those days are coming to a close. I expect it will be cheaper to run over the Internet than through private backbones which will help drive us to a more federated model for Internet connectivity.

Finally security as we see it today becomes problematic. How do we sniff packets between a user and a cloud provider when the company has nothing traversed in between. Surely we could route users through the corporate firewall but again, this defeats some of the economic model of cloud. We need better tools on the client side to help us manage the security aspects of this federated model. I'm not saying there aren't tools today, but those tools need to improve their automated detection, recording, and reporting capabilities to prevent attacks both inside and outside the company.

I've noticed over the past ten years a change in approach to networking. In the past bandwidth was managed loosely to ensure it was adequate. We've really tightened it down and now we need to be asking our bandwidth providers to provide more virtualized options enabling the rapid, automated scale up and down of circuits. We don't want to leave the network out of the push to move from a fixed to a variable cost model in IT.

Tuesday, November 10, 2009

Marketing Through the Cloud

Cloud Computing promises to turn many conventional ideas on their head such as disaster recovery, data warehousing, and application development. However one area I see as having the opportunity to explode in value with Cloud is Marketing. I have always had an affinity for marketing from college where it was my minor (unofficially because as a Computer Engineering student we weren't allowed to have minors) through multiple interface roles from my internship at Eaton/Cutler-Hammer through my time in Sales & Distribution at IBM. The challenge of marketing is to know what the customer thinks and what the customer wants, often in advance of the customer. Much of marketing is driven by research; what works, what doesn't, who, what where, when, how, why. The only limitation is time because there is always one more question to ask. At some point that research has to be analyzed and made actionable through inputs to product development, sales, advertising, eCommerce and IT.

So where does cloud come in to play? First at the most base level with the enablement of rapid change represented by SOA. Once an SOA foundation is in place there are no limits to the reach of marketing. Social Networking. Semantic Web. Business Intelligence. Awareness. These are the future of marketing and all work better on an SOA foundation.

Social networking gives marketers the opportunity to find the influencers, tailor marketing messages, solicit feedback, observe from afar, and even seed new ideas. It's a human lab with no walls and no limitations. I have yet to see tools such as Facebook used for focus groups, or Twitter to measure interest, or harvesting forums for feedback. It may happen but as a user and an industry insider I've heard no discussion.

Semantic web technologies will be as important to Marketing as they will to supply chain management. The closer marketing gets to understanding the whole picture the better they can analyze data in the proper context and provide better input to downstream efforts. Why did someone purchase the product? What was the impetus for purchase? What made them think of the product? What was their first thought about the product? Good questions to ask, good information to know, but today all of this data is gathered post-sale through interviews. What if, via semantic technologies, customers were queried for this data and responded, all without realizing it? Capturing data real-time is always preferable to eliminate the issues of memory loss and filtering.

With the trove of new data available new business intelligence tools will emerge, and by having the data available in the cloud means the four walls of the data center will no longer limit how the data is analyzed. Specialty firms staffed with PhD's will offer services to slice and dice the data using their proprietary tools and provide an additional layer of context by bringing in additional 3rd party data sources. Business intelligence, real business intelligence and not the analytical reporting which passes for BI in many companies today, will emerge as a cloud service.

Awareness? What is that? Find me a better name and I'll use it but through cloud computing and semantic technologies we are a very large and important step closer to enabling awareness; the ability of a computer to understand. Although it will start with small steps, over the next 10 years computers will increasingly direct their own searches for data they feel is missing to assemble their own conclusions based on simple human queries. And when this happens we'll be ready to turn the corner in Marketing and become more proactive than reactive, able to predict events and prepare to seize opportunity. We'll be able to take our supply chain optimizations which can move the snow blowers to the states with the impending snow storms and extend that knowledge with who owns a snow blower ready for replacement, what size is needed based on the footprint of their property, who would benefit from snow removal services, who can be prompted to buy to use a remaining store credit. Targeted marketing will beging to take on the 1:1 reality we've been talking about for the past decade.

Cloud is a revolutionary technology which can be adopted in evolutionary steps making it unique and unavoidable. Cloud brings the world closer together eliminating some of our artificial barriers and bridging ones which are all too real. For Marketing this new capability will drive new thinking, new approaches, and new solutions bound only by their need to understand the customer better than the customer understands themself.

Saturday, November 7, 2009

The Sun Sets on Disaster Recovery (Finally)

Big changes have to come in small doses. For those of us fortunate to have several years experience with SOA and utility computing we see so many of the great things cloud can do and how it really addresses so many of the complexities within IT. I often explain part of the tremendous value of Cloud Computing is trapping complexity within layers of abstraction so we don't expose limitations. However I see one of the biggest killer apps for Cloud Computing, business continuity, as not only trapping disaster recovery within the infrastructure layer but doing away with disaster reactivity entirely!

First not every failure is a disaster. Failures can and do occur and we need to be smarter about how we engineer our solutions. Our focus should be on automated recovery; an option which becomes a real solution in a cloud world. If a service dies another service should be started. If hardware fails jobs should move to alternate hardware. Creating heat maps for failover can go a long way to identifying and targeting areas where failure recovery needs to be addressed and hopefully automated. But a disaster is a large scale failure for which we so often employ a different set of tools. Why? Primarily because of our legacy silo approach. If one silo dies we need to move data and jobs to an alternate silo. In a cloud architecture we don't have silos (even in a virtualized architecture the silo is logical rather than physical manifestation). So?

Once we architect our solutions to be service oriented and distributed from the start we lessen the impact of all failures from the simple to the theatrical. If we lose a data center that's bad. However if our solution is already load balancing across data centers, and we ensure by business rule we always have services available in each center, then our exposure is limited to in-flight transactions and sessions. If we have a true cloud infrastructure then we should already have the network bandwidth required to perform database mirroring in which case the disruption of the data center loss is as minimal as we have the ability to make it.
Further advances will come to light in the next few years as databases tackle the federation issue and learn to manage data in logical instances rather than physical domains.

None of this happens, however, without intent.

We need a business case. What is the lost business opportunity per hour of downtime. For many business critical systems this value is calculable, and I argue if it’s not then there’s no reason for recovery. Our solution cost needs to be a small percentage of that potential loss. Today disaster recovery is EXPENSIVE: hot-sites and recovery contracts, tapes to retrieve from an off-site location and restore, staff to move around, periodic tests which always end with multiple failures. According to the Symantec 5th Annual IT Disaster Recovery Survey in June 2009, the average annual budget for disaster recovery is $50M. Consider that against the cost of 50TB of storage on Amazon E3: $90k. WOW! So in one fell swoop we can improve recovery speed and accuracy and reduce cost by elminating tapes, backups, tape recoveries, off-site storage, and the administration costs. And it only gets better!

Moving into the cloud we take advantage of all the tools and capabilities that already exist from service directories and virtual machines to provisioning and orchestration engines, schedulers and service level managers. We move out of Disaster Recovery and into Business Continuity. The focus shifts from recovering business systems based on a Recovery Time Objective and Recovery Point Objective to providing near seamless continuity via recovering services and virtual machines and cloudbursting to get needed resources. Is the cloud ready today? It's pretty darn close. Consider that Oracle's ERP solution will backup and recover from cloud storage. According to Symantec's survey three key hurdles in the virtualized world are storage management tools to protect data and applications, resource constraints which challenge the backing up of virtual environments, and that today 1/3 of organizations don't backup virtual environments.
Today or tomorrow business continuity brought about by cloud concepts is on the horizon and is a target every should be shooting for. It saves money and time and reduces risk. What's not to love?

Tuesday, November 3, 2009

What Should a Cloud Provide

To me the single most important question I have never been asked is "What Should a Cloud Provide." I've been asked what do Cloud providers provide, and at what cost, and what solutions are available. But nobody has taken the question up a notch which to me means the focus is on application instead of understanding. We shouldn't limit our discussion to what's available. In such a nascent market we should define what we need, then tell the providers who can in turn react and build the services deemed most valuable to the market. So what should a cloud provide?

Nearly Unlimitted Bandwidth
It may sound impossible however cloud computing is predicated on having bandwidth available on demand. Workloads and data need to be shifted around the infrastructure at a moment's notice which means the network cannot be a constraint. This is a tremendous opportunity for the telco's to build out additional network bandwidth. More so this is a tremendous opportunity for telco's to deliver cloud services themselves out of their existing data centers.

Storage without Backup
Storage on a large scale is cheap. Amazon S3 provides 50TB of storage for $90k/yr. At those costs I can keep three copies of every file and eliminate ALL the costs of backup and recovery for less than the fully loaded cost of three administrators. And costs go down as volume increases.

Resources On Demand
This is the easy one but when additional processors are required they must be available, fully provisioned, within minutes if not seconds. From bare metal to fully loaded with application and all shouldn't take more than 15min. And although today we are bound by the limitations of processor type or operating system, now that technologists see them as barriers new solutions will evolve to minimize their impact.

On Demand Management Tools
Four important tools to learn and love are provisioning engines, orchestrators, schedulers, and service level managers. Provisioning engines which grab extra resources and put them into the correct pool when demand is high, and move resources out of the pool when demand subsides. Orchestrators observe the changes in traffic and determine when resources should be added or removed by the provisioning engine. Schedulers determine where jobs should run within the cloud and communicate with the orchestrators to ensure the required footprint is available. Finally the service level manager watches over all the applications and intercedes whenever an application threatens to miss its service level targets. At times this may require ramping down usage of non-critical applications in preference of critical applications.

Metering and Chargebacks/Billing
Cloud only makes sense with the economic model of paying by consumption when consumption and its associated cost are communicated. Usage must be metered and the costs must reflect the consumption. Although some costs will still make sense to allocate, the majority of costs must move to a chargeback model.

Clouds are clouds and as such the barriers between them are security based. Clouds must be able to share data, services, and infrastructure otherwise instead of cloud one ends up with a larger silo. The future value of cloud will be the new capabilities which are unlocked in such areas as collaboration, supply chain integration, business intelligence, and federated control. Security checkpoints will be important but should be the only barrier to integrating clouds.

Business Rules
Since a cloud is a consumption driven model there need to be business rules which govern consumption. Looking at healthcare today, what drove the high cost was the elimination of barriers to access. We made healthcare easy through co-pays and low deductibles. The easier something is, the more people will use it which results in a tilting of the supply and demand scale. In response either supply increase or prices will rise. So in the cloud we need business rules to govern consumption so it doesn't outstrip supply. What rights are required to execute an application? Who has the necessary rights? Who can grant rights? When will external resources be used? I believe that just as we did in the mobile telecommunications industry, at some point tiered pricing will enter cloud computing and rates will be higher during the day than overnight. When this change occurs it could have a dramatic impact on the costs of a cloud solution.

Monday, November 2, 2009

Three Cheers to Ubuntu

"Sit Ubu. Sit." I can't remember what show it was but it always ended with a picture of a dog and that line. That's what I thought Ubuntu was when I heard about it the first time. Only later through continued pestering did I realize it was Debian fork in the mid 2000's. I'm a bit of a Linux bigot who preferred the Slackware release in the Volkerding days and moved to RedHat in the late 1990's. When RedHat went commercial I felt lost for a bit but picked up Fedora. Hearing how great Ubuntu was from friends I recently decided to give it a try on a Windows Vista machine with chronic problems and I was impressed.

Oh, don't misunderstand me, I'm not impressed with Ubuntu. I simply haven't used it enough but so far it seems remarkably like, oh, Fedora.

I was impressed with the stance Ubuntu has taken on cloud computing. Ubuntu brags on their site about the inclusion of Eucalyptus (Elastic Utility Computing Architecture Linking Your Programs To Useful Systems). Eucalyptus is an Amazon EC2 clone that works with EC2, S3 and EBS. Ubuntu takes an open source package that is relatively unknown, includes it in their server distributions, and makes the argument you should use Ubuntu as the foundation for your private cloud BECAUSE it will make you compatible with Amazon EC2 when you want to cloudburst. Now that's smart! People forget that Amazon is largely based on open technologies with Linux virtual machines running Xen hypervisor.


It makes me wonder where the relationship between Microsoft and Amazon sits. Microsoft and Google are competitors. In the growing cloud space Amazon and Google are competitors. But with Azure Microsoft is also an Amazon competitor, kind of, right? No in the sense that they use different platforms (Windows vs. Linux), but yes in that they compete for business. But Microsoft is so big perhaps Amazon needs to adopt "the enemy of my enemy is my friend" mentality and get Microsoft to build in cloudbursting capability to Amazon EC2. It would mean more licenses of Microsoft server products and address those Microsoft customers who don't want Azure, just more processor time. What happens when the concept of the operating system falls apart, as large tech heavyweights such as EMC and Cisco are starting to argue. Does that push Microsoft out of the picture?

Regardless of all the above, and perhaps because of it, choosing Linux is easy. Which distribution to choose? Really any because Eucalyptus can work with any distro, but Ubuntu has clearly seized the initiative.

My prediction is we'll hear RedHat announce they are working with OpenNebula...

Friday, October 30, 2009

Who Will Be The Expert?

I've noticed a disturbing trend as we work to squeeze more and more computing technology graduates out of our universities. It appears we're dumbing the students down. Three quick examples, I have yet to meet a generation Y or Z who knows how:

1. to read and resolve a stack dump

2. how databases store data on the hard drive

3. how a network sends messages from one computer to another

Now I know there are some out there who do know, unfortunately I haven't met you. And I run in a circle of technology consultants whereas if I spent my time at Intel I'm sure more would know. Yet I work with graduates from our top technical schools: Michigan, Cornell, MIT, Virginia Tech, Stanford, Illinois, and Carnegie Mellon to name a few. Why are my expectations important? Here are the answers:

1. The development consultant could not resolve a stack dump and didn't because it was only happening on one developer's machine so they reformatted and reinstalled the software. However the cause of the problem was an automated update which changed the Java Virtual Machine. When the client moved to production the application wouldn't work because the operating system came with the later JVM and could not be downgraded. Downgrading the operating system meant losing crucial updates to improve database performance. Had they traced the stack dump they could have learned the cause months in advance and worked out an alternative instead of calling in all hands for several days and delaying their launch AFTER their public announcement.

2. The data architecture consultant did not understand the relationship of the files within a database so she incorrectly directed the support group to only backup some of the data files believing the rest were "configuration files and stuff". Turns out configuration files are pretty important for interpreting the data and as a result when the primary host went down hard, the data was unrecoverable because it was incomplete.

3. The security consultant didn't understand that Ethernet sends all packets to everyone so he believed the connection between two machines was point-to-point connection and therefore impenetrable. About 30sec of research would have taught him about promiscuous mode. It took the client many weeks and tens of thousands of dollars to define and implement a new architecture (nobody would take my advice to drop in SSL).

I have a cousin who is a developer for Microsoft and has had a lifelong passion for computing. During his degree program in Computer Science he NEVER learned Assembly Language, C or C++ instead being forced to focus on Java and its ilk. Java is ok, but it's got issues among them being it's heft. I spent several years in embedded systems and do not see Assembler being replaced anytime soon. At GM we used Modula, Assembler and C as the languages to program the Engine Control Modules. Lower level languages require you to understand how things like schedulers, pipelines, and memory work in order to maximize performance and sometimes just to make something happen (pointer arithmetic to manipulate memory for example). I encounter few Java developers who understand how memory is allocated and freed, how time-slicing is performed, how the cache operates and invalidates its contents.

I don't know everything. I know how to ask and how to learn. I'm never afraid to admit what I don't know. But I take great pride and work hard to know as much as I can. In each of these cases significant problems were stumbled into because the person in charge lacked breadth and depth. Why? Because none of them ever learned the core elements of computing; they were all one trick ponies.

So what is the solution? Luckily we'll have graduates in my program, Computer Engineering, and other engineering and science disciplines to act as the true experts. And those graduates will be swallowed up by the hardware, networking, telecomm, plant floor automation, aviation and embedded systems companies around the world. For the remainder of the world including consulting firms and corporations in non-R&D roles, I believe it's time to develop computing fundamentals courses to expand their view and understanding of computer technology. Companies today need people with multi-disciplined technology backgrounds to both lead large scale technology efforts and to provide guidance in troubleshooting. Find the good ones, expose them to a wider perspective, and most often they'll start to look at things with a different set of eyes which benefits everyone. Again, in my experience, I've had hundreds of conversations with people which end with "I never knew that. That's so interesting. Thank-you for explaining it to me." I guess that's why I've worked on projects including program/project management, IT strategy, Enterprise Architecture, application rationalization, software development, infrastructure architecture, business intelligence, data warehousing, systems integration, ERP selection, call centers, CRM, SFA, architecture modelling, requirements determination, and many more and in each one been considered the expert.

CRM, ERP, SFA, DW, and BI are not as challenging as operating system development, but they still needs experts!

Sunday, October 25, 2009

The Value of Models

I've had a theory for years which I always intended to research and proove during the course of post-graduate work. My belief is by understanding the core elements of computing (logic gates, transistors, magnetic storage, assembler, ethernet, etc.) makes all of their applications (databases, business intelligence, web architecture, cloud computing, etc.) easy to understand. Each computing technology core is composed of a set of models and I've found many models repeat themselves. We handle multiple simultaneous requests on processors, networks, and storage using the same time-slicing model which is the same way Client Service Representatives handle multiple chat requests. My theory is a person who learns and understands the models has the fastest route to gaining advanced knowledge in any one area and will have the broadest view.

Over the past 30yrs I've become a model driven person. I have taken the models I learned in college and continuously added new models or made existing models more robust to provide my core understanding. I've applied the same rules to business using my consumer side interactions (retail purchases, my bank account, etc.) as the foundation for models in each industry. What this model driven approach gives me is a head start whenever I encounter something new. I have found I can hit the ground in a new business vertical and be considered a technology and process expert within 90 days. My first goal is to understand, second to align one or more of my existing models, third to perform a gap analysis, and four to fill in the gaps. The result tends to be a very robust understanding.

I am often asked by my leadership where we can find more of me. It's not me, it's my way of learning and applying knowledge they want to replicate. But it all starts with an open mind. What I find in my competition, regardless of level or job type, is a very myopic view. I worked with a software engineer early in my career at Eaton/Cutler-Hammer who told me he didn't care about the hardware; he only wanted to know where the on/off switch was located. He felt understanding the hardware would take too much time and too much capacity. If only he realized the models between hardware and software are largely similar.

I work day-in, day-out with ERP and supply chain guru's, CRM experts, and people focused on Enterprise Transformation. What I find interesting is how many are one or two trick ponies. They are considered experts yet they cannot explain how things really work within their own domain, and certainly not to someone new to the domain. Perhaps they know the processes, which is paramount, but they don't understand the model. Every problem is different and I agree one can repeatedly apply the same approach to solving the problem, but too often consultants are trying to apply the same solution. When I dive in it becomes readily apparent the reason for pushing the same solution is that nobody really understands it but it's been proven to work. It's a best practice. The truth leaked out in mid-2002 while I was at PricewaterhouseCoopers Consulting when we were told to use the term "leading practice" in place of "best practice". Now that's some logic I can agree with because there is never one best practice.

As an Enterprise Architect I'm a modeller in a modeller's world. I find it interesting how businesses are now starting to unlock the power of modelling. In a recent internal discussion one of the 2010 technology trends discussed was the evolution of modelling in business to a primary focus. Perhaps not in 2010, but the fact that it was even a topic of discussion surprised me. I guess I'm lucky in that the way I naturally think is evolving as a better mousetrap. Hopefully it has long legs or my thinking continues to evolve.

Perhaps I should develop a course on modelling...

Sunday, October 18, 2009

Google Makes an Important Step Forward in Cloud

I'll leave my criticisms of the hypocrisy at Google between their "Do no evil" motto and their actions aside for now. I have to applaud Google for resolving one of three hurdles to the use of their Cloud products. Google is now releasing methods to transfer the data users have put into Google products such as Google Docs, Blogger, and Gmail out of Google's data center. Google calls it their Data Liberation Front and it has it's own webiste at dataliberation.org. Google will provide an easy to use method for exporting data from every product, in bulk, to the user's selected destination. Bravo Google!

Why is this important? The Cloud is predicated on a virtualized infrastructure (utility computing model) with a service based software layer (SOA). The combination of these two models creates a powerful foundation to provide the most flexibility in the most efficient manner. Creating arbitrary obstacles to moving data in and out of data stores, using application components instead of only full applications, and changing where the data and applications reside destroys some of the of Cloud. Cloud has to be bigger than Google, Amazon, Rackspace, IBM, or any other vendor. The emerging Cloud lacks a definition of data ownership. Ownership means the ability to add, change, delete and move at will and have it still be useful. Companies always seem to forget, whether is the old ASP model or SaaS, that they need to make sure their data can be exported AND imported into another tool, otherwise they don't own the data but rather have granted that ownership to the application/platform provider by proxy.

For the Fortune 1000 Cloud will start inside the data center, as it already is for large banks and a select few others with vision. To be relevant, public Cloud offerings need to enable, not disable, integration across the public/private cloud boundaries. Users need to own their data which is not the case when a cloud provider partitions it for them but ties it inextricably to their platform. Without ownership standards the Cloud represents nothing more than a new larger external silo, but still a silo.

What Google is doing will raise the bar for all providers which should go a long way to making Cloud more palatable from a risk point of view. The next steps should be enabling a federated data model so I can store highly sensitive data at home while using Google for the remainder of the data but all within a single data model. In addition Google can go further with enabling its applications as services for integration into other tools. Of course I expect Google will need remuneration and needs to think through how enterprise licensing will work because everyone knows you get what you pay for. As long as a service is free it also means at the mercy of the provider.

Oh, the other two obstacles Google needs to figure out? First, encryption of data at rest and in transit which I know is already on the drawing board and partly implemented in tools such as Google Docs. Second, interoperability standards to enabling the shifting of data and applications throughout the cloud. Where Google goes the public Clouds will follow.

Sunday, October 11, 2009

A Road Made Longer by Doubt

In 2003 I joined the IBM Grid & Virtualization team as the lead architect for Healthcare in the Americas. As a part of Systems Technology Group our job was to evangelize grid computing and virtualization technologies. We helped early adopters move further faster investing time and money to learn and have an impact. As the harbinger of new technologies we met LOTS of skepticism and dobut. Funny enough, just about everyone is moving that direction now, many having given up their chance at innovation by adopting the technologies first in their industries. Oh well. If I had a dime for every bad business decision about technology I've witnessed in 15+ years of consulting I'd be retired to my own island.

In 2003 grid computing was synonymous with high performance computing, a boundary we worked tirelessly to break down because it was arbritrary at best. In this endeavor I upgraded one of my computers to the latest Nvidia card, a company I have followed for years one because of its technology and two because two of its executives went to the same small engineering college I attended. When I researched the specs and looked at the calculations, and compared those to what a medical research institution was attepmting, I realized the video GPU had much more to offer than the general purpose CPU. I talked to a few fellow IBM'ers who agreed and had been looking at such uses for a few months. Getting my facts together I approached my client to propose we do some joint research into the use of the GPU for the calcuations.

Cost? Zero. I had funding in "blue money" ready to go. Delays? Zero. We had a functioning system but it was resource constrained. Receptiveness of my client. Also zero. No appetite.

The world has moved on and Nvidia has stayed its course now designing a new GPU architecture specifically to advance high throughput computing. It's a great idea, especially considering Nvidia's past architecture has enabled the sharing of resources across four video cards. Impressive to say the least!

I wonder who, beyond Oak Ridge, will bite. More so, I wonder how much faster we could have advanced important causes such as research into pharmacogenomics and protein folding if we had adopted this technology earlier. I have to believe it would have expedited the development, ultimately leading us to the same place but sooner. I don't know about you, but I'm interested in an AIDS vaccine and a cure for cancer BEFORE I die. I'm sure others are too. Isn't there a moral imperative that says research centers should be, perhaps, researching? I found they do as long as the topics aren't too politically sensitive. And for whatever reason the idea we presented was a political landmine. Too many people would have to agree. Too many people would have to approve. It would take too long. There was no guarantee.

Yes. But there is no guarantee in Research, or did I miss something.

Too bad for all the people for whom the vaccine arrives 1min later than needed; and to those whom it could have saved had it come to market on the earliest possible path instead of the one easiest to navigate. I guess that's why I left R&D after my internship in college and never wanted to go back (although I've been dragged, reluctantly, back through the halls of R&D a few times). R&D to me is all promise with very little delivery. Guess that's why we do so little in the United States these days. I didn't realize the reason for poor delivery was because researchers were afraid to do....uh....research.

Friday, October 9, 2009

Welcome to the Cloud Everyone!

I find it interesting how some of us have always envisioned a computing world based on "cloud" technologies while others are just figuring out what cloud means. I started getting into virtualization technologies back in 1993 during an internship in the Artificial Intelligence Group at Eaton/Cutler-Hammer. A student in Computer Engineering at MSOE at the time, I used the FIDE package (Fuzzy Inference Development Environment) to do research on fuzzy logic. I could mimic the Motorola MC68HC11 micro-controller which opened my eyes to the realty of our compute stack. Each layer is an abstraction not for the benefit of the machine, but for the benefit of the human. A CPU cannot differentiate between code at the micro, operating system, or application levels. Packages, libraries, databases, user interfaces all look the same.

About this time I was asked by our neighbor back at home, President of Ameritech's business services division, what I felt the next big thing in computing would be in ten years. I was off on the timing but I replied "Hey, you guys in the phone company have lots of big computers. I think the future is running the applications I want on those machines and charging me only for what I use. You have way more power than I could ever afford because I only need it for a few nanoseconds at a time." My father, a first generation Computer Engineer, reminded me of the conversation earlier this year.

Post-graduation I worked for General Motors via EDS on the Powertrain Embedded Sytems and Controls team where I developed and managed the teams developing components of the Engine Control Module code as well as development utilities such as Cal-Tools and WinVLT. Again we used software to emulate hardware further cementing my belief that the hardware was, in some form, a commodity.

Fast forward to the Linux movement around 1996 when I saw the value of open source as a form of virtualization; consolidating the logic of various systems into a common, open application for everyone to learn and expand. Open source had few proponents in business but that didn't stop friends and I from trying to move an outdated call center outsourcer out of the mainframe age into the internet age. We didn't succeed, not for a lack of technical capability, but for a lack of salesmanship but that's another story for another post. We did succeed in demonstrating the value of open source, and once we did there was no going back (at least for us, the company, sadly regressed to a number of failed implementations including Lotus Notes - whose idea was that!?!?! - and eventually folded into the pages of history being acquired by an Indian firm).

So around 2002 I ran smack into an idea I called serverless computing. The idea is based upon that fundamental realization that most constructs of computing are for our benefit. So why, then, do we need servers? We invented the idea of a server to help us branch out from one system to two. Client. Server. Easy! But what if the client also acts as a server? Preposterous everyone told me. Crazy. Insane! Never! I submitted a paper to multiple outlets, including my employer IBM, but nobody would give it a second look (honestly I bet most didn't look at it a first time either).

Ok. I took a step back, did some research on grid computing and peer to peer networks and realized I wasn't wrong, simply nobody I talked to could see the light.

Well I welcome those who do. Important people like those at Cisco and EMC who have finally understood the operating system is a limitation. I expect now that they're thinking, that thinking will evolve and they'll realize it's not just the OS but how development environments, platforms, and the very concept of a centralized data center are significant limitations with inherent cost disadvantages. Let's move to a new model which frees us from the limitations of the physical which has always been the interface boundary for innovation!

Welcome to the Cloud! Welcome to the Party!

For more on Serverless Computing read the unpublished paper Serverless Computing.