Sunday, January 17, 2010

You need it real time? Really? Real Time???

Business intelligence and executive dashboards are all the rage soon to be complimented, if not already, by customer dashboards. Although I'm no ERP specialist I get involved in all kinds of projects and one of the most glaring inconsistencies I face is the demand for real time data that changes daily or even weekly.

First we need to drop the term real time and focus on the term most recent. Real time to a guy like me from the plant floor automation/data acquisition/embedded systems/engine control module world means something TOTALLY different. Real time means just what it says, in the moment. No delays.

Second business people need to think about data with the fourth dimension of time. Why ask for a data update when the data hasn't changed? Whenever I hear the request for real time data I push back by asking "How often does the data change?" Sometimes it's hourly, sometimes periodic throughout the day. Most often its daily or weekly. Rarely does the data change minute to minute. More important to understand is how the data drive a decision. In one project we grabbed real time service data when all the executives needed was an hourly update. In return that hourly update was used to perform daily staffing analysis, and even then only when the service data exceeded set boundaries. Since the designers never asked, it was just assumed the data had to be real time. We changed from a real time posting to the executive information portal to an exception based alert and made everyone's life that much easier. This is one of hundreds of examples from healthcare to consumer packaged goods to retail that I have in my work history.

Why is this a big deal? Speed costs. It does in racing and it does in technology. The faster you want to go the more it's going to cost you. In IT we're not doing our jobs if we comply with the request just because the business "wants it". When we load up systems with unnecessary requests we slow everyone down; the opportunity cost of bad design. A great architectural option is the data intermediary which sits between the application and the data store to cache query and service results (if the data hasn't changed, the service response won't either)

Educate the business, add the time dimension to all data, and consider how the data drives decisions when building interfaces, services, reports, and dashboards. And the added side benefit? Less complexity!

Tuesday, December 29, 2009

We're Through the Looking Glass - Cloud Security

I feel it's appropriate to consider an Alice in Wonderland world when thinking through the cloud computing landscape and it's security implications. Most security experts I know, including many CISO's at clients some of whom are quoted on the topic, appear to throw water on the burning desire of CTO's everywhere to go "cloud". It's understandable and, in my opinion, quite reasonable. Let's face it - the cloud isn't ready for prime time.

I have no argument against using the cloud for non-critical tasks but I tell clients day in and day out we are 2-3yrs from enterprise clouding (I love new domains where we can make up words!). Comments like that get me in the good graces of CISO's, at least until my next sentence, "You better get started now." What? Why do we have to get started now if the enterprise version won't be ready for 2-3yrs? Well because that's when it will be easy and everyone will have it - don't you want a competitive advantage? Well then put your nose to the grindstone and get that whole security thing figured out pronto so IT can move forward...

...or get run over, your choice!

With the cloud computing juggernaut gaining speed now is not the time for "No, but..." responses. What CIO's and CTO's need now are "Yes, if..." answers on how to pursue secure cloud services. We have lots of existing models, standards, and solutions so nobody can tell me the cloud is entirely unique. What it does present is a new architecture to which we need to plug in known solutions to known problems and some new solutions to cover feared gaps.

One of the biggest gaps clients identify today is data security. "How do I know my data is secure at a cloud provider?" Honestly I don't know in a holistic way but the old stand-by of encrypt data in transit and data at rest seems to pose the foundation of a solution. The immediate response, as the responder's face wrinkles so their eyes become nothing but slits in the creases of skin below their brow, "But that's too much overhead". Oh. So are we taking this security thing seriously or not? If we are then again, lets take our foundation and now get to work on the speed issue. Solving that problem involves the economics of speed where money is often the answer, governance so we don't speedily reach a cliff, and improving performance so the overhead of encryption becomes a round-off error.

Economically we don't have much of an issue. Cheap bandwidth. Cheap cloud storage. At $90k per 50TB of data storage at Amazon S3 we can afford encryption even if it increases our data sets by an order of magnitude in size. Governance is an issue but as we increase the use of automation in the cloud we should be automating governance as well. We need strong tools enabling us to enforce policies, especially on data which is hasn't been categorized.

How do we increase performance? Encryption takes time but if we can convince the cloud storage providers to provide hardware based encryption we can reduce the cost. Next we need a new way of thinking that takes advantage of cloud: lots of network bandwidth, storage services available on the fly, and ubiquitous availability.

How about applying a grid storage idea to the problem for data archival. Take a set of data and split it up into multiple chunks, each chunk with a sequence number, and encrypt it. Store a random set of chunks at three or more storage vendors and manage which data is stored where using a private index engine. Because each site contains a portion of the total data, the data is non-contiguous, and the data is random the value of the data at the site is dramatically reduced. A hacker would be required to hack all the sites, decrypt the data, and reassemble it to get the full picture. With the landscape being inherently more difficult to hack and the value of any independent data set being low a less onerous encryption method, such as one using 64bit keys, can be used.

Where could such a solution be used? How about monthly billing. Once the bill is generated and paid, the details are rarely if ever used again. Archive the data to the cloud. If it needs to be retrieved it can be, but its value is low to begin with for most hackers. Securing the data through obfuscation will make most hackers look for easier targets. Hacking is a numbers game.

So for all the CISO's out there consider that now is the time to identify the gaps and start looking at how to fill them in. One thing I can assure you of as we talk to CIO's and CTO's, the cloud computing train is coming and it's starting to build some serious momentum.

Be prepared to lead, follow, or get out of the way!

Sunday, November 15, 2009

The Achilles Heel of Cloud Computing

I think everyone understands the "Cloud" in cloud computing is an undefined network incredibly similar to, but not necessarily synonymous, with the Internet. What many apparently have not spent time thinking through is the impact building private and public clouds will have on the network architecture at most public companies. Three fundamentals which need to be revisited are: the Internet connectivity architecture, Internet bandwidth, and network security.

Many, if not most, large corporations consolidate their internet connectivity into a very few and sometimes a single point. I was working recently with a large non-governmental organization who has consolidated all of their internet access for primary and field offices into their Chicago data center. It's a great model for using the Internet, not so good for incorporating the Internet. In the world of Cloud Computing the Internet is less an end point and more one of several intermediate points during the execution of a function. Public clouds must be accessible at all times from any location to be of value. This mandate implies there is no single point of failure between the corporation and the public internet. A new architecture is required with many-to-one access to the internet instead of a one-to-one model. If New York cannot connect to the Internet it cannot jeopardize the entire corporation. And remember that Internet backbones do go down and more likely will due to ever increasing loads in the foreseeable future. If the public cloud function leverages data within the data center the reverse is true; multiple paths provide redundancy. I'm sure some will argue with me but I cannot make sense of data travelling from Denver to Chicago just to gain access to the Internet; it's an archaic model at best.

The second major issue is existing Internet bandwidth will have to grow. At the same time as traffic moves to the Internet it will move off of internal WAN's. We've grown accustomed to cheap bandwidth but with the explosion of WiFi, the coming of WiMax, and the growth of rich media on the Internet I expect those days are coming to a close. I expect it will be cheaper to run over the Internet than through private backbones which will help drive us to a more federated model for Internet connectivity.

Finally security as we see it today becomes problematic. How do we sniff packets between a user and a cloud provider when the company has nothing traversed in between. Surely we could route users through the corporate firewall but again, this defeats some of the economic model of cloud. We need better tools on the client side to help us manage the security aspects of this federated model. I'm not saying there aren't tools today, but those tools need to improve their automated detection, recording, and reporting capabilities to prevent attacks both inside and outside the company.

I've noticed over the past ten years a change in approach to networking. In the past bandwidth was managed loosely to ensure it was adequate. We've really tightened it down and now we need to be asking our bandwidth providers to provide more virtualized options enabling the rapid, automated scale up and down of circuits. We don't want to leave the network out of the push to move from a fixed to a variable cost model in IT.


Tuesday, November 10, 2009

Marketing Through the Cloud

Cloud Computing promises to turn many conventional ideas on their head such as disaster recovery, data warehousing, and application development. However one area I see as having the opportunity to explode in value with Cloud is Marketing. I have always had an affinity for marketing from college where it was my minor (unofficially because as a Computer Engineering student we weren't allowed to have minors) through multiple interface roles from my internship at Eaton/Cutler-Hammer through my time in Sales & Distribution at IBM. The challenge of marketing is to know what the customer thinks and what the customer wants, often in advance of the customer. Much of marketing is driven by research; what works, what doesn't, who, what where, when, how, why. The only limitation is time because there is always one more question to ask. At some point that research has to be analyzed and made actionable through inputs to product development, sales, advertising, eCommerce and IT.

So where does cloud come in to play? First at the most base level with the enablement of rapid change represented by SOA. Once an SOA foundation is in place there are no limits to the reach of marketing. Social Networking. Semantic Web. Business Intelligence. Awareness. These are the future of marketing and all work better on an SOA foundation.

Social networking gives marketers the opportunity to find the influencers, tailor marketing messages, solicit feedback, observe from afar, and even seed new ideas. It's a human lab with no walls and no limitations. I have yet to see tools such as Facebook used for focus groups, or Twitter to measure interest, or harvesting forums for feedback. It may happen but as a user and an industry insider I've heard no discussion.

Semantic web technologies will be as important to Marketing as they will to supply chain management. The closer marketing gets to understanding the whole picture the better they can analyze data in the proper context and provide better input to downstream efforts. Why did someone purchase the product? What was the impetus for purchase? What made them think of the product? What was their first thought about the product? Good questions to ask, good information to know, but today all of this data is gathered post-sale through interviews. What if, via semantic technologies, customers were queried for this data and responded, all without realizing it? Capturing data real-time is always preferable to eliminate the issues of memory loss and filtering.

With the trove of new data available new business intelligence tools will emerge, and by having the data available in the cloud means the four walls of the data center will no longer limit how the data is analyzed. Specialty firms staffed with PhD's will offer services to slice and dice the data using their proprietary tools and provide an additional layer of context by bringing in additional 3rd party data sources. Business intelligence, real business intelligence and not the analytical reporting which passes for BI in many companies today, will emerge as a cloud service.

Awareness? What is that? Find me a better name and I'll use it but through cloud computing and semantic technologies we are a very large and important step closer to enabling awareness; the ability of a computer to understand. Although it will start with small steps, over the next 10 years computers will increasingly direct their own searches for data they feel is missing to assemble their own conclusions based on simple human queries. And when this happens we'll be ready to turn the corner in Marketing and become more proactive than reactive, able to predict events and prepare to seize opportunity. We'll be able to take our supply chain optimizations which can move the snow blowers to the states with the impending snow storms and extend that knowledge with who owns a snow blower ready for replacement, what size is needed based on the footprint of their property, who would benefit from snow removal services, who can be prompted to buy to use a remaining store credit. Targeted marketing will beging to take on the 1:1 reality we've been talking about for the past decade.

Cloud is a revolutionary technology which can be adopted in evolutionary steps making it unique and unavoidable. Cloud brings the world closer together eliminating some of our artificial barriers and bridging ones which are all too real. For Marketing this new capability will drive new thinking, new approaches, and new solutions bound only by their need to understand the customer better than the customer understands themself.

Saturday, November 7, 2009

The Sun Sets on Disaster Recovery (Finally)

Big changes have to come in small doses. For those of us fortunate to have several years experience with SOA and utility computing we see so many of the great things cloud can do and how it really addresses so many of the complexities within IT. I often explain part of the tremendous value of Cloud Computing is trapping complexity within layers of abstraction so we don't expose limitations. However I see one of the biggest killer apps for Cloud Computing, business continuity, as not only trapping disaster recovery within the infrastructure layer but doing away with disaster reactivity entirely!

First not every failure is a disaster. Failures can and do occur and we need to be smarter about how we engineer our solutions. Our focus should be on automated recovery; an option which becomes a real solution in a cloud world. If a service dies another service should be started. If hardware fails jobs should move to alternate hardware. Creating heat maps for failover can go a long way to identifying and targeting areas where failure recovery needs to be addressed and hopefully automated. But a disaster is a large scale failure for which we so often employ a different set of tools. Why? Primarily because of our legacy silo approach. If one silo dies we need to move data and jobs to an alternate silo. In a cloud architecture we don't have silos (even in a virtualized architecture the silo is logical rather than physical manifestation). So?

Once we architect our solutions to be service oriented and distributed from the start we lessen the impact of all failures from the simple to the theatrical. If we lose a data center that's bad. However if our solution is already load balancing across data centers, and we ensure by business rule we always have services available in each center, then our exposure is limited to in-flight transactions and sessions. If we have a true cloud infrastructure then we should already have the network bandwidth required to perform database mirroring in which case the disruption of the data center loss is as minimal as we have the ability to make it. Further advances will come to light in the next few years as databases tackle the federation issue and learn to manage data in logical instances rather than physical domains.

None of this happens, however, without intent.

We need a business case. What is the lost business opportunity per hour of downtime. For many business critical systems this value is calculable, and I argue if it’s not then there’s no reason for recovery. Our solution cost needs to be a small percentage of that potential loss. Today disaster recovery is EXPENSIVE: hot-sites and recovery contracts, tapes to retrieve from an off-site location and restore, staff to move around, periodic tests which always end with multiple failures. According to the Symantec 5th Annual IT Disaster Recovery Survey in June 2009, the average annual budget for disaster recovery is $50M. Consider that against the cost of 50TB of storage on Amazon E3: $90k. WOW! So in one fell swoop we can improve recovery speed and accuracy and reduce cost by elminating tapes, backups, tape recoveries, off-site storage, and the administration costs. And it only gets better!

Moving into the cloud we take advantage of all the tools and capabilities that already exist from service directories and virtual machines to provisioning and orchestration engines, schedulers and service level managers. We move out of Disaster Recovery and into Business Continuity. The focus shifts from recovering business systems based on a Recovery Time Objective and Recovery Point Objective to providing near seamless continuity via recovering services and virtual machines and cloudbursting to get needed resources. Is the cloud ready today? It's pretty darn close. Consider that Oracle's ERP solution will backup and recover from cloud storage. According to Symantec's survey three key hurdles in the virtualized world are storage management tools to protect data and applications, resource constraints which challenge the backing up of virtual environments, and that today 1/3 of organizations don't backup virtual environments.

Today or tomorrow business continuity brought about by cloud concepts is on the horizon and is a target every should be shooting for. It saves money and time and reduces risk. What's not to love?

Tuesday, November 3, 2009

What Should a Cloud Provide

To me the single most important question I have never been asked is "What Should a Cloud Provide." I've been asked what do Cloud providers provide, and at what cost, and what solutions are available. But nobody has taken the question up a notch which to me means the focus is on application instead of understanding. We shouldn't limit our discussion to what's available. In such a nascent market we should define what we need, then tell the providers who can in turn react and build the services deemed most valuable to the market. So what should a cloud provide?

Nearly Unlimitted Bandwidth
It may sound impossible however cloud computing is predicated on having bandwidth available on demand. Workloads and data need to be shifted around the infrastructure at a moment's notice which means the network cannot be a constraint. This is a tremendous opportunity for the telco's to build out additional network bandwidth. More so this is a tremendous opportunity for telco's to deliver cloud services themselves out of their existing data centers.

Storage without Backup
Storage on a large scale is cheap. Amazon S3 provides 50TB of storage for $90k/yr. At those costs I can keep three copies of every file and eliminate ALL the costs of backup and recovery for less than the fully loaded cost of three administrators. And costs go down as volume increases.

Resources On Demand
This is the easy one but when additional processors are required they must be available, fully provisioned, within minutes if not seconds. From bare metal to fully loaded with application and all shouldn't take more than 15min. And although today we are bound by the limitations of processor type or operating system, now that technologists see them as barriers new solutions will evolve to minimize their impact.

On Demand Management Tools
Four important tools to learn and love are provisioning engines, orchestrators, schedulers, and service level managers. Provisioning engines which grab extra resources and put them into the correct pool when demand is high, and move resources out of the pool when demand subsides. Orchestrators observe the changes in traffic and determine when resources should be added or removed by the provisioning engine. Schedulers determine where jobs should run within the cloud and communicate with the orchestrators to ensure the required footprint is available. Finally the service level manager watches over all the applications and intercedes whenever an application threatens to miss its service level targets. At times this may require ramping down usage of non-critical applications in preference of critical applications.

Metering and Chargebacks/Billing
Cloud only makes sense with the economic model of paying by consumption when consumption and its associated cost are communicated. Usage must be metered and the costs must reflect the consumption. Although some costs will still make sense to allocate, the majority of costs must move to a chargeback model.

Interoperability
Clouds are clouds and as such the barriers between them are security based. Clouds must be able to share data, services, and infrastructure otherwise instead of cloud one ends up with a larger silo. The future value of cloud will be the new capabilities which are unlocked in such areas as collaboration, supply chain integration, business intelligence, and federated control. Security checkpoints will be important but should be the only barrier to integrating clouds.

Business Rules
Since a cloud is a consumption driven model there need to be business rules which govern consumption. Looking at healthcare today, what drove the high cost was the elimination of barriers to access. We made healthcare easy through co-pays and low deductibles. The easier something is, the more people will use it which results in a tilting of the supply and demand scale. In response either supply increase or prices will rise. So in the cloud we need business rules to govern consumption so it doesn't outstrip supply. What rights are required to execute an application? Who has the necessary rights? Who can grant rights? When will external resources be used? I believe that just as we did in the mobile telecommunications industry, at some point tiered pricing will enter cloud computing and rates will be higher during the day than overnight. When this change occurs it could have a dramatic impact on the costs of a cloud solution.

Monday, November 2, 2009

Three Cheers to Ubuntu

"Sit Ubu. Sit." I can't remember what show it was but it always ended with a picture of a dog and that line. That's what I thought Ubuntu was when I heard about it the first time. Only later through continued pestering did I realize it was Debian fork in the mid 2000's. I'm a bit of a Linux bigot who preferred the Slackware release in the Volkerding days and moved to RedHat in the late 1990's. When RedHat went commercial I felt lost for a bit but picked up Fedora. Hearing how great Ubuntu was from friends I recently decided to give it a try on a Windows Vista machine with chronic problems and I was impressed.

Oh, don't misunderstand me, I'm not impressed with Ubuntu. I simply haven't used it enough but so far it seems remarkably like, oh, Fedora.

I was impressed with the stance Ubuntu has taken on cloud computing. Ubuntu brags on their site about the inclusion of Eucalyptus (Elastic Utility Computing Architecture Linking Your Programs To Useful Systems). Eucalyptus is an Amazon EC2 clone that works with EC2, S3 and EBS. Ubuntu takes an open source package that is relatively unknown, includes it in their server distributions, and makes the argument you should use Ubuntu as the foundation for your private cloud BECAUSE it will make you compatible with Amazon EC2 when you want to cloudburst. Now that's smart! People forget that Amazon is largely based on open technologies with Linux virtual machines running Xen hypervisor.

Pure GENIUS!

It makes me wonder where the relationship between Microsoft and Amazon sits. Microsoft and Google are competitors. In the growing cloud space Amazon and Google are competitors. But with Azure Microsoft is also an Amazon competitor, kind of, right? No in the sense that they use different platforms (Windows vs. Linux), but yes in that they compete for business. But Microsoft is so big perhaps Amazon needs to adopt "the enemy of my enemy is my friend" mentality and get Microsoft to build in cloudbursting capability to Amazon EC2. It would mean more licenses of Microsoft server products and address those Microsoft customers who don't want Azure, just more processor time. What happens when the concept of the operating system falls apart, as large tech heavyweights such as EMC and Cisco are starting to argue. Does that push Microsoft out of the picture?

Regardless of all the above, and perhaps because of it, choosing Linux is easy. Which distribution to choose? Really any because Eucalyptus can work with any distro, but Ubuntu has clearly seized the initiative.

My prediction is we'll hear RedHat announce they are working with OpenNebula...