| By Denny Lane | Article Rating: |
|
| September 14, 2009 06:45 AM EDT | Reads: |
1,321 |
Virtualization and fault-tolerant technology are like the would-be ideal couple, a match made in heaven, but who never meet, even though they're constantly in the same place at the same time. That can be a funny conundrum in romantic comedies, but in the real IT world, virtualization and fault tolerance need to get together quickly and often. IT organizations that are virtualizing their server infrastructures need both technologies if they're going to succeed in building platforms that have virtualization's efficiency but also provide the continuous availability they need to support enterprise applications.
Virtualization and fault tolerance are both long-established technologies with roots in the mainframe era. They went through transformations that
have made them especially relevant in today's IT markets, and they exponentially increase each others' value. Full-function fault tolerance - note the qualifier - provides the continuous availability that makes virtualized environments reliable enough to support the most demanding enterprise applications.
The growing interest in continuous availability computing to complement virtualization has led to the inevitable realization by vendors that they need to jump on the bandwagon before their opportunity passes. As a consequence, definitions, features and functions get stretched out of shape to mask the shortcomings of marketers' claims. That can lead IT managers into buying decisions that aren't going to work for them because they're buying less availability than they realize. Vendors have taken to calling any reliability solution "fault tolerant" when only a few products actually meet the criteria. Before committing to a fault-tolerant continuous availability solution to support a virtualization project, IT managers need reliable definitions of the terms so they know what they're getting. And maybe not getting.
Fault Tolerance or High Availability?
Fault tolerance is the apex of reliability technologies, and the only standards-based means for achieving continuous availability, or near-perfect 99.999 percent uptime (a.k.a. five-nines availability) in continuous, round-the-clock processing. Most of the availability solutions on the market today that are called fault tolerant are not. Many of them provide high availability, which is "four nines" or below, but not continuous availability. The difference is important.
Unlike fault tolerant systems, high availability systems recover from a problem by failing over, or switching to a standby system and restarting applications on another server. Server clusters, for example, are high availability solutions, but they can never be fault tolerant because they allow an interruption in processing during the failover period, which can be anywhere from a few minutes to an hour long. Critical applications, such as emergency 911 or financial trading, can't tolerate that much downtime, so a high-availability solution doesn't work for them. Therefore high availability ranks a category below fault tolerance in the availability stack. "Failover" and "restart" are never part of the fault tolerance lexicon, except to say they do not apply.
What, then, is fault tolerance? The answer depends on the type of fault tolerance - hardware or software. IT managers have to understand their similarities and differences to choose the best approach for a particular need.
At its most basic, "hardware" fault tolerance is designed to prevent unplanned downtime and data loss. All components are duplicated - not just power suppliers or fans - and run in complete synchronization so they appear as one logical server to the operating system and the application. Logic and diagnostic software cross-check every operation. If something is amiss within the server, the diagnostics will identify the problem and, if necessary, remove the broken part from service while the rest of the server and the application continue to run completely unaffected. Often knocked for being pricey, entry Intel-based servers can be purchased for less than $15,000 (USD).
After a generation of existing only as hardware, fault tolerance for x86 systems is now developing as a software technology. This can muddy the waters. These new software solutions are fault tolerant up to a point. They support continuous availability but only under certain workloads. They are not also able to harness the full power of virtualization and multiprocessor technologies.
The state-of-the-art today for software fault tolerance is linking two industry-standard x86 servers together with cable and software (or virtual machines mirrored by software across two, preferably three, identical x86 servers) so that they run in virtual lock-step, similar to the way fault tolerant hardware does, and deliver five-nines uptime. But, unlike hardware, applications and OSs must be licensed on each physical server.
Software Fault Tolerance: The Good and Bad
There are realities to software fault tolerance that limit its potential in corporate IT. Perhaps the most important of these is that software-based fault tolerance lacks symmetric multi-processing (SMP), which means applications cannot scale beyond a single core per server. In a two- socket server powered by quad-core processors, an application running in fault-tolerant mode is restricted to the compute power of just one of the eight server cores. Further, processor manufacturers are engineering virtualization capabilities into powerful new products that will be grossly underutilized in this scenario. Despite assertions that all applications will run in a software-fault-tolerant environment, physical or virtual, many true business-critical and mission-critical applications are simply too demanding to function properly, if at all.
This is not full-function fault tolerance; it's fault tolerance light, appropriate for workgroups or departments, but not for enterprise applications. It's unlikely these technological shortcomings can be overcome any time soon.
By the narrowest of definitions, the new generation of software solutions on the market is fault-tolerant. However, the end product of true fault tolerance is continuous availability at the highest levels of corporate IT. Mission-critical application availability requires more than saying you have fault tolerance. Continuous availability demands a combination of fault-tolerant hardware and software. That combination makes fault-tolerant technology an ideal match for virtualization, providing the continuous availability that makes virtualized environments a versatile, flexible, and economical platform for enterprise applications.
Published September 14, 2009 Reads 1,321
Copyright © 2009 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Denny Lane
Denny Lane is director of product marketing and management at Maynard, Mass.-based Stratus Technologies.
![]() |
leonllyu 09/10/09 01:23:00 PM EDT | |||
The combination of virtualization and fault tolerance has already been provided in cloud computing (IaaS, specifically), I guess. |
||||
- 4th International Cloud Computing Conference & Expo Starts Today
- Publishing Synergy: Blog, Twitter and Ulitzer
- Performance Tuning Essentials for Java
- Cloud Expo New York Call for Papers Deadline December 15
- Google Wave
- IBM Hardware Chief, Intel VC Exec Arrested in Insider Trading Scam
- Cloud Computing Can Revitalize Your Career as Software Developer
- SOA World Magazine "Readers' Choice Awards" Voting Is Now Open
- Oracle+MySQL Opponents Take to the Barricades
- Virtualization Expo Call for Papers Deadline December 15
- Oracle Faces Growing Price for MySQL
- SpringSource Moving to Spring 3.0
- 4th International Cloud Computing Conference & Expo Starts Today
- Deputy CIO of the CIA to Keynote 1st Annual GovIT Expo
- Publishing Synergy: Blog, Twitter and Ulitzer
- Performance Tuning Essentials for Java
- Cloud Expo New York Call for Papers Deadline December 15
- Cloud Computing Expo: Exclusive Q&A with Yahoo! SVP Cloud Computing
- Google Wave
- IBM Hardware Chief, Intel VC Exec Arrested in Insider Trading Scam
- Cloud Computing Can Revitalize Your Career as Software Developer
- Oracle-Sun: IBM Reportedly Behind Delay
- Citrix Aims To Cripple VMware’s Cloud Designs
- Oracle Trashes HP Relationship for Sun
- After Ubuntu, Windows Looks Increasingly Bad, Increasingly Archaic, Increasingly Unfriendly
- SCO CEO Posts Open Letter to the Open Source Community
- Simula Labs Launches Hosted Delivery Platform To Enable Enterprise Open Source Adoption
- Where Are RIA Technologies Headed in 2008?
- Source Claims SCO Will Sue Google
- How Open Is "Open"? – Industry Luminaries Join the Debate
- Latest SCO News is Plain Weird
- IBM Tells SCO Court It Can't Find AIX-on-Power Code
- SCO Claims Linux Lifted ELF
- Flashback: Investing in 'Professional Open Source' - Exclusive 2004 Interview with David Skok, Matrix Partners
- HP Starts Pushing Desktop Linux
- Linux Business Week Exclusive: Linux Kernel To Be Re-Written To Counter Microsoft FUD































