World Library  
Flag as Inappropriate
Email this Article

Single point of failure

Article Id: WHEBN0026207504
Reproduction Date:

Title: Single point of failure  
Author: World Heritage Encyclopedia
Language: English
Subject: Load balancing (computing), Cascading failure, Failing badly, Double-spending, Manifold (scuba)
Collection: Failure, Fault-Tolerant Computer Systems, Network Architecture, Reliability Engineering, Systems Engineering
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Single point of failure

In this diagram the router is a single point of failure for the communication network between computers

A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working.[1] They are undesirable in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial system.

Contents

  • Overview 1
  • Computing 2
  • Other fields 3
  • See also 4
  • References 5

Overview

Systems can be made robust by adding redundancy in all potential SPOFs. For instance, the owner of a small tree care company may only own one wood chipper. If the chipper breaks, he may be unable to complete his current job and may have to cancel future jobs until he can obtain a replacement.

Redundancy can be achieved at various levels. For instance, the owner of the tree care company may have spare parts ready for the repair of the wood chipper, in case it fails. At a higher level, he may have a second wood chipper that he can bring to the job site. Finally, at the highest level, he may have enough equipment available to completely replace everything at the work site in the case of multiple failures.

The assessment of a potential SPOF involves identifying the critical components of a complex system that would provoke a total systems failure in case of malfunction. Highly reliable systems should not rely on any such individual component.

Computing

In computing, redundancy can be achieved at the internal component level, at the system level (multiple machines), or site level (replication).

One would normally deploy a load balancer to ensure high availability for a server cluster at the system level.

In a high-availability server cluster, each individual server may attain internal component redundancy by having multiple power supplies, hard drives, and other components. System level redundancy could be obtained by having spare servers waiting to take on the work of another server if it fails.

Since a data center is often a support center for other operations such as business logic, it represents a potential SPOF in itself. Thus, at the site level, the entire cluster may be replicated at another location, where it can be accessed in case the primary location becomes unavailable.

Paul Baran and Donald Davies developed packet switching, a key part of "survivable communications networks". Such networks -- including ARPANET and the Internet -- are designed to have no single point of failure. Multiple paths between any two points on the network allow those points to continue communicating with each other, the packets "routing around" damage, even after any single failure of any one particular path or any one intermediate node.

Other fields

The concept of a single point of failure has also been applied to fields outside of engineering, computers, and networking, such as corporate supply chain management.[2]

Design structures that create single points of failure include bottlenecks and series circuits (in contrast to parallel circuits).

See also

Concepts:

Applications:

In literature:

References

  1. ^ 1: Designing Large-scale LANs – Page 31, K. Dooley, O'Reilly, 2002
  2. ^ Single Point of Failure: The 10 Essential Laws of Supply Chain Risk Management. Wiley. Oct 7, 2009.  
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 


Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.