World Library  
Flag as Inappropriate
Email this Article

Distributed data store

Article Id: WHEBN0000870094
Reproduction Date:

Title: Distributed data store  
Author: World Heritage Encyclopedia
Language: English
Subject: I2P, Dynamo (storage system), Super column family, Voldemort (distributed data store), Clustered file system
Collection: Data Management, Distributed Data Storage, Distributed Data Stores
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Distributed data store

A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion.[1] It is usually specifically used to refer to either a distributed database where users store information on a number of nodes, or a computer network in which users store information on a number of peer network nodes.

Contents

  • Distributed databases 1
  • Peer network node data stores 2
  • Examples 3
    • Distributed non-relational databases 3.1
    • Peer network node data stores 3.2
  • See also 4
  • References 5

Distributed databases

Distributed databases are usually non-relational databases that make a quick access to data over a large number of nodes possible. Some distributed databases expose rich query abilities while others are limited to a key-value store semantics. Examples of limited distributed databases are Google's BigTable, which is much more than a distributed file system or a peer-to-peer network,[2] Amazon's Dynamo[3] and Windows Azure Storage.[4]

As the ability of arbitrary querying is not as important as the availability, designers of distributed data stores have increased the latter at an expense of consistency. But the high-speed read/write access results in reduced consistency, as it is not possible to have both consistency, availability, and partition tolerance of the network, as it has been proven by the CAP theorem.

Peer network node data stores

In peer network data stores, the user can usually reciprocate and allow other users to use their computer as a storage node as well. Information may or may not be accessible to other users depending on the design of the network.

Most peer-to-peer networks do not have distributed data stores in that the user's data is only available when their node is on the network. However, this distinction is somewhat blurred in a system such as BitTorrent, where it is possible for the originating node to go offline but the content to continue to be served. Still, this is only the case for individual files requested by the redistributors, as contrasted with a network such as Freenet where all computers are made available to serve all files.

Distributed data stores typically use an error detection and correction technique. Some distributed data stores (such as Parchive over NNTP) use forward error correction techniques to recover the original file when parts of that file are damaged or unavailable. Others try again to download that file from a different mirror.

Examples

Distributed non-relational databases

Peer network node data stores

See also

References

  1. ^ Yaniv Pessach, Distributed Storage (Distributed Storage: Concepts, Algorithms, and Implementations ed.) 
  2. ^ "BigTable: Google's Distributed Data Store". http://the-paper-trail.org/: Paper Trail. Retrieved 2011-04-05. Although GFS provides Google with reliable, scalable distributed file storage, it does not provide any facility for structuring the data contained in the files beyond a hierarchical directory structure and meaningful file names. It’s well known that more expressive solutions are required for large data sets. Google’s terabytes upon terabytes of data that they retrieve from web crawlers, amongst many other sources, need organising, so that client applications can quickly perform lookups and updates at a finer granularity than the file level. [...] The very first thing you need to know about BigTable is that it isn’t a relational database. This should come as no surprise: one persistent theme through all of these large scale distributed data store papers is that RDBMSs are hard to do with good performance. There is no hard, fixed schema in a BigTable, no referential integrity between tables (so no foreign keys) and therefore little support for optimised joins. 
  3. ^ Sarah Pidcock (2011-01-31). "Dynamo: Amazon’s Highly Available Key-value Store" (PDF). http://www.cs.uwaterloo.ca/: WATERLOO – CHERITON SCHOOL OF COMPUTER SCIENCE. p. 2/22. Retrieved 2011-04-05. Dynamo: a highly available and scalable distributed data store 
  4. ^ "Windows Azure Storage". 2011-09-16. Retrieved 6 November 2011. 
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 


Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.