World Library  
Flag as Inappropriate
Email this Article

Remote Differential Compression

Article Id: WHEBN0011034655
Reproduction Date:

Title: Remote Differential Compression  
Author: World Heritage Encyclopedia
Language: English
Subject: .NET Framework, File Replication Service, Data synchronization, Rsync, Transactional NTFS
Collection: Data Synchronization, Windows Administration, Windows Components
Publisher: World Heritage Encyclopedia

Remote Differential Compression

Remote Differential Compression (RDC) is a client–server synchronization algorithm that allows the contents of two files to be synchronized by communicating only the differences between them. It was introduced with Microsoft Windows Server 2003 R2 and is included with later Windows client and server operating systems.

Unlike Binary Delta Compression (BDC), which is designed to operate only on known versions of a single file, RDC does not make assumptions about file similarity or versioning. The differences between files are computed on the fly, therefore RDC is suitable for efficient synchronization of files that have been updated independently, where network bandwidth is small, or where the files are large but the differences between them are small.

The algorithm used is based on fingerprinting blocks on each file locally at both ends of the replication partners. Since many types of file changes can cause the file contents to move (for example, a small insertion or deletion at the beginning of a file can cause the rest of the file to become misaligned to the original content) the blocks used for comparison are not based on static arbitrary cut points but on cut points defined by the contents of each file segment. This means that if a part of a file changes in length or blocks of the contents get moved to other parts of the file, the block boundaries for the parts that have not changed remain fixed related to the contents, and thus the series of fingerprints for those blocks don't change either, they just change position. By comparing all hashes in a file to the hashes for the same file at the other end of the replication pair, RDC is able to identify which blocks of the file have changed and which haven't, even if the contents of the file has been significantly reshuffled. Since comparing large files could imply making large numbers of signature comparisons, the algorithm is recursively applied to the hash sets to detect which blocks of hashes have changed or moved around, significantly reducing the amount of data that needs to be transmitted for comparing files.

Later versions of Windows support cross-file RDC, which finds files similar to the one being replicated, and uses blocks of the similar files that are identical to the replicating file to minimize data transferred over the WAN. Cross-file RDC can use blocks of up to five similar files.[1]

Where files are similar, RDC can significantly reduce amount of data transferred. A test was made with two similar but not identical 2.4MB files, bitmap files of the same photograph, one of which had a watermark superimposed. Being uncompressed, the content of the files was mostly similar. When transferred with RDC, only 217kB was needed, a 92% reduction. For smaller files the RDC processing overhead may override the bandwidth reduction.[2]

RDC is similar in many ways to the older (1996) rsync protocol, but with some useful innovations, in particular the recursive algorithm and cross-file RDC.[2]

RDC is implemented in Windows operating systems essentially as an API, but is invoked by very little software, particularly on non-server systems. A myth has arisen that RDC significantly slows local file transfers and should be switched off; a Microsoft TechNet Web page by a Microsoft Directory Services Team member comprehensibly debunks this with detailed timings, additional to the fact that a service which is not invoked by software cannot have any effect, detrimental or otherwise.[3]

See also


  1. ^ Microsoft TechNet: DFS Replication: Frequently Asked Questions, section "What is cross-file RDC?", pub. 16 October 2006, updated 30 January 2013
  2. ^ a b Remote Differential Compression (aka rsync algorithm for Windows), David Jade, Programming, 15 February 2013
  3. ^ Microsoft TechNet: Ask the Directory Services Team - Debunking the Vista Remote Differential Compression Myth, Ned Pyle [MSFT], 26 Jun 2009

External links

  • Introduction to DFS replication
  • About Remote Differential Compression
  • Optimizing File Replication over Limited-Bandwidth Networks using Remote Differential Compression
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.