World Library  
Flag as Inappropriate
Email this Article

Collision attack

Article Id: WHEBN0000969624
Reproduction Date:

Title: Collision attack  
Author: World Heritage Encyclopedia
Language: English
Subject: Rebound attack, MD4, NaSHA, GOST (hash function), Grøstl
Collection: Cryptographic Attacks, Cryptographic Hash Functions
Publisher: World Heritage Encyclopedia

Collision attack

In cryptography, a collision attack on a cryptographic hash tries to find two inputs producing the same hash value, i.e. a hash collision. In contrast to a preimage attack the hash value is not specified.

There are roughly two types of collision attacks:

Collision attack
Find two different messages m1 and m2 such that hash(m1) = hash(m2).
Chosen-prefix collision attack
Given two different prefixes p1, p2 find two appendages m1 and m2 such that hash(p1 ∥ m1) = hash(p2 ∥ m2) (where is the concatenation operation).


  • Classical collision attack 1
  • Chosen-prefix collision attack 2
  • Attack scenarios 3
    • Digital signatures 3.1
  • See also 4
  • References 5
  • External links 6

Classical collision attack

Mathematically stated, a collision attack finds two different messages m1 and m2, such that hash(m1) = hash(m2). In a classical collision attack, the attacker has no control over the content of either message, but they are arbitrarily chosen by the algorithm.

Much like symmetric-key ciphers are vulnerable to brute force attacks, every cryptographic hash function is inherently vulnerable to collisions using a birthday attack. Due to the birthday problem, these attacks are much faster than a brute force would be. A hash of n bits can be broken in 2n/2 time (evaluations of the hash function).

More efficient attacks are possible by employing cryptanalysis to specific hash functions. When a collision attack is discovered and is found to be faster than a birthday attack, a hash function is often denounced as "broken". The NIST hash function competition was largely induced by published collision attacks against two very commonly used hash functions, MD5[1] and SHA-1. The collision attacks against MD5 have improved so much that it takes just a few seconds on a regular computer.[2] Hash collisions created this way are usually constant length and largely unstructured, so cannot directly be applied to attack widespread document formats or protocols.

However, workarounds are possible by abusing dynamic constructs present in many formats. In this way, two documents would be created which are as similar as possible in order to have the same hash value. One document would be shown to an authority to be signed, and then the signature could be copied to the other file. Such a malicious document would contain two different messages in the same document, but conditionally display one or the other through subtle changes to the file:

  • Some document formats like PostScript, or macros in Microsoft Word, have conditional constructs.[3][4] (if-then-else) that allow testing whether a location in the file has one value or another in order to control what is displayed.
  • TIFF files can contain cropped images, with a different part of an image being displayed without affecting the hash value.[4]
  • PDF files are vulnerable to collision attacks by using color value (such that text of one message is displayed with a white color that blends into the background, and text of the other message is displayed with a dark color) which can then be altered to change the signed document's content.[4]

Chosen-prefix collision attack

An extension of the collision attack is the chosen-prefix collision attack, which is specific to Merkle–Damgård hash functions. In this case, the attacker can choose two arbitrarily different documents, and then append different calculated values that result in the whole documents having an equal hash value. This attack is much more powerful than a classical collision attack.

Mathematically stated, given two different prefixes p1, p2, the attack finds two appendages m1 and m2 such that hash(p1 ∥ m1) = hash(p2 ∥ m2) (where is the concatenation operation).

In 2007, a chosen-prefix collision attack was found against MD5, requiring roughly 250 evaluations of the MD5 function. The paper also demonstrates two X.509 certificates for different domain names, with colliding hash values. This means that a certificate authority could be asked to sign a certificate for one domain, and then that certificate could be used to impersonate another domain.[5]

A real-world collision attack was published in December 2008 when a group of security researchers published a forged

  • "Meaningful Collisions", attack scenarios for exploiting cryptographic hash collisions
  • Fast MD5 and MD4 Collision Generators - Bishop Fox (formerly Stach & Liu). Create MD4 and MD5 hash collisions using groundbreaking new code that improves upon the techniques originally developed by Xiaoyun Wang. Using a 1.6 GHz Pentium 4, MD5 collisions can be generated in an average of 45 minutes, and MD4 collisions can be generated in an average of 5 seconds. Originally released on 22Jun2006.

External links

  1. ^ a b Xiaoyun Wang, Dengguo Feng, Xuejia Lai, Hongbo Yu: Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD, Cryptology ePrint Archive Report 2004/199, 16 Aug 2004, revised 17 Aug 2004. Retrieved July 27, 2008.
  2. ^ M.M.J. Stevens (June 2007). "On Collisions for MD5" (PDF). [...] we are able to find collisions for MD5 in about 224.1 compressions for recommended IHVs which takes approx. 6 seconds on a 2.6GHz Pentium 4. 
  3. ^ Magnus Daum, Stefan Lucks. "Hash Collisions (The Poisoned Message Attack)".  
  4. ^ a b c Max Gebhardt, Georg Illies, Werner Schindler. "A Note on the Practical Value of Single Hash Collisions for Special File Formats" (PDF). 
  5. ^ Marc Stevens, Arjen Lenstra, Benne de Weger (2007-11-30). "Chosen-prefix Collisions for MD5 and Colliding X.509 Certificates for Different Identities". 
  6. ^ Alexander Sotirov; et al. (2008-12-30). "Creating a rogue CA certificate". 
  7. ^ "Microsoft releases Security Advisory 2718704".  
  8. ^ Marc Stevens (7 June 2012). "CWI Cryptanalist Discovers New Cryptographic Attack Variant in Flame Spy Malware". Centrum Wiskunde & Informatica. Retrieved 9 June 2012. 
  9. ^ "Hash Collision Q&A". Cryptography Research Inc. 2005-02-15. Archived from the original on 2008-07-17. Because of the way hash functions are used in the HMAC construction, the techniques used in these recent attacks do not apply 
  10. ^ Shai Halevi and Hugo Krawczyk, Randomized Hashing and Digital Signatures


See also

  1. Mallory creates two different documents A and B, that have an identical hash value (collision).
  2. Mallory then sends document A to Alice, who agrees to what the document says, signs its hash and sends it back to Mallory.
  3. Mallory copies the signature sent by Alice from document A to document B.
  4. Then she sends document B to Bob, claiming that Alice signed the other document (document B). Because the digital signature matches the document hash, Bob's software is unable to detect the modification.

The usual attack scenario goes like this:

Note that all public key certificates, like SSL certificates, also rely on the security of digital signatures and are compromised by hash collisions.

Because digital signature algorithms cannot sign a large amount of data efficiently, most implementations use a hash function to reduce ("compress") the amount of data that needs to be signed down to a constant size. Digital signature schemes are often vulnerable to hash collisions, unless using techniques like randomized hashing.[10]

Digital signatures

Many applications of crytographic hash functions do not rely on collision resistance, thus collision attacks do not affect their security. For example, HMACs are not vulnerable.[9] For the attack to be useful, the attacker must be in control of the input to the hash function.

Attack scenarios

The Flame malware successfully used a new variation of a chosen-prefix collision attack to spoof code signing of its components by a Microsoft root certificate that still used the compromised MD5 algorithm.[7][8]

and at least one Microsoft code-signing certificate was still using MD5 in May 2012. [6] certificate authorities were still willing to sign MD5-verified certificates in December 2008,[1]

This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.