World Library  
Flag as Inappropriate
Email this Article

Null character

Article Id: WHEBN0000338161
Reproduction Date:

Title: Null character  
Author: World Heritage Encyclopedia
Language: English
Subject: Tar (computing), ASCII, Control character, /dev/zero, C string handling
Collection: Computer Security Exploits, Control Characters, Nothing
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Null character

The null character (also null terminator), abbreviated NUL, is a control character with the value zero.[1][2] It is present in many character sets, including ISO/IEC 646 (or ASCII), the C0 control code, the Universal Character Set (or Unicode), and EBCDIC. It is available in nearly all mainstream programming languages.[3]

The original meaning of this character was like NOP—when sent to a printer or a terminal, it does nothing (some terminals, however, incorrectly display it as space). When electromechanical teleprinters were used as computer output devices, one or more null characters were sent at the end of each printed line to allow time for the mechanism to return to the first printing position on the next line. On punched tape, the character is represented with no holes at all, so a new unpunched tape is initially filled with null characters, and often text could be "inserted" at a reserved space of null characters by punching the new characters into the tape over the nulls.

Today the character has much more significance in C and its derivatives and in many data formats, where it serves as a reserved character used to signify the end of a string,[4] often called a null-terminated string.[5] This allows the string to be any length with only the overhead of one byte; the alternative of storing a count requires either a string length limit of 255 or an overhead of more than one byte (there are other advantages/disadvantages described under null-terminated string).

Contents

  • Representation 1
  • Encoding 2
  • See also 3
  • References 4
  • External links 5

Representation

The null character is often represented as the escape sequence \0 in source code string literals or character constants.[6] In many languages (such as C, which introduced this notation), this is not a separate escape sequence, but an octal escape sequence with a single octal digit of 0; as a consequence, \0 must not be followed by any of the digits 0 through 7; otherwise it is interpreted as the start of a longer octal escape sequence.[7] Other escape sequences that are found in use in various languages are \000, \x00, \z, or the Unicode representation \u0000. A null character can be placed in a URL with %00.

The ability to represent a null character does not always mean the resulting string will be correctly interpreted, as many programs will consider the null to be the end of the string. Thus the ability to type it (in case of unchecked user input) creates a vulnerability known as null byte injection and can lead to security exploits.[8]

In caret notation the null character is ^@. On some keyboards, one can enter a null character by holding down Ctrl and pressing @ (which usually requires also holding Shift and pressing another key such as 2 or P). It is also common to be able to type a null with Ctrl2 or Alt256 or Ctrlspace.

In documentation the null character is sometimes represented as a single-em-width symbol containing the letters "NUL". In Unicode, there is a character with a corresponding glyph for visual representation of the null character, "symbol for null", U+2400 ()—not to be confused with the actual null character, U+0000.

Encoding

In all modern character sets the null character has a code point value of zero. In most encodings this is translated to a single code unit with a zero value. For instance in UTF-8 it is a single zero byte. However in Modified UTF-8 the null character is encoded as two bytes: 0xC0, 0x80. This allows the byte with the value of zero, which is now not used for any character, to be used as a string terminator.

See also

References

  1. ^ ASCII format for Network Interchange.  
  2. ^ "The set of control characters of the ISO 646" (PDF). Secretariat ISO/TC 97/SC 2. 1975-12-01. p. 4.4. Position: 0/0, Name: Null, Abbreviation: Nul 
  3. ^ "A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string literal." — ANSI/ISO 9899:1990 (the ANSI C standard), section 5.2.1
  4. ^ "A string is a contiguous sequence of characters terminated by and including the first null character" — ANSI/ISO 9899:1990 (the ANSI C standard), section 7.1.1
  5. ^ Working Draft, Standard for Programming Language C++ (PDF) (ISO 14882 standard working draft),  
  6. ^ Kernighan and Ritchie, C, p. 38
  7. ^ In YAML this combination is a separate escape sequence.
  8. ^ Null Byte Injection WASC Threat Classification Null Byte Attack section.

External links

  • Null Byte Injection WASC Threat Classification Null Byte Attack section
  • Poison Null Byte Introduction Introduction to Null Byte Attack
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 


Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.