World Library  
Flag as Inappropriate
Email this Article

Tree-adjoining grammar

Article Id: WHEBN0000567555
Reproduction Date:

Title: Tree-adjoining grammar  
Author: World Heritage Encyclopedia
Language: English
Subject: Embedded pushdown automaton, Thread automaton, Deep linguistic processing, Indexed grammar, Formal grammar
Collection: Generative Linguistics, Grammar Frameworks
Publisher: World Heritage Encyclopedia

Tree-adjoining grammar

Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi. Tree-adjoining grammars are somewhat similar to context-free grammars, but the elementary unit of rewriting is the tree rather than the symbol. Whereas context-free grammars have rules for rewriting symbols as strings of other symbols, tree-adjoining grammars have rules for rewriting the nodes of trees as other trees (see tree (graph theory) and tree (data structure)).


  • History 1
  • Description 2
  • Complexity and application 3
    • Equivalences 3.1
  • Lexicalized Tree-adjoining grammar 4
  • Notes 5
  • References 6
  • External links 7


TAG originated in investigations by Joshi and his students into the family of adjunction grammars (AG),[1] the "string grammar" of Zellig Harris.[2] AGs handle endocentric properties of language in a natural and effective way, but do not have a good characterization of exocentric constructions; the converse is true of rewrite grammars, or phrase-structure grammar (PSG). In 1969, Joshi introduced a family of grammars that exploits this complementarity by mixing the two types of rules. A few very simple rewrite rules suffice to generate the vocabulary of strings for adjunction rules. This family is distinct from the Chomsky-Schützenberger hierarchy but intersects it in interesting and linguistically relevant ways.[3] The center strings and adjunct strings can also be generated by a dependency grammar, avoiding the limitations of rewrite systems entirely.[4] [5]


The rules in a TAG are trees with a special leaf node known as the foot node, which is anchored to a word. There are two types of basic trees in TAG: initial trees (often represented as '\alpha') and auxiliary trees ('\beta'). Initial trees represent basic valency relations, while auxiliary trees allow for recursion.[6] Auxiliary trees have the root (top) node and foot node labeled with the same symbol. A derivation starts with an initial tree, combining via either substitution or adjunction. Substitution replaces a frontier node with another tree whose top node has the same label. The root/foot label of the auxiliary tree must match the label of the node at which it adjoins. Adjunction can thus have the effect of inserting an auxiliary tree into the center of another tree.[4]

Other variants of TAG allow multi-component trees, trees with multiple foot nodes, and other extensions.

Complexity and application

Tree-adjoining grammars are more powerful (in terms of weak generative capacity) than context-free grammars, but less powerful than linear context-free rewriting systems,[7] indexed[note 1] or context-sensitive grammars.

A TAG can describe the language of squares (in which some arbitrary string is repeated), and the language \{a^n b^n c^n d^n | 1 \le n \}. This type of processing can be represented by an embedded pushdown automaton. Languages with cubes (i.e. triplicated strings) or with more than four distinct character strings of equal length cannot be generated by tree-adjoining grammars.

For these reasons, tree-adjoining grammars are often described as mildly context-sensitive. These grammar classes are conjectured to be powerful enough to model natural languages while remaining efficiently parsable in the general case.[8]


Vijay-Shanker and Weir (1994)[9] demonstrates that Linear Indexed Grammars, Combinatory Categorial Grammars, Tree-adjoining Grammars, and Head Grammars are weakly equivalent formalisms, in that they all define the same string languages.

Lexicalized Tree-adjoining grammar

Lexicalized Tree-Adjoining Grammars (LTAG) are a variant of TAG in which each elementary tree (initial or auxiliary) is associated with a lexical item. A lexicalized grammar for English has been developed by the XTAG Research Group of the Institute for Research in Cognitive Science at the University of Pennsylvania. [5]


  1. ^ since for each tree-adjoining grammar, a linear indexed grammar can be found producing the same language, see below, and for the latter, a weakly equivalent (proper) indexed grammar can be found, in turn, see Indexed grammar#Computational Power


  1. ^ Joshi, Aravind; S. R. Kosaraju; H. Yamada (1969). "String Adjunct Grammars". Proceedings Tenth Annual Symposium on Automata Theory, Waterloo, Canada.  Joshi, Aravind K.; Kosaraju, S. Rao; Yamada, H. M. (1972), "String Adjunct Grammars: I. Local and Distributed Adjunction", Information and Control 21 (2): 93–116,  
  2. ^ Harris, Zellig S. (1962). String analysis of sentence structure. Papers on Formal Linguistics 1. The Hague: Mouton & Co. 
  3. ^ Joshi, Aravind (1969). "Properties of Formal Grammars with Mixed Types of Rules and Their Linguistic Relevance". Proceedings Third International Symposium on Computational Linguistics, Stockholm, Sweden. 
  4. ^ a b Joshi, Aravind; Owen Rambow (2003). "A Formalism for Dependency Grammar Based on Tree Adjoining Grammar" (PDF). Proceedings of the Conference on Meaning-Text Theory. 
  5. ^ a b "A Lexicalized Tree Adjoining Grammar for English". 
  6. ^ Jurafsky, Daniel; James H. Martin (2000). Speech and Language Processing. Upper Saddle River, NJ: Prentice Hall. p. 354. 
  7. ^ Kallmeyer, Laura (2010). Parsing Beyond Context-Free Grammars. Springer. Here: p.215-216
  8. ^ Joshi, Aravind (1985). "How much context-sensitivity is necessary for characterizing structural descriptions". In D. Dowty, L. Karttunen, and A. Zwicky, (eds.). Natural Language Processing: Theoretical, Computational, and Psychological Perspectives. New York, NY: Cambridge University Press. pp. 206–250. 
  9. ^ Vijay-Shanker, K. and Weir, David J. 1994. The Equivalence of Four Extensions of Context-Free Grammars. Mathematical Systems Theory 27(6): 511–546.

External links

  • The XTAG project, which uses a TAG for natural language processing.
  • A tutorial on TAG
  • Another tutorial with focus on comparison with Lexical Functional Grammar and grammars extraction from Treebank
  • SemConst Documentation A quick survey on Syntax and Semantic Interface problematic within the TAG framework.
  • The TuLiPa project The Tübingen Linguistic Parsing Architecture (TuLiPA) is a multi-formalism syntactic (and semantic) parsing environment, designed mainly for multi-component tree adjoining grammars with tree tuples
  • The Metagrammar Toolkit which provides several tools to edit and compile MetaGrammars into TAGs. It also include a wide coverage French Metagrammars.
  • LLP2 A lexicalized tree adjoining grammar parser which provides an easy to use graphical environment (page in French)

This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.