World Library  
Flag as Inappropriate
Email this Article

Concentration parameter

Article Id: WHEBN0028962480
Reproduction Date:

Title: Concentration parameter  
Author: World Heritage Encyclopedia
Language: English
Subject: Dirichlet process, Hidden Markov model
Collection: Statistical Terminology, Theory of Probability Distributions
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Concentration parameter

In probability theory and statistics, a concentration parameter is a special kind of numerical parameter of a parametric family of probability distributions. Concentration parameters occur in two kinds of distribution: In the Von Mises–Fisher distribution, and in conjunction with distributions whose domain is a probability distribution, such as the symmetric Dirichlet distribution and the Dirichlet process. The rest of this article focuses on the latter usage.

The larger the value of the concentration parameter, the more evenly distributed is the resulting distribution (the more it tends towards the uniform distribution). The smaller the value of the concentration parameter, the more sparsely distributed is the resulting distribution, with most values or ranges of values having a probability near zero (in other words, the more it tends towards a distribution concentrated on a single point, the degenerate distribution defined by the Dirac delta function).

In the case of multivariate Dirichlet distributions, there is some confusion over how to define the concentration parameter. In the topic modelling literature, it is often defined as the sum of the individual Dirichlet parameters,[1] when discussing symmetric Dirichlet distributions (where the parameters are the same for all dimensions) it is often defined to be the value of the single Dirichlet parameter used in all dimensions. This second definition is smaller by a factor of the dimension of the distribution.

A concentration parameter of 1 (or k, the dimension of the Dirichlet distribution, by the definition used in the topic modelling literature) results in all sets of probabilities being equally likely, i.e. in this case the Dirichlet distribution of dimension k is equivalent to a uniform distribution over a k-1-dimensional simplex. Note that this is not the same as what happens when the concentration parameter tends towards infinity. In the former case, all resulting distributions are equally likely (the distribution over distributions is uniform). In the latter case, only near-uniform distributions are likely (the distribution over distributions is highly peaked around the uniform distribution). Meanwhile, in the limit as the concentration parameter tends towards zero, only distributions with nearly all mass concentrated on one of their components are likely (the distribution over distributions is highly peaked around the k possible Dirac delta distributions centered on one of the components, or in terms of the k-dimensional simplex, is highly peaked at corners of the simplex).

An example of where a sparse prior (concentration parameter much less than 1) is called for, consider a topic model, which is used to learn the topics that are discussed in a set of documents, where each "topic" is described using a categorical distribution over a vocabulary of words. A typical vocabulary might have 100,000 words, leading to a 100,000-dimensional categorical distribution. The prior distribution for the parameters of the categorical distribution would likely be a symmetric Dirichlet distribution. However, a coherent topic might only have a few hundred words with any significant probability mass. Accordingly, a reasonable setting for the concentration parameter might be 0.01 or 0.001. With a larger vocabulary of around 1,000,000 words, an even smaller value, e.g. 0.0001, might be appropriate.

References

  1. ^ Wallach, Hanna M.; Iain Murray; Ruslan Salakhutdinov; David Mimno (2009). "Proceedings of the 26th Annual International Conference on Machine Learning". ICML '09. New York, NY, USA: ACM. pp. 1105–1112.  

See also

This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 


Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.