World Library  
Flag as Inappropriate
Email this Article

Multinomial probit

Article Id: WHEBN0014758355
Reproduction Date:

Title: Multinomial probit  
Author: World Heritage Encyclopedia
Language: English
Subject: Regression analysis, General linear model, Mixed logit, Polynomial regression, Fixed effects model
Publisher: World Heritage Encyclopedia

Multinomial probit

In statistics and econometrics, the multinomial probit model is a generalization of the probit model used when there are several possible categories that the dependent variable can fall into. As such, it is an alternative to the multinomial logit model as one method of multiclass classification. It is not to be confused with the multivariate probit model, which is used to model correlated binary outcomes for more than one dependent variable.

General specification

It is assumed that we have a series of observations Yi, for i = 1...n, of the outcomes of multi-way choices from a categorical distribution of size m (there are m possible choices). Along with each observation Yi is a set of k observed values x1,i, ..., xk,i of explanatory variables (also known as independent variables, predictor variables, features, etc.). Some examples:

  • The observed outcomes might be "has disease A, has disease B, has disease C, has none of the diseases" for a set of rare diseases with similar symptoms, and the explanatory variables might be characteristics of the patients thought to be pertinent (sex, race, age, blood pressure, body-mass index, presence or absence of various symptoms, etc.).
  • The observed outcomes are the votes of people for a given party or candidate in a multi-way election, and the explanatory variables are the demographic characteristics of each person (e.g. sex, race, age, income, etc.).

The multinomial probit model is a statistical model that can be used to predict the likely outcome of an unobserved multi-way trial given the associated explanatory variables. In the process, the model attempts to explain the relative effect of differing explanatory variables on the different outcomes.

Formally, the outcomes Yi are described as being categorically-distributed data, where each outcome value h for observation i occurs with an unobserved probability pi,h that is specific to the observation i at hand because it is determined by the values of the explanatory variables associated with that observation. That is:

Y_i|x_{1,i},\ldots,x_{k,i} \ \sim \operatorname{Categorical}(p_i,\ldots,p_m),\text{ for }i = 1, \dots , n

or equivalently

\Pr[Y_i=h|x_{1,i},\ldots,x_{k,i}] = p_{i,h},\text{ for }i = 1, \dots , n,

for each of m possible values of h.

Latent variable model

Multinomial probit is often written in terms of a latent variable model:

\begin{align} Y_i^{1\ast} &= \boldsymbol\beta_0 \cdot \mathbf{X}_i + \varepsilon_1 \, \\ Y_i^{2\ast} &= \boldsymbol\beta_1 \cdot \mathbf{X}_i + \varepsilon_2 \, \\ \ldots & \ldots \\ Y_i^{m\ast} &= \boldsymbol\beta_m \cdot \mathbf{X}_i + \varepsilon_m \, \\ \end{align}


\boldsymbol\varepsilon \sim \mathcal{N}(0,\boldsymbol\Sigma)


Y_i = \begin{cases} 1 & \text{if }Y_i^{1\ast} > Y_i^{2\ast},\ldots,Y_i^{m\ast} \\ 2 & \text{if }Y_i^{2\ast} > Y_i^{1\ast},Y_i^{3\ast},\ldots,Y_i^{m\ast} \\ \ldots & \ldots \\ m &\text{otherwise.} \end{cases}

That is,

Y_i = \arg\max_{h=1}^m Y_i^{h\ast}

Note that this model allows for arbitrary correlation between the error variables, so that it doesn't necessarily respect independence of irrelevant alternatives.

When \scriptstyle\boldsymbol\Sigma is the identity matrix (such that there is no correlation or heteroscedasticity), the model is called independent probit.

This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.