Measurement software

Psychometric software is software that is used for psychometric analysis of data from tests, questionnaires, or inventories reflecting latent psychoeducational variables. While some psychometric analyses can be performed with standard statistical software like SPSS, most analyses require specialized tools.

Sources

Because only a few commercial businesses (most notably Assessment Systems Corporation and Scientific Software International) develop specialized psychometric tools, there exist many free tools developed by researchers and educators. Important websites for free psychometric software include:

  • CASMA at the University of Iowa, USA
  • REMP at the University of Massachusetts, USA
  • Software from Brad Hanson
  • Software from John Uebersax
  • Software from J. Patrick Meyer
  • Software directory at the Institute for Objective Measurement
  • Software from Lihua Yao

Classical test theory

Classical test theory is an approach to psychometric analysis that has weaker assumptions than item response theory and is more applicable to smaller sample sizes.

CITAS

CITAS (Classical Item and Test Analysis Spreadsheet) is a free Excel workbook designed to provide scoring and statistical analysis of classroom tests. Item responses (ABCD) and keys are typed or pasted into the workbook, and the output automatically populates; unlike other programs, CITAS does not require any "running" or experience in psychometric analysis, making it accessible to school teachers and professors. It is available for free download here.

jMetrik

jMetrik University of Virginia. Current methods include classical item analysis, differential item functioning (DIF) analysis, confirmatory factor analysis, item response theory, IRT equating, and nonparametric item response theory. The item analysis includes proportion, point biserial, and biserial statistics for all response options. Reliability coefficients include Cronbach's alpha, Guttman's lambda, the Feldt-Gilmer Coefficient, the Feldt-Brennan coefficient, decision consistency indices, the conditional standard error of measurement, and reliability if item deleted. The DIF analysis is based on nonparametric item characteristic curves and the Mantel-Haenszel procedure. DIF effect sizes and ETS DIF classifications are included in the output. Confirmatory factor analysis is limited to the common factor model for congeneric, tau-equivalent, and parallel measures. Fit statistics are reported along with factor loadings and error variances. IRT methods include the Rasch, partial credit, and rating scale models. IRT equating methods include mean/mean, mean/sigma, Haebara, and Stocking-Lord procedures.

jMetrik also include basic descriptive statistics and a graphics facility that produces bar charts, pie chart, histograms, kernel density estimates, and line plots.

jMetrik is a pure Java application that runs on 32-bit and 64-bit versions of Windows, Mac, and Linux operating systems. jMetrik requires Java 1.6 on the host computer. jMetrik is available as a free download from www.ItemAnalysis.com.

Iteman

Iteman is a commercial program specifically designed for classical test analysis, producing rich text (RTF) reports with graphics, narratives, and embedded tables. It calculates the proportion and point biserial of each item, as well as high/low subgroup proportions, and detailed graphics of item performance. It also calculates typical descriptive statistics, including the mean, standard deviation, reliability, and standard error of measurement, for each domain and the overall tests. It is only available from Assessment Systems Corporation [4].

Lertap

Lertap (Laboratory of Educational Research Test Analysis Program) is a comprehensive software package for classical test analysis developed for use with Microsoft Excel. It includes test, item, and option statistics, classification consistency and mastery test analysis, procedures for cheating detection, and extensive graphics (e.g., trace lines for item options, conditional standard errors of measurement, scree plots, boxplots of group differences, histograms, scatterplots).

DIF, differential item functioning, is supported in the Excel 2007, Excel 2010, Excel 2011 (Macintosh), and Excel 2013 versions of Lertap. Mantel-Haenszel methods are used; graphs of results are provided.

Lertap will produce ASCII data files ready for input to Xcalibre and Bilog MG.

Several sample datasets for use with Lertap and/or other item and test analysis programs are available [6].

Lertap was developed by Larry Nelson at [8].

TAP

TAP (the Test Analysis Program) is a free program for basic classical analysis developed by Gordon Brooks at Ohio University. It is available here.

ViSta-CITA

ViSta-CITA (Classical Item and Test Analysis) is a module included in the Visual Statistics System (ViSta) that focuses on graphical-oriented methods applied to psychometric analysis. It is freely available at [9]. It was developed by Ruben Ledesma, J. Gabriel Molina, Pedro M. Valero-Mora, and Forrest W. Young.

Item response theory calibration

Item response theory (IRT) is a psychometric approach which assumes that the probability of a certain response is a direct function of an underlying trait or traits. Various functions have been proposed to model this relationship, and the different calibration packages reflect this. Several software packages have been developed for additional analysis such as equating; they are listed in the next section.

BILOG-MG

BILOG-MG is a software program for IRT analysis of dichotomous (correct/incorrect) data, including fit and [11].

Facets

Facets is a software program for Rasch analysis of rater- or judge-intermediated data, such as essay grades, diving competitions, satisfaction surveys and quality-of-life data. Other applications include rank-order data, binomial trials and Poisson counts. For availability, see Software directory at the Institute for Objective Measurement.

flexMIRT

flexMIRT is a new multilevel and multiple group IRT software package for item analysis and test scoring. This IRT software package fits a variety of unidimensional and multidimensional item response theory models (also known as item factor analysis models) to single-level and multilevel data in any number of groups. It is available from Vector Psychometric Group, LLC [12].

ICL

ICL (IRT Command Language) performs IRT calibrations, including the 1, 2, and 3 parameter logistic models as well as the partial credit model and generalized partial credit model. It can also generate response data. As the name implies, it is completely command code driven, with no graphical user interface. It is available for free download here.

jMetrik

jMetrik University of Virginia. Current methods include classical item analysis, differential item functioning (DIF) analysis, confirmatory factor analysis, item response theory, IRT equating, and nonparametric item response theory. The item analysis includes proportion, point biserial, and biserial statistics for all response options. Reliability coefficients include Cronbach's alpha, Guttman's lambda, the Feldt-Gilmer Coefficient, the Feldt-Brennan coefficient, decision consistency indices, the conditional standard error of measurement, and reliability if item deleted. The DIF analysis is based on nonparametric item characteristic curves and the Mantel-Haenszel procedure. DIF effect sizes and ETS DIF classifications are included in the output. Confirmatory factor analysis is limited to the common factor model for congeneric, tau-equivalent, and parallel measures. Fit statistics are reported along with factor loadings and error variances. IRT methods include the Rasch, partial credit, and rating scale models. IRT equating methods include mean/mean, mean/sigma, Haebara, and Stocking-Lord procedures.

jMetrik also include basic descriptive statistics and a graphics facility that produces bar charts, pie chart, histograms, kernel density estimates, and line plots.

jMetrik is a pure Java application that runs on 32-bit and 64-bit versions of Windows, Mac, and Linux operating systems. jMetrik requires Java 1.6 on the host computer. jMetrik is available as a free download from www.ItemAnalysis.com.

MULTILOG

MULTILOG is an extension of BILOG to data with polytomous (multiple) responses. It is commercial, and only available from Scientific Software International [15].

BMIRT

BMIRT [16] is a free Java multi-purpose application program that conducts item calibrations and ability estimation in a multidimensional, multi-group item response theory (IRT) model framework; it can fit dichotomous or polytomous models, along with mixed models. It supports both exploratory and confirmatory and for both compensatory and noncompensatory MIRT models.


PARSCALE

PARSCALE is a program designed specifically for polytomous IRT analysis. It is commercial, and only available from Scientific Software International [18].

PARAM-3PL

PARAM-3PL here.

TESTFact

Testfact features [20] - Marginal maximum likelihood (MML) exploratory factor analysis and classical item analysis of binary data - Computes tetrachoric correlations, principal factor solution, classical item descriptive statistics, fractile tables and plots - Handles up to 10 factors using numerical quadrature: up to 5 for non-adaptive and up to 10 for adaptive quadrature - Handles up to 15 factors using Monte Carlo integration techniques - Varimax (orthogonal) and PROMAX (oblique) rotation of factor loadings - Handles an important form of confirmatory factor analysis known as "bifactor" analysis: Factor pattern consists of one main factor plus group factors - Simulation of responses to items based on user specified parameters - Correction for guessing and not-reached items - Allows imposition of constraints on item parameter estimates - Handles omitted and not-presented items - Detailed online HELP documentation includes syntax and annotated examples.

WINMIRA 2001

WINMIRA 2001 is a program for analyses with the [22].

Winsteps

Winsteps is a program designed for analysis with the [23]. A previous DOS-based version, BIGSTEPS, is also available.

Xcalibre

XCalibre is a commercial program that performs marginal maximum likelihood estimation of both dichotomous (1PL-Rasch, 2PL, 3PL) and all major polytomous IRT models. The interface is point-and-click; no command code required. Its output includes both spreadsheets and a detailed, narrated report document with embedded tables and figures, which can be printed and delivered to subject matter experts for item review. It is only available from Assessment Systems Corporation [24].

IATA

IATA is a software package for analysing psychometric and educational assessment data. The interface is point-and-click, and all functionality is delivered through wizard-style interfaces that are based on different workflows or analysis goals, such as pilot testing or equating. IATA reads and writes csv, Excel and SPSS file formats, and produces exportable graphics for all statistical analyses. Each analysis also includes heuristics suggesting appropriate interpretations of the numerical results. IATA performs factor analysis, (1PL-Rasch, 2PL, 3PL) scaling and calibration, differential item functioning (DIF) analysis, (basic) computer aided test development, equating, IRT-based standard setting, score conditioning, and plausible value generation. It is available for free from Polymetrika International [25].

Additional item response theory software

Because of the complexity of IRT, there exist few software packages capable of calibration. However, many software programs exist for specific ancillary IRT analyses such as equating and scaling. Examples of such software follow.

eqboot

eqboot is an open source syntax-based Java application for conducting IRT equating and computing the bootstrap standard error of equating developed by J. Patrick Meyer. The program runs on any 32- or 64-bit operating system that has the Java Runtime Environment (JRE) version 1.6 or higher installed. At the moment, the programs only support equating with binary items. EQBOOT will compute equating constants using the mean/mean, mean/sigma, Haebara,www.ItemAnalysis.com.

LinkMIRT

LinkMIRT [26] is a free Java application program that links two sets of item parameters in a multidimensional IRT (MIRT) framework. The software can implement the Stocking and Lord method, the mean/mean method, and the mean/sigma method. Linking by comment-person and by random equivalent-groups design are supported.

SimuMIRT

SimuMIRT [27] is a program that simulates multidimensional data (examinee ability and item responses) for a fixed form (i.e., paper and pencil) test, from a user-specified set of parameters. The rater-effect model is supported.

SimuMCAT

SimuMCAT [29]). Two exposure control approaches are possible: the traditional Sympson-Hetter approach and a maximum exposure control approach. It is also possible to implement content constraints using the Priority Index method. Different stopping rules are implemented with fixed-length test and varying-length test. The user specifies true examinee ability, item pools, and item selection procedures, and the program outputs selected items with item responses and ability estimates. Bayesian and non-Bayesian methods can be specified by the user. The examinees’ ability and item pools can also be created from the program by the user specified distributions.

IRTEQ

IRTEQ

ResidPlots-2

ResidPlots-2 University of Massachusetts Amherst.

WinGen

WinGen

ST

ST [33] conducts item response theory (IRT) scale transformations for dichotomously scored tests.

POLYST

POLYST [34] conducts IRT scale transformations for dichotomously and polytomously scored tests.

STUIRT

STUIRT [35] conducts IRT scale transformations for mixed-format tests (tests that include some multiple choice items and some polytomous items).

Decision consistency

Decision consistency methods are applicable to criterion-referenced tests such as licensure exams and academic mastery testing.

Iteman

Iteman [36] provides an index of decision consistency as well as a classical estimate of the conditional standard error of measurement at the cutscore, which is often requested for accreditation of a testing program.

jMetrik

jMetrik [37] is free and open source software for conducting a comprehensive psychometric analysis. Detailed information is listed above. jMetrik includes Huynh's decision consistency estimates if cut-scores are provided in the item analysis.

Lertap

Lertap [38] calculates several statistics related to decision and classification consistency, including Livingston's coefficient, the Brennan-Kane dependability index, kappa, and an estimate of p(0), number of correct classifications as a proportion, derived by using the Peng-Subkoviac adaptation of Huynh's method. More detailed information concerning Lertap is provided above, under 'Classical test theory'.

General statistical analysis software

Software designed for general statistical analysis can often be used for certain types of psychometric analysis. Moreover, code for more advanced types of psychometric analysis is often available.

R

[40].

SPSS

SPSS, originally called the Statistical Package for the Social Sciences, is a commercial general statistical analysis program where the data is presented in a spreadsheet layout and common analyses are menu driven.

S-Plus

S-Plus is a commercial analysis package based on the programming language S.

SAS

SAS is a commercially available package for statistical analysis and manipulation of data. It is also command-based.

References

This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.