Package ‘polycor’

 

Package ‘polycor’

August 27, 2016

 

Version 0.7-9

Date 2016-08-26

Title Polychoric and Polyserial Correlations

Depends R (>= 3.3.0)

Imports stats, mvtnorm, Matrix

ByteCompile yes

LazyLoad yes

Description Computes polychoric and polyserial correlations by quick ``two-step'' methods or ML, optionally with standard errors; tetrachoric and biserial correlations are special cases.

License GPL (>= 2)

URL https://r-forge.r-project.org/projects/polycor/, http://CRAN.R-project.org/package=polycor

Author John Fox [aut, cre]
Maintainer John Fox <jfox@mcmaster.ca>
Repository CRAN
Repository/R-Forge/Project polycor

Repository/R-Forge/Revision 13

Repository/R-Forge/DateTimeStamp 2016-08-26 18:25:37

Date/Publication 2016-08-27 00:22:11
NeedsCompilation no

R topics documented:

hetcor............................................ 2

polychor........................................... 4

polyserial .......................................... 6

print.polycor......................................... 8

 

Index10

 

hetcor Heterogeneous Correlation Matrix 

 

 

 

 

Description

Computes a heterogenous correlation matrix, consisting of Pearson product-moment correlations between numeric variables, polyserial correlations between numeric and ordinal variables, and poly- choric correlations between ordinal variables.

Usage

    hetcor(data, ..., ML = FALSE, std.err = TRUE, bins=4, pd=TRUE)
    ## S3 method for class 'data.frame'
    hetcor(data, ML = FALSE, std.err = TRUE,
      use = c("complete.obs", "pairwise.complete.obs"), bins=4, pd=TRUE, ...)
    ## Default S3 method:
    hetcor(data, ..., ML = FALSE, std.err = TRUE, bins=4, pd=TRUE)
    ## S3 method for class 'hetcor'
    print(x, digits = max(3, getOption("digits") - 3), ...)
    ## S3 method for class 'hetcor'
    as.matrix(x, ...)

Arguments

data a data frame consisting of factors, ordered factors, logical variables, and/or nu- meric variables, or the first of several variables.

... variables and/or arguments to be passed down.
ML if TRUE, compute maximum-likelihood estimates; if FALSE, compute quick two-

step estimates.
std.err if TRUE, compute standard errors.

bins number of bins to use for continuous variables in testing bivariate normality; the default is 4.

pd if TRUE and if the correlation matrix is not positive-definite, an attempt will be made to adjust it to a positive-definite matrix, using the nearPD function in the Matrix package. Note that default arguments to nearPD are used (except corr=TRUE); for more control call nearPD directly.

use if "complete.obs", remove observations with any missing data; if "pairwise.complete.obs", compute each correlation using all observations with valid data for that pair of
variables.

x an object of class "hetcor" to be printed, or from which to extract the correla- tion matrix.

digits number of significant digits.

 

Value

Returns an object of class "hetcor" with the following components:

    correlations
    type
    std.errors
    n
    tests
    NA.method
    ML

Note

the correlation matrix.
the type of each correlation: "Pearson", "Polychoric", or "Polyserial".
the standard errors of the correlations, if requested.
the number (or numbers) of observations on which the correlations are based.
p-values for tests of bivariate normality for each pair of variables.
the method by which any missing data were handled: "complete.obs" or "pairwise.complete.obs". TRUE for ML estimates, FALSE for two-step estimates.

Although the function reports standard errors for product-moment correlations, transformations (the most well known is Fisher’s z-transformation) are available that make the approach to asymptotic normality much more rapid.

Author(s)

John Fox <jfox@mcmaster.ca> References

Drasgow, F. (1986) Polychoric and polyserial correlations. Pp. 68-74 in S. Kotz and N. Johnson, eds., The Encyclopedia of Statistics, Volume 7. Wiley.

Olsson, U. (1979) Maximum likelihood estimation of the polychoric correlation coefficient. Psy- chometrika 44, 443-460.

Rodriguez, R.N. (1982) Correlation. Pp. 193-204 in S. Kotz and N. Johnson, eds., The Encyclope- dia of Statistics, Volume 2. Wiley.

Ghosh, B.K. (1966) Asymptotic expansion for the moments of the distribution of correlation coef- ficient. Biometrika 53, 258-262.

Olkin, I., and Pratt, J.W. (1958) Unbiased estimation of certain correlation coefficients. Annals of Mathematical Statistics 29, 201-211.

See Also

polychor, polyserial, nearPD Examples

    if(require(mvtnorm)){
        set.seed(12345)
        R <- matrix(0, 4, 4)
        R[upper.tri(R)] <- runif(6)
        diag(R) <- 1
        R <- cov2cor(t(R) %*% R)
        round(R, 4)  # population correlations
 
    data <- rmvnorm(1000, rep(0, 4), R)
    round(cor(data), 4)   # sample correlations
    }
if(require(mvtnorm)){
    x1 <- data[,1]
    x2 <- data[,2]
      y1 <- cut(data[,3], c(-Inf, .75, Inf))
      y2 <- cut(data[,4], c(-Inf, -1, .5, 1.5, Inf))
      data <- data.frame(x1, x2, y1, y2)
      hetcor(data)  # Pearson, polychoric, and polyserial correlations, 2-step est.
      }

if(require(mvtnorm)){
hetcor(x1, x2, y1, y2, ML=TRUE) # Pearson, polychoric, polyserial correlations, ML est.

}

 

polychor Polychoric Correlation 

 

 

 

Description

Computes the polychoric correlation (and its standard error) between two ordinal variables or from their contingency table, under the assumption that the ordinal variables dissect continuous latent variables that are bivariate normal. Either the maximum-likelihood estimator or a (possibly much) quicker “two-step” approximation is available. For the ML estimator, the estimates of the thresholds and the covariance matrix of the estimates are also available.

Usage

    polychor(x, y, ML = FALSE, control = list(), std.err = FALSE, maxcor=.9999)

Arguments

  1. x  a contingency table of counts or an ordered categorical variable; the latter can be numeric, logical, a factor, or an ordered factor, but if a factor, its levels should be in proper order.

  2. y  if x is a variable, a second ordered categorical variable.

ML if TRUE, compute the maximum-likelihood estimate; if FALSE, the default, com-

pute a quicker “two-step” approximation.
control optional arguments to be passed to the optim function.

std.err if TRUE, return the estimated variance of the correlation (for the two-step estima- tor) or the estimated covariance matrix (for the ML estimator) of the correlation and thresholds; the default is FALSE.

maxcor maximum absolute correlation (to insure numerical stability).

Details

The ML estimator is computed by maximizing the bivariate-normal likelihood with respect to the thresholds for the two variables (τix,i = 1,...,r 1; τjy,j = 1,...,c 1) and the population correlation (ρ). Here, r and c are respectively the number of levels of x and y. The likelihood is maximized numerically using the optim function, and the covariance matrix of the estimated parameters is based on the numerical Hessian computed by optim.

The two-step estimator is computed by first estimating the thresholds (τix , i = 1, . . . , r 1 and τjy , i = j, . . . , c 1) separately from the marginal distribution of each variable. Then the one- dimensional likelihood for ρ is maximized numerically, using optim if standard errors are requested, or optimise if they are not. The standard error computed treats the thresholds as fixed.

Value

If std.err is TRUE, returns an object of class "polycor" with the following components:

type set to "polychoric".

rho the polychoric correlation.

row.cuts estimated thresholds for the row variable (x), for the ML estimate.

col.cuts estimated thresholds for the column variable (y), for the ML estimate.

var the estimated variance of the correlation, or, for the ML estimate, the estimated covariance matrix of the correlation and thresholds.

n the number of observations on which the correlation is based. chisq chi-square test for bivariate normality.
df degrees of freedom for the test of bivariate normality.
ML TRUE for the ML estimate, FALSE for the two-step estimate.

Othewise, returns the polychoric correlation.

Author(s)

John Fox <jfox@mcmaster.ca> References

Drasgow, F. (1986) Polychoric and polyserial correlations. Pp. 68–74 in S. Kotz and N. Johnson, eds., The Encyclopedia of Statistics, Volume 7. Wiley.

Olsson, U. (1979) Maximum likelihood estimation of the polychoric correlation coefficient. Psy- chometrika 44, 443-460.

See Also

hetcor, polyserial, print.polycor, optim

polyserial Examples

    if(require(mvtnorm)){
        set.seed(12345)
        data <- rmvnorm(1000, c(0, 0), matrix(c(1, .5, .5, 1), 2, 2))
        x <- data[,1]
        y <- data[,2]
        cor(x, y)  # sample correlation
        }
    if(require(mvtnorm)){
        x <- cut(x, c(-Inf, .75, Inf))
        y <- cut(y, c(-Inf, -1, .5, 1.5, Inf))
        polychor(x, y)  # 2-step estimate
        }
    if(require(mvtnorm)){
        set.seed(12345)
        polychor(x, y, ML=TRUE, std.err=TRUE)  # ML estimate
        }

 

polyserial Polyserial Correlation 

 

 

 

Description

Computes the polyserial correlation (and its standard error) between a quantitative variable and an ordinal variables, based on the assumption that the joint distribution of the quantitative vari- able and a latent continuous variable underlying the ordinal variable is bivariate normal. Either the maximum-likelihood estimator or a quicker “two-step” approximation is available. For the ML es- timator the estimates of the thresholds and the covariance matrix of the estimates are also available.

Usage

polyserial(x, y, ML = FALSE, control = list(), std.err = FALSE, maxcor=.9999, bins=4)

Arguments

  1. x  a numerical variable.

  2. y  an ordered categorical variable; can be numeric, logical, a factor, or an ordered

    factor, but if a factor, its levels should be in proper order.

ML if TRUE, compute the maximum-likelihood estimate; if FALSE, the default, com- pute a quicker “two-step” approximation.

control optional arguments to be passed to the optim function.

std.err if TRUE, return the estimated variance of the correlation (for the two-step esti- mator) or the estimated covariance matrix of the correlation and thresholds (for the ML estimator); the default is FALSE.

maxcor maximum absolute correlation (to insure numerical stability).

bins the number of bins into which to dissect x for a test of bivariate normality; the default is 4.

polyserial

Details

The ML estimator is computed by maximizing the bivariate-normal likelihood with respect to the thresholds for y (τjy , i = 1, . . . , c 1) and the population correlation (ρ). The likelihood is maxi- mized numerically using the optim function, and the covariance matrix of the estimated parameters is based on the numerical Hessian computed by optim.

The two-step estimator is computed by first estimating the thresholds (τjy , i = 1, . . . , c 1) from the marginal distribution of y. Then if the standard error of ρˆ is requested, the one-dimensional likelihood for ρ is maximized numerically, using optim if standard errors are requested; the standard error computed treats the thresholds as fixed. If the standard error isn’t request, ρˆ is computed directly.

Value

If std.err is TRUE, returns an object of class "polycor" with the following components:

type set to "polyserial".

rho the polyserial correlation.

cuts estimated thresholds for the ordinal variable (y), for the ML estimator.

var the estimated variance of the correlation, or, for the ML estimator, \ the estimated covariance matrix of the correlation and thresholds.

n the number of observations on which the correlation is based. chisq chi-square test for bivariate normality.
df degrees of freedom for the test of bivariate normality.
ML TRUE for the ML estimate, FALSE for the two-step estimate.

Othewise, returns the polyserial correlation.

Author(s)

John Fox <jfox@mcmaster.ca> References

Drasgow, F. (1986) Polychoric and polyserial correlations. Pp. 68–74 in S. Kotz and N. Johnson, eds., The Encyclopedia of Statistics, Volume 7. Wiley.

See Also

hetcor, polychor, print.polycor, optim Examples

    if(require(mvtnorm)){
        set.seed(12345)
        data <- rmvnorm(1000, c(0, 0), matrix(c(1, .5, .5, 1), 2, 2))
        x <- data[,1]
        y <- data[,2]
        cor(x, y)  # sample correlation

print.polycor

      }
  if(require(mvtnorm)){
      y <- cut(y, c(-Inf, -1, .5, 1.5, Inf))
      polyserial(x, y)  # 2-step estimate
      }
  if(require(mvtnorm)){
      polyserial(x, y, ML=TRUE, std.err=TRUE) # ML estimate
      }

 

print.polycor Print Method for polycor Objects 

 

 

 

Description

print method for objects of class polycor, produced by polychor and polyserial. Usage

    ## S3 method for class 'polycor'
    print(x, digits = max(3, getOption("digits") - 3), ...)

Arguments

x an object of class polycor, as returned by polychor or polyserial. digits number of significant digits to be printed.
... not used.

Value

Invisibly returns x; used for its side effect — i.e., printing. Author(s)

John Fox <jfox@mcmaster.ca> See Also

polychor, polyserial Examples

    if(require(mvtnorm)){
        set.seed(12345)
        data <- rmvnorm(1000, c(0, 0), matrix(c(1, .5, .5, 1), 2, 2))
        x <- data[,1]
        y <- data[,2]
        cor(x, y) # sample correlation
        }
    if(require(mvtnorm)){

print.polycor 

        x <- cut(x, c(-Inf, .75, Inf))
        y <- cut(y, c(-Inf, -1, .5, 1.5, Inf))
        polychor(x, y, ML=TRUE, std.err=TRUE)  # polychoric correlation, ML estimate
        }

Index

Topic methods print.polycor, 8

Topic models hetcor, 2

polychor, 4

polyserial, 6 Topic print

print.polycor, 8

as.matrix.hetcor (hetcor), 2

hetcor, 2, 5, 7

nearPD, 2, 3

optim, 5, 7 optimise, 5

polychor, 3, 4, 7, 8 polyserial, 3, 5, 6, 8 print.hetcor (hetcor), 2 print.polycor, 5, 7, 8

posted @ 2017-09-14 15:56  aongao  阅读(210)  评论(0编辑  收藏  举报