Navigation

This example will demonstrate latent variable modeling via the common factor model using RAM matrices for model specification. We’ll walk through two applications of this approach: one with a single latent variable, and one with two latent variables. As with previous examples, these two applications are split into four files, with each application represented separately with raw and covariance data. These examples can be found in the following files:

The common factor model is a method for modeling the relationships between observed variables believed to measure or indicate the same latent variable. While there are a number of exploratory approaches to extracting latent factor(s), this example uses structural modeling to fit confirmatory factor models. The model for any person and path diagram of the common factor model for a set of variables - are given below.

While 19 parameters are displayed in the equation and path diagram above (six manifest variances, six manifest means, six factor loadings and one factor variance), we must constrain either the factor variance or one factor loading to a constant to identify the model and scale the latent variable. As such, this model contains 18 parameters. Unlike the manifest variable examples we’ve run up until now, this model is not fully saturated. The means and covariance matrix for six observed variables contain 27 degrees of freedom, and thus our model contains 9 degrees of freedom.

Our first step to running this model is to include the data to be analyzed. The data for this example contain nine variables. We’ll select the six we want for this model using the selection operators used in previous examples. Both raw and covariance data are included below, but only one is required for any model.

The following code contains all of the components of our model. Before running a model, the OpenMx library must be loaded into R using either the require() or library() function. All objects required for estimation (data, matrices, an expectation function, and a fit function) are specified. This code uses the mxModel function to create an MxModel object, which we will then run.

This mxModel function can be split into several parts. First, we give the model a name. The first argument in an mxModel function has a special function. If an object or variable containing an MxModel object is placed here, then mxModel adds to or removes pieces from that model. If a character string (as indicated by double quotes) is placed first, then that becomes the name of the model. Models may also be named by including a name argument. This model is named "CommonFactorModelMatrixSpecification".

The second component of our code creates an MxData object. The example above, reproduced here, first references the object where our data is, then uses the type argument to specify that this is raw data.

dataRaw <- mxData( observed=myFADataRaw, type="raw")

If we were to use a covariance matrix and vector of means as data, we would replace the existing mxData function with this one:

Model specification is carried out using mxMatrix functions to create matrices for a RAM specified model. The A matrix specifies all of the asymmetric paths or regressions in our model. In the common factor model, these parameters are the factor loadings. This matrix is square, and contains as many rows and columns as variables in the model (manifest and latent, typically in that order). Regressions are specified in the A matrix by placing a free parameter in the row of the dependent variable and the column of independent variable.

The common factor model requires that one parameter (typically either a factor loading or factor variance) be constrained to a constant value. In our model, we will constrain the first factor loading to a value of 1, and let all other loadings be freely estimated. All factor loadings have a starting value of one and labels of "l1" - "l6".

The second matrix in a RAM model is the S matrix, which specifies the symmetric or covariance paths in our model. This matrix is symmetric and square, and contains as many rows and columns as variables in the model (manifest and latent, typically in that order). The symmetric paths in our model consist of six residual variances and one factor variance. All of these variances are given starting values of one and labels "e1" - "e6" and "varF1".

The third matrix in our RAM model is the F or filter matrix. Our data contains six observed variables, but the A and S matrices contain seven rows and columns. For our model to define the covariances present in our data, we must have some way of projecting the relationships defined in the A and S matrices onto our data. The F matrix filters the latent variables out of the expected covariance matrix, and can also be used to reorder variables.

The F matrix will always contain the same number of rows as manifest variables and columns as total (manifest and latent) variables. If the manifest variables in the A and S matrices precede the latent variables and are in the same order as the data, then the F matrix will be the horizontal adhesion of an identity matrix and a zero matrix. This matrix contains no free parameters, and is made with the mxMatrix function below.

The last matrix of our model is the M matrix, which defines the means and intercepts for our model. This matrix describes all of the regressions on the constant in a path model, or the means conditional on the means of exogenous variables. This matrix contains a single row, and one column for every manifest and latent variable in the model. In our model, the latent variable has a constrained mean of zero, while the manifest variables have freely estimated means, labeled "meanx1" through "meanx6".

The final parts of this model are the expectation function and the fit function. The expectation defines how the specified matrices combine to create the expected covariance matrix of the data. The fit defines how the expectation is compared to the data to create a single scalar number that is minimized. In a RAM specified model, the expected covariance matrix is defined as:

The expected means are defined as:

The free parameters in the model can then be estimated using maximum likelihood for covariance and means data, and full information maximum likelihood for raw data. Although users may define their own expected covariance matrices using mxExpectationNormal and other functions in OpenMx, the mxExpectationRAM function computes the expected covariance and means matrices when the A, S, F and M matrices are specified. The M matrix is required both for raw data and for covariance or correlation data that includes a means vector. The mxExpectationRAM function takes four arguments, which are the names of the A, S, F and M matrices in your model. The mxFitFunctionML yields maximum likelihood estimates of structural equation models. It uses full information maximum likelihood when the data are raw.

The model now includes an observed covariance matrix (i.e., data), model matrices, an expectation function, and a fit function. So the model has all the required elements to define the expected covariance matrix and estimate parameters.

The model can now be run using the mxRun function, and the output of the model can be accessed from the $output slot of the resulting model. A summary of the output can be reached using summary().

Rather than specifying the model using RAM notation, we can also write the model explicitly with self-declared matrices, matching the formula for the expected mean and covariance structure of the one factor model:

We start with displaying the complete script. Note that we have used the succinct form of coding and that the mxData command did not change.

The first mxMatrix statement declares a Full6x1 matrix of factor loadings to be estimated, called “facLoadings”. We fix the first factor loading to 1 for identification. Even though we specify just one start value of 1 which is recycled for each of the elements in the matrix, it becomes the fixed value for the first factor loading and the start value for the other factor loadings. The second mxMatrix is a symmetric1x1 which estimates the variance of the factor, named “facVariances”. The third mxMatrix is a Diag6x6 matrix for the residual variances, named “resVariances”. The fourth mxMatrix is a Full1x6 matrix of free elements for the means of the observed variables, called “varMeans”. The fifth mxMatrix is a Full1x1 matrix with a fixed value of zero for the factor mean, named “facMeans”.

We then use two algebra statements to work out the expected means and covariance matrices. Note that the formula’s for the expression of the expected covariance and the expected mean vector map directly on to the mathematical equations. The arguments for the mxExpectationNormal function now refer to these algebras for the expected covariance and expected means. The dimnames are used to map them onto the observed variables. The fit function compares the expectation and the observation (i.e. data) to optimize free parameters.

The common factor model can be extended to include multiple latent variables. The model for any person and path diagram of the common factor model for a set of variables - and - are given below.

Our model contains 21 parameters (six manifest variances, six manifest means, six factor loadings, two factor variances and one factor covariance), but each factor requires one identification constraint. Like in the common factor model above, we will constrain one factor loading for each factor to a value of one. As such, this model contains 19 parameters. The means and covariance matrix for six observed variables contain 27 degrees of freedom, and thus our model contains 8 degrees of freedom.

The data for the two factor model can be found in the myFAData files introduced in the common factor model. For this model, we will select three x variables (x1-x3) and three y variables (y1-y3).

Specifying the two factor model is virtually identical to the single factor case. The mxData function has been changed to reference the appropriate data, but is identical in usage. We’ve added a second latent variable, so the A and S matrices are now of order 8x8. Similarly, the F matrix is now of order 6x8 and the M matrix of order 1x8. The mxExpectationRAM has not changed. The code for our two factor model looks like this:

The four mxMatrix functions have changed slightly to accomodate the changes in the model. The A matrix, shown below, is used to specify the regressions of the manifest variables on the factors. The first three manifest variables ("x1"-"x3") are regressed on "F1", and the second three manifest variables ("y1"-"y3") are regressed on "F2". We must again constrain the model to identify and scale the latent variables, which we do by constraining the first loading for each latent variable to a value of one.

The S matrix has an additional row and column, and two additional parameters. For the two factor model, we must add a variance term for the second latent variable and a covariance between the two latent variables.

The F and M matrices contain only minor changes. The F matrix is now of order 6x8, but the additional column is simply a column of zeros. The M matrix contains an additional column (with only a single row), which contains the mean of the second latent variable. As this model does not contain a parameter for that latent variable, this mean is constrained to zero.

The model is now ready to run using the mxRun function, and the output of the model can be accessed from the $output slot of the resulting model. A summary of the output can be reached using summary().