SPSS

This section closely follows Peugh and Enders (2005). It demonstrates how to group-mean center level-1 covariates and estimate multilevel models using SPSS syntax. Note that it is also possible to use the Mixed Models option under the Analyze pull-down menu (see Norusis 2005, pgs. 197-246). However, length considerations limit the examples here to syntax. The SPSS syntax editor can be accessed by going to File → New → Syntax.

In the HSB data file, the student-level SES variable is in its original metric (a standardized scale with a mean of zero). Oftentimes researchers dealing with hierarchically structured data wish to center a level-1 variable around the mean of all cases within the same level-2 group in order to facilitate interpretation of the intercept. To group-mean center a variable in SPSS, first use the AGGREGATE command to estimate mean SES scores by school. In this example, the syntax would be:

AGGREGATE OUTFILE=sesmeans.sav
/BREAK=id
/meanses=MEAN(ses)

The OUTFILE statement specifies that the means are written out to the file sesmeans.sav in the working directory. The BREAK subcommand specifies the groups within which to estimate means. The final line names the variable containing the school means meanses.

Next, the group means are sorted and merged with the original data using the SORT CASES and MATCHFILES commands. The centered variables are then created using the COMPUTEcommand. (To grand mean center a variable in SPSS requires only a single line of syntax. For example, COMPUTE newvar = oldvar - mean(oldvar).) The syntax for these steps would be:

SORT CASES BY id .
MATCH FILES
/TABLE=sesmeans.sav
/FILE=*
/BY id .
COMPUTE centses = ses - meanses .
EXECUTE .

The subcommands for MATCH FILES ask SPSS to take the data file saved using the AGGREGATE command and merge it with the working data (denoted by *). The matching variable is the school ID.

With the data prepared, the next step is to estimate the models of interest. The following syntax corresponds to the empty model (5):

MIXED mathach
/PRINT = SOLUTION TESTCOV
/FIXED = INTERCEPT
/RANDOM = INTERCEPT | SUBJECT(id) .

The command for estimating multilevel models is MIXED followed immediately by the dependent variable. PRINT = SOLUTION requests that SPSS reports the fixed effects estimates and standard errors. FIXED and RANDOM specify which variables to treat as fixed and random effects, respectively. The SUBJECT option following the vertical line | identifies the grouping variable, in this case school ID.

The fixed and random effect estimates for this and subsequent models are displayed in Table 1 at the bottom of the page. The intercept in the empty model is equal to the overall average math achievement score, which for this sample is 12.637. The variance component corresponding to the random intercept is 8.614; for the level-1 error it is 39.1483. Including the TESTCOVsubcommand requested that SPSS report Wald-Z significance tests for the variance components, equal to the estimate divided by its standard error. In this example, the value of the Wald-Z statistic is 6.254, which is significant (p<.001). Note, however, that these tests should not be taken as conclusive. Singer (1998, pg. 351) writes,

``the validity of these tests has been called into question both because they rely on large sample approximations (not useful with the small sample sizes often analyzed using multilevel models) and because variance components are known to have skewed (and bounded) sampling distributions that render normal approximations such as these questionable.''

A more thorough test would thus estimate a second model constraining the variance component to equal zero and compare the two models using a likelihood ratio test.

The two variance components can be used to partition the variance across levels according to equation 6 above. The intraclass correlation coefficient for this example is equal to =.1804, meaning that roughly 18% of the variance is attributable to school traits. Because the intraclass correlation coefficient shows a fair amount of variation across schools, model 2 adds two school-level variables. These variables are sector, defining whether a school is private or public, and ttfamily meanses, which is the average student socioeconomic status in the school. The SPSS syntax to estimate this model is:

MIXED mathach WITH meanses sector
/PRINT = SOLUTION TESTCOV
/FIXED = INTERCEPT meanses sector
/RANDOM = INTERCEPT | SUBJECT(id) .

The results, displayed in the second column of Table 1, show that meanses and sector significantly affect a school's average math achievement score. The intercept, representing the expected math achievement score for a student in a public school with average SES, is equal to 12.1283. A one unit increase in average SES raises the expected school mean by 5.5334. Private schools have expected math achievement scores 1.2254 units higher than public schools. The variance component corresponding to the random intercept has decreased to 2.3140, demonstrating that the inclusion of the two school-level variables has explained much of the level-2 variation. However, the estimate is still more than twice the size of its standard error, suggesting that there remains a significant amount of unexplained school-level variance (though the same caution about over-interpreting this test still applies).

A final model introduces a student-level covariate, the group-mean centered SES variable centses. Because it is possible that the effect of socioeconomic status may vary across schools, SES is treated as a random effect. In addition, sector and meanses are included to model the slope on the student-level SES variable. Modeling the slope of a random effect is the same as specifying a cross-level interaction, which can be specified in the FIXED subcommand as in the following syntax:

MIXED mathach WITH meanses sector centses
/PRINT = SOLUTION TESTCOV
/FIXED = INTERCEPT meanses sector centses meanses*centses sector*centses
/RANDOM = INTERCEPT centses | SUBJECT(id) COVTYPE(UN) .

One important change over the previous models is the addition of the COV(UN) option, which specifies a structure for the level-2 covariance matrix. Only a single school-level variance component was estimated in the previous two models, thus it was unnecessary to deal with covariances. When there is more than one level-2 variance component, SPSS will assume a particular covariance structure. In many cross-sectional applications of multilevel models, the researcher does not wish to put any constraints on this covariance matrix. Thus the UN in the COV option specifies an unstructured matrix. In other contexts, the researcher may wish to specify a first-order autoregressive (AR1), compound symmetry (CS), identity (ID), or other structure. These alternatives are more restrictive but may sometimes be appropriate.

The results from this final model appear in the last column of Table 1. The fixed effects are all significant. Given the inclusion of the group-mean centered SES variable, the intercept is interpreted as the expected math achievement in a public school with average SES levels for a student at his or her school's average SES. In this model, the expected outcome is 12.1279. Because there are interactions in the model, the marginal fixed effects of each variable will depend on the value of the other variable(s) involved in the interaction. The marginal effect of a one-unit change in a student's SES score on math achievement depends on whether a school is public or private as well as on the school's average SES score. For a public school (where sector=0), the marginal effect of a one-unit change in the group-mean centered student SES variable is equal to = γ10 + γ11(MEANSES) = 2.945041 + 1.039232(MEANSES). For a private school (where sector=1), the marginal effect of a one-unit change in student SES is equal to = γ10 + γ11(MEANSES) + γ12 = 2.945041 + 1.039232(MEANSES) - 1.642674. When cross-level interactions are present, graphical means may be appropriate for exploring the contingent nature of marginal effects in greater detail (Raudenbush \& Bryk 2002; Brambor, Clark, and Golder 2006). Here the simplest interpretation is that the effect of student-level SES is significantly higher in wealthier schools and significantly lower in private schools.

The variance component for the random intercept continues to be significant, suggesting that there remains some variation in average school performance not accounted for by the variables in the model. The variance component for the random slope, however, is not significant. Thus the researcher may be justified in estimating an alternative model that constrains this variance component to equal zero.



Up: Estimation
Next: Stata