Using dcm2

Model fit is a key requirement for making inferences from psychometric models (J. Chen et al., 2013). In order to support the inferences made from a model, the model should adequately fit the data (e.g., Ames & Penfield, 2015). For diagnostic classification models (DCMs), Rupp et al. (2010) defined three methods for evaluating model fit: resampling techniques, posterior predictive model checking, and limited-information fit statistics. Resampling techniques are computationally intensive, meaning the time requirements may not be feasible. Posterior predictive model checking requires Bayesian models, which is not how many models are estimated. This suggests, resampling techniques and posterior predictive model checking are not practical for many settings. However, limited-information fit statistics do not have these drawbacks, and Rupp et al. described limited-information fit statistics as the most promising option for DCMs.

Maydeu-Olivares and Joe (2005) defined the Mr family of limited-information fit statistics, where the fit statistic uses the rth order marginal proportions. Maydeu-Olivares and Joe (2005, 2006) recommended using the M2 limited-information fit statistic. Subsequent work has applied the M2 statistic to DCMs (e.g., F. Chen et al., 2018; Liu et al., 2016).

We create a package containing functions for evaluate modeling fit for DCMs using the M2 statistic using the dcm2 package. The package provides native support for models estimated with GDINA, but package authors can create methods for different classes of models.

You can install the release version of dcm2 from CRAN:

install.packages("dcm2")

To install the development version from GitHub use:

# install.packages("remotes")
remotes::install_github("atlas-aai/dcm2")

Usage

Once dcm2 has been installed, we can estimate a DCM and apply the M2 statistic to estimate the evidence of model fit.

library(dcm2)
library(tidyverse)
library(GDINA)

We included simulated data in the dcm2 package to demonstrate how the package can be used. We can load in this data using:

full_data <- dcm2::sample_data
q_matrix <- full_data$q_matrix
data <- full_data$data

Then, we want to estimate a DCM to fit this data. We will use the GDINA package to estimate a log-linear cognitive diagnosis model [LCDM; Henson et al. (2009)]. However, we need to format the data prior to fitting the LCDM.

fit_dat <- data %>%
  tidyr::pivot_wider(names_from = "item_id",
                     values_from = "score") %>%
  dplyr::select(-"resp_id") %>%
  as.matrix() %>%
  unname()

Now that the data is formatted, we can fit the model using:

gdina_mod <- GDINA::GDINA(dat = fit_dat,
                          Q = data.frame(q_matrix),
                          model = "logitGDINA",
                          control = list(conv.type = "neg2LL"))

Finally, we use the fit_m2() function to estimate the model fit using the M2 fit statistic for the model estimated from the GDINA package. This function produces a tibble with all of the output for the M2 fit statistic.

fit_m2(gdina_mod, ci = 0.9)

m2 reports the Chi-squared statistic for the M2 fit statistic, df reports the degrees of freedom for the Chi-squared test, and pval reports the p value of that Chi-squared test. rmsea reports the Root Mean Squared Error of Approximation (RMSEA) for the M2 statistic, ci_lower reports the lower bound of the 90% confidence interval for the RMSEA, and ci_upper reports the upper bound of the 90% confidence interval for the RMSEA. Note that these confidence intervals bounds are for the 90% confidence interval, which is the default setting for the fit_m2() function based on Kline (2015). Finally, srmsr reports the standardized root mean squared residual (SRMSR) for the M2 statistic.

As the final step in this demonstration, we interpret the model fit results from this code. The previous results indicated M2 = 9.35, df = 13, p = .75, which suggests that the model fit the data since the p value was greater than .05. The RMSEA in the previous results was 0, with a 90% confidence interval ranging from 0 to .023, and the SRMSR was .018. The 90% confidence interval for the RMSEA for the M2 statistic suggests the model fits the data well, since the entire confidence interval is less than .06 (Hu & Bentler, 1999). Similarly, the SRMSR suggests the model fits the data well, since the SRMSR statistic is less than .08 (Hu & Bentler, 1999). Taken together, this evidence suggests the model fits the data well.

References

Ames, A. J., & Penfield, R. D. (2015). An NCME instructional module on item-fit statistics for item response theory models. Educational Measurement: Issues and Practice, 34(3), 39–48.
Chen, F., Liu, Y., Xin, T., & Cui, Y. (2018). Applying the M2 statistic to evaluate the fit of diagnostic classification models in the presence of attribute hierarchies. Frontiers in Psychology, 9, 1875.
Chen, J., de la Torre, J., & Zhang, Z. (2013). Relative and absolute fit evaluation in cognitive diagnosis modeling. Journal of Educational Measurement, 50, 123–140. https://doi.org/10.1111/j.1745-3984.2012.00185.x
Henson, R., Templin, J., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210. https://doi.org/10.1007/s11336-008-9089-5
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6, 1–55.
Kline, R. B. (2015). Principles and practice of structural equation modeling (4th ed.). Guilford Press.
Liu, Y., Tian, W., & Xin, T. (2016). An application of M2 statistic to evaluate the fit of cognitive diagnostic models. Journal of Educational and Behavioral Statistics, 41(1), 3–26.
Maydeu-Olivares, A., & Joe, H. (2005). Limited- and full-information estimation and goodness-of-fit testing in 2n contingency tables: A unified framework. Journal of the American Statistical Association, 100(471), 1009–1020. https://doi.org/10.1198/016214504000002069
Maydeu-Olivares, A., & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrika, 71(4), 713–732. https://doi.org/10.1007/s11336-005-1295-9
Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. Guilford Press.