Quantifying the Reliability and Replicability of Psychopathology Network Characteristics

Forbes, Miriam K.; Wright, Aidan G. C.; Markon, Kristian E.; Krueger, Robert F.

doi:10.6084/m9.figshare.8198699

hmbr_a_1616526_sm8349.pdf (8.13 MB)

Quantifying the Reliability and Replicability of Psychopathology Network Characteristics

Version 2 2021-06-15, 13:20

Version 1 2019-05-29, 13:30

journal contribution

posted on 2021-06-15, 13:20 authored by Miriam K. Forbes, Aidan G. C. Wright, Kristian E. Markon, Robert F. Krueger

Pairwise Markov random field networks—including Gaussian graphical models (GGMs) and Ising models—have become the “state-of-the-art” method for psychopathology network analyses. Recent research has focused on the reliability and replicability of these networks. In the present study, we compared the existing suite of methods for maximizing and quantifying the stability and consistency of PMRF networks (i.e., lasso regularization, plus the bootnet and NetworkComparisonTest packages in R) with a set of metrics for directly comparing the detailed network characteristics interpreted in the literature (e.g., the presence, absence, sign, and strength of each individual edge). We compared GGMs of depression and anxiety symptoms in two waves of data from an observational study (n = 403) and reanalyzed four posttraumatic stress disorder GGMs from a recent study of network replicability. Taken on face value, the existing suite of methods indicated that overall the network edges were stable, interpretable, and consistent between networks, but the direct metrics of replication indicated that this was not the case (e.g., 39–49% of the edges in each network were unreplicated across the pairwise comparisons). We discuss reasons for these apparently contradictory results (e.g., relying on global summary statistics versus examining the detailed characteristics interpreted in the literature) and conclude that the limited reliability of the detailed characteristics of networks observed here is likely to be common in practice, but overlooked by current methods. Poor replicability underpins our concern surrounding the use of these methods, given that generalizable conclusions are fundamental to the utility of their results.