Learning disentangled representations to harmonize connectome network measures

Nancy R. Newlin; Michael E. Kim; Praitayini Kanakaraj; The BIOCARD Study Team; Kimberly R. Pechman; Niranjana Shashikumar; Elizabeth Moore; Derek B. Archer; Timothy J. Hohman; Angela L. Jefferson; Daniel Moyer; Bennett A. Landman

doi:10.1117/1.JMI.12.1.014004

14 February 2025 Learning disentangled representations to harmonize connectome network measures

Nancy R. Newlin, Michael E. Kim, Praitayini Kanakaraj, The BIOCARD Study Team, Kimberly R. Pechman, Niranjana Shashikumar, Elizabeth Moore, Derek B. Archer, Timothy J. Hohman, Angela L. Jefferson, Daniel Moyer, Bennett A. Landman

Author Affiliations +

Funded by: National Institutes of Health (NIH), National Institutes of Health (NIH)/National Center for Advancing Translational Sciences (NCATS), National Institute on Aging (U.S. National Institute on Aging)

Journal of Medical Imaging, Vol. 12, Issue 1, 014004 (February 2025). https://doi.org/10.1117/1.JMI.12.1.014004

Abstract

Purpose

Connectome network metrics are commonly regarded as fundamental properties of the brain, and their alterations have been implicated in the development of Alzheimer’s disease, multiple sclerosis, and traumatic brain injury. However, these metrics are actually estimated properties through a multistage propagation from local voxel diffusion estimations, regional tractography, and region of interest mapping. These estimation processes are significantly influenced by choices specific to imaging protocols and software, producing site-wise effects.

Approach

Recent advances in disentanglement techniques offer opportunities to learn representational spaces that separate factors that cause domain shifts from intrinsic biological factors. Although these techniques have been applied in unsupervised brain anomaly detection and image-level features, their application to the unique manifold structures of connectome adjacency matrices remains unexplored. Here, we explore the conditional variational autoencoder structure for generating site-invariant representations of the connectome, allowing the harmonization of brain network measures.

Results

Focusing on the context of aging, we conducted a study involving 823 patients across two sites. This approach effectively segregates site-specific influences from biological features, aligns network measures across different domains (Cohen’s D<0.2 and Mann–Whitney U-test<0.05), and maintains associations with age (2.71×10−02±2.86×10−03 error in years) and sex (0.92±0.02 accuracy).

Conclusions

Our findings demonstrate that using latent representations significantly harmonizes network measures and provides robust metrics for multi-site brain network analysis.

1. Introduction

Diffusion-weighted imaging (DWI), tractography, and connectomics are advanced neuroimaging techniques that have enhanced our understanding of brain structure and connectivity.¹^,² DWI measures the diffusion of water molecules in brain tissue, providing detailed maps of white matter fiber integrity.³ Tractography combines these data to estimate white matter fibers, creating comprehensive maps of brain connectivity and structures.¹^,⁴ Connectomics, the study of these connections, delves into the intricate network of neural interactions.⁵ Together, these methodologies are key in studying Alzheimer’s disease⁶^,⁷ and the aging process,⁸ offering insights into the degeneration of brain networks and the consequent cognitive decline. By quantifying changes in brain connectivity with connectomics, we can better understand the progression of Alzheimer’s disease and identify potential markers for early diagnosis and therapeutic targets.⁵^,⁹^,¹⁰

Major hurdles in leveraging DWI to study aging and cognitive decline are small sample sizes¹¹ and integrating different diffusion datasets is complex.¹²^,¹³ There are a growing number of multi-center diffusion imaging studies that span multiple scanner manufacturers and acquisition protocols or “sites.” Alzheimer’s Disease Neuroimaging Initiative¹⁴ and National Alzheimer’s Coordinating Center¹⁵ incorporate data from multiple scanner vendors and protocols, Open Access Series of Imaging Studies¹⁶ includes multiple protocols, and Baltimore Longitudinal Study of Aging¹⁷ has data from distinct scanning hardware. Moreover, expanding DWI dataset sources to include various acquisitions, scanners, or centers introduces confounding biases due to site-specific factors. Our current network connectivity measures inherently contain such site-specific information¹⁸^,¹⁹ (Fig. 1).

Fig. 1

Proposed model is a VAE with a site-conditional decoder, originally proposed in Ref. 21. The encoder block comprised two linear layers and two ReLU activation functions. Following encoding, the latent space, $z$ , is reparametrized by sampling from the learned mean ( $μ_{z}$ ) and variance ( $\log σ_{z}^{2}$ ). The decoding block has two linear layers, resulting in the connectome reconstructed in the $c$ site domain, ${\hat{x}}^{c}$ . We constrain the latent space with multiple prediction heads for patient sex ( ${\hat{y}}_{sex}$ ), age ( ${\hat{y}}_{age}$ ), and network measures ( ${\hat{y}}_{network measures}^{c}$ ).

Previous efforts to address these biases have focused on non-linear harmonization in diffusion magnetic resonance imaging (MRI), primarily at the image level,²⁰^–²² and more broadly across MRI.²³^,²⁴ These methods often aim to remove predictive information related to the site variable using techniques such as adversarial losses,²²^,²³ variational bounds,²⁰ contrastive losses,²⁵ and ad hoc disentanglement methods.²⁴ As shown in Ref. 21, optimizing these losses is the same as minimizing mutual information between a site variable and the learned representation.

The connectome is a two-dimensional graph representation that encapsulates biological and site-specific information. Each brain region translates to a connectome node, and the streamlines connecting brain regions are connectome edges.⁵ In recognizing the connectome’s potential as a lower-dimensional but rich source of connectivity information, we explore the possibility of disentangling biological and site-specific information in connectomes and extracting meaningful, site-invariant features.²⁶

In this approach, we learn representations of the connectome that have disentangled site and biological features (Fig. 1). Then, we shift the representations to a common site domain to reduce confounding biases in the connectome and network measures.

2. Methods

We propose that it is possible to disentangle information in connectomes to extract biological features with minimal or no site-specific information. Site information is a conglomeration of non-biological hardware, protocol, and study parameters that should not drive analysis.¹³ We train a neural network that produces representations that are uninformative of the site variable yet preserves relevant biological signals (Fig. 1). Then, we map this representation back to connectivity matrices conditional on a (possibly different) site variable. We show that by manipulating that site variable at test time, we choose which site domain to reconstruct (Fig. 2). Further, we can choose a common site domain for network connectivity analysis. Our method uses multi-head prediction tasks on the latent space alongside the main reconstruction task to achieve this site-minimal-bio-maximal representation.

Fig. 2

We average all connectomes from the matched testing cohort for each site. Top: Although data are matched across sites, we see systematic differences among the average connectomes, particularly in intra-hemispheric connections. We reconstruct all connectomes in a common domain to reduce inter-site differences in connectome edges. Middle: Site 1 domain. Bottom: Site 2 domain.

2.1.

Data

DWI scans from two sites were combined for joint analysis: biomarkers of cognitive decline among normal individuals: the BIOCARD cohort (BIOCARD)²⁷ and Vanderbilt Memory and Aging Project (VMAP).²⁸ BIOCARD patients were scanned on a 3T Philips Achieva scanner (Eindhoven, The Netherlands) at Johns Hopkins University in Baltimore, Maryland, United States. Diffusion-weighted images were acquired from a spin echo sequence (TR = 7.5 s, TE = 75 ms, $resolution = 0.828 mm \times 0.828 mm \times 2.2 mm$ , $b - values = 0$ , $700 s / {mm}^{2}$ , and number of gradients = 33). VMAP patients were scanned on a Philips 3T Achieva scanner (Best, The Netherlands) at Vanderbilt University in Nashville, Tennessee, United States. Diffusion-weighted images were acquired from a spin echo sequence (TR = 8.9 ms, TE = 4.6 ms, $resolution = 2 mm \times 2 mm \times 2 mm$ , $b - values = 0$ , $1000 s / {mm}^{2}$ , and number of gradients = 32).

Data used in this study are split into training and testing cohorts. The training cohort comprised 347 scans from BIOCARD (site 1) and 324 scans from VMAP (site 2), all free of cognitive impairment. Training data from VMAP have ages $74.1 \pm 7.5$ and 114 women. BIOCARD training data are ages $73.1 \pm 5.7$ with 205 women. The testing cohort is 77 matched participants (one scan from each participant), free of cognitive impairment, ages $72.9 \pm 7.6$ , and 57% percent women.

To isolate site-wise differences between these two datasets without traveling subjects,²⁹^,³⁰ we curate a subset of patients that have the same demographics. The matching process finds a 1:1 cross-site mapping of each patient based on demographics. For example, a 59-year-old female from the VMAP cohort is matched with a 59.5-year-old female from the BIOCARD cohort. Matched patients are the same sex and $\pm 1$ year in age. The matching was done using the pyPheWAS maximal group matching tool (version 4.1.1).³¹

2.2.

Diffusion Processing

The proposed model learns from connectome representations of diffusion tractography and is derived from DWI outlined in Sec. 2.1. DWI from all participants were first preprocessed to remove eddy current, motion, and echo-planar imaging distortions prior to any model fitting.³² We used MRTrix³³ to perform tractography over fiber orientation distributions with anatomically constrained tractography framework (seeded on gray matter–white matter interface, allowed backtracking, terminating with five-tissue-type mask, and generated 10 million streamlines³⁴). Afterward, we map the tractogram to a connectome representation using the Desikan–Killany atlas³⁵ with 84 cortical parcellations from Freesurfer.³⁶ We use the Brain Connectivity toolbox (version-2019-03-03) to compute 12 brain network measures⁵ and filter connections less than 0.00001% of the total number of streamlines.³⁴ Network measures computed with this toolbox are the site-biased ground truth used in model training.

We consider modularity, average node betweenness centrality, assortativity, average node participation coefficient, average node clustering, average node strength, average local efficiency, global efficiency, density, rich club coefficient, characteristic path length, and number of edges in the characteristic path length (“edge count”).⁵ Modularity is the quality of division of the network into modules.⁵ Betweenness centrality is the sum of the ratio between the shortest paths that contain the node and the total number of the shortest paths.⁵ Assortativity is the correlation coefficient among the degrees of all nodes on two opposite ends of a connection and reflects network resilience.⁵ Participation coefficient characterizes the diversity of intermodular connections of nodes.⁵ Clustering coefficient is the possibility that any two neighbors of a node are also connected.⁵ Here, node strength is the number of streamlines connecting a node.⁵ Global efficiency is the ability for information to move around the whole network, and local efficiency is the ability of information to move around nodal subsets.⁵ Density is the number of present connections to the total number of connections possible.⁵ Rich club networks have subgroups of central nodes that tend to interact with one another.³⁷ The characteristic path length is the average shortest path, in millimeters, to traverse the network.⁵

2.3.

Model Architecture

We implemented a variational autoencoder (VAE) with site-conditional restrictions on the latent space, $z$ . The proposed model architecture is outlined in Fig. 1. First, the encoder processes the input data using two layers of linear layers and two layers of non-linear activation (ReLU). Following encoding, the latent space is reparametrized by sampling from the learned distribution. Next, the decoder takes this compact representation and reconstructs it back into the original data format while accounting for site-specific characteristics. In addition to recreating the original connectome, the model is trained to predict other useful information, such as the patient’s sex, age, and brain connectivity features. These predictions help ensure that the latent space is organized in a way that captures meaningful patterns and relationships in the data. During test time, we set the decoding conditions to project network measures and the reconstructed connectome to a site domain of the user’s choice.

We chose this architecture to learn site-minimal, bio-maximal representations of the connectome. We accomplish this goal by maximizing mutual information ( $I (x, y)$ ) of $z$ and biological attributes (let $a$ be patient age and $s$ be patient sex) and $z$ with network connectivity properties ( $B$ ). Simultaneously, we minimize mutual information of $z$ with site (let $c$ be the site encoding). We translate these relationships to loss functions as follows: to maximize mutual information, we minimize loss [mean squared error (MSE) or binary cross entropy]. To minimize mutual information between $z$ and site, we minimize MSE loss of the conditional reconstruction [the first term in Eq. (1)] and maximize KL divergence [the second term in Eq. (1)].

Eq. (1)

I (z, c) \leq \log (p (x | z, c)) - KL [q (z | x) | q (z)

The second term in this bound is difficult to compute directly but, as shown in Moyer et al.,²⁰ can be approximated using the standard normal Gaussian in place of induced marginal $q (z)$ . This fits nicely with existing VAE literature,³⁸ is computationally tractable, and, as we show, performs well empirically for removing site information. We use Eq. (1) to replace $I (z, c)$ in Eq. (2).

Therefore, the overall loss function, $ℓ_{total}$ , is the sum of five sub-component losses: connectome reconstruction ( $ℓ_{recon}$ ), mean squared site-conditional prediction error for brain network measures for BIOCARD ( $ℓ_{B} (c)$ , $c = 1$ ) and VMAP ( $ℓ_{B} (c)$ , $c = 2$ ), mean squared age prediction error ( $ℓ_{age}$ ), and binary cross entropy loss of sex predictions ( $ℓ_{sex}$ ). The final sub-component loss is KL divergence, which is a key component in our learning as it limits site information in $z$ .

Toward that end, our component loss function can be rewritten as mutual information terms (up to constant entropic terms):

Eq. (2)

I (z, c) - I (z, B) - I (z, a) - I (z, s) = ℓ_{recon} (c) + ℓ_{kl} + ℓ_{age} + ℓ_{sex} + ℓ_{B} (c) = ℓ_{total}

2.4.

Co-Learning and Individual Learning Schemes

We explore two learning schemes for optimizing the latent space. In the first scheme, the model learns to predict one brain network measure from the site-invariant latent space (“individual learning”). We then train 12 separate models, one for each brain network measure. In the second scheme, the model learns to predict all 12 brain network measures at the same time from the site-invariant latent space (“co-learning”).

2.5.

Bootstrapping

To assess training stability, we bootstrapped training data. At each bootstrap iteration, 80% of training data are randomly sampled without replacement. The same testing data are used to evaluate each iteration.

2.6.

Evaluating Harmonization Performance

We measure successful harmonization by evaluating differences in the connectomes and network measures with respect to their site variable. Although we do not use traveling participants in this study, we do match patients across sites based on their age, sex, and cognitive status. We expect network measures computed for this matched dataset to have similar distributions. We measure the disparity between sites with Cohen’s $D$ and the Mann–Whitney $U$ -test of medians (with $p - value < 0.05$ as significant). Cohen’s $D$ is a standardized effect size that reports the difference between two means compared with the overall data variability.³⁹ Successful harmonization would reduce large (Cohen’s $D > 0.5$ ) differences to small (Cohen’s $D < 0.2$ ) and remove significant differences in the medians among sites.³⁹

The second benchmark of harmonization performance is preserving biological variation. Previous studies⁴⁰^–⁴² demonstrated that connectomes contain relevant signals related to patient age and sex. During model training, the latent space is shaped by five tasks. Two of which, sex and age prediction, are included to encourage the model to retain relevant biological covariates.

3. Results

Although data are matched across sites, we observe systematic differences among the average uncorrected connectomes, particularly in intra-hemispheric connections (Fig. 2). The model successfully projects connectomes and brain network measures to a common site domain. Inter-site differences in all network measures are reduced to small and medium effect size differences (Fig. 3). In addition, significant site-wise differences in the median in modularity, assortativity, average node clustering coefficient, average node strength, average node local efficiency, global efficiency, density, rich club, and characteristic path length are no longer significant (Fig. 4).

Fig. 3

We evaluate the harmonization efficacy of the learning site domain of each network measure separately (blue) and co-learned together (orange) across 100 bootstrapped experiments. Both schemes reduce inter-site differences to small and medium effect sizes by projecting data to a common site (site 1, left; site 2, right); however, co-learning improves performance in measures marked with significant differences in medians (* $p < 0.05$ ). In addition, the co-learning structure has comparable or improved sex prediction accuracy (middle) and age prediction error (bottom).

Fig. 4

Here, we project network measures for the demographically matched testing set to site 1 using the proposed correction method. We show histograms of site 1 values originally from the BIOCARD cohort (blue) and site 2 values originally from the VMAP cohort (orange) before and after correction. Significant inter-site differences (* $p - value < 0.05$ ) in the median are ameliorated in the domain-shifted network measures.

Individually and co-learned models retain biological information while reducing inter-site variation (Fig. 3). Under the co-learning scheme, patient sex is predicted with $0.92 \pm 0.02$ accuracy, and patient age is predicted with $2.71 \times 10^{- 02} \pm 2.86 \times 10^{- 03}$ error in years.

4. Discussion

There is a growing need for DWI-derived features that can be used reliably in multi-site studies. Harmonization methods play a critical role in pooling data from multiple sites with varying sample sizes and age ranges, enabling the study of complex pathologies that would otherwise suffer from limited statistical power and generalizability.¹³ This need is particularly acute in Alzheimer’s disease research, where cohort sizes are typically small.¹¹ These cohorts require harmonization techniques that are robust to variations in sample size and overlapping age distributions.¹³ To address this, we propose a method that extracts site-invariant features and harmonized network measures, serving as quantitative tools for multi-site analyses of neurodegeneration and aging.

A key advantage of our method is its independence from traveling subjects or perfect demographic matching, which are often impractical requirements. Although we used demographic matching to validate harmonization performance, the model itself does not require matched data during training or testing. This flexibility makes our approach more scalable and applicable to diverse datasets. Currently, the model has been calibrated using connectomes from patients without cognitive impairment across only two sites. Future work will aim to expand its applicability to patients with cognitive impairment and extend its utility to datasets spanning multiple sites.

One of the most widely used harmonization methods is ComBat, which estimates and reduces site-specific effects on the sample mean and variance.⁴³ Although efficient and lightweight, ComBat is only reliable for datasets with more than 162 scans and minimal mean age differences among sites.⁴⁴ These limitations restrict its utility in studies with smaller cohorts or significant demographic variations.

An alternative to ComBat harmonization is calibrating scanner differences using rotationally invariant features of DWI rather than downstream features.⁴⁵^,⁴⁶ This method involves computing the average rotationally invariant signal for each site and generating a multiplicative template to normalize the signals across sites.⁴⁵ Although this approach ensures consistency in signal intensities, it requires all images to be co-registered to a common space. This step is computationally intensive and can result in a significant loss of information. Furthermore, it necessitates demographic matching across sites, which poses additional challenges in multi-site studies.⁴⁵

On the whole, our proposed method offers a promising solution for harmonizing multi-site DWI data by extracting site-invariant features and harmonized network measures. Unlike existing methods such as ComBat or rotationally invariant signal calibration, our approach balances computational efficiency with robustness to demographic mismatches. By enabling the integration of diverse datasets, our method holds the potential for advancing the study of neurodegeneration and aging in a multi-site context.

5. Conclusion

We assert the efficacy of a conditional variational autoencoder for minimizing mutual information between the latent space and site variable. We explored two schemes for further optimizing the latent space by maximizing information related to network connectivity and biological covariates. Co-learning has improved harmonization efficacy and biological preservation over the individually learned scheme. With the proposed model, we extract site-invariant connectome features that are predictive of brain network structure, patient sex, and patient age.

Disclosures

The authors have no conflicts of interest to declare. The authors used generative artificial intelligence (AI) to create code segments based on task descriptions, as well as to debug, edit, and autocomplete code. The conceptualization, ideation, and all prompts provided to the AI originated entirely from the authors’ creative and intellectual efforts.

Code and Data Availability

Code is publicly available at https://github.com/nancynewlin-masi/LearningSiteInvariantConnectomeFeatures.

Acknowledgments

Data used in the preparation of this article were derived from BIOCARD study data, supported by grant U19–AG033655 from the National Institute on Aging. The BIOCARD Study consists of seven cores and two projects with the following members: (1) the administrative core (Marilyn Albert, Corinne Pettigrew, and Barbara Rodzon), (2) the clinical core (Marilyn Albert, Anja Soldan, Rebecca Gottesman, Corinne Pettigrew, Leonie Farrington, Maura Grega, Gay Rudow, Rostislav Brichko, Scott Rudow, Jules Giles, and Ned Sacktor), (3) the imaging core (Michael Miller, Susumu Mori, Anthony Kolasny, Hanzhang Lu, Kenichi Oishi, Tilak Ratnanather, Peter vanZijl, and Laurent Younes), (4) the biospecimen core (Abhay Moghekar, Jacqueline Darrow, Alexandria Lewis, and Richard O’Brien), (5) the informatics core (Roberta Scherer, Ann Ervin, David Shade, Jennifer Jones, Hamadou Coulibaly, Kathy Moser, and Courtney Potter), the (6) biostatistics core (Mei-Cheng Wang, Yuxin Zhu, and Jiangxia Wang), (7) the neuropathology core (Juan Troncoso, David Nauen, Olga Pletnikova, and Karen Fisher), (8) project 1 (Paul Worley, Jeremy Walston, and Mei-Fang Xiao), and (9) project 2 (Mei-Cheng Wang, Yifei Sun, and Yanxun Xu). Study data were obtained from VMAP. VMAP data were collected by Vanderbilt Memory and Alzheimer’s Center investigators at Vanderbilt University Medical Center. This work was supported in part by the National Institutes of Health (NIH) [Grant Nos. R01-EB017230 (PI: Landman) and K01-AG073584 and K24-AG046373 (PI: Jefferson)], National Institute on Aging (NIA) (Grant Nos. R01-AG034962, R01-AG056534, and U24AG074855), and Alzheimer’s Association (Grant No. IIRG-08-88733). In addition, this work was supported by Vanderbilt’s High-Performance Computer Cluster for Biomedical Research (Award No. S10-OD023680), and the Vanderbilt Institute for Clinical and Translational Research is funded by the National Center for Advancing Translational Sciences Clinical Translational Science Award Program (Award No. 5UL1TR002243-03). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH, NIA, or Alzheimer’s Association. We used generative AI technologies to assist in structuring sentences and performing grammatical checks. We take accountability for the review of all content generated by AI in this work.

References

1.

S. Mori and P. C. M. Van Zijl, “Fiber tracking: principles and strategies—a technical review,” NMR Biomed., 15 (7–8), 468 –480 https://doi.org/10.1002/nbm.781 (2002). Google Scholar

2.

J. Zhang et al., “Three-dimensional anatomical characterization of the developing mouse brain by diffusion tensor microimaging,” Neuroimage, 20 (3), 1639 –1648 https://doi.org/10.1016/S1053-8119(03)00410-5 NEIMEF 1053-8119 (2003). Google Scholar

3.

D. Le Bihan and M. Iima, “Diffusion magnetic resonance imaging: what water tells us about biological tissues,” PLoS Biol., 13 (7), e1002203 https://doi.org/10.1371/journal.pbio.1002203 (2015). Google Scholar

4.

T. E. Conturo et al., “Tracking neuronal fiber pathways in the living human brain,” Appl. Phys. Sci., 96 10422 –10427 https://doi.org/10.1073/pnas.96.18.10422 (1999). Google Scholar

5.

M. Rubinov and O. Sporns, “Complex network measures of brain connectivity: uses and interpretations,” Neuroimage, 52 (3), 1059 –1069 https://doi.org/10.1016/j.neuroimage.2009.10.003 NEIMEF 1053-8119 (2010). Google Scholar

6.

Y. Sun et al., “Prediction of conversion from amnestic mild cognitive impairment to Alzheimer’s disease based on the brain structural connectome,” Front. Neurol., 9 1178 –1178 https://doi.org/10.3389/fneur.2018.01178 (2019). Google Scholar

7.

A. Ebadi et al., “Ensemble classification of Alzheimer’s disease and mild cognitive impairment based on complex graph measures from diffusion tensor images,” Front. Neurosci., 11 (Feb.), 56 https://doi.org/10.3389/fnins.2017.00056 1662-453X (2017). Google Scholar

8.

Y. Wang et al., “Longitudinal changes of connectomes and graph theory measures in aging,” Proc SPIE, 12032 120321U https://doi.org/10.1117/12.2611845 (2022). Google Scholar

9.

E. Bullmore and O. Sporns, “Complex brain networks: graph theoretical analysis of structural and functional systems,” Nat. Rev. Neurosci., 10 (4), 312 –312 https://doi.org/10.1038/nrn2618 NRNAAN 1471-003X (2009). Google Scholar

10.

E. Bullmore, “Human brain networks in health and disease,” Curr. Opin. Neurol., 22 (4), 340 –347 https://doi.org/10.1097/WCO.0b013e32832d93dd (2009). Google Scholar

11.

S. M. Smith and T. E. Nichols, “Statistical challenges in ‘big data’ human neuroimaging,” Neuron, 97 (2), 263 –268 https://doi.org/10.1016/j.neuron.2017.12.018 NERNET 0896-6273 (2018). Google Scholar

12.

C. M. Tax et al., “Cross-scanner and cross-protocol diffusion MRI data harmonisation: a benchmark database and evaluation of algorithms,” Neuroimage, 195 285 –299 https://doi.org/10.1016/j.neuroimage.2019.01.077 NEIMEF 1053-8119 (2019). Google Scholar

13.

M. S. Pinto et al., “Harmonization of brain diffusion MRI: concepts and methods,” Front. Neurosci., 14 396 https://doi.org/10.3389/fnins.2020.00396 1662-453X (2020). Google Scholar

14.

“ADNI|ADNI 3,” https://adni.loni.usc.edu/adni-3/ Google Scholar

15.

“National Alzheimer’s Coordinating Center,” https://naccdata.org/ Google Scholar

16.

D. S. Marcus et al., “Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults,” J. Cognit. Neurosci., 19 (9), 1498 –1507 https://doi.org/10.1162/jocn.2007.19.9.1498 JCONEO 0898-929X (2007). Google Scholar

17.

L. Ferrucci, “The Baltimore Longitudinal Study of Aging (BLSA): a 50-year-long journey and plans for the future,” J. Gerontol. – Ser. A Biol. Sci. Med. Sci., 63 (12), 1416 –1419 https://doi.org/10.1093/gerona/63.12.1416 (2008). Google Scholar

18.

N. R. Newlin et al., “Comparing voxel- and feature-wise harmonization of complex graph measures from multiple sites for structural brain network investigation of aging,” Proc. SPIE, 12464 124642B https://doi.org/10.1117/12.2653947 PSISDG 0277-786X (2023). Google Scholar

19.

A. I. Onicas et al., “Multisite harmonization of structural DTI networks in children: an A-CAP study,” Front. Neurol., 13 850642 https://doi.org/10.3389/fneur.2022.850642 (2022). Google Scholar

20.

D. Moyer et al., “Scanner invariant representations for diffusion MRI harmonization,” Magn. Reson. Med., 84 (4), 2174 –2189 https://doi.org/10.1002/mrm.28243 MRMEEN 0740-3194 (2020). Google Scholar

21.

D. Moyer et al., “Invariant representations without adversarial training,” Neural Inf. Process. Syst., (2018). Google Scholar

22.

M. Liu et al., “Style transfer using generative adversarial networks for multi-site MRI harmonization,” Lect. Notes Comput. Sci., 12903 313 –322 https://doi.org/10.1007/978-3-030-87199-4_30 LNCSD9 0302-9743 (2021). Google Scholar

23.

K. Kamnitsas et al., “Unsupervised domain adaptation in brain lesion segmentation with adversarial networks,” Lect. Notes Comput. Sci., 10265 597 –609 https://doi.org/10.1007/978-3-319-59050-9_47 LNCSD9 0302-9743 (2016). Google Scholar

24.

L. Zuo et al., “Unsupervised MR harmonization by learning disentangled representations using information bottleneck theory,” Neuroimage, 243 118569 https://doi.org/10.1016/j.neuroimage.2021.118569 NEIMEF 1053-8119 (2021). Google Scholar

25.

V. Nath et al., “Inter-scanner harmonization of high angular resolution DW-MRI using null space deep learning,” Comput. Diffus. MRI, 2019 193 –201 (2019). Google Scholar

26.

N. R. Newlin et al., “Learning site-invariant features of connectomes to harmonize complex network measures,” Proc. SPIE Int. Soc. Opt. Eng., 12930 129302E https://doi.org/10.1117/12.3009645 (2024). Google Scholar

27.

“BIOCARD Home Page (NS),” https://www.biocard-se.org/public/BIOCARD%20Home%20Page.html Google Scholar

28.

A. L. Jefferson et al., “The Vanderbilt Memory & Aging Project: study design and baseline cohort overview,” J. Alzheimer’s Dis., 52 (2), 539 –559 https://doi.org/10.3233/JAD-150914 (2016). Google Scholar

29.

Q. Tong et al., “Reproducibility of multi-shell diffusion tractography on traveling subjects: a multicenter study prospective,” Magn. Reson. Imaging, 59 1 –9 https://doi.org/10.1016/j.mri.2019.02.011 MRIMDQ 0730-725X (2019). Google Scholar

30.

R. Kurokawa et al., “Cross-scanner reproducibility and harmonization of a diffusion MRI structural brain network: a traveling subject study of multi-b acquisition,” Neuroimage, 245 118675 https://doi.org/10.1016/j.neuroimage.2021.118675 NEIMEF 1053-8119 (2021). Google Scholar

31.

C. I. Kerley et al., “pyPheWAS: a phenome-disease association tool for electronic medical record analysis,” Neuroinformatics, 20 (2), 483 –505 https://doi.org/10.1007/s12021-021-09553-4 1539-2791 (2022). Google Scholar

32.

L. Y. Cai et al., “PreQual: an automated pipeline for integrated preprocessing and quality assurance of diffusion weighted MRI images,” Magn. Reson. Med., 86 (1), 456 –470 https://doi.org/10.1002/mrm.28678 MRMEEN 0740-3194 (2021). Google Scholar

33.

J. D. Tournier et al., “MRtrix3: a fast, flexible and open software framework for medical image processing and visualisation,” Neuroimage, 202 116137 https://doi.org/10.1016/j.neuroimage.2019.116137 NEIMEF 1053-8119 (2019). Google Scholar

34.

N. R. Newlin et al., “Characterizing streamline count invariant graph measures of structural connectomes,” J. Magn. Reson. Imaging, 58 1211 –1220 https://doi.org/10.1002/jmri.28631 (2023). Google Scholar

35.

“CorticalParcellation—Free Surfer Wiki,” https://surfer.nmr.mgh.harvard.edu/fswiki/CorticalParcellation Google Scholar

36.

B. Fischl, “FreeSurfer,” Neuroimage, 62 (2), 774 –781 https://doi.org/10.1016/j.neuroimage.2012.01.021 (2012). Google Scholar

37.

V. Colizza et al., “Detecting rich-club ordering in complex networks,” Nature Phys., 2 110 –115 https://doi.org/10.1038/nphys209 (2006). Google Scholar

38.

D. P. Kingma and M. Welling, “Auto-encoding variational Bayes,” CoRR, (2013). Google Scholar

39.

J. Cohen, Statistical Power Analysis for the Behavioral Sciences, 2nd ed.Routledge( (1988). Google Scholar

40.

M. Ingalhalikar et al., “Sex differences in the structural connectome of the human brain,” Proc. Natl. Acad. Sci. U. S. A., 111 (2), 823 –828 https://doi.org/10.1073/pnas.1316909110 (2014). Google Scholar

41.

E. L. Dennis et al., “Changes in anatomical brain connectivity between ages 12 and 30: a HARDI study of 467 adolescents and adults,” in Proc. IEEE Int. Symp. Biomed. Imaging: from Nano to Macro, 904 (2012). Google Scholar

42.

S. J. Stam, “The influence of ageing on complex brain networks: a graph theoretical analysis,” Hum. Brain Mapp., 30 (1), 200 –208 https://doi.org/10.1002/hbm.20492 HBRME7 1065-9471 (2007). Google Scholar

43.

J. P. Fortin et al., “Harmonization of multi-site diffusion tensor imaging data,” Neuroimage, 161 149 –170 https://doi.org/10.1016/j.neuroimage.2017.08.047 NEIMEF 1053-8119 (2017). Google Scholar

44.

M. E. Kim et al., “Empirical assessment of the assumptions of ComBat with diffusion tensor imaging,” J. Med. Imaging, 11 (2), 024011 https://doi.org/10.1117/1.JMI.11.2.024011 (2024). Google Scholar

45.

N. R. Newlin et al., “MidRISH: unbiased harmonization of rotationally invariant harmonics of the diffusion signal,” Magn. Reson. Imaging, 111 113 –119 https://doi.org/10.1016/j.mri.2024.03.033 (2024). Google Scholar

46.

H. Mirzaalian et al., “Multi-site harmonization of diffusion MRI data in a registration framework,” Brain Imaging Behav., 12 (1), 284 https://doi.org/10.1007/s11682-016-9670-y (2018). Google Scholar

Biography

Nancy R. Newlin is a graduate student studying computer science at Vanderbilt University. Her current interests are in studying brain network connectivity changes in aging and cognitively impaired populations.

Biographies of the other authors are not available.

CC BY: © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.

Funding Statement

Citation Download Citation

Nancy R. Newlin, Michael E. Kim, Praitayini Kanakaraj, The BIOCARD Study Team, Kimberly R. Pechman, Niranjana Shashikumar, Elizabeth Moore, Derek B. Archer, Timothy J. Hohman, Angela L. Jefferson, Daniel Moyer, and Bennett A. Landman "Learning disentangled representations to harmonize connectome network measures," Journal of Medical Imaging 12(1), 014004 (14 February 2025). https://doi.org/10.1117/1.JMI.12.1.014004

Received: 18 August 2024; Accepted: 13 January 2025; Published: 14 February 2025

Access the abstract

JOURNAL ARTICLE
11 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

KEYWORDS

Brain

Alzheimer disease

Education and training

Diffusion weighted imaging

Neuroimaging

Biological imaging

Biological research

Purpose

Approach

Results

Conclusions

1.

Introduction

Fig. 1

2.

Methods

Fig. 2

2.1.

Data

2.2.

Diffusion Processing

2.3.

Model Architecture

Eq. (1)

Eq. (2)

2.4.

Co-Learning and Individual Learning Schemes

2.5.

Bootstrapping

2.6.

Evaluating Harmonization Performance

3.

Results

Fig. 3

Fig. 4

4.

Discussion

5.

Conclusion

Disclosures

Code and Data Availability

Acknowledgments

References

Biography

Funding Statement

Show All Keywords

Keywords/Phrases

Search In:

Publication Years