Article Text
Abstract
Objective To validate the Canada-Denmark (CANDEN) MRI scoring system for the spine in axial spondyloarthritis with updated lesion definitions.
Methods Lesion definitions in the CANDEN system were updated and illustrated by a consensus set of reference images. Sagittal spine MRIs of 40 patients with axial spondyloarthritis obtained at baseline and at week 52 after initiation of treatment with the tumour necrosis factor inhibitor golimumab were evaluated in unknown chronology by seven readers blinded to all other data.
Results CANDEN MRI spine inflammation score had very good reliability for status scores (single-measure intraclass correlation coefficient (ICC) of 21 reader pairs median of 0.91 (IQR 0.88–0.92)) and change scores (ICC 0.88 (0.86–0.92)). CANDEN MRI spine fat score had good to very good reliability for status scores (ICC 0.79 (0.75–0.86)) and moderate to good reliability for detecting change (ICC 0.59 (0.46–0.73)). CANDEN MRI spine bone erosion score and CANDEN MRI spine new bone formation score had slight to moderate reliability for status scores (ICC 0.38 (0.32–0.52) and 0.39 (0.27–0.49), respectively).
Conclusion The CANDEN MRI spine scoring system allows a comprehensive evaluation of inflammation, fat, bone erosion and new bone formation of the spine in patients with axial spondyloarthritis. It demonstrated very good reliability for detecting change in inflammation, moderate to good reliability for detecting change in fat, and slight to moderate reliability for detecting bone erosions and new bone formation. Studies with longer follow-up or patients with more advanced spinal involvement may be needed to reliably detect change in bone erosion and new bone formation scores.
Trial registration number NCT02011386.
- magnetic resonance imaging
- spondyloarthritis
- outcomes research
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Key messages
What is already known about this subject?
The Canada-Denmark (CANDEN) scoring system is a comprehensive scoring system that allows evaluation of inflammatory and structural lesions of the spine in patients with axial spondyloarthritis.
What does this study add?
In a multireader setting with experienced as well as newly trained MRI readers, the CANDEN MRI spine scoring system demonstrated very good reliability for detecting change in inflammation, moderate to good reliability for detecting change in fat, and slight to moderate reliability for detecting bone erosions and new bone formation.
Four inflammation subscores that capture different anatomical parts of the spine were developed.
How might this impact on clinical practice?
The scoring system can be used to investigate how drugs with different modes of action influence the individual components of spine inflammation and damage, the mutual relationship between lesions, as well as the evolution of inflammation and damage in the entire spine.
Introduction
In 2009, the Canada-Denmark (CANDEN) MRI working group published definitions of pathologies, an atlas of inflammatory and structural lesions of the spine, and the reliability for scoring various lesions among the founders of the system (RGWL, WM, SJP and MØ).1–5 Later, preliminary CANDEN MRI spine scores of inflammation, fat lesion, bone erosion and new bone formation were developed based on these lesion definitions, and they were applied in two studies of patients with axial spondyloarthritis (axSpA) and ankylosing spondylitis (AS) treated with tumour necrosis factor (TNF) inhibitor,6–8 where significant changes in the scores over 1–5 years occurred, except for bone erosion. These results suggest that the CANDEN MRI spine scoring system allows a comprehensive, anatomy-based evaluation of both inflammatory and structural lesions, and that it can provide patient-level sum scores for inflammatory lesions, fat lesions, bone erosion and new bone formation that are responsive to detect change.
Several other MRI scoring systems of the spine have been developed. The Ankylosing Spondylitis spine Magnetic Resonance Imaging (ASspiMRI) method provides an activity score of 0–6 and a chronicity score of 0–6 for each discovertebral unit (DVU).9 10 The Berlin method provides an activity score of 0–3 per DVU and was later amended to allow assessment of inflammation of the posterior segments with a score of 0–2, and erosions, fatty bone marrow deposition and bone proliferation, each with a score of 0–3 per DVU.11–16 The Spondyloarthritis Research Consortium of Canada (SPARCC) MRI Spine Inflammation Index divides each DVU into four vertebral quadrants, which are assessed for presence/absence of bone marrow oedema with additional scores for intensity and depth. This is done on three consecutive slices, and provides a score of 0–18 for each DVU and has been used for scoring either the 6 worst affected DVUs or all 23 DVUs.17–21 Assessment of inflammation in pedicles, facet joints, transverse processes, spinous processes and soft tissues in relation to the SPARCC method has been investigated.22 The Leeds method counts the number of inflammatory lesions in the vertebral bodies, spinous processes, facet joints and soft tissue in the lumbar spine.23 24 The Aarhus method provides an activity score of 0–3, and an additional score of 1 for costovertebral involvement for each DVU and a chronicity score of 0–9 for each DVU.25 The ASspiMRI, Berlin and SPARCC scoring methods for inflammatory activity were compared in a multireader study.26
The CANDEN MRI spine scoring system is a comprehensive system which permits a detailed description of the involvement of different spinal structures, various topographic parts of the vertebral bodies, the facet joints, the spinous processes, the transverse processes, the ribs and soft tissue. By covering various inflammatory and structural lesion types, the CANDEN MRI spine scoring system may help identify subgroups of patients and different disease trajectories. It also allows investigation of the relation between different lesion types over time at the lesion level and at the patient level,27–29 and how this association may vary depending on the mode of action of the applied therapy. In this updated version of the scoring system, we have harmonised the definitions and rules across the various lesion types in order to ease the use of the system. Sum scores for inflammation, fat and erosion differ slightly from the preliminary sum scores that were previously published,6–8 but differences are minor; the decision to make this update of the system was driven by a desire for internally consistent rules, definitions and accompanying atlas.
The objective of this work was to update the lesion definitions including an atlas with consensus reference images, clarify the CANDEN MRI spine scoring system for inflammation and structural lesions, and assess the reliability of the scoring system in a multireader setting.
Methods
Patient characteristics
Sagittal MRIs of the spine of 40 patients with axSpA obtained at baseline and after 52 weeks in the MANGO (Novel MRI ANd biomarkers in GOlimumab-treated patients with axial spondyloarthritis) cohort study (ClinicalTrials.gov: NCT02011386) were used. All patients were diagnosed with axSpA and fulfilled the imaging arm of the Assessment of SpondyloArthritis international Society (ASAS) axSpA criteria. TNF inhibitor treatment with golimumab 50 mg once every month was initiated in all patients at baseline. The median age of the 40 patients was 33 years (IQR 28–48, range 22–73), 55% were men, 73% were human leucocyte antigen B27 (HLA-B27)-positive, 24 (60%) fulfilled the radiographic part of the modified New York criteria, while 16 (40%) had non-radiographic axSpA, with a median time since diagnosis of 0.7 years (IQR 0.2–3.4) and median symptom duration of 4.6 years (IQR 2.1–9.8). At baseline, 53% had elevated C-reactive protein (CRP) (defined as >5 mg/L), Ankylosing Spondylitis Disease Activity Score (ASDAS) median of 3.6 (IQR 3.1–4.0) and SPARCC sacroiliac joint (SIJ) inflammation median score of 14 (IQR 8–22). At week 52, 21% had elevated CRP, ASDAS median of 1.4 (IQR 0.8–2.2) and SPARCC SIJ inflammation median score of 2 (IQR 0–4).
MRI methodology
MRI of the spine was performed at baseline and at week 52. Spine images were acquired in two halves (upper and lower spine). MRI was performed on a single high-field Philips Medical Systems Ingenia scanner (3.0 T) by sagittal T1-weighted turbo spin echo: repetition time (TR): 663 ms, echo time (TE): 8 ms, matrix: 444×319, field of view (FOV): 400×180 mm, slice thickness 4 mm, interslice gap 0.4 mm, and acquisition duration for upper/lower spine: 2 min 7 s each; and by short tau inversion recovery: TR: 3476 ms, TE: 70 ms, inversion time: 210 ms, matrix: 400×307/400×307, FOV: 400×180 mm, slice thickness 4 mm, interslice gap 0.4 mm, and acquisition duration for upper/lower spine: 5 min 23 s each.
Two readers screened the images for variation in spinal segmentation using C2 as the reference vertebra. Four patients had partial or complete sacralisation of L5 or lumbarisation of S1. In case of lumbarisation of S1, the DVU between S1 and S2 was not scored. Images were read independently by seven readers, of whom five were experienced readers (MØ, SJP, UW, WM and RGWL) and two were newly trained readers with less experience (SK and GK). This was done without knowledge of clinical, biochemical and other imaging information and blinded to chronology.
Description of the CANDEN scoring system
Lesion definitions applied in the CANDEN MRI spine scoring system are provided in table 1, and all the rules for scoring lesions and calculating the CANDEN patient-level scores and subscores are provided in table 2. To improve the use of the system, the original set of reference images for all four lesion types (inflammatory lesions, fat lesions, bone erosion and new bone formation) with definitions of lesions, rules for scoring, and examples of lesions above and below the threshold for scoring were updated and made available at www.carearthritis.com and www.copecare.dk
Statistical analysis
Patient-level sum scores for each lesion type (inflammatory lesions, fat lesions, bone erosion and new bone formation) were calculated based on each reader’s scores. The reliability of patient-level sum scores was assessed by pairwise intraclass correlation coefficients (ICC) for status scores and for change scores between all reader pairs, based on two-way single-measure models by absolute agreement and two-way average-measure models by absolute agreement. Similar analyses were also performed separately for the vertebral bodies and posterior elements, as well as for all individual components. Agreement plots in which a reader’s scores are plotted against the mean score of all readers allow a graphical assessment of the agreement between multiple readers. The reliability at the lesion level was assessed for status scores and change scores between all reader pairs by pairwise kappa with squared weights. Kappa values and ICC values ≥0.8 were considered to represent very good reliability, ≥0.6 to <0.8 good reliability, ≥0.4 to <0.6 moderate reliability, ≥0.2 to <0.4 slight reliability, and <0.2 poor reliability.
The responsiveness for changes in CANDEN MRI spine inflammation score, CANDEN MRI spine fat score, CANDEN MRI spine bone erosion score and CANDEN MRI spine new bone formation score between baseline and week 52 was assessed based on the observed status and change scores. Change at group level was tested for statistical significance using Wilcoxon signed-rank test.
Results
The CANDEN MRI spine inflammation score and all its subscores decreased significantly, the CANDEN MRI spine fat score increased significantly, whereas the CANDEN MRI spine bone erosion score and the CANDEN MRI spine new bone formation score remained largely unchanged (see table 3).
The reliability of the CANDEN MRI spine inflammation score was good to very good, and the reliability of the CANDEN MRI spine fat score was moderate to very good, for status scores and change scores during active treatment after 52 weeks of follow-up (see figure 1). The reliability of the CANDEN MRI spine bone erosion score and the CANDEN MRI spine new bone formation score was slight to moderate for status scores.
All readers had a largely similar mean CANDEN MRI spine inflammation score and CANDEN MRI spine fat score, both regarding baseline score and decrease in score during follow-up. Most readers identified no overall change in CANDEN MRI spine bone erosion score or CANDEN MRI spine new bone formation score during follow-up. Differences between readers’ scores tended to be smaller for patients with small mean scores and larger for patients with large mean scores for all four lesion types (see online supplementary figure 1 and online supplementary table 1).
Supplemental material
Almost all components of the CANDEN MRI spine inflammation score, that is, anterior corner, posterior corner, non-corner, anterolateral, posterolateral, transverse process, rib, facet joint and soft tissue inflammatory lesions, decreased significantly during the 52-week follow-up. Spinous process inflammation also tended to decrease, although this did not reach statistical significance (see online supplementary table 2).
Similarly, for the CANDEN MRI spine fat score, anterior corner, posterior corner, and anterolateral and posterolateral fat lesions all increased significantly during the 52-week follow-up. Non-corner and facet joint fat lesions tended to increase, although this did not reach statistical significance. None of the components of the CANDEN MRI spine bone erosion score or the CANDEN MRI spine new bone formation score changed significantly during 52-week follow-up, although anterior corner and non-corner bone erosions tended to decrease slightly, and anterior and posterior corner new bone formation tended to increase slightly (see online supplementary table 3).
Discussion
The revised CANDEN MRI spine scoring system demonstrated very good inter-reader agreement for the assessment of inflammatory lesions and moderate to very good inter-reader agreement for the assessment of fat lesions for both status and change scores in this multireader study involving seven readers. The individual components of these scores, whether from the vertebral bodies, that is, anterior corner, posterior corner and non-corner lesions, and anterolateral and posterolateral vertebral body lesions, or from the facet joints (for both inflammation and fat), or from the transverse processes, ribs, spinous processes and soft tissue (for inflammation only), showed responsiveness to change. Inflammatory lesions decreased, while fat lesions increased during treatment, although for some components with a low frequency of lesions, change over time did not reach statistical significance. The proposed four subscores of spinal inflammation, that is, the vertebral body corner inflammation subscore, the spondylodiscitis subscore, the facet joint inflammation subscore and the posterolateral elements inflammation subscore, all decreased significantly during follow-up in this cohort of patients treated with the TNF inhibitor golimumab.
The observed slight to moderate reliability for bone erosion and new bone formation status scores at baseline and their poor reliability for change over time may in part be explained by bone erosion and new bone formation being rather infrequent lesions in this cohort. The cohort consisted of biological treatment-naïve patients with axSpA, where many had only limited or no involvement of the spine, and where little structural change happened during 52 weeks of follow-up. This length of follow-up may be too short a timeframe for detecting structural changes in patients with axSpA, but we were able to identify a slight progression of new bone formation at the group level. When comparing the reliability for assessing spinal structural lesions across various studies, the proportion of patients who reached different stages in the evolution of disease should be taken into account. Higher reliability may be reached if patients were to be selected based on spinal involvement above a certain threshold at baseline, or in studies with longer follow-up since disease progression is slow. Another contributor to the slight to moderate reliability for bone erosion and new bone formation status scores is that structural lesions are often small and hard to detect on MRI T1-weighted images, where cortical bone provides no MRI signal and contrast and spatial resolution is limited compared with radiography. We have previously shown that the CANDEN MRI spine new bone formation score correlates with clinical examination (Bath Ankylosing Spondylitis Metrology Index) as well as with a radiographic score (modified Stoke Ankylosing Spondylitis Spine Score, mSASSS).7 Future studies may explore how well structural changes as assessed with MRI correlate with radiography or CT findings.
We have limited data on applying the CANDEN MRI spine scoring system to patients with severe involvement of the spine and with longer follow-up. A small longitudinal study with serial lower spine MRI in patients with AS identified progression of CANDEN MRI spine fat score at 0.4 years of follow-up and CANDEN MRI spine new bone formation score at 0.8 years of follow-up at the group level, with additional progression in CANDEN MRI spine new bone formation also noted at up to 5 years of follow-up, whereas no change in CANDEN MRI spine bone erosion score was observed.8 This suggests that the CANDEN system can be used for monitoring structural MRI lesions of the spine in patients with axSpA at the group level. Further validation on the ability of the system to capture change in erosion and new bone formation across the entire spectrum of spondyloarthritis is needed.
Fat lesions and bone erosion of the facet joints have been added to the scoring system. The aim is to have a comprehensive scoring system. The image quality of MRI scans obtained today allows such lesions to be visualised, although in this cohort these lesion types were rare. An assessment of fat lesions and bone erosion in the facet joints also allows studies of development of ankylosis of the facet joints over time.
Lateral slices were tentatively assessed for bone erosion, but we decided not to include scoring of bone erosion on lateral slices in the scoring system. Not only in this read-out, but also in the overall experience of the group members, such lesions were considered infrequent, and it is inherently difficult or impossible to assess bone erosion and new bone formation at those locations confidently when only sagittal images are available due to partial volume artefacts. For the same reason, bone spurs and ankylosis are also scored only on central slices, not on lateral slices. It was decided to exclude non-corner erosion from the scoring system, since these lesions were also considered infrequent. Moreover, protrusion of nucleus pulposus through the vertebral endplate (Schmorl’s nodes), unrelated to spondyloarthritis, was judged to be a major confounder.
Inflammatory and fat lesions have been shown to predict radiographic progression at the same location on conventional spinal radiographs30 31; for the CANDEN MRI spine scoring system, this specifically pertains to anterior corner inflammatory lesions and anterior corner fat lesions.27–29,32 The value of the various MRI spine scores for predicting structural progression, for example, worsening in mSASSS, needs further clarification.
The CANDEN MRI spine scoring system may be used in its entirety, or parts of it may be used selectively, depending on the objectives of the study. In a randomised study with short follow-up time where the aim is to identify a between-group difference for change in spinal inflammation over time, it may be decided to only score inflammatory lesions. To keep scores comparable across studies, the terminology ‘CANDEN MRI spine inflammation score’, ‘CANDEN MRI spine fat score’, ‘CANDEN MRI spine bone erosion score’ and ‘CANDEN MRI spine new bone formation score’ should be used only when all components are scored and calculated as described in this article.
It may be considered a limitation of the scoring system that it is complex and requires a thorough training of readers as well as experience with musculoskeletal MRI. Not only the rules and definitions of the scoring system must be well understood, but the anatomy of the spine as visualised by MRI as well as the ability to identify the various lesion types seen in patients with spondyloarthritis must also be mastered. The system is not designed for use in daily clinical practice but is intended for research use.
Even though the scoring system is intended to be used in patients diagnosed with spondyloarthritis, such patients may also have concomitant degeneration of intervertebral discs and endplates with related bone marrow changes. We currently propose that in trials and observational cohorts of patients diagnosed with axSpA, all lesions that fulfil the definitions outlined here should be scored. The reader should not make a judgement whether a lesion is likely to be related to spondyloarthritis or not, but should decide whether the lesion fulfils the definitions provided. This gives a higher sensitivity for capturing all spinal lesions that are present, at the cost of a reduced diagnostic specificity from, for example, degeneration of intervertebral discs. Other ways of handling disc degeneration may be considered depending on the research design and the purpose of the study, for example, excluding scores from DVUs that are considered to have loss of at least 50% of normally expected disc height or where disc degeneration is obvious.33 The issue of degenerative spinal changes needs to be addressed in future studies to find a consistent handling of this major limitation of any MRI spine assessment.
It is possible that some therapeutic interventions may vary in their impact on different components, for example, entheseal inflammation in the vertebral bodies versus synovial joint inflammation in the facet joints. Using this scoring system, it is possible to calculate four inflammation subscores that may predominantly represent inflammation of different spinal structures, and which may respond differently to various treatments and may have different prognostic significance.
In conclusion, a comprehensive system for the evaluation of inflammation, fat, bone erosion and new bone formation of the spine in patients with axSpA has been updated and validated. The system is designed for assessment of the individual types of MRI lesions and for acquiring total scores for the different types of lesions (inflammation, fat, erosion and new bone formation). Moderate to very good reliability for the assessment of inflammatory lesions and fat lesions for both status scores and change scores and poor to moderate reliability for bone erosion and new bone formation status scores were demonstrated. The scoring system may be used to investigate how drugs with different modes of action influence the individual components of MRI spine inflammation and damage, the mutual relationship between lesions, as well as the evolution of inflammation and damage in the entire spine.
Acknowledgments
The authors would like to acknowledge senior radiographer Jakob Møller and Kasper Gosvig MD PhD, Department of Radiology, Herlev and Gentofte Hospital; and MANGO trial investigators Inge Juul Sørensen MD PhD, Rigshospitalet Glostrup, Bente Jensen MD, Frederiksberg Hospital, Ole Rintek Madsen MD PhD DMSc, Herlev and Gentofte Hospital, and Mette Klarlund MD PhD, Nordsjællands Hospital. Thanks to CaRE Arthritis (www.carearthritis.com) for development and use of the web-based scoring interface and DICOM viewer and hosting the online meetings.
References
Footnotes
Contributors The decision to revise the Canada-Denmark MRI spine scoring system and to perform a multireader reliability scoring exercise was made by all authors. SJP and MØ acquired the images. All authors revised the slideshow files that contain definitions and reference images during the six online meetings. All authors read the MRI scans. SK performed the statistical analyses and drafted the manuscript. All authors approved the final version of the reference image set and the manuscript.
Funding Images for this analysis were provided from an investigator-initiated trial (MANGO) sponsored by MSD. MSD had no role in the study design or in the collection, analysis or interpretation of the data, the writing of the manuscript, or the decision to submit the manuscript for publication, and publication of this article was not contingent upon approval by MSD. SK received grants from The Danish Rheumatism Association and Rigshospitalet.
Competing interests MØ has received research support and/or consultancy/speaker fees from AbbVie, BMS, Boehringer Ingelheim, Celgene, Eli Lilly, Centocor, GSK, Hospira, Janssen, Merck, Mundipharma, Novartis, Novo, Orion, Pfizer, Regeneron, Roche, Schering-Plough, Takeda, UCB and Wyeth. SJP has received speaker fees from MSD, Pfizer, AbbVie, Novartis and UCB, has been an advisory board member for AbbVie and Novartis, and has received research support from AbbVie, MSD and Novartis. UW received speaking fees from AbbVie for serving as convenor for workshops on imaging in SpA. WM has received research support and/or consultancy/speaker fees from AbbVie, Boehringer, Celgene, Eli Lilly, Janssen, Novartis, Pfizer and UCB. RGWL has received consultancy fees from Parexel. The remaining authors declare that they have no competing interests.
Patient consent for publication Not required.
Ethics approval The study was approved by the Regional Committee on Health Research Ethics, Capital Region of Denmark, approval number: H1-2013-118. All patients provided written consent.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available upon reasonable request.