Article Text

Download PDFPDF

Original research
Sex-specific diagnostic efficacy of MRI in axial spondyloarthritis: challenging the ‘One Size Fits All’ notion
  1. Sevtap Tugce Ulas1,2,
  2. Fabian Proft3,
  3. Torsten Diekhoff1,
  4. Valeria Rios3,
  5. Judith Rademacher2,3,
  6. Mikhail Protopopov3,
  7. Juliane Greese1,
  8. Iris Eshed4,5,
  9. Lisa C Adams6,7,
  10. Kay Geert A Hermann1,
  11. Sarah Ohrndorf8,
  12. Denis Poddubnyy3 and
  13. Katharina Ziegeler1
  1. 1Department of Radiology, Charité Universitätsmedizin Berlin, Berlin, Germany
  2. 2Berlin Institute of Health, Berlin, Germany
  3. 3Department of Gastroenterology, Infectiology and Rheumatology (including Nutrition Medicine), Charite Universitatsmedizin Berlin, Berlin, Germany
  4. 4Diagnostic Imaging, Sheba Medical Center, Tel Hashomer, Israel
  5. 5Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
  6. 6Department of Radiology, Technische Universität München, Munich, Germany
  7. 7Department of Radiology, Stanford University School of Medicine, Stanford, California, USA
  8. 8Department of Rheumatology and Clinical Immunology, Charité Universitätsmedizin Berlin, Berlin, Germany
  1. Correspondence to Dr Katharina Ziegeler; katharina.ziegeler{at}


Objectives Sex-specific differences in the presentation of axial spondyloarthritis (axSpA) may contribute to a diagnostic delay in women. The aim of this study was to investigate the diagnostic performance of MRI findings comparing men and women.

Methods Patients with back pain from six different prospective cohorts (n=1194) were screened for inclusion in this post hoc analysis. Two blinded readers scored the MRI data sets independently for the presence of ankylosis, erosion, sclerosis, fat metaplasia and bone marrow oedema. Χ2 tests were performed to compare lesion frequencies. Contingency tables were used to calculate markers for diagnostic performance, with clinical diagnosis as the standard of reference. The positive and negative likelihood ratios (LR+/LR–) were used to calculate the diagnostic OR (DOR) to assess the diagnostic performance.

Results After application of exclusion criteria, 526 patients (379 axSpA (136 women and 243 men) and 147 controls with chronic low back pain) were included. No major sex-specific differences in the diagnostic performance were shown for bone marrow oedema (DOR m: 3.0; f: 3.9). Fat metaplasia showed a better diagnostic performance in men (DOR 37.9) than in women (DOR 5.0). Lower specificity was seen in women for erosions (77% vs 87%), sclerosis (44% vs 66%), fat metaplasia (87% vs 96%).

Conclusion The diagnostic performance of structural MRI markers is substantially lower in female patients with axSpA; active inflammatory lesions show comparable performance in both sexes, while still overall inferior to structural markers. This leads to a comparably higher risk of false positive findings in women.

  • magnetic resonance imaging
  • low back pain
  • spondylitis, ankylosing

Data availability statement

Data are available upon reasonable request. Data are available upon reasonable request from the corresponding author.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Axial spondyloarthritis has a larger diagnostic delay in women than in men.

  • Female patients with chronic low back pain are more likely to suffer from degenerative or mechanical stress-induced disease of the sacroiliac joints than men, which may be connected to significant sex differences in joint biomechanics.


  • Imaging appearance of axial spondyloarthritis differs between the sexes, with more ankylosis and fat metaplasia in men and more sclerosis in women; however, no sex-specific differences were shown for bone marrow oedema in axial spondyloarthritis patients.

  • Diagnostic performance of established imaging markers on MRI is substantially lower in women. Notably ankylosis, which is generally viewed as the most specific imaging marker for axial spondyloarthritis, carries a significant risk of false positives in female patients.

  • Exclusion of lesions from the ventral joint third, that is, the mechanical load zone, increases diagnostic performance.


  • Based on these findings, future revisions of imaging criteria may consider sex-specific recommendations, improving diagnostic accuracy for male and female patients.


Axial spondyloarthritis (axSpA) is a chronic inflammatory disease of the axial skeleton that has historically been regarded as a predominantly male disease,1 but is more recently recognised as affecting both sexes to a similar degree.2–4 However, disease presentation may differ considerably between men and women, with men facing a higher risk of structural damage whereas women are more susceptible to peripheral manifestations.3 Furthermore, previous studies showed that female patients with axSpA report perception of pain, stiffness, fatigue and loss of mobility at higher rates than their male counterparts.3 5 6 Concomitant diagnoses such as depression and fibromyalgia are also seen more commonly in women.7 In women, chronic low back pain (LBP), as a clinical hallmark of axSpA is more closely associated with degenerative or mechanical stress-induced disease of the sacroiliac joint (SIJ). The female SIJ is exposed to more mechanical strains,8 especially during pregnancy and childbirth.9 Therefore, joint lesions detected in the diagnostic process are more commonly attributed to degenerative or mechanical joint disease, for example, osteitis condensans ilii10 or axial osteoarthritis. In addition, the female SIJ comparatively more prone to exhibit variations in anatomical form, which has recently been linked to both degenerative and inflammatory lesions on imaging.11

These different factors may contribute to underdiagnosis of axSpA in women clinically, although imaging findings are more likely to lead to overdiagnosis.2 Though crucial for the diagnosis of sacroiliitis, bone marrow oedema (BME) has limited specificity, particularly in women, mainly since mechanical overload, such as intensive sporting activities and pregnancy, and recent labour, can lead to periarticular sacroiliac BME persisting for several years.12 13 For this reason, BME alone, especially in the ventral joint portion,10 14 does not appear to be well suited to differentiate early axSpA, especially non-radiographic axSpA, from non-axSpA findings,15 and should generally not be interpreted without regard to structural lesions, especially erosions.16 In addition, erosions are relatively common even in healthy elderly individuals.17

All these factors may contribute to a longer diagnostic delay in the diagnosis for axSpA in women18 19 resulting in greater impairment of quality of life and physical function.20 A potential tool to overcome diagnostic delay and misdiagnosis is sex-disaggregation of medical data, which is practiced in a variety of medical specialties, but has only recently gained interest within the rheumatological community2 and is controversial.21 Since imaging plays a vital role in the diagnostic process of axSpA, the establishment of a sex-specific imaging phenotype is a promising step towards overcoming sex-disparities. The purpose of this analysis was to investigate the differences in diagnostic performance of MRI findings in men and women.

Materials and methods


This study was designed as a post hoc analysis of six different prospective cohorts: the GErman Spondyloarthritis Inception Cohort (GESPIC), with its three arms (ankylosing spondylitis (AS), Crohn, Uveitis),22–25 the Optimal Referral Strategy for Early Diagnosis of Axial Spondyloarthritis (OptiRef) study,26 the SacroIliac joint MRI and CT (SIMACT) study,27 and the Virtual Non-Calcium—Susceptibility Weighted Imaging (VNC-SWI) study.28 In OptiRef, SIMACT and VNC-SWI patients with chronic LBP and suspected axSpA were included. GESPIC-AS enrolled patients with active radiographic axSpA before initiation of a biological disease-modifying anti-rheumatic drug while in the other two arms patients with either Crohn’s disease or acute anterior uveitis with or without SpA were enrolled. The patient cohort was divided in the axSpA group (based on the diagnosis made by the rheumatologist) and the control group which included patients without axSpA. A more comprehensive description of the respective cohorts can be obtained from the cited references; for the purpose of this investigation, only baseline data was used. Patients with missing or incomplete imaging (no available T1-weighted and short-tau inversion recovery (STIR) sequences or images containing artefacts) or clinical data (eg, missing patients characteristics such as age and sex) and no data regarding back pain were excluded, see figure 1.

Figure 1

Patient inclusion and clinical characteristics. Significantly higher frequencies or mean values in comparisons between female and male subjects of the same group, as detected by unpaired t-tests or χ2 tests are printed in bold and marked with an asterisk (*). axSpA, axial spondyloarthritis; BASDAI, Bath Ankylosing Spondylitis Disease Activity Index; CRP, C-reactive protein; GESPIC, GErman Spondyloarthritis Inception Cohort; HLA, human leucocyte antigen; LBP, low back pain; OptiRef, Optimal Referral Strategy for Early Diagnosis of Axial Spondyloarthritis; SIMACT, SacroIliac joint MRI and CT; VNC-SWI, virtual Non-Calcium—Susceptibility Weighted Imaging.

Patient and public involvement

There was no specific patient or public involvement in this investigation.

Scoring system and lesion definition

After pseudonymisation of the oblique-coronal T1-weighted and STIR sequence imaging data sets, all images were analysed using a structured scoring system, which was a simplified version of those used in previous studies.28 First, each side was evaluated for ankylosis (none/partial/complete), resulting in a total sum score of 0–4—only changes in the cartilaginous portion of the joint were graded in this category, not extra-articular bridging osteophytes. Second, the SIJ was divided into three portions: ventral, middle and dorsal third. The ventral third of the SIJ was defined as the location at which the true pelvis was in the centre of the MR image. The visualisation of the sacral foramina was defined as the middle third of the SIJ. The dorsal third was characterised by the depiction of the sacral nerve roots. In each portion, the iliac and sacral parts on both sides were assessed separately for presence of erosion, sclerosis, fat metaplasia and BME with a scoring range of 0–1 for each parameter, resulting in a total sum score of 0–12 for each parameter. Both readers were trained and calibrated on a set of test cases prior to the scoring. All images were read by two residents in diagnostic imaging (STU/KZ with 2/6 years of experience in musculoskeletal (MSK) imaging, respectively) who were blinded to all clinical data. Cases of disagreement, as well as cases of ankylosis in controls were resolved in a consensus reading under the supervision of an expert MSK radiologist with 12 years of experience (TD).

Statistical analysis

Comparison of lesion frequencies per location and group were performed with χ2 tests; sum scores were compared using unpaired t-tests. Indicators of diagnostic accuracy for different lesions and their combinations were computed from cross-tabulations: sensitivity (SE), specificity (SP), positive and negative predictive values and likelihood ratios (LR+/LR−). In addition, the diagnostic OR (DOR) was calculated, which is simply LR+ divided by LR–.29 A good LR+ was defined as >10 and a good negative likelihood ratio was defined as <0.1,30 and a DOR of ≥10 was considered to describe a strong test.31 All analyses were carried out using SPSS V.27 with a two-tailed significance level of alpha=0.05. To avoid inflation of the alpha-error, significance levels for comparisons of lesion frequencies per region were adjusted for multiple comparisons with a Bonferroni correction (n=12), resulting in an adjusted significance level of alpha=0.004 for these analyses.



A total of 1194 patients were evaluated, and 526 patients were included in the analysis after applying exclusion criteria (missing clinical data, incomplete imaging, no evident LBP). Of these, 379 (72.1%; 136 women and 243 men with a mean age of 37.6±11.4) were clinically diagnosed with axSpA (202 radiographic axSpA, 63 women; 177 non-radiographic axSpA, 98 women) by expert rheumatologists. Figure 1 shows patient inclusion and clinical characteristics. In the control group (n=147; 92 women and 55 men with a mean age of 37.6±11.4), 96 patients were diagnosed with mechanical disease of the SIJ (including osteitis condensans and diffuse idiopathic skeletal hyperostosis) or degenerative spinal disease and 51 patients were classified as non-specific back pain.

Distribution and extent of lesions

A summary of lesion frequencies and extent of findings expressed as patient-level sum scores is given in table 1. No major sex-specific differences in the distribution of BME and erosions were found. A significantly higher prevalence of fat metaplasia (58.8% vs 42.6%; p=0.003) and ankylosis (24.3% vs 7.4%; p<0.001) was shown in male patients with axSpA. Out of the overall four control patients with ankylosis, three were diagnosed with diffuse idiopathic skeletal hyperostosis (DISH) and one with non-specific back pain (an imaging example is given as online supplemental file 3). Sclerosis was generally more common in women, both in the axSpA (75.0% vs 57.6%; p<0.001) and in the control group (56.5% vs 34.5%; p=0.011). The spatial distribution of these lesions in patients with axSpA, compared between female and male patients is shown in figure 2. The figure shows that the excess sclerosis in women is found in the ventral and middle iliac joint portions, while fat metaplasia in men is found in the ventral and middle sacral bone marrow. Data for controls is shown in figure 3; this analysis yielded no significant difference in spatial distribution of lesions between the sexes.

Table 1

Distribution of lesions (patient level)

Figure 2

Distribution of lesions among patients with axSpA. Relative (%) lesion frequencies, given as comparisons between female patients with axSpA (n=136) and male patients with axSpA (n=243). Asterisk (*) denotes significantly (p<0.004) higher proportions in comparison to the other sex, p values were derived from χ2 tests. axSpA, axial spondyloarthritis.

Figure 3

Distribution of lesions among controls. Relative (%) lesion frequencies, given as comparisons between female (n=92) and male (n=55) controls. Asterisk (*) denotes significantly (p<0.004) higher proportions in comparison to the other sex, p values were derived from χ2 tests.

Diagnostic performance

Diagnostic performance expressed as likelihood ratios is given as table 2, while comprehensive compilation of single and multiple parameter accuracy is provided as online supplemental file 1. BME performed slightly stronger in women with a DOR of 3.9 versus 3.0. However, the most significant difference in single parameter performance was found in fat metaplasia which had a higher DOR in men (37.9) but not in women (5.0). In addition to fat metaplasia, erosion and sclerosis performed at least slightly better in men—DOR 15.1 versus 7.8 and 2.6 versus 2.3, respectively. All markers performed better, when only the middle and dorsal joint portions were assessed, except for ankylosis which was assessed per joint. The strongest diagnostic performance of parameter combinations was found for (partial) ankylosis and erosions of the middle and dorsal joint portions with a DOR of 10.9 for women and 28.6 for men. Inclusion of further imaging markers resulted in marked decreases of LR+ without sufficient improvements of LR–, leading to an overall weaker diagnostic performance. Division of the study population not just by gender but also by disease duration yielded insufficient sample sizes, so that detailed analysis of this aspect was not undertaken; however, SE, SP and DOR of ankylosis, erosion, sclerosis, fat metaplasia and BME in different disease duration groups are given as online supplemental file 2.

Table 2

Diagnostic performance


This is the first large-scale analysis to investigate the sex-specific diagnostic performance of MR imaging in axSpA. While we found differences in the imaging appearance and diagnostic performance of individual imaging markers, we did not find different optimal combinations of imaging parameters in MRI for men and women.

Contrary to expectations, no major sex-specific differences were found in the distribution of erosions and of BME. The greatest difference was found for fat metaplasia and ankylosis, which is commonly considered a highly specific imaging marker in axSpA; while our data confirm this in men, we found a slightly more limited diagnostic value for ankylosis in women, with 3 out of 13 women with ankylosis not suffering from axSpA versus 1 out of 55 men, although the overall low rates of ankylosis in women should be taken into account. As controls with ankylosis were likely to suffer from DISH, we believe more focus on this differential diagnosis should be given in cases with partial ankylosis without erosions.32 In our analysis, exclusion of the ventral joint third resulted in improved diagnostic performance of imaging markers in both for men and women. The findings are in line with those reported for CT33 and previously for MRI,14 and are best explained by the fact that the ventral joint portions are prone to degenerative lesions as they constitute the mechanical load zone of the SIJ.34 In a previous study, sex-specific differences in the extent of BME were shown in the general population depending on the HLA-B27 positivity.15 In contrast to these results, we showed no major differences of BME in male patients with axSpA and female patients with axSpA and in the control group. This lack of the difference in the distribution of BME between the sexes reaffirms the well-known importance of this specific active lesion as a diagnostic tool for axSpA. It is assumed that structural damage and severe radiological progression are more common in male than in female patients with axSpA,3 who exhibit a slower radiographic progression, which may explain the relatively greater number of women diagnosed with non-radiographic axSpA as well as the longer delay in diagnosis.35 Adding to this deficit in SE, we also found a comparatively lower SP for ankylosis, rendering this imaging marker less suited in female patients with axSpA than their male counterparts. Notably, even the most favourable combination of parameters in our cohort did not yield sufficient diagnostic accuracy to confirm (LR+ f: 4.3 or m: 7.1) or rule out axSpA (LR– f: 0.4 or m: 0.2) based on MRI, while a recent investigation of our group demonstrated that CT could indeed be used as a confirmatory test with a good LR+ of 18.3.33

Evidence is accumulating that the clinical presentation of axSpA differs between the sexes.3–5 Furthermore, women show greater functional impairment as well as worse patient-reported outcomes and lower response to therapy compared with men and early diagnosis thus targeted initiation of adequate therapy is crucial to prevent disease progression and increase quality of life.4 However, accurate sex-specific data from clinical trials are still lacking.21 Different sex-specific phenotypes of axSpA appear to exist. This highlights not only the need to better identify the signs and symptoms of axSpA according to sex but also to establish sex-specific classification criteria and align them to the corresponding profiles.

These results need to be interpreted with caution, for different reasons. First, only conventional T1-weighted spin-echo sequences were used, which have well-established limitations regarding the assessment of structural lesions.27 The MR images were compiled from different study cohorts using different imaging protocols. This may have had an impact on the assessment of the lesions. The use of high-resolution gradient-echo sequences would have significantly increased the diagnostic accuracy of structural lesions. However, the readers agreement was moderate to excellent for the joint lesions.36 Furthermore, backfill and enthesitis were not considered in the evaluation, potentially limiting the scope of this investigation. We also did not use a more detailed scoring system, such as the Scores Spondyloarthritis Research Consortium of Canada or Berlin 24-regions-method which might have been helpful in eliciting more subtle differences in the spatial distribution of lesions, especially for BME. Further limitations include a very heterogeneous control population, which is neither perfectly representative of the normal population, nor of the patients that report to a rheumatologist with suspicion of axSpA. Our control patients exhibited higher rates of HLA-B27 positive subjects than expected (37% of the women and 47% of the men) as well an uneven gender distribution, which may have introduced some bias to the imaging data and reduces the generalisability of our findings.15 Despite being comprised of different larger patient cohorts, the sample size was too small to study gender disparities with regards to disease duration. Most importantly, the MRI images under investigation were also used in the diagnostic process, which carries a risk of circular reasoning bias.

In conclusion, our study provides evidence of relevant sex differences in the diagnostic performance of MR imaging in axSpA. Further data, including high-resolution gradient-echo MRI or dedicated CT, are needed to overcome gender inequity in diagnostic imaging in axSpA.

Data availability statement

Data are available upon reasonable request. Data are available upon reasonable request from the corresponding author.

Ethics statements

Patient consent for publication

Ethics approval

All patients gave written informed consent before enrolment. All investigations were approved by the institutional ethics review board prior to study commencement (EA4/161/15; EA4/170/16; EA1/0886/16; EA1/073/10; NCT 01277419). There was no specific patient or public involvement in this investigation. All data and materials presented in this study are available on request from the corresponding author.


This research project was funded by the Assessment of Spondyloarthritis international Society (ASAS) (research grant for KZ). The authors thank the Berlin Institute of Health for personal funding (STU, JR, LCA and TD) and providing essential infrastructure for data collection.


Supplementary materials


  • STU and FP are joint first authors.

  • DP and KZ are joint senior authors.

  • Twitter @ProftDr, @mprotopopov

  • Contributors STU: design of scoring system, image scoring, data evaluation, article draft, critical revision of the manuscript for important intellectual content. FP: patient acquisition, data collection, data evaluation, article draft, critical revision of the manuscript for important intellectual content. TD: image scoring, data evaluation, critical revision of the manuscript for important intellectual content. VR and JR: patient acquisition, data collection, critical revision of the manuscript for important intellectual content. MP and SO: patient acquisition, critical revision of the manuscript for important intellectual content. JG, IE, LCA and K-GAH: data collection, critical revision of the manuscript for important intellectual content. DP: patient acquisition, data evaluation, critical revision of the manuscript for important intellectual content. KZ: conception and design of the study, design of scoring system, image scoring, data evaluation, statistical calculations, article draft, critical revision of the manuscript for important intellectual content, acts as the guarantor of this study.

  • Funding The funding sources were not involved in study design, in the collection, analysis and interpretation of data, in the writing of the report or in the decision to submit the paper for publication.

  • Competing interests STU is participant in the BIH-Charité Junior Digital Clinician Scientist Program funded by the Charité – Universitätsmedizin Berlin and the Berlin Institute of Health. FP reports grants and personal fees from Novartis, Lilly and UCB, as well as personal fees from AbbVie, AMGEN, BMS, Celgene, Hexal, Janssen, MSD, Pfizer and Roche. TD reports personal fees from Novartis, Lilly, MSD and Canon MS. JR and LCA are participants in the BIH-Charité Clinician Scientist Program funded by the Charité – Universitätsmedizin Berlin and the Berlin Institute of Health. MP reports personal fees from Novartis. K-GAH reports personal fees from AbbVie, MSD, Pfizer and Novartis, he is also the co-founder of BerlinFlame GmbH. DP reports grants and personal fees from AbbVie, Eli Lilly, MSD, Novartis, Pfizer, and personal fees from Biocad, Gilead, GlaxoSmithKline, Janssen, MSD, Moonlake, Novartis, Pfizer, Samsung Bioepis and UCB. KZ reports funding (research grant) from the Assessment of Spondyloarthritis international Society (ASAS) during the conduct of this study. All other authors have no funding to report.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.