Article Text

Download PDFPDF

Original research
Diagnostic accuracy in axial spondyloarthritis: a systematic evaluation of the role of clinical information in the interpretation of sacroiliac joint imaging
  1. Tim Pohlner1,
  2. Dominik Deppe2,
  3. Katharina Ziegeler2,
  4. Fabian Proft1,
  5. Mikhail Protopopov1,
  6. Judith Rademacher1,
  7. Valeria Rios Rodriguez1,
  8. Murat Torgutalp1,
  9. Jürgen Braun3,4,
  10. Torsten Diekhoff2 and
  11. Denis Poddubnyy1,5
  1. 1Department of Gastroenterology, Infectiology and Rheumatology (including Nutrition Medicine), Charité Universitätsmedizin Berlin, Berlin, Germany
  2. 2Department of Radiology, Charité Universitätsmedizin Berlin, Berlin, Germany
  3. 3Ruhr University Bochum, Bochum, Germany
  4. 4RVZ Steglitz, Berlin, Germany
  5. 5Epidemiology Unit, DRFZ, Berlin, Germany
  1. Correspondence to Dr Denis Poddubnyy; denis.poddubnyy{at}


Objectives Radiography and MRI of the sacroiliac joints (SIJ) are relevant for the diagnosis and classification of patients with axial spondyloarthritis (axSpA). This study aimed to evaluate the impact of clinical information (CI) on the accuracy of imaging interpretation.

Methods Out of 109 patients referred because of suspicion of axSpA with complete imaging sets (radiographs and MRI of SIJ), 61 were diagnosed with axSpA (56%). Images were independently evaluated by three radiologists in four consecutive reading campaigns: radiographs and radiographs+MRI without and with CI including demographic data, SpA features, physical activity and pregnancy. Radiographs were scored according to the modified New York criteria, and MRIs for inflammatory and structural changes compatible with axSpA (yes/no). The clinical diagnosis was taken as reference standard. The compatibility of imaging findings with a diagnosis of axSpA (precision) before and after the provision of CI and radiologists’ confidence with their findings (0–10) were evaluated.

Results The precision of radiographs evaluation without versus with CI increased from 70% to 78% (p=0.008), and for radiographs+MRI from 81% to 82% (p=1.0), respectively. For CR alone, the sensitivity and specificity of radiologic findings were 51% and 94% without and 60% and 100% with CI, while, for radiographs+MRI, they were 74% and 90% vs 71% and 98%, respectively. The diagnostic confidence of radiologists increased from 5.2±1.9 to 6.0±1.7 with CI for radiographs, and from 6.7±1.6 to 7.2±1.6 for radiographs+MRI, respectively.

Conclusion The precision, specificity and diagnostic confidence of radiologic evaluation increased when CI was provided.

  • magnetic resonance imaging
  • inflammation
  • spondyloarthritis

Data availability statement

Data are available on reasonable request. The original study data may be made available on a reasonable request, which should be directed to the corresponding author.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Imaging (radiography and MRI) is an important component of the diagnostic approach in suspected axial spondyloarthritis (axSpA), but the evidence on the impact of clinical information (CI) on the imaging interpretation in the axSpA context was lacking.


  • This study demonstrates that including structured CI in sacroiliac joint imaging interpretation enhances the precision, specificity and radiologists’ diagnostic confidence.


  • The findings advocate for the integration of structured CI into the radiologic assessment process in axSpA to prevent misdiagnosis and misclassification. It emphasises the importance of effective communication between rheumatologists and radiologists to improve patient outcomes.


Axial spondyloarthritis (axSpA) is a chronic inflammatory immune-mediated disease that belongs to the spectrum of spondyloarthritides (SpA). AxSpA predominantly affects the axial skeleton by inflammation and structural changes.1 2 Patients suffer from pain, stiffness, restricted mobility and functional deficits. The most typical symptom—inflammatory back pain (IBP)—is mostly caused by inflammation in the sacroiliac joints (SIJ) and/or in the spine. Extraspinal manifestations (peripheral arthritis, enthesitis, dactylitis) and extramusculoskeletal manifestations such as anterior uveitis, psoriasis and inflammatory bowel disease (IBD) may occur.1 2 AxSpA is strongly associated with HLA-B27.3

The Assessment of SpondyloArthritis international Society (ASAS) classification criteria4 and the modified New York criteria (mNYC5) are currently used for classification of patients with axSpA. Using radiographs, structural changes in the SIJ (‘radiographic sacroiliitis’) can be detected, but the reliability of scoring radiographs of SIJ was shown to be limited.6 In contrast, MRI can visualise not only active inflammation in form of bone marrow oedema (BME) but also structural changes in the SIJ such as erosions, backfill, fat metaplasia and ankylosis.7–9 Importantly, contrast agents are not needed to achieve that.10 The definition of MRI changes compatible with axSpA has been recently updated by ASAS for both the SIJ and the spine, for classification purposes.11 12 Thus, MRI plays a major role for the diagnosis and classification of axSpA—also due to its ability to detect inflammation in early disease stages, when no structural changes are depicted by radiographs.

A considerable delay to diagnose patients with axSpA has recently been confirmed.13 An early and correct diagnosis of axSpA is critical in the era of potent anti-inflammatory therapies and the treating patients to target. The precise imaging evaluation plays a crucial role in establishing the correct and early diagnosis of axSpA.

In this context, a question across medical specialties nowadays is whether appropriate clinical information (CI) about patients undergoing diagnostic imaging has an impact on radiologists’ interpretation of images. Although most authors found more arguments in favour of CI,14 15 the evidence in the field of axSpA is still lacking.

Recently, an ASAS expert group has developed international recommendations regarding the content of CI that should be provided to the radiologist and how the radiologist should describe the radiological findings of patients with suspected axSpA undergoing imaging.16 The recommended information to be transferred includes the history and characteristics of back pain, HLA-B27 status, SpA parameters, physically demanding job and the level of physical activity and the history of pregnancies.

The main purpose of this study was to evaluate the impact of the predefined CI on the diagnostic accuracy of the interpretation of imaging findings (radiography and MRI of SIJ) in the context of diagnosing axSpA.


Patient selection and imaging evaluation

All clinical and imaging data used in this work were collected in the OptiRef study17 that included:

  1. A total of 180 patients presenting to a rheumatologist using a self-referral tool.

  2. A total of 181 patients referred by orthopaedic surgeons and general practitioners using the ‘Berlin referral tool’.18

  3. A total of 92 patients referred by general practitioners, orthopedists and other physicians with suspected axSpA without the application of a specific referral approach.

All included patients had undergone structured medical and physical examinations by rheumatologists. Demographic data including the duration and onset of back pain, whether IBP was present, history and/or presence of arthritis, enthesitis, dactylitis and extraarticular SpA manifestations (anterior uveitis, psoriasis, IBD), response to non-steroidal anti-inflammatory drugs (NSAIDs) and family history of SpA were collected. In addition, C reactive protein (CRP) level and HLA-B27 were determined, and radiography and MRI of the SIJ performed if clinically justified. Furthermore, information on occupation, sport activity and the history of pregnancies deliveries was collected. A diagnosis of axSpA or no axSpA was made by the rheumatologists and used as reference standard.

For the current study, we selected a total of 109 patients based on the availability of relevant CI and both imaging modalities (radiographs and MRI of the SIJ). For MRI, a T1-weighted and a Short Tau Inversion Recovery (STIR) sequences in the oblique-coronary plane were required.

All images were evaluated by three musculoskeletal radiologists (readers): reader 1 had 6 years, reader 2 had 3 years and reader 3 had 12 years of experience. Readers were blinded to the rheumatologists’ diagnoses.

All 109 cases were separately evaluated in 4 consecutive rounds: (1) radiographs of SIJ without CI, (2) radiographs of SIJ with CI, (3) radiographs and MRI of SIJ without CI and (4) radiographs and MRI of SIJ with CI.

The following CI was provided: age, sex, height and weight, duration of back pain, localisation of pain, the presence of IBP, other SpA manifestations, a family history of SpA, response to NSAIDs (defined as significant reduction of pain 24–48 hours after intake of a full dose), occupation (predominantly mental or physical work or both), sport activity (regular sports yes/no), HLA-B27 (positive/negative) and CRP level (mg/L). In addition, the number of births and the time span between the last delivery and the image acquisition was provided.

After completion of one round, the next round was released. Already completed rounds could not be seen once the readers had completed these steps. Readers had no access to CI other than provided in rounds 2 and 4.

Radiographs of SIJ were graded according to the mNYC.5

MRIs of SIJ were evaluated as follows.

  1. Are there signs of inflammatory activity compatible with axSpA (yes/no)?

  2. Are there inflammatory changes (including non-axSpA attributed changes*) at all (yes/no)?

  3. Are there structural changes compatible with axSpA (yes/no)?

  4. Are there structural changes (including non-axSpA attributed structural changes*) at all (yes/no)?

*Non-axSpA-related changes included, for example, changes deemed to be mechanical or degenerative in nature.

In all rounds, readers were asked whether the imaging findings were compatible with a diagnosis of axSpA (yes/no) and how confident they were with this evaluation (assessed by a Numerical Rating Scale (NRS), in which 0 was equivalent to ‘not certain’ and 10 to ‘very certain’). In the rounds with CI, readers were asked to estimate how strongly the CI had influenced their findings (NRS with 0: no influence; 10: very strong influence).

Statistical analyses

For the sample size calculation, we assumed that without CI, a 20% discrepancy between the overall judgement on imaging and the final diagnosis done by rheumatologist (the reference) is observed. With clinical diagnosis, the percentage of discrepant diagnostic judgements is expected on the level of 5%. Thus, it was calculated that at least 85 images were needed to find a statistically significant difference between results without and with CI using McNemar’s test with a power of 80% and α=0.05.

The main study question on the influence of CI on the diagnostic precision of imaging evaluation was separately investigated for rounds 1 and 2 (radiographs) and 3 and 4 (radiographs and MRI combined). The agreement between the readers’ judgement (at least two of three readers) whether the case is ‘compatible with the diagnosis of axSpA’ or ‘not compatible with the diagnosis of axSpA’ with the predefined reference (diagnosis by rheumatologist) is presented as percentage, referred to as precision. The sensitivity, specificity, the positive predictive value (ppV) and the negative predictive value (npV) were calculated. The precision in the round without CI was compared with CI using the McNemar test. These analyses were also performed for each reader individually.

Fleiss’ kappa values were calculated to analyse the interrater reliability of the main outcome—compatibility of the imaging findings with a diagnosis of axSpA in the rounds with and without CI.


The demographic data of all included 109 patients are shown in table 1. In total, 61 patients (56%) had been diagnosed with axSpA (39 with radiographic and 22 with non-radiographic axSpA), while in 48 patients axSpA was excluded and other diagnoses were made such as non-specific low back pain and degenerative changes in the SIJ/spine. Expectedly, patients with axSpA were younger, more frequently male and had a higher frequency of most SpA features. There were only small differences regarding physical activity and sports. Pregnancy was less often reported in axSpA.

Table 1

Demographic and clinical characteristics of the included patients

Due to imaging quality issues and incomplete findings of the radiologists, 8 cases could not be evaluated in rounds 1 and 2 (radiographs evaluation), while 10 cases were excluded in rounds 3 and 4 (radiographs and MRI). Thus, there were a total of 101 cases available in rounds 1 and 2, while there were 99 cases in rounds 3 and 4.

The results of SIJ radiographic findings evaluation without and with CI are presented in table 2 and figure 1. The precision of the findings was better when CI was available: 70.3% vs 78.2% (p<0.008). All other parameters also tended to improve by the availability of CI. While there were moderate improvements regarding sensitivity and npV, specificity and ppV reached 100%. The inter-rater reliability (kappa values) increased to substantial agreement with CI.

Figure 1

The change in precision, sensitivity, specificity, positive and negative predictive value of imaging evaluation associated with providing clinical information to a radiologist. npV, negative predictive value; ppV, positive predictive value; SIJ, sacroiliac joints.

Table 2

Diagnostic accuracy of evaluation of radiographs of the sacroiliac joints in relation to the availability of clinical information

The results of the radiographs and MRI evaluation (the approach simulating clinical practice when available imaging modalities are evaluated simultaneously) are presented in table 3 and figure 1. In the situation of the availability of the comprehensive imaging information, only a small non-significant increase in precision (from 80.8% to 81.8%) related to CI was found. Interestingly, while sensitivity tended to decrease (from 74.1% to 70.1%), the specificity of the imaging findings increased (from 90.2% to 97.6%). The ppV slightly improved, npV did not change much. The inter-rater reliability showed substantial agreement with or without CI. Overall, the reliability of assessment was better for radiographs+MRI than in the radiographs evaluation alone.

Table 3

Diagnostic accuracy of imaging evaluation including radiographs and MRI of sacroiliac joints in relation to the availability of clinical information

Table 4 shows the impact of CI on the performance of individual readers. Despite some variability, the overall trend was towards the increase of precision and specificity of the assessment with CI provided.

Table 4

Diagnostic accuracy of the individual radiologists’ evaluation of radiographs and MRIs of sacroiliac in relation to the availability of clinical information

The mean (SD) diagnostic confidence (0–10) of the radiologists increased from 5.2 (1.9) without CI to 6.0 (1.7) for radiographs (table 2), and from 6.7 (1.6) to 7.2 (1.6) for radiographs+MRI, respectively (table 3). Similar trends were also observed on the level of individual readers, with a larger impact of CI observed with radiographs as for radiographs+MRI.

The mean with SD self-assessed impact of CI (0 to 10) on the evaluation of imaging was 3.7 (1.9) for radiographs alone and 3.8 (1.2) for radiographs+MRI.


The results of this study show that the structured CI provided to radiologists evaluating SIJ images has an overall positive influence on the related interpretation of images. When comparing the radiologists’ findings without and with CI we found clinically relevant differences in their interpretation of images—especially for radiographs.

The precision of the evaluation of SIJ radiographs in relation to the clinical diagnosis of axSpA was 70.3% without and 78.2% with CI—a statistically and clinically significant difference. For the radiographs+MRI combination, the precision was 80.8% without CI and 81.8% with CI—only a minor, non-significant difference. This is most likely related to the higher level of uncertainty of interpretation of SIJ radiographs as compared with MRI (or a combination of radiographs+MRI). The evaluation of radiographs and MRI combined provided more accurate results as compared with radiographs alone even without CI. This is explained, at least in part, by the fact that only about 64% of axSpA patients were classified as radiographic axSpA (patients with definite radiographic SIJ changes based on mNY criteria) and could, therefore, be captured by radiographs. On the other hand, radiologists may just be more certain with their findings when evaluating MRI (+radiographs) of the SIJ allowing for a more sensitive evaluation of structural lesions in parallel to the visualisation of active inflammatory changes, which are simultaneously present in the majority of axSpA patients.17

Accordingly, the sensitivity of SIJ X-ray findings (mNY criteria) for a clinical diagnosis of axSpA was only 51% without and 60% with CI. However, the specificity increased from 94% to 100% with CI, and the ppV approached 100% as well. This was strengthened by rather good kappa values, corresponding to a substantial agreement between readers. In comparison, the sensitivity for the clinical diagnosis of axSpA with combined radiographs and MRI findings was 74% without and 71% with CI. However, the specificity increased from 90% to 98% with CI, and the ppV was ultimately 98%, again accompanied by very good kappa values.

The provision of CI seems especially critical for the radiologists’ findings when these were doubtful. Thus, if CI supports a diagnosis of axSpA, radiologists are more prone to report that. Conversely, when there were no clinical indications of axSpA, radiologists were reluctant to do so. This was supported by the increase of subjectively estimated diagnostic certainty, although the perceived impact of CI on the imaging interpretation was not high. Of interest, the radiologists reported lower confidence levels for radiographs alone as compared with radiographs in combination with MRI. This indicates that CI was weighted more important for the assessment of radiographs.

The main argument in favour of providing CI before evaluation of SIJ images in the context of axSpA diagnosis is that this knowledge may increase the diagnostic accuracy of the assessment as shown in other fields.14 15 19 20 In addition to the diagnostic aspects, important safety information such as renal function and contrast agent intolerance has to be transmitted in any case.21 22 On the other hand, authors were concerned that the provision of CI could prevent an objective assessment and bias the findings and interpretation of the images. Other authors have recommended to first document the findings without further background knowledge, and then study the clinical data before making the final interpretation.23 However, this is not very realistic due to the time constraints in daily routine.

Based on a systematic review,15 the diagnostic accuracy of radiological findings increased, especially when structured CI was provided. In our study, we provided structured and standardised CI and did not address the question of the impact of structured versus unstructured CI. We also did not study the relevance of previous findings and images—both can influence radiological interpretation.

When radiologists were asked about the quality of their communication with clinicians, lack of or inadequate CI was among the major problems reported. The majority of radiologists saw a clear advantage in having CI as a background of imaging evaluation.24 This was similar for the diagnostic accuracy of imaging in axSpA by radiologists.17 22 The authors stressed that findings such as BME and structural lesions visualised on MRI should always be interpreted in light of the clinical context since there are several differential diagnoses to be considered such as osteitis condensans,25 fractures and infectious sacroiliitis.22 26

This study has some limitations. First of all, the study design is retrospective. Including only patients with complete imaging material could have caused a selection bias with inclusion of less clear cases, which required MRI in addition to radiographs. Second, the impact of CI on MRI interpretation alone was not specifically studied, although we expect that the results would have been very similar as compared with radiographs+MRI. The extent to which the results of the study can be transferred to everyday clinical practice remains to be seen. This study involved radiologists with more experience in axSpA imaging than normal. However, the differences between readers and the variability of findings does suggest that results could be much different—especially for radiologists with less experience in musculoskeletal imaging. The study results presented were partly based on the principle of majority decision, that is, if two radiologists agreed, this was considered a positive finding. This scientific democratic principle is not inherently flawless but represents one of the widely accepted ways to identify true positive findings. Finally, imaging is an important part of the diagnostic approach and imaging results certainly have an impact on the final diagnosis. However, the gold standard in this study was the clinical diagnosis by rheumatologists. This means that the diagnosis of axSpA could have been potentially made in patients without imaging changes (at least in the SIJ) or could have implemented other imaging findings (eg, CT, spinal imaging that was not available for the assessment as a part of this study, historical images). As a complex construct, the diagnosis takes the positive and negative results of multiple diagnostic tests into account and considers other, more likely explanations of symptoms.

The need for conventional radiography for a diagnosis of axSpA has recently been debated and there is an argument to prioritise MRI as the standard method for SIJ imaging when axSpA is suspected.7 However, European Alliance of Associations for Rheumatology still recommends radiographs as imaging method of first choice for patients with suspected axSpA. Priority for MRI is only given for young patients with short symptom duration,27 even though recent studies have convincingly shown that MRI is superior to radiographs in terms of sensitivity and specificity in general and with regard to structural SIJ changes in axSpA.7 8 No doubt, the great advantage of MRI is the simultaneous detection of active inflammatory and structural changes.7 11 28 In addition, the anatomic location is relevant,29 and knowledge about the relatively high percentage of minor changes, for example, among runners and postpartum women,30 and even in the population.31

This study was conducted in a controlled experimental setting, distinctly separating rounds with and without CI. It is important to note that in routine clinical practice, the scenario of analysing images without any CI, such as age and sex, is uncommon. Typically, at least basic demographic information and the reason for imaging are provided. However, the CI given to radiologists often includes only these basic details and lacks other potentially relevant information for accurate diagnosis. The experimental design, with its clear distinction between rounds, was crucial to precisely assess the impact of CI on imaging evaluation.

Finally, we provided the full set of CI as recommended by ASAS and compared it to no CIn. Therefore, it was not possible to address the question, what specific CI was most relevant for improving the performance of the imaging interpretation—this point needs further research.

What could these results mean for clinical practice? According to the results of this study, radiologists in charge of evaluating SIJ changes should receive relevant CI related to a diagnosis of axSpA. The sensitivity of the radiologists’ findings cannot reach 100% in relation to the clinical diagnosis if there are no changes in imaging of axSpA patients. This could be the case in patients with axSpA with primary spinal involvement and no SIJ changes. Importantly, our study showed that specificity close to 100% can be achieved by radiologists if CI is provided. This means that false positive results can be largely avoided in experienced hands. Further, it seems reasonable that clinicians should be especially consulted, if radiologists are not certain about radiographic SIJ findings since there is the greatest potential for improvement. At the same time, radiologists should inform clinicians if they are particularly certain or uncertain about the findings. The radiologists’ findings showed a considerable degree of heterogeneity and variability that could be due to the different experience of those involved. However, this is not different among rheumatologists with different degrees of experience and also between local and central reading.32 Thus, rheumatologists should know about the expertise of their cooperating radiologist, this may have an influence on the quality of their diagnoses—if they are unable to judge on the images themselves. However, limited expertise of rheumatologists may also play a role in the interpretation of clinical and imaging findings.

In conclusion, our data favour and reinforce the transmission of predefined clinical data to radiologists that is able to improve the precision and the specificity of the sacroiliac imaging (radiography and MRI) evaluation and helping to avoid misdiagnosis and misclassification of axSpA.

Data availability statement

Data are available on reasonable request. The original study data may be made available on a reasonable request, which should be directed to the corresponding author.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and this study was performed according to the principles of the 1975 Declaration of Helsinki and Good Clinical Practice. The OptiRef study was approved by the Ethical Committee of the Charité—Universitätsmedizin Berlin (EA4/161/15). All patients provided informed consent prior to inclusion in the study. Participants gave informed consent to participate in the study before taking part.


We are thankful for the contributions of Kay-Geert Hermann, Susanne Lüders, Burkhard Muche, Imke Redeker, Joachim Sieper, Laura Spiller and Anne-Kathrin Weber for their contribution to the OptiRef study. We would like to thank all orthopaedists and primary care physicians who referred their patients. Further, we thank Annegret Langdon and Julia Schally for their support in data collection and data management and Torsten Karge for set-up and support of the online-self referral questionnaire.



  • X @ProftDr, @mprotopopov

  • TD and DP contributed equally.

  • Contributors All authors played a significant role in the development and completion of this work. Specifically, each contributed to one or more of the following aspects: data collection, data analysis and interpretation and drafting or revising the manuscript. All authors have approved the final version of the manuscript. DP acts as guarantor and accepts full responsibility for the work and the conduct of the study, had access to the data, and controlled the decision to publish.

  • Funding The work presented in this manuscript has no specific funding. The OptiRef was supported by a research grant from Novartis.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.