Objectives To explore whether changes in a composite (power Doppler/greyscale ultrasound (PDUS)) synovitis score, developed by the OMERACT-EULAR-Ultrasound Task Force, predict disease activity outcomes in rheumatoid arthritis (RA).
Methods Patients with RA who were methotrexate inadequate responders starting abatacept were evaluated. Individual joint PDUS scores were combined in the Global OMERACT-EULAR Synovitis Score (GLOESS) for metacarpophalangeal joints (MCPs) 2–5, all joints (22 paired) and a reduced (9 paired) joint set. The predictive value of changes in GLOESS at week 1–16 evaluations for clinical status and response (Disease Activity Score (DAS)28 (C reactive protein, CRP) <2.6; DAS28(CRP) ≤3.2; DAS28(CRP) ≥1.2 improvement) up to week 24, and correlations between DAS28 and GLOESS were assessed.
Results Eighty-nine patients completed the 24-week treatment period. Changes in GLOESS (MCPs 2–5) from weeks 1 to 16 were unable to predict DAS28 outcomes up to week 24. However, significant improvements in GLOESS (MCPs 2–5) were observed at week 12 in patients with DAS28 ≥1.2 improvement at week 24 versus those who did not achieve that clinical response. In patients achieving DAS28 ≥1.2 improvement or DAS28 ≤3.2 at week 24, changes in GLOESS (22 and 9 paired joint sets) were greater in patients who already achieved DAS28 ≥1.2 at week 12 than in those who did not. No significant correlations were found between changes in DAS28 and GLOESS definitions at any time point.
Conclusions PDUS was not correlated with clinical status or response as measured by DAS28-derived criteria, and PDUS changes were not predictive of clinical outcome. The discrepancies require further exploration.
Trial registration number NCT00767325; Results.
- Rheumatoid Arthritis
- Disease Activity
- DMARDs (biologic)
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What is already known about this subject?
Power Doppler with greyscale ultrasound (PDUS) is a non-invasive, bedside, objective and sensitive tool for visualising synovial inflammatory joint changes in rheumatoid arthritis (RA) that were not detected by conventional clinical and radiographic examinations.
What does this study add?
The primary analysis of the APPRAISE study of abatacept treatment in RA demonstrated the responsiveness of the composite PDUS Global OMERACT-EULAR Synovitis Score (GLOESS) when applied at a patient level, showing the rapid onset of action of abatacept at 1 week. Although improvements were also demonstrated using DAS28, a clinical improvement of ≥1.2 was reached only after 1 month. The current analyses demonstrated a lack of correlation between GLOESS outcomes and clinical status or response as measured by DAS28-derived criteria, which is an important finding to underline.
How might this impact on clinical practice?
The lack of correlation between PDUS and clinical measures suggests that these tools evaluate different aspects of disease activity in RA and should be considered complementary in clinical practice.
Recently, the European League Against Rheumatism (EULAR) recommended that the treatment of rheumatoid arthritis (RA) should target remission or low disease activity in every patient, with adjustments in therapy if there is no improvement within 3 months or if the target is not reached within 6 months.1 To follow this treat-to-target strategy, rheumatologists need to tightly monitor patients to ensure that they reach the target using clinical indices.
The combined use of power Doppler and greyscale ultrasound (PDUS) represents an easy, non-invasive, bedside imaging modality that has been demonstrated to be an objective and sensitive tool for visualising synovial inflammatory joint changes in RA that were not detected by conventional clinical and radiographic examinations.2–6 Several factors, such as machine characteristics and operator-dependent interpretation, are known to influence the sensitivity of detecting synovitis by PDUS. Therefore, the Outcome Measures in Rheumatology-Ultrasound (OMERACT-US) Task Force, with funding from EULAR, developed a standardised composite PDUS scoring system for synovitis in RA designed to be applicable to all joints and consistent between machines. To facilitate the assessment of global synovial activity, the group also developed a patient PDUS activity score, the Global OMERACT-EULAR Synovitis Score (GLOESS), calculated from the sum of composite PDUS scores for all joints examined. GLOESS has since been validated in cross-sectional and longitudinal data sets.7
APPRAISE was the first prospective, multicentre, international study to use this composite PDUS score at joint and patient levels to measure the early signs of response to treatment with abatacept in biological-naïve adult patients with active RA despite methotrexate (MTX) therapy.8 This study demonstrated the responsiveness of the composite PDUS GLOESS when applied at a patient level, showing the rapid onset of action of abatacept, independently of the number of joints examined. The responsiveness of GLOESS equalled that of clinical assessment by Disease Activity Score (DAS)28 C reactive protein (CRP). Despite the clear capability of PDUS for monitoring the effects of treatment in RA demonstrated in APPRAISE and other published clinical studies,9–12 discordant correlations have been found between PDUS scores and clinical outcomes measured at the same time point.13 ,14
The aim of this paper was to present the results of predefined secondary, exploratory and post hoc analyses, which investigated whether changes in GLOESS at assessments from weeks 1 to 16 were predictive of clinical response measured by DAS28 at later assessments, and also whether GLOESS could differentiate between multiple definitions of clinical response or status using DAS28 up to week 12 and at week 24.
The APPRAISE study methodology has been reported previously.8 Briefly, APPRAISE (NCT00767325) was a 24-week, Phase IIIb, open-label, multicentre, single-arm study investigating the responsiveness of the OMERACT-EULAR-composite PDUS score in biological-naïve patients (≥18 years) with active RA and an inadequate response to MTX therapy starting abatacept. Patients received intravenous abatacept (approximately 10 mg/kg) at baseline (day 1), and at weeks 2, 4, 8, 12, 16, 20 and 24, in addition to stable doses of concomitant MTX (≥15 mg/week). Oral corticosteroid use (stable dose of ≤10 mg prednisone/day) was permitted. Patients had American College of Rheumatology (ACR)-defined RA (1987 classification criteria) for at least 6 months, and were receiving MTX (≥15 mg/week) for at least 3 months prior to baseline, with a stable MTX dose for at least 28 days before baseline (except in cases of intolerance). Patients were required to have active disease, defined by a baseline DAS28 (CRP) score of >3.2 or tender and swollen joint counts (TJC and SJC) of ≥6 and a CRP level of greater than the upper normal limit.
Patients underwent bilateral PDUS examinations of metacarpophalangeal joints (MCPs) 2–5 at screening and baseline, and of 44 (22 paired) joints (MCPs 1–5, proximal interphalangeal joints (PIPs) 1–5, wrist, elbow, shoulder (glenohumeral), knee, ankle (tibiotalar), hind foot (talonavicular and calcaneocuboidal) and metatarsophalangeal joints (MTPs) 1–5) at baseline (day 1), and at weeks 1, 2, 4, 6, 8, 12, 16, 20 and 24. The PDUS examinations were performed at each site by an independent expert in musculoskeletal ultrasound who was blinded to the clinical evaluations. Medium-level to high-level ultrasound machines were used (Esaote Technos MPX, MyLab 70, Toshiba Aplio, GE Logic (series 5, 7, 9 and E 9) or Siemens Acuson Antares), employing high-frequency (12–18 MHz) transducers. Doppler parameters were adjusted according to the device used (range of pulse repetition frequency 400–800 Hz; Doppler frequency 7–11.1 MHz).8
The presence of hypoechoic synovial hyperplasia (SH) and joint effusion (JE), both assessed using greyscale, and of synovial vascularisation, assessed using power Doppler (PD), were scored using semiquantitative scales. The presence of synovitis (SH and PD, without JE) was scored for each joint according to the semiquantitative OMERACT-EULAR-US composite PDUS scale, giving a score of 0–3 for each joint. GLOESS was calculated for MCPs 2–5 of both hands and for the 22 paired joints, using the sum of the composite PDUS scores for all joints examined, giving a potential score of 0–24 for MCPs 2–5, and of 0–132 for the 22 paired joints. A new reduced, 9 paired joint set score (including both large and small joints: shoulder, elbow, wrist, MCP1, MCP4, PIP2, knee, MTP3 and MTP5) was also determined using principal component analysis and was found to adequately represent the comprehensive 22 paired joint GLOESS.8
DAS28 (CRP) was evaluated at baseline (day 1) and at weeks 1, 2, 4, 6, 8, 12, 16, 20 and 24. Mean change in DAS28 from baseline, the proportion of patients achieving DAS28 improvement ≥1.2, and the proportion of patients achieving DAS28 <2.6 or DAS28 ≤3.2 were assessed.
Receiver operating characteristic (ROC) analysis was used to assess whether changes in GLOESS (MCPs 2–5) or in any of the component scores (SH, JE and PD) at any of the assessments from weeks 1 to 16 were predictive of clinical status (DAS28 <2.6; DAS28 ≤3.2) and response (improvement in DAS28 ≥1.2) at later time points. An area under the ROC curve of ≥0.7 was considered acceptable for prediction of disease activity, as suggested by Hosmer and Lemeshow.15 Descriptive statistics were used to compare changes from baseline to weeks 12 and 24 in GLOESS (MCPs 2–5) and the component scores in patients who achieved a clinical status or response at week 24 versus those who did not.
Data were analysed to see if PDUS could identify patients with a meaningful clinical status or response at week 12 versus week 24. In patients achieving DAS28 ≤3.2 or improvement in DAS28 ≥1.2 at week 24, descriptive statistics were used to assess whether changes in GLOESS at week 12 could differentiate patients who also achieved DAS28 ≥1.2 improvement at week 12 from those who did not. These analyses were completed using MCPs 2–5, the 22 paired and reduced 9 paired joint sets.
To explore further the relationship between GLOESS and clinical response, additional post hoc correlation analyses were performed using Pearson's (parametric) or Spearman's (non-parametric) correlation coefficient with 95% CIs. Correlations between changes at different time points were assessed within and between GLOESS and DAS28 outcomes. Details of the correlation analyses performed are presented in online supplementary table S1. Effect size, expressed as standardised response means (SRM) of GLOESS and DAS28, was investigated on the basis of mean changes from baseline to weeks 1, 12 and 24. Correlation analyses using absolute values and changes from baseline were completed with the last observation carried forward to quantify the treatment effect of abatacept plus MTX over time. Analyses were performed using all available GLOESS and component scores (no imputation of missing data). Missing DAS28 values were imputed. No correction for multiple statistical tests was applied.
In total, 89 (86%) of the 104 patients enrolled between December 2008 and October 2011 completed the 24-week, open-label treatment period. Demographic and baseline characteristics of the study population have been reported previously, as well as the results on the responsiveness of the score.8 In patients with DAS28 measurements available at week 12 and week 24 (n=98, which included 9 patients with DAS28 measurements but who had not completed the open-label treatment period), the number of patients who achieved DAS28 ≥1.2 improvement at week 12 and who also achieved DAS28 ≥1.2 improvement or DAS28 ≤3.2 at week 24 were 50/98 (51%) and 37/98 (38%), respectively, while the number of patients who did not achieve DAS28 ≥1.2 improvement at week 12 but achieved DAS28 ≥1.2 improvement or DAS28 ≤3.2 at week 24 were 10/98 (10%) and 10/98 (10%), respectively.
Relationship between GLOESS and meaningful clinical status or response
All patients showed an improvement in GLOESS (MCPs 2–5) at week 12, regardless of clinical status or response at week 24 (figure 1). In particular, significantly greater improvements were observed from baseline to week 12 in GLOESS and in all three component scores (SH, JE and PD) in patients who achieved DAS28 ≥1.2 improvement at week 24 compared with those who did not (figure 1A). However, the ROC area under the curve never reached or exceeded the predefined cut-off of 0.7; therefore, changes in GLOESS and in all component scores for MCPs 2–5 at any time point up to week 16 were unable to adequately predict clinical status or response at week 24 or at any other time point, regardless of criteria used. Moreover, no difference was observed in the mean decrease of GLOESS (MCPs 2–5) from baseline to week 12 in patients who achieved or did not achieve DAS28 <2.6 or DAS28 ≤3.2 at week 24 (figure 1B, C). The only component score that was different in patients who reached DAS28 <2.6 or DAS28 ≤3.2 at week 24 versus those who did not was the JE score. Similar results were observed using the 22 paired and 9 paired joint sets (table 1).
In patients who achieved DAS28 ≤3.2 or DAS28 ≥1.2 improvement at week 24, changes from baseline to week 12 in GLOESS for the 22 paired and 9 paired joint sets were able to detect some differences between patients who achieved DAS28 ≥1.2 improvement at week 12 versus those who did not (figure 2).
Previous correlation analyses demonstrated the absence of correlations between changes in DAS28 from baseline with changes in GLOESS or component scores; very low correlations between GLOESS (MCPs 2–5) or component scores and swollen joint counts (SJC) and/or tender joint counts (TJC); no correlations between early changes in GLOESS or components and changes in the sum of swollen joints at week 12 or 24.8
When we looked at correlations within each assessment method (ie, between DAS28 scores at different time points or between GLOESS scores at different time points), moderate-to-high positive correlations were observed between improvements up to week 12 and improvements at week 24 in DAS28 (table 2). Moderate-to-high positive correlations were also observed between changes in GLOESS from baseline to weeks 2, 4, 8 and 12 and change in GLOESS (both 22 paired and 9 paired joint sets) from baseline to week 24 (table 3). However, changes in GLOESS during the first week of treatment were weakly correlated with overall change in GLOESS during the 24-week treatment period. Overall, SRM values for GLOESS (MCPs 2–5, 22 paired and reduced 9 paired joint sets) were smaller than SRM values for DAS28 for mean change from baseline at weeks 1, 12 and 24 (see online supplementary tables S1 and S2). A summary of the findings of the correlation analysis is presented in table 4.
This study demonstrated an almost complete absence of association between ultrasound and clinical status or response in RA in patients starting treatment with abatacept. Whereas both modalities (PDUS and composite DAS) were responsive to treatment in this setting,8 the extent of response in one modality was not reflected in the extent of improvement in the other. In addition, early changes in GLOESS were not predictive of a good clinical status or response at 24 weeks. This lack of correlation or predictive capacity was not caused by inconsistency of results over time, as correlations within modality (ie, between early and late GLOESS, and between early and late DAS28 outcomes) were moderate to strong, as were SRM values. This lack of correlation is a novel finding, although previous studies have reported a lack of complete overlap between ultrasound-assessed disease activity and composite clinical measures,16–18 with one study suggesting that the simplified disease activity index is closer to PDUS assessment of disease state than DAS28.19 Another study found that ultrasound had low correlations with disease activity assessment at baseline, while after 12 months of adalimumab treatment, correlation coefficients had improved.20
Several explanations can be discussed for interpreting these results. First, the sensitivity to change as estimated by SRM calculations (which was better for DAS28 than for GLOESS, whatever the joint set) can suggest that clinical efficacy measures were more sensitive to change than ultrasound. However, these discrepancies can also be explained by the wider SD for PDUS assessments than DAS28, as the SRM calculation is dependent on the SD of the measure.21 Second, we identified some patients who appeared to have a complete disconnect between GLOESS (and component scores) and DAS28, SJC or TJC (data not shown). This may be explained by the greater sensitivity of PDUS for detecting minimal and/or subclinical synovitis compared with clinical examination. A recent paper confirmed that, in patients in disease remission (clinical disease activity index ≤2.8), only grade 3 (severe) PDUS-assessed synovitis correlated with clinical examination and clinical activity scores, demonstrating that when clinical evaluation is taken as the gold standard, the capability of PDUS is reduced.22 A number of patients with RA can score pain and TJC higher than PDUS assessments would indicate. Finally, the severity of PDUS synovitis (MCPs 2–5) was relatively low (grade 1–2) at baseline, supporting the apparent discordance between clinical and PDUS evaluations when assessing whether a joint was inflamed or not. This latter aspect requires further investigation. Overall, this study suggests that PDUS and DAS28 capture different aspects of disease activity that present different kinetics of response over time.
The findings of this study should be considered as exploratory, and the limitations of the study have been discussed previously and include the single-arm, open-label design, the sample size and the variables relating to the technology used.8 The analyses presented here are intended to inform future prospective studies of PDUS in RA. The present study suggests that PDUS scores improve as a consequence of effective treatment, but DAS28 cannot be used as a comparator for ultrasound, as they do not reflect the same aspects of the arthritis inflammatory process. The use of other measures of disease activity that are more stringent than DAS28, such as Clinical Disease Activity Index (CDAI) and Simple Disease Activity Index (SDAI), would probably provide better correlation results.
In conclusion, PDUS was a responsive tool of disease activity in patients with RA starting abatacept plus MTX, but the extent of PDUS response was not correlated with clinical status or response, as measured by DAS28 derived criteria, and early PDUS changes were not predictive of a later good clinical outcome. This suggests that PDUS adds independent information on treatment response, and its contribution should be explored further.
The authors would like to thank the OMERACT-EULAR-US Task Force and principal investigators of the APPRAISE study. APPRAISE principal investigators: Silvano Adami, Vivi Bakkenheim, HBH, Stefano Bombardieri, M-AD'A, Paul Emery, Liana Euller-Ziegler, Gianfranco Ferraccioli, Maurizio Galeazzi, Philippe Gaudin, Walter Grassi, Annamaria Iagnocco, Herbert Kellner, Thierry Lequerré, IM, EN, MØ, Fredeswinda Romero, Istvan Szombati, Lene Terslev, Jacqueline Uson, Esther Vicente, OV and RW. The authors also thank Emilie Barré, Karina Van Holder, Wendy Kerselaers, Harry Goyvaerts, Stephane Munier and Nathalie Schmidely from Bristol-Myers Squibb, Coralie Poncet from DOCS International and Christel Perrone from the CRO ICON for their contribution to the study design, analysis and study conduct. Professional medical writing and editorial assistance was provided by Gary Burd, PhD, CMPP, and Laura McDonagh, PhD, at Caudex Medical and was funded by Bristol-Myers Squibb.
CG: Affiliation at time of study.
Funding This study was funded by Bristol-Myers Squibb.
Competing interests M-AD'A has received speaker's bureau fees from Bristol-Myers Squibb, AbbVie, UCB, MSD, Novartis and Roche Pharma, as well as research grants from Pfizer. MB has received consulting fees from Bristol-Myers Squibb and personal fees from Mundipharma, Roche, Pfizer, GSK and Novartis. HBH has received honoraria for scientific lectures from AbbVie, Roche, BMS, Pfizer and MSD, as well as grants for scientific studies from AbbVie, Roche and Pfizer. IM has received consulting fees from Bio Iberica pharma, AbbVie, GE and ESAOTE. EN has received consulting fees from AbbVie, Roche Pharma, Bristol-Myers Squibb, Pfizer, UCB, General Electric and Esaote, as well as research grants from MSD. MØ has received personal fees from AbbVie, BMS, Boehringer-Ingelheim, Celgene, Eli Lilly, Janssen, Genmab, GSK, Merck, Pfizer, Regeneron, Roche, Sanofi and UCB, grants from AbbVie, Merck and UCB, as well as non-financial support from AbbVie, BMS, Janssen, Merck, Pfizer, Roche and UCB. CG is a former employee of Bristol-Myers Squibb. MLB is an employee of Bristol-Myers Squibb.
Ethics approval Hopital Ambroise Pare, Comite De Protection Des Personne Idf Vlll Lab D'Anatomopathologie, 9 Ave Charles De Gaulle, Boulogne-Billancourt 92100, France. Comitato Etico Per La Sperimentazione Del Farmaco, Asur, Territoriale 5 Di Jesi Via Gallodoro 68, Jesi (An) 60035, Italy. Azienda Ospedaliera Universitaria Policlinico G. Martino, Comitato Etico Scientifico Via Consolare Valeria, 1, Messina 98124, Italy. Leeds East Research Ethics Committee, Yorkshire andamp; Humber Rec Cntr Off. First Floor, Millside Mill Pond Lane, Leeds, England LS6 4RA, UK. Ceic Fundacio Unio Catalana D'Hospitals, Area De Serveis C/ Bruc 72-74 1a, Barcelona 08009, Spain. Azienda Universitaria Senese, Comitato Etico Locale La Sperimentazione Clinica Dei Medicinali Farmacia Aous—Viale Bracci, Siena 53100, Italy. De Videnskabsetiske Komiteer For Region Hovedstaden, Kongens Vaenge 2, Hillerod 3400, Denmark. Com Sperimentazione Clin Dei Med Azienda Osp-Univ Pisana, Via Roma 67, Pisa 56126, Italy. Comitato Etico Dell'Universita Cattolica Del Sacro Cuore, Policlinico Universitario Agostino Gemelli Di Roma Largo A Gemelli 8, Roma 00168, Italy. Rek Sorost, Frederik Holsts Hus Ulleval Terrasse Ulleval Sykehus Kirkeveien 166, Oslo 0450, Norway. Ceic—Fundacion Jimenez Diaz- Ute, Avda. Reyes Catolicos, 2-2a, Madrid 28040, Spain. Hospital Mostoles, Ceic Area 8 C/ Rio Jucar S/N, Madrid 28935, Spain. Universita Di Roma La Sapienza, Comitato Etico Dell'Azienda Policlinico Umberto I, Roma 00161, Italy. Ceic Area 9-Hosp Severo Ochoa De Leganes Y Hosp De Fuenlabrad, Avenida Orellana S/N Leganes, Madrid 28911, Spain. Hospital Universitario La Princesa, C/ Diego De Leon, 62, Madrid 28006, Spain. Farmakologiai Etk Bizottsaga, Arany J.U. 6-8., Budapest 1051, Hungary. Comitato Etico Per La Sperimentazione Dell’ Azienda, Ospedaliera Istituti Ospitalieri Di Verona Piazzale Stefani 1, Verona 37126, Italy. Ethikkommission Der Ludwig-Maximilians Universitaet, Marchioninistr. 15, Muenchen 81377, Germany.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.