Article Text

Download PDFPDF

2016 American College of Rheumatology/European League Against Rheumatism Criteria for Minimal, Moderate, and Major Clinical Response in Juvenile Dermatomyositis
  1. Lisa G Rider1,
  2. Rohit Aggarwal2,
  3. Angela Pistorio3,
  4. Nastaran Bayat1,
  5. Brian Erman4,
  6. Brian M Feldman5,
  7. Adam M Huber6,
  8. Rolando Cimaz7,
  9. Rubén J Cuttica8,
  10. Sheila Knupp de Oliveira9,
  11. Carol B Lindsley10,
  12. Clarissa A Pilkington11,
  13. Marilynn Punaro12,
  14. Angelo Ravelli13,
  15. Ann M Reed14,
  16. Kelly Rouster-Stevens15,
  17. Annet van Royen-Kerkhof16,
  18. Frank Dressler17,
  19. Claudia Saad Magalhaes18,
  20. Tamás Constantin19,
  21. Joyce E Davidson20,21,
  22. Bo Magnusson22,
  23. Ricardo Russo23,
  24. Luca Villa24,
  25. Mariangela Rinaldi24,
  26. Howard Rockette2,
  27. Peter A Lachenbruch1,
  28. Frederick W Miller1,
  29. Jiri Vencovsky25,
  30. Nicolino Ruperto24,
  31. for the International Myositis Assessment and Clinical Studies Group and the Paediatric Rheumatology International Trials Organisation
  1. 1NIEHS, NIH, Bethesda, Maryland, USA
  2. 2University of Pittsburgh, Pittsburgh, Pennsylvania, USA
  3. 3Istituto Giannina Gaslini, Servizio di Epidemiologia e Biostatistica, Genoa, Italy
  4. 4Social and Scientific Systems, Inc., Durham, North Carolina, USA
  5. 5The Hospital for Sick Children, Toronto, Ontario, Canada
  6. 6IWK Health Centre, Halifax, Nova Scotia, Canada
  7. 7University of Firenze, Florence, Italy
  8. 8Hospital de Niños Pedro de Elizalde, University of Buenos Aires, Buenos Aires, Argentina
  9. 9Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
  10. 10University of Kansas City Medical Center, Kansas City, Kansas, USA
  11. 11Great Ormond Street Hospital for Children NHS Trust, London, UK
  12. 12University of Texas Southwestern Medical Center, Dallas, Texas, USA
  13. 13Istituto Giannina Gaslini, Pediatria II - Reumatologia, and Università degli Studi di Genova, Genoa, Italy
  14. 14Duke University, Durham, North Carolina, USA
  15. 15Emory University School of Medicine, Atlanta, Georgia, USA
  16. 16University Medical Centre Utrecht, Wilhelmina Children's Hospital, Utrecht, The Netherlands
  17. 17Hannover Medical School, Hannover, Germany
  18. 18Universidade Estadual Paulista Júlio de Mesquita Filho, Botucatu, Saõ Paulo, Brazil
  19. 19Semmelweis University, Budapest, Hungary
  20. 20Royal Hospital for Sick Children, Glasgow, UK
  21. 21Royal Hospital for Sick Children, Edinburgh, UK
  22. 22Karolinska University Hospital, Stockholm, Sweden
  23. 23Hospital de Pediatría Garrahan, Buenos Aires, Argentina
  24. 24Istituto Giannina Gaslini, Pediatria II - Reumatologia, PRINTO, Genoa, Italy
  25. 25Charles University, Prague, Czech Republic
  1. Correspondence to Dr Nicolino Ruperto, Istituto Giannina Gaslini, Pediatria II, PRINTO, Via G. Gaslini 5, Genoa 16147, Italy; nicolaruperto{at}gaslini.org

An International Myositis Assessment and Clinical Studies Group/Paediatric Rheumatology International Trials Organisation Collaborative Initiative

Abstract

To develop response criteria for juvenile dermatomyositis (DM). We analysed the performance of 312 definitions that used core set measures from either the International Myositis Assessment and Clinical Studies Group (IMACS) or the Paediatric Rheumatology International Trials Organisation (PRINTO) and were derived from natural history data and a conjoint analysis survey. They were further validated using data from the PRINTO trial of prednisone alone compared to prednisone with methotrexate or cyclosporine and the Rituximab in Myositis (RIM) trial. At a consensus conference, experts considered 14 top candidate criteria based on their performance characteristics and clinical face validity, using nominal group technique. Consensus was reached for a conjoint analysis–based continuous model with a total improvement score of 0–100, using absolute per cent change in core set measures of minimal (≥30), moderate (≥45), and major (≥70) improvement. The same criteria were chosen for adult DM/polymyositis, with differing thresholds for improvement. The sensitivity and specificity were 89% and 91–98% for minimal improvement, 92–94% and 94–99% for moderate improvement, and 91–98% and 85–86% for major improvement, respectively, in juvenile DM patient cohorts using the IMACS and PRINTO core set measures. These criteria were validated in the PRINTO trial for differentiating between treatment arms for minimal and moderate improvement (p=0.009–0.057) and in the RIM trial for significantly differentiating the physician's rating for improvement (p<0.006). The response criteria for juvenile DM consisted of a conjoint analysis–based model using a continuous improvement score based on absolute per cent change in core set measures, with thresholds for minimal, moderate, and major improvement.

  • Dermatomyositis
  • Polymyositis
  • Treatment

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Juvenile dermatomyositis (DM) is a systemic autoimmune disease characterised by chronic skeletal muscle inflammation and weakness. Core set measures to assess juvenile DM disease activity have been established and validated by the International Myositis Assessment and Clinical Studies Group (IMACS) and the Paediatric Rheumatology International Trials Organisation (PRINTO), with provisional endorsement by the American College of Rheumatology and the European League Against Rheumatism.1–6 Both core sets include physician and parent global activity, muscle strength, and physical function. IMACS also includes the most abnormal serum muscle enzyme value and extramuscular global activity, whereas PRINTO includes instead a health-related quality of life measure, the Child Health Questionnaire7 and a global activity score, the Disease Activity Score.8 IMACS measures muscle strength using manual muscle testing, and PRINTO measures muscle strength using the Childhood Myositis Assessment Scale.1 ,2 ,5

Combinations of these measures to determine clinical improvement were developed to enhance the sensitivity of responses and decrease the sample sizes needed, by using large prospective natural history data sets and expert clinician consensus as the gold standard. For both PRINTO and IMACS, at least 20% improvement in 3 of 6 core set measures with no more than 1 or 2 worsening (which cannot be muscle strength) had been established as preliminary response criteria, and additional combinations of improvement in the core set measures serve as secondary response criteria.9 ,10 PRINTO adapted its top criteria for minimal clinical improvement to moderate and major improvement by using cutoffs of 50% and 70%, similar to the improvement criteria for juvenile idiopathic arthritis (JIA).11–13

Although the preliminary response criteria for juvenile DM advanced the assessment of patients and their responses to treatment, those criteria were limited by differences in the core set measures and final consensus response criteria between IMACS and PRINTO, a lack of randomised controlled trial data for full validation, and inadequate exploration of more sensitive approaches using hybrid or continuous methods.14 The preliminary response criteria also considered each core set measure equally rather than differentially weighting them. However, most myositis experts agree that some core set measures are more important, such as physician global activity and muscle strength.3 ,15 For PRINTO studies, physician global evaluation of disease activity, muscle strength, and parent global evaluation of the child's overall well-being were weighted as the most important core set measures in a logistic regression analysis.3 ,10 Moreover, the preliminary response criteria did not validate criteria for moderate or major improvement. There is, therefore, a clear need to have standardised improvement criteria for all levels of improvement in future clinical trials, similar to the standardized criteria developed for rheumatoid arthritis (RA) and JIA.

For these reasons, IMACS and PRINTO engaged in a joint effort to develop fully validated response criteria for juvenile DM, including criteria for minimal, moderate, and major clinical response. This report focuses on the consensus conference in which the top candidate definitions of response leading to the final juvenile DM response criteria were considered.

Methods

In previous reports,16 ,17 we described the methodology used a) to create patient profiles using natural history data and obtain expert consensus on minimal, moderate, and major improvement,16 b) to determine differential weights of the core set measures using conjoint analysis, and c) to draft six types of candidate definitions for response criteria using the myositis expert survey on thresholds of improvement and data-driven methods, such as logistic regression and conjoint analysis (table 1).

Table 1

Types of candidate definitions for response criteria that were developed and tested

Conjoint analysis is a choice modeling or discrete choice experiment, which is a valid methodology for developing composite criteria and has been used recently in rheumatology.19–22 In the conjoint analysis surveys administered using 1000Minds online software,23 experts were presented with pairs of hypothetical patient scenarios; each patient had different levels of improvement in the same 2 core set measures, assuming other core set measures remained the same. Experts rated which of the 2 scenarios had greater improvement. Based on the rater's response, the relative weights of core set measures and their levels of improvement were established and used to develop a scoring system by mathematical methods based on linear programming24 such that when all 6 core set measures are considered together, the maximum score (total improvement score) possible for representing a patient's improvement is 100, and the minimum score is 0.

We then compared the performance characteristics of the drafted definitions in the patient profiles, using expert consensus ratings as a gold standard, and externally validated the candidate response criteria by applying them to clinical trial data. This process led to the development of traditional categorical as well as continuous candidate definitions for response criteria, with thresholds for minimal, moderate, and major improvement.18 Continuous candidate definitions can also be considered hybrid definitions, because the same definition can be used either as a continuous outcome measure by using the total improvement score or as a categorical outcome measure by using the thresholds for minimal, moderate, and major improvement.

Candidate definitions were evaluated using consensus profile ratings as the gold standard, by assessing sensitivity, specificity, and area under the curve (AUC) to compare the performance of these candidate definitions. Those that performed well in the consensus profiles (sensitivity and specificity both ≥80%, AUC ≥0.9 for minimal, and AUC ≥0.8 for moderate and major improvement, using IMACS or PRINTO core set measures1) were externally validated. The PRINTO trial randomised patients with new-onset juvenile DM to receive prednisone alone (n=47) or prednisone combined with methotrexate or cyclosporine (n=46 patients per treatment arm).13 χ2 analysis was used to compare the percentages of patients meeting the candidate definitions for response at the primary end point (6 months) for the combined treatment arms versus the prednisone-alone (placebo) arm. Definitions with a significant difference (p<0.05) between treatment arms for minimal improvement were further considered. Both PRINTO and IMACS core set measures were available in this trial.

A second trial validation data set included 48 juvenile DM patients enrolled in the Rituximab in Myositis (RIM) trial for treatment-refractory patients. It had a randomised placebo-phase design in which patients received either rituximab or placebo at weeks 0 and 1, and at weeks 8 and 9 their treatment assignment was reversed in a blinded manner.25 We used the Mann-Whitney U test to determine whether each candidate definition could differentiate between the treating physician's rating of improvement (score range 1–7) at 6 months, a time point when most patients improved and that was also comparable to that in the PRINTO trial. For the RIM trial, only the IMACS core set measures were available.

We then selected the top candidate definitions, up to 4 top-performing definitions from each of the six different types of candidate definitions (table 1), for consideration at the final consensus conference as a manageable number of definitions to discuss.

Consensus conference

Nominal group technique was used at a consensus conference held in Paris, France on 9–10 June 2014, led by experienced moderators (LGR and NR, for the paediatric working group) The methodologies used to develop the new candidate response criteria and performance characteristics of each type of candidate definition were reviewed with the participants in a general session. The 12 paediatric working group participants first independently and then as a group reviewed the performance characteristics of the 14 top candidate definitions of response criteria for juvenile DM. Data for minimal, moderate, and major clinical response were presented for each definition, including a detailed spreadsheet that included the performance in the patient profiles using the IMACS and PRINTO core set measures, including sensitivity, specificity, AUC, as well as kappa values and ORs. AUC was defined as the average of the sensitivity and specificity values for all categorical candidate definitions, as well as for thresholds for minimal, moderate, and major improvement in continuous candidate definitions. In addition, for continuous definitions, an AUC for the total improvement score was determined from the receiver operating characteristic (ROC) curve as a plot of sensitivity versus (1—specificity) for total improvement scores as well as for thresholds.26–28 Results of the external validation for each candidate definition from the PRINTO and RIM clinical trial data sets were also presented.

Paediatric working group

After reviewing the performance of the 14 top performing candidate definitions, the 12 paediatric working group participants developed consensus response criteria for minimal, moderate, and major improvement in juvenile DM. The participants were informed of the secondary goal of reaching consensus on response criteria for both juvenile DM and adult DM/polymyositis (PM). Participants were first asked to rank their top five choices, considering the data presented, based on face validity, feasibility, and generalisability, and to determine which response criteria were most clinically meaningful. The voting process was conducted in a systematic manner with a predetermined format using nominal group technique29 ,30 facilitated by an internet-based system developed by staff at the PRINTO coordinating centre.31 ,32 Voting was done anonymously and independently using the online voting software.

After the initial round of voting, the results were shared with the group. Each participant was then asked to explain his or her top- and bottom-ranked choices to the group. The rounds of voting continued in the same manner until consensus was reached (≥80% of the votes) or until it was clear that consensus would not be reached. Between each round, after the participants were shown the results, the administrators were allowed to remove candidate definitions that decisively received a small proportion of the votes. In the final round, participants were asked to select their final top response criteria. The paediatric working group also voted on additional issues, including use of both IMACS and PRINTO core set measures and response criteria for juvenile DM that would interchange both the IMACS and PRINTO measures. Participants also voted on retesting the performance of the top candidate response criteria in future trials.

Combined paediatric and adult working group

After consensus was attained for juvenile DM response criteria, a combined working group of 22 paediatric and adult experts was formed to determine whether consensus could be reached on final, common response criteria for both juvenile DM and adult DM/PM. Common response criteria that would include both juvenile DM and adult DM/PM were considered for use in clinical trials, which might facilitate drug approvals for myositis treatment. Experienced moderators (LGR, RA, FWM, and NR) led the combined working group. For the first round of votes, the top adult and paediatric definitions from the final round of voting in each working group were considered. The online voting system was utilised again, and each participant discussed his or her top-choice candidate definition, using nominal group technique in a round-robin manner. At each round, participants were asked to select only one candidate top response croteroa set; discussion was stopped once consensus of ≥80% was reached. For determining the thresholds of improvement for the selected definition, the required consensus was ≥70%, which was done by post-conference voting.

Results

The performance characteristics of 101 of 312 candidate definitions were excellent (sensitivity and specificity of ≥80%, AUC ≥0.90 for minimal improvement), and 30 candidate definitions also performed well in 2 clinical trials, in which they differentiated between treatment arms (p<0.05 for minimal improvement) and differentiated the treating physician's improvement score at week 24 (p<0.001).15

Top candidate definitions for response criteria

Fourteen top-performing candidate definitions were brought to the paediatric working group for consideration at the consensus conference (table 2 and online supplementary tables S1 and S2, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.40060/abstract). These candidate criteria included nine categorical definitions in which different criteria were set for minimal, moderate, and major improvement and five continuous definitions in which improvement points are given on a continuous scale that corresponds to the magnitude of improvement, with different thresholds for minimal, moderate, and major improvement. Among the nine categorical definitions, two were previously published IMACS and PRINTO response criteria,9–11 four were newly drafted definitions based on a survey of experts, and three were weighted definitions. Among the continuous definitions, two were developed by logistic regression, and three were developed from the conjoint analysis survey. Among the 14 candidate criteria considered, 11 were based on relative per cent change, and 3 were based on absolute per cent change in the core set measures.

Table 2

Detailed performance characteristics of patient profiles for the top 5 candidate definitions presented at the consensus conference*

The performance characteristics of these 14 candidate definitions are shown in table 2 and online supplementary table S1 (available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.40060/abstract). In the patient profiles, with expert consensus as a gold standard, all definitions presented at the conference had sensitivity and specificity of ≥87% (AUC≥0.90) for minimal improvement (table 2 and online supplementary table S1). For moderate improvement, specificity decreased but was ≥80% (AUC≥0.88), and for major improvement specificity was generally ≥75% (AUC≥0.84). For continuous definitions, the AUCs (from ROC curves) for the total improvement score were generally better than the AUCs (average of sensitivity and specificity) for the thresholds of minimal, moderate, and major improvement. Performance was similar between the IMACS and PRINTO core set measures for each definition.

Almost all candidate criteria were validated using the PRINTO trial at 6 months, when they could differentiate between treatment arms, with p<0.05 for minimal improvement (table 2 and online supplementary table S1). All candidate criteria were also validated in 48 juvenile DM patients in the RIM trial.25 All definitions could differentiate the median treating physician's improvement score at week 24 (p≤0.006).

Consensus conference voting

Among the 14 candidate definitions, 13 and 11 candidate definitions of response were promoted in the first and second voting rounds, respectively. In round three, six candidate definitions were chosen, each receiving a similar number of votes. These six included the three conjoint analysis–based continuous definitions, a conjoint analysis–based weighted definition, a logistic regression absolute per cent change definition, and the previously published PRINTO preliminary response criteria.8 ,9 In the fourth round of voting and discussion, participants reached consensus on final top response criteria, a conjoint analysis–based continuous model using absolute per cent change in the IMACS or PRINTO core set measures (table 3).

Table 3

Final top response criteria for minimal, moderate, and major improvement in juvenile dermatomyositis (DM) and combined adult DM/PM and juvenile DM clinical trials and studies*

Table 2 and online supplementary table S1 (available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.40060/abstract) show the performance characteristics in the patient profiles and the trial validation for each of the top candidate response criteria presented at the conference. For the top conjoint analysis–based continuous response criteria using absolute per cent change in each of the core set measures, the sensitivity and specificity in the patient profiles was generally >90% and the AUC >0.90 for both the IMACS and PRINTO measures. For the PRINTO trial, a difference in the treatment arms was detected for minimal and moderate improvement using the top response criteria, and in the RIM trial a difference in the physician's rating of improvement when the response criteria rated the patient as improved versus not improved was detected for minimal, moderate, and major improvement.

Paediatric experts favoured the conjoint analysis–based continuous response criteria because of the continuous improvement score that corresponds to the magnitude of improvement and provides the ability to categorise a patient's degree of change into minimal, moderate, and major improvement. The continuous model definitions also differentially weight the various core set measures, which experts thought were consistent with their assessment of the relative importance of each of the core set measures. The top response criteria were based on absolute per cent change in core set measures, which was also favoured by the participants because, given the various visual analog scale (VAS) measurements used in the core set measures, the absolute per cent changes were more congruent than relative per cent changes with actual changes that the myositis experts see in clinical practice.

Final response criteria chosen by the combined pediatric and adult working group

For this round of votes, the top 2 paediatric (table 2) and adult definitions18 were considered. Two rounds of voting resulted in final consensus response criteria, with 91% of participants voting for the conjoint analysis–based continuous response criteria based on absolute per cent change in the core set measures (table 3). It was agreed that the top response criteria would be used in future clinical trials that combined juvenile DM and adult DM/PM. Because the final response criteria were similar, participants favoured using response criteria that would be common to juvenile DM and adult DM/PM, and they favoured combined studies when possible as well as the possibility of comparing outcomes in separate studies using the same final response criteria.

Other votes

In a post-conference final vote using the Delphi method, 74% of the participants agreed to use the following paediatric threshold values for minimal, moderate, and major response in juvenile DM: total improvement score ≥30 (on a scale of 0–100) for minimal, ≥45 for moderate, and ≥70 for major improvement. In contrast, the final thresholds for minimal, moderate, and major response in adult DM/PM were ≥20, ≥40, and ≥60, respectively. The paediatric working group also reached consensus that, given the overall similarity between the IMACS and PRINTO response criteria, joint IMACS/PRINTO response criteria for juvenile DM are being proposed. The current development of the response criteria in parallel between the IMACS and PRINTO core set measures necessitates that either all of the IMACS or all of the PRINTO core set measures be used. The paediatric experts, however, committed to measure both IMACS and PRINTO core set measures in future therapeutic trials, with 92% agreement, and to continue to test the interchangeability of the IMACS and PRINTO core set measures. The group also unanimously agreed to retest the validity of the top five candidate definitions for response criteria and to utilise the other four definitions as secondary end points in future clinical trials. The top 3 of these criteria, the conjoint analysis definitions, are the same for both juvenile DM and adult DM/PM, with different thresholds of improvement (table 3 and online supplementary table S3, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.40060/abstract).

Discussion

Conjoint analysis–based continuous response criteria, based on absolute per cent change in the core set measures, were developed as the consensus- and data-driven response criteria for minimal, moderate, and major improvement in juvenile DM. For the response criteria, either IMACS or PRINTO core set measures could be used. In addition, it was agreed that the same response criteria, using the IMACS core set measures but with different thresholds for improvement, would be the consensus response criteria for adult DM/PM trials and combined juvenile DM and adult DM/PM trials in the future.18

The comprehensive process used to develop final response criteria for minimal, moderate, and major improvement in juvenile DM included the use of large, prospective, natural history data sets for juvenile DM and data from two randomised controlled trials for validation, which included a wide range of disease activity and different stages of disease, from recently diagnosed to treatment-refractory patients.13 ,15 ,25 The involvement of many clinical experts who had experience using the core set measures in juvenile DM patients was also critical. They provided input at several points throughout the process, including determining thresholds for improvement in core set measures by which definitions of response were drafted, achieving gold standard ratings of improvement by evaluating and developing consensus patient profiles, completing the conjoint analysis surveys to develop differential weights for the core set measures, and participating in the final consensus conference to achieve consensus for common response criteria with the greatest clinical face validity. The current response criteria (table 3) also resolve the differences between PRINTO and IMACS core set measures by testing candidate definitions of response criteria in parallel using both sets of measures and showing that they are largely interchangeable, and that their performance is comparable. Moreover, this project brought both IMACS and PRINTO consortia to work together for this rare disease.

The combined group of paediatric and adult experts selected the same top-choice definition but with differing thresholds for improvement, which had very similar performance characteristics and were thought to be more appropriate for use in clinical trials that would, in the future, combine adult and paediatric patients.

The final response criteria selected, conjoint analysis–based continuous response criteria using absolute per cent change in core set measures, have many advantages. For each measure, improvement points are calculated based on the level of change in that measure, and each core set measure is differentially weighted, such that changes in muscle strength and physician global activity are weighted more heavily than changes in the most abnormal enzyme value or quality of life. A total improvement score can be obtained as a continuous measure, and the means or medians of total improvement scores can be compared between treatment arms.33 A total improvement score between 0 and 100 also corresponds to the degree of improvement, with higher scores corresponding to a greater magnitude of improvement. This score may be more sensitive to change, resulting in smaller trial sample sizes.33 ,34 Alternatively, thresholds for minimal, moderate, and major improvement have been established that allow dichotomous use of the response criteria as well. Therefore, this is truly a hybrid model that can be used as either a continuous or categorical outcome measure within the same response criteria depending on the trial design and needs of the study.

The response criteria allow input from all the core set measures instead of relying on only a few measures to determine whether a patient has experienced improvement. However, although these response criteria were developed using all six core set measures, the response criteria could still be used if fewer core set measures were obtained, allowing for greater flexibility in the types of patients and improvements that can occur, but we caution that the response criteria are most accurate when all six core set measures are used. As such, the response criteria signify a major advance in assessing improvement in therapeutic trials and other clinical research studies by providing data-driven response criteria that were developed by consensus of major stakeholders in the field who come from all over the world.

Prior response criteria in rheumatic diseases have included33 ,34 relative per cent change,35 ,36 whereas myositis response criteria are based on absolute per cent change. The experts favoured the use of absolute per cent change for various reasons. In this study, several core set measures used a 10-cm VAS, and the experts thought that absolute per cent change better represents the degree of change they see in clinical practice. Moreover, absolute per cent changes can be calculated when the baseline core set measure is 0 and give similar results for similar degrees of change at either end of the VAS.

The participants also favoured using the same response criteria for juvenile DM and adult DM/PM, but with cut points or thresholds for improvement specific to paediatric or adult patients. Having common response criteria facilitates the potential to conduct combined clinical trials, such as the RIM trial,25 and to compare the outcomes of trials and studies conducted separately. Participants agreed to include other top-performing definitions that were highly rated as secondary end points for future clinical trials. Among these were not only other conjoint analysis–based continuous models but also the published PRINTO preliminary response criteria.10 ,11 Future work should also evaluate whether a baseline composite score threshold derived from the PRINTO or IMACS core set measures could be used as inclusion criteria for future clinical trials.

Limitations of the present work include the lack of a placebo group in the RIM trial. For this reason, the physician's assessment of improvement at 6 months was used instead. We were fortunate to have another controlled clinical trial for juvenile DM that had three treatment arms to use for external validation,13 in which we evaluated the ability of the candidate definitions to differentiate between treatment arms. Although thresholds for major improvement were developed and validated in fewer patients, we believe that it was sufficient given that 29% of patients had major improvement in patient profiles, and 17% had major improvement in the clinical trials used for validation. The final conjoint analysis–based continuous response criteria also do not address worsening in the core set measures; however, this generally does not affect the outcome, because when patients are rated as improved, no more than 1 or 2 measures worsen in our clinical data sets. Also, although we tested the interchange of IMACS and PRINTO core set measures, we tested these variations as 2 parallel core set measures but did not examine intermixing the PRINTO and IMACS core set measures. Further work to examine the interchangeability of the IMACS and PRINTO core set measures will be needed.

The data sets used to develop the new response criteria primarily contained information about patients with a recent diagnosis or those experiencing a disease flare, and further work is needed to determine how the response criteria perform in patients with longstanding disease or those with significant disease-related damage. Finally, although application of the criteria might seem cumbersome, as regularly done for JIA and RA, the evaluation of improvement will be facilitated by appropriate dedicated software or ‘apps’, or in the future, by simplification of the manner in which the core set measures are evaluated (eg, similar to the Juvenile Arthritis Disease Activity Score for JIA).37 The time required to apply these criteria is estimated to be 25–35 min to complete the core set measures at each visit1 and 2–3 min to hand-calculate the total improvement score and degree of response. Both IMACS and PRINTO are developing a web-based tool as well as a downloadable calculator that will allow easy administration of the response criteria and immediate calculation. The apparent complexity is, however, counterbalanced by the establishment of different validated levels of improvement, which constitute the real novelty of this project and which have never been validated as such for either RA or JIA, despite being regularly reported in clinical trials.

In summary, conjoint analysis–based continuous response criteria that establish different thresholds for minimal, moderate, and major improvement and utilise the absolute per cent change in core set measures were chosen as the consensus response criteria for juvenile DM and were validated using both natural history and trial data. These response criteria should be highly acceptable and widely used given that they were developed with consensus among many myositis experts worldwide. They should be sensitive in detecting differences in improvement and in quantitating the degree of improvement, as seen in the two clinical trials. Thus, clinical trials that test new therapies for juvenile DM should be easier to design, conduct, and compare.

This criteria set has been approved by the American College of Rheumatology (ACR) Board of Directors and the European League Against Rheumatism (EULAR) Executive Committee. This signifies that the criteria set has been quantitatively validated using patient data, and it has undergone validation based on an independent data set. All ACR/EULAR-approved criteria sets are expected to undergo intermittent updates.

The ACR is an independent, professional, medical and scientific society that does not guarantee, warrant, or endorse any commercial product or service.

Acknowledgments

We thank the following individuals for providing invaluable input and feedback on project development and support: Dr Daniel Aletaha (European League Against Rheumatism), Drs Suzette Peng and Sarah Yim (Food and Drug Administration), Drs Thorsten Vetter and Richard Vesely (European Medicines Agency), Bob Goldberg and Theresa Curry (The Myositis Association), Rhonda McKeever and Patti Lawler (Cure JM Foundation), and Irene Oakley (Myositis UK). We also thank Drs Michael Ward and Steven Pavletic for their critical review of the manuscript. Paul Hansen, who with Franz Ombler owns and co-invented the 1000Minds software referred to in the article, provided intellectual and logistic support for this project.

References

Footnotes

  • Handling editor Tore K Kvien

  • LGR and RA contributed equally and JV and NR contributed equally.

  • This article is published simultaneously in the May 2017 issue of Arthritis & Rheumatology.

  • Submitted for publication 11 February 2016; accepted in revised form 31 January 2017

  • Collaborators Appendix A MEMBERS OF THE INTERNATIONAL MYOSITIS ASSESSMENT AND CLINICAL STUDIES GROUP AND THE PAEDIATRIC RHEUMATOLOGY INTERNATIONAL TRIALS ORGANISATION WHO CONTRIBUTED TO DEVELOPING THE RESPONSE CRITERIA

    Steering committee: Lisa G Rider (co-principal investigator), Nicolino Ruperto (co-principal investigator), Rohit Aggarwal (methodology lead), Frederick W Miller, Jiri Vencovsky.

    Statistical team: Rohit Aggarwal, Brian Erman, Nastaran Bayat, Angela Pistorio, Adam M. Huber, Brian M Feldman, Paul Hansen, Howard Rockette, Peter A Lachenbruch, Nicolino Ruperto, Lisa G. Rider.

    Paediatric core set survey group: Maria Apaz, Suzanne Bowyer, Rolando Cimaz, Tamás Constantin, Megan Curran, Joyce Davidson, Brian M. Feldman, Thomas Griffin, Adam H. Huber, Olcay Jones, Susan Kim, Bianca Lang, Carol Lindsley, Daniel Lovell, Claudia Saad Magalhaes, Lauren M. Pachman, Clarissa Pilkington, Andrea Ponyi, Marilynn Punaro, Pierre Quartier, Athimalaipet V Ramanan, Angelo Ravelli, Ann Reed, Robert Rennebohm, David D Sherry, Clovis A Silva, Elizabeth Stringer, Annet van Royen-Kerkhof, Carol Wallace.

    Clinical trial or natural history study data set contributors: Frederick W. Miller, Chester V. Oddis, Ann Reed, Lisa G. Rider, Nicolino Ruperto, and PRINTO members.

    Paediatric patient profile working group: Maria Apaz, Tadej Avcin, Mara Becker, Michael W. Beresford, Rolando Cimaz, Tamás Constantin, Megan Curran, Ruben Cuttica, Joyce Davidson, Frank Dressler, Jeffrey Dvergsten, Sheila Knupp Feitosa de Oliveira, Brian M. Feldman, Virginia Paes Leme Ferriani, Berit Flato, Valeria Gerloni, Thomas Griffin, Michael Henrickson, Claas Hinze, Mark Hoeltzel, Adam M. Huber, Maria Ibarra, Norman Ilowite, Lisa Imundo, Olcay Jones, Susan Kim, Daniel Kingsbury, Bianca Lang, Carol Lindsley, Daniel Lovell, Alberto Martini, Claudia Saad Magalhaes, Bo Magnusson, Sheilagh Maguiness, Susan Maillard, Pernille Mathiesen, Liza McCann, Susan Nielsen, Lauren M. Pachman, Murray Passo, Clarissa Pilkington, Marilynn Punaro, Pierre Quartier, Egla Rabinovich, Athimalaipet V. Ramanan, Angelo Ravelli, Ann Reed, Robert Rennebohm, Lisa G. Rider, Rafael Rivas-Chacon, Angela Byun Robinson, Kelly Rouster-Stevens, Ricardo Russo, Lidia Rutkowska-Sak, Adriana Sallum, Helga Sanner, Heinrike Schmeling, Duygu Selcen, Bracha Shaham, David D. Sherry, Clovis A. Silva, Charles H. Spencer, Robert Sundel, Marc Tardieu, Akaluck Thatayatikom, Janjaap van der Net, Annet van Royen-Kerkhof, Dawn Wahezi, Carol Wallace, Francesco Zulian.

    Conjoint analysis, paediatric group: Rolando Cimaz, Tamás Constantin, Ruben Cuttica, Joyce Davidson, Frank Dressler, Brian M. Feldman, Thomas Griffin, Michael Henrickson, Adam M. Huber, Lisa Imundo, Bianca Lang, Carol Lindsley, Claudia Saad Magalhaes, Bo Magnusson, Susan Maillard, Lauren M. Pachman, Murray Passo, Clarissa Pilkington, Marilynn Punaro, Angelo Ravelli, Ann Reed, Lisa G. Rider, Kelly Rouster-Stevens, Ricardo Russo, Bracha Shaham, Robert Sundel, Janjaap van der Net, Annet van Royen-Kerkhof.

    Participants in consensus conference, paediatric working group: Rolando Cimaz, Rubén Cuttica, Sheila Knupp Feitosa de Oliveira, Brian M. Feldman, Adam M. Huber, Carol B. Lindsley, Clarissa Pilkington, Marilynn Punaro, Angelo Ravelli, Ann Reed, Kelly Rouster-Stevens, Annet van Royen-Kerkhof.

    Participants in consensus conference, adult working group: Anthony Amato, Hector Chinoy, Robert G Cooper, Maryam Dastmalchi, Marianne de Visser, David Fiorentino, David Isenberg, James Katz, Andrew Mammen, Chester V. Oddis, Jiri Vencovsky, Steven R Ytterberg.

  • See Appendix A for members of the International Myositis Assessment and Clinical Studies Group and the Paediatric Rheumatology International Trials Organisation who contributed to developing the response criteria.

  • Contributors All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. LGR and NR had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Study conception and design: LGR, RA, FWM, JV, NR. Acquisition of data: LGR, RA, AP, NB, BE, BMF, AMH, RC, RJC, SKO, CBL, CAP, MP, AR, AMR, KR, AR, FD, CS, TC, JED, BM, RR, LV, MR, HR, PAL, JV, NR. Analysis and interpretation of data: LGR, RA, CAP, NB, BE, BMF, AMH, HR, PAL, FWM, JV, NR.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; internally peer reviewed.

  • Supported in part by the American College of Rheumatology, the European League Against Rheumatism, the NIH (Intramural Research Programs of the National Institute of Environmental Health Sciences (NIEHS), National Center for Advancing Translational Sciences, and National Institute of Arthritis and Musculoskeletal and Skin Diseases), Istituto G. Gaslini and the Paediatric Rheumatology International Trials Organisation (PRINTO), Cure JM Foundation, Myositis UK, and the Myositis Association. Dr. Vencovsky's work was supported by the Ministry of Health, Czech Republic (Institute of Rheumatology project for conceptual development of a research organisation, 00023728).

Linked Articles