Article Text

Development and preliminary validation of the Sjögren’s Tool for Assessing Response (STAR): a consensual composite score for assessing treatment effect in primary Sjögren’s syndrome
  1. Raphaele Seror1,2,
  2. Gabriel Baron3,4,
  3. Marine Camus1,2,
  4. Divi Cornec5,6,
  5. Elodie Perrodeau3,4,
  6. Simon J Bowman7,8,9,
  7. Michele Bombardieri10,
  8. Hendrika Bootsma11,
  9. Jacques-Eric Gottenberg12,13,
  10. Benjamin Fisher14,15,
  11. Wolfgang Hueber16,
  12. Joel A van Roon17,
  13. Valérie Devauchelle-Pensec5,6,
  14. Peter Gergely18,
  15. Xavier Mariette1,2,
  16. Raphael Porcher3,4
  17. on behalf of the NECESSITY WP5 - STAR development working group
    1. 1 Paris-Saclay University, INSERM UMR1184: Centre for Immunology of Viral Infections and Autoimmune Diseases, Le Kremlin-Bicetre, France
    2. 2 Rheumatology, Assistance Publique-Hôpitaux de Paris (AP-HP), Hôpitaux universitaires Paris-Sud - Hôpital Bicêtre, Le Kremlin-Bicêtre, France
    3. 3 Assistance Publique Hôpitaux de Paris, Hôtel Dieu hospital, Paris, France
    4. 4 Centre d'Epidémiologie Clinique, INSERM U1153, Faculté de Médecine, Université Paris Descartes, Paris, France
    5. 5 Rhumatologie, CHU Brest, Brest, France
    6. 6 Université de Brest, INSERM UMR 1227, LBAI, Brest, France
    7. 7 Rheumatology, University Hospitals Birmingham, Birmingham, UK
    8. 8 Rheumatology, Milton Keynes University Hospital, Milton Keynes, UK
    9. 9 University of Birmingham, Birmingham, UK
    10. 10 Experimental Medicine and Rheumatology, William Harvey Research Institute, Queen Mary University of London, London, UK
    11. 11 Rheumatology and Clinical Immunology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
    12. 12 Rheumatology, University Hospital of Strasbourg, Strasbourg, France
    13. 13 Université de Strasbourg, IBMC, CNRS, UPR3572, Strasbourg, France
    14. 14 National Institute for Health Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
    15. 15 Rheumatology Research Group, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
    16. 16 Novartis Pharma, Basel, Switzerland
    17. 17 Immunology, Rheumatology and Clinical Immunology, Center of Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
    18. 18 Novartis Institutes for BioMedical Research Basel, Basel, Switzerland
    1. Correspondence to Dr Raphaele Seror, Rheumatology, Hopital Bicetre, Le Kremlin-Bicetre 94270, France; raphaele.seror{at}aphp.fr

    Abstract

    Objective To develop a composite responder index in primary Sjögren’s syndrome (pSS): the Sjögren’s Tool for Assessing Response (STAR).

    Methods To develop STAR, the NECESSITY (New clinical endpoints in primary Sjögren’s syndrome: an interventional trial based on stratifying patients) consortium used data-driven methods based on nine randomised controlled trials (RCTs) and consensus techniques involving 78 experts and 20 patients. Based on reanalysis of rituximab trials and the literature, the Delphi panel identified a core set of domains with their respective outcome measures. STAR options combining these domains were proposed to the panel for selection and improvement. For each STAR option, sensitivity to change was estimated by the C-index in nine RCTs. Delphi rounds were run for selecting STAR. For the options remaining before the final vote, a meta-analysis of the RCTs was performed.

    Results The Delphi panel identified five core domains (systemic activity, patient symptoms, lachrymal gland function, salivary gland function and biological parameters), and 227 STAR options combining these domains were selected to be tested for sensitivity to change. After two Delphi rounds, a meta-analysis of the 20 remaining options was performed. The candidate STAR was then selected by a final vote based on metrological properties and clinical relevance.

    Conclusion The candidate STAR is a composite responder index that includes all main disease features in a single tool and is designed for use as a primary endpoint in pSS RCTs. The rigorous and consensual development process ensures its face and content validity. The candidate STAR showed good sensitivity to change and will be prospectively validated by the NECESSITY consortium in a dedicated RCT.

    • Sjogren's syndrome
    • outcome assessment, health care
    • patient reported outcome measures

    Data availability statement

    No data are available. Data have been obtained and analysed in accordance with the NECESSITY consortium agreement and relevant data sharing agreements signed among the parties. They cannot be shared with other partners.

    https://creativecommons.org/licenses/by/4.0/

    This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

    Statistics from Altmetric.com

    Request Permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

    Key messages

    What is already known about this subject?

    • Today, there are still no Disease Modifying Anti Rheumatic Drug licensed for patients with primary Sjögren’s syndrome (pSS).

    • One explanation to this is due to limitations of current outcome measures used as primary endpoints, for example, the high placebo response rate, evaluation of either the symptoms or the systemic activity, and important features not being assessed.

    What does this study add?

    • We herein developed a consensual composite endpoint, the Sjögren’s Tool for Assessing Response (STAR), using data-driven methods based on nine randomised controlled trials and consensus techniques based on the opinion of 78 experts and 20 patients.

    • STAR aims to resolve the issues on current outcome measures in pSS and encompasses all disease features in a single tool.

    How might this impact on clinical practice or future developments?

    • STAR is intended for use in clinical trials as an efficacy endpoint and intends to become the reference standard outcome measure in pSS.

    Introduction

    For decades, evidence-based therapy in primary Sjögren’s syndrome (pSS) has largely been based on sicca features or patient-reported outcomes (PROs). Over the past 20 years, work from an international consortium, supported by the European Alliance of Associations for Rheumatology (EULAR), has led to the development and validation of the consensual EULAR Sjögren’s Syndrome Disease Activity Index (ESSDAI) and EULAR Sjögren’s Syndrome Patient Reported Index (ESSPRI).1–3 Both have emerged as reference standards to measure systemic activity and patients’ symptoms, respectively.

    Thus, ESSDAI has been used as a primary endpoint in recent randomised controlled trials (RCT) testing biologics, and for the first time in pSS four RCTs have met their primary endpoint.4–7 ESSDAI has shown promising capability to monitor changes in disease activity and assess therapeutic efficacy. Nonetheless, several trials failed to show improvement in ESSDAI,8–10 perhaps due to inefficacy of the drugs, but also potentially to the relatively high placebo response rates observed with ESSDAI. Also, the lack of efficacy may be explained by the absence of assessment of important features in ESSDAI, such as patients’ symptoms and glandular function.11 Recent RCTs showed that improvement in ESSDAI does not necessarily translate to improvement in PROs.4–7 12 Thus, used as a unique primary endpoint, ESSDAI does not capture all important disease features. These limitations are inherent to scale constructs and highlight the need for a composite endpoint able to assess the disease globally.13

    The NECESSITY (New clinical endpoints in primary Sjögren’s syndrome: an interventional trial based on stratifying patients) consortium (https://www.necessity-h2020.eu/) includes pSS experts from academia, pharmaceutical industry and patient groups formed to develop a new composite responder index, the Sjögren’s Tool for Assessing Response (STAR). STAR aims to resolve the issues on current outcome measures in pSS and is intended for use in clinical trials as an efficacy endpoint. We herein report its development process.

    Methods

    The development and preliminary retrospective validation of STAR followed the OMERACT (Outcome Measures in Rheumatology) guidelines14 and consisted of three steps (figure 1), combining data-driven methods from nine RCTs (table 1) and consensus methods. The Delphi panel was formed by 78 pSS international experts (57 clinicians, 21 scientists) and 20 patients with pSS (online supplemental 1).

    Figure 1

    STAR development process. STAR, Sjögren’s Tool for Assessing Response.

    Table 1

    Description of the nine randomised controlled trials used for the development of STAR and their classification by the expert panel

    Step 1: identification of the STAR core set

    This step aimed to select the core set of domains of relevance in assessing treatment response in pSS and the measurement tool and definition of response for each domain.

    We used data from two rituximab trials because, although they failed to demonstrate treatment efficacy in their primary endpoint relying on PROs, clinical experience suggests that rituximab should work in at least some patients and for some endpoints.15–17 When only a portion of patients respond to the treatment, it might preclude the identification of an average treatment effect in the whole population. Many statistical methods exist to maximise the chances of detecting parameters that show differential change between active and placebo arms. We here used the virtual twins approach, which identifies subgroups with enhanced probability of response based on their baseline characteristics and which estimates the treatment effect in each subgroup while correcting for optimism due to the data-driven process.18

    Identification of subsets of responders

    The panellists first agreed, based on expertise, literature and patient feedback, on baseline variables to include in the analyses (ie, main pSS characteristics suspected to be associated with response to treatment), and on the definitions of response to treatment, based on existing outcome measures in pSS and validated cut-offs.

    Virtual twins regression trees were computed for each definition of response and each set of baseline variables. Responder subsets (ie, a branch of virtual twins analysis) were selected by the lead team based on statistical criteria (a relative risk of response to treatment vs placebo notably higher than in the whole population and a sufficient number of patients (≥60) for statistical power) and clinical relevance (subset identified by a definition of response including both physician and PROs).

    Identification of items sensitive to change

    The items most sensitive to change were identified based on their effect size (ES; with their 95% CI) for the between-group difference of change in score from baseline to week 24 and in score at week 24 in each responder subset and the whole population. The ES for the difference between groups was assessed by the Cohen’s d measure, assuming a pooled SD.19 CIs were estimated using the non-centrality parameter approach. This method searches for the best non-central parameter (NCP) of the non-central t distribution for the desired tail probabilities, and these NCPs are then converted to the corresponding ES.20 The larger the ES, the greater the sensitivity to change.21 ES values are commonly considered large (>0.8), moderate (0.5–0.8) or small (<0.5). The following outcomes were analysed: specific scores (ESSDAI, ESSPRI, and physician and patient global assessment), dryness (global, oral and ocular), pain and fatigue Visual Analogue Scale, glandular function (Schirmer’s test and salivary flow), and biological variables (β2 globulin, serum IgG, γ-globulin, erythrocyte sedimentation rate, rheumatoid factor (RF) and C4 complement).

    Selection of domains, items and definition of response

    The results of the analyses on sensitivity to change and relevant literature review were presented to the Delphi panel. The scoping review of the literature on outcome measures in pSS will be published elsewhere. Based on these data and on clinical experience, the Delphi panellists were asked to rate the importance of measuring each outcome in the context of assessing treatment response in clinical trials (from not important (1–3) to critical (7–9) on a 9-point Likert scale) and to provide comments and suggest new domains or measurements. Items scored as critical (score ≥7) by ≥50% of the panellists were selected and were defined as the domains to include in STAR. Several items and definitions of response were selected in each domain.

    Step 2: construction of STAR options

    The lead team prepared the drafts of the STAR options, combining the items and definitions of response identified previously. These draft options, along with the recently developed concise Composite of Relevant Endpoints for Sjögren’s Syndrome (CRESS),22 were presented to the panellists to select by vote which designs will be analysed in the next step. They could also make suggestions of combinations and alternate measurement tools or thresholds. Designs with ≥50% of votes, modified as per experts’ suggestions, were selected.

    Step 3: evaluation of sensitivity to change of STAR options and selection of the candidate STAR

    This phase aimed at selecting the candidate STAR and relied on analysis of nine RCTs completed at the time of analysis (table 1).

    Analysis of sensitivity to change of STAR options

    The responder rate in each group for binary options (or the mean score for continuous options) was calculated for each STAR option in each RCT. Sensitivity to change was estimated using the concordance (C) index,23 which is similar to the area under the curve of the receiver operating characteristics curve for a binary outcome. It ranges from 0 to 1 and is interpreted as follows: 1, perfectly discriminant; 0.5–1, more discriminant than random; and <0.5, worse than random.

    Voting for top 10

    These analyses along with explanations on data interpretation were presented to the expert panel. They were asked to vote for their top 10 options. During a follow-up meeting, the results of the vote were discussed to consensually select the options for the next step.

    Meta-analysis of the selected options

    To better appraise the sensitivity to change of the remaining STAR options, the Delphi panel decided to perform a meta-analysis of the nine RCTs. The Delphi panel voted on which trials they considered positive, negative or ‘in between’ with regard to primary but also key secondary endpoints. A study that failed to meet its primary outcome was considered ‘in between’ if the experts agreed that there was sufficient signal of benefit in the secondary outcomes. Meta-analyses were run for 'positive' and ‘in between’ trials together in which positive results were expected, and separately for negative trials in which no difference between groups was expected.

    For binary outcomes, meta-analyses were run using the Mantel-Haenszel method with the Paule-Mandel estimator for τ 2, Q-profile method for the CI of τ 2and τ, and continuity correction of 0.5 in studies with zero cell frequencies.24 For continuous outcomes, the inverse variance method was used with the Paule-Mandel estimator for τ 2, Q-profile method for the CI of τ 2and τ, and Hedges’ g.

    For binary scores, the treatment effect was expressed as OR, where 1 or below indicates absence of any effect, while above 1 favours the experimental treatment. For continuous scores, the treatment effect was expressed as standardised mean difference, where 0 indicates absence of any effect, while above 0 favours the experimental treatment. Consequently, a STAR option that is sensitive and specific to change should have a treatment effect close to the null effect for the negative trials and as far from the null effect for the positive trials.

    Voting for top 3

    The results of the meta-analyses were shared with the Delphi panel, who then voted for their top 3 options. During a follow-up meeting, the results were discussed to consensually select the options for the next step.

    Voting for the candidate STAR

    A final vote was run to select the candidate STAR based on clinical relevance.

    Patient involvement

    The NECESSITY Patient Advisory Group (PAG) representatives were involved in all steps and participated in every discussion meeting. Other patients contacted by the PAG representatives participated anonymously in the development of STAR (steps 1 and 3). Only PAG representatives participated in step 2 because this exercise required technical knowledge of endpoint construction. The background information provided in each survey was tailored to the patients.

    Results

    Step 1: identification of the STAR core set

    Identification of subsets of responders

    The Delphi panel selected two sets of baseline variables for analyses, one with ESSDAI and ESSPRI total scores (set 1) and one with their subscales/domains (set 2) (online supplemental 2), and proposed 14 definitions of response to treatment (online supplemental 3). Virtual twins regression trees were computed and the lead team selected four responder subsets (online supplemental 4).

    Identification of items sensitive to change

    Analysis of sensitivity to change of each outcome revealed that some outcomes improved significantly better in the rituximab arms compared with the placebo arms in at least one responder subset and/or in the whole population (figure 2): (1) among PROs, dryness (overall, oral or ocular) and ESSPRI; (2) among objective dryness measures, unstimulated whole salivary flow (UWSF) but not Schirmer’s test; and (3) among biological markers, serum IgG, γ-globulin and RF levels. By contrast, systemic scores did not improve in any subset, except for physician global assessment in subset 3. The results were similar when analysing the ES for between-group differences of change in score from baseline to week 24 (figure 2) or the final value at week 24 (online supplemental 5).

    Figure 2

    Sensitivity to change of each individual outcome in the combined analysis of the TEARS and TRACTISS rituximab trials. Sensitivity to change is represented by the Cohen’s effect size and the 95% CI. Analyses relied on a combined analysis of data from TEARS and TRACTISS rituximab trials. Cohen’s effect size and 95% CI for the standardised difference in mean change from baseline to W24 were computed for each outcome in the four responder subsets and in the whole population of the two trials. ESR, erythrocyte sedimentation rate; ESSDAI, EULAR Sjögren’s Syndrome Disease Activity Index; ESSPRI, EULAR Sjögren’s Syndrome Patient Reported Index; PtGA, patient global assessment; PhGA, physician global assessment; RF, rheumatoid factor; TEARS, Tolerance and Efficacy of Rituximab in primary Sjögren Syndrome; TRACTISS, TRial of Anti-B Cell Therapy In patients with primary Sjögren Syndrome; UWSF, unstimulated whole salivary flow; VAS, Visual Analogue Scale; W, duration in weeks.

    Selection of domains, items and definition of response

    Five domains were identified by the Delphi panel: systemic activity, patient symptoms, lachrymal gland function, salivary gland function and biological parameters. No other domain was suggested.

    For each domain, voting results, as well as clinical relevance, feasibility at clinical sites, and acceptability for patients and regulatory agencies, were considered when selecting the measurement tools. Thus, the Delphi panel selected either one or two measurement tools per domain (online supplemental 6 and 7). For the systemic domain, clinESSDAI was preferred to ESSDAI to avoid redundant recording of the biological parameter.25 For each glandular domain, two measurement tools were included to ensure the score could be calculated regardless of equipment availability at clinical sites.

    Step 2: construction of STAR options

    Various designs for STAR were prepared by the lead team (table 2). The designs were inspired by the Disease Activity Score 28,26 Systemic Lupus Responder Index,27 American College of Rheumatology response criteria28 and by the recently developed clinical CRESS.22 Various cut-off values were proposed for each measurement. In some designs, due to their importance, systemic activity and PROs were defined as major domains that must improve to meet the definition of a responder. A total of 227 options were selected after voting and discussion meeting (online supplemental 8).

    Table 2

    Description of the STAR design proposed (step 2) and tested for sensitivity to change (step 3)

    Step 3: evaluation of sensitivity to change of STAR options and selection of the candidate STAR

    Analysis of sensitivity to change was run for the 227 options in the nine RCTs (online supplemental 9). Options in STAR design 1 were rejected because it was not possible to obtain a stable estimation of domain weights to construct a score. Of the 225 remaining options, 189 were never selected and were rejected, and 16 additional options, found to be redundant or less clinically relevant than the others, were rejected during the follow-up meeting. Consequently, 20 options moved to the next step.

    Based on the panellists’ classification of RCTs (online supplemental 10), meta-analyses were computed separately for trials considered 'positive' and or trials considered 'negative' by the experts (figure 3) to allow for comparison of sensitivity and specificity to change, respectively. Based on these results, the panellists voted for their top 3 options. Five options not selected by any panellist were not included in the final vote.

    Figure 3

    Results of meta-analyses on six studies considered positive and three considered negative by the experts. Meta-analyses were performed for the 20 STAR options that reach the final step and are presented for binary endpoints (panel A) et continuous endpoints (panel B). Interpretation: a score that is sensitive to change and specific to the treatment should have a treatment effect close to the null effect in the negative trials and as far as possible from the null effect in the positive trials. cont, continuous; CRESS, Composite of Relevant Endpoints for Sjögren’s Syndrome; SMD, standardised mean difference; STAR, Sjögren’s Tool for Assessing Response; th5, threshold 5; th6, threshold 6; V, version.

    During the follow-up meeting, the panellists agreed that the selection of the candidate STAR from the remaining 15 options should be based on clinical relevance. The rationale for selection was as follows. A decrease in clinESSDAI was preferred to a set score (<5 points) at the final evaluation to avoid defining this domain as responder while the score did not change from baseline in patients with baseline low activity. ESSPRI was preferred to individual dryness scales because it is a validated score. The panellists selected the published minimal clinically important difference (MCID) as the response cut-off for clinESSDAI (≥3 points) and ESSPRI (≥1 point). Finally, the experts rejected the ‘no worsening’ clause because there is no published consensual definition for worsening of these outcomes and the options with this clause did not show better discriminative capacity (table 3). Finally, since the other 19 options (online supplemental 11) had good psychometric properties, they will be evaluated as exploratory endpoints in the NECESSITY clinical trial (EudraCT no: 2019-002470-32; online supplemental 12).

    Table 3

    Candidate STAR

    Discussion

    The NECESSITY consortium, supported by an international panel of pSS experts, scientists, methodologists and patients, developed a consensual single tool for pSS that globally assesses all disease features and for use as an efficacy endpoint in RCTs: the composite responder index STAR. STAR fulfils the truth, discrimination and feasibility criteria recommended by OMERACT. The strength of our work relies on a rigorous process combining both consensus techniques based on the opinion of a large panel and data-driven methods generated from nine trials. In the analyses performed separately for trials considered negative and positive by the expert consensus, our study demonstrated that the candidate STAR is able to show treatment efficacy in positive trials and did not erroneously detect significant between-arm differences in trials considered negative, as did some alternate options (figure 3).

    Designing a primary endpoint in pSS is challenging due to the wide spectrum of disease features and the great heterogeneity and complexity of signs and symptoms. Major changes in RCT design recently conducted to adoption of ESSDAI as primary outcome and allowed, for the first time, demonstration of treatment efficacy (table 1). However, these trials suggested that other outcomes might also improve with treatment, such as ESSPRI, UWSF and biological components (IgG and RF levels). However, recent trials focused on patients with moderate to high systemic disease activity, excluding a large proportion of patients with no systemic complications but with high symptom burden. In pSS, low quality of life is mainly driven by PROs rather than systemic activity29; also, these two domains poorly correlate.11 30 31 STAR can evaluate treatment response in the full spectrum of patients with pSS, including those with low systemic activity but high burden of symptoms, for whom there remains an important unmet need. Effectively, to avoid the pitfalls of a data-driven process relying on a single trial, the development of the candidate STAR relied on nine trials, some of which included patients with low systemic disease activity and having various timepoints of evaluation (12–48 weeks, but 24 weeks in most cases). A recent important initiative from a group in the Netherlands, also a NECESSITY partner, proposed the CRESS based on reanalysis of the ASAP-III (Abatacept Sjögren Active Patients Phase III Study) trial.8 22 The CRESS, similar to STAR, also includes the same five domains, confirming their clinical relevance in the global assessment of pSS. However, STAR has defined two major domains, systemic activity and patient symptoms, and the definition of response requires improvement of at least one. Thus, unlike CRESS, STAR requires improvement of PROs in patients with low systemic activity. Also, in negative trials, where no difference between arms is expected, the candidate STAR, accurately, did not detect any difference between arms, where other options such as the concise CRESS did (figure 3). STAR also includes improvement of glandular function using simple and validated measures, that is, Schirmer’s test, sicca ocular staining score (OSS)32 and UWSF, but also includes salivary gland ultrasound, leaving the door open to more sophisticated tests to evaluate these domains in the future. Lastly, and although they do not reflect patients’ perceived disease burden, the experts decided to include IgG and RF levels because they considered, whatever the mechanism of action of the drug, a therapeutic goal to decrease the levels of these biomarkers, signs of activity (IgG) or predictive markers of lymphoma (RF).33 34

    Nevertheless, our study has some limitations. The main issue is circular thinking since pSS experts may be tempted to define a patient as a responder or a non-responder or a trial as positive or negative based on pre-existing indexes. This may give high weight to previous indexes, leaving little room for very innovative items, which by definition were not included in previous RCTs and cannot be evaluated at this stage. Nevertheless, theses definitions relied on a high level of consensus (online supplemental 10) after evaluation of multiple independent RCTs. Finally, in most of the trials, OSS, ultrasound data and RF levels were not available and thus the impact of these outcomes on STAR response cannot be evaluated at this stage.

    The NECESSITY PAG strongly supports the STAR outcome (see letter of support in online supplemental 13). Recommendations from the European Medicines Agency (EMA) were sought through a scientific advice procedure, and the EMA has offered to publish on their website a letter of support for STAR (https://www.ema.europa.eu/en/documents/other/letter-support-sjogrens-tool-assessing-response-star_en.pdf). Also, additional steps are being worked on in collaboration with OMERACT to fulfil all requirements and for STAR to be formally endorsed.

    Even though this process relied on a nearly never-equal number of experts and RCTs, further to the present retrospective validation, STAR has to be prospectively validated in an independent population in the NECESSITY RCT (online supplemental 12). The strength of this validation step is its evaluation of the psychometric properties of STAR, in particular its discriminant capacity in an interventional study where active and placebo arms will be compared. Also, patients will be stratified according to systemic activity, allowing the evaluation of the properties of STAR in any patient with pSS with either high systemic activity or high level of symptoms. We strongly encourage the use of the candidate STAR to evaluate its properties in diverse patient populations with treatments of various mechanisms of action to definitively validate STAR as a gold standard outcome measure for RCTs in pSS.

    Data availability statement

    No data are available. Data have been obtained and analysed in accordance with the NECESSITY consortium agreement and relevant data sharing agreements signed among the parties. They cannot be shared with other partners.

    Ethics statements

    Patient consent for publication

    Ethics approval

    This study involves human participants. The data sets analysed in this study are not publicly available, except for the baminercept trial. Data came from nine completed clinical trials that have all been conducted in accordance with Good Clinical Practice and the Declaration of Helsinki and have obtained adequate ethical approval. Participants gave informed consent to participate in the study before taking part.

    Acknowledgments

    We thank the NECESSITY consortium participants as well as the following experts: Esen Karamursel Akpek, Alan Baer, Chiara Baldini, Elena Bartoloni, Marí-Alfonso Begona, Johan Brun, Vatinee Bunya, Laurent Chiche, Troy Daniels, Paul Emery, Robert Fox, Roberto Giacomelli, John Gonzales, John Greenspan, Robert Moots, Susumu Nishiyama, Elizabeth Price, Christophe Richez, Caroline Shiboski, Roser Solans Laque, Muthiah Srinivasan, Peter Olsson, Tsutomu Takeuchi, Frederick Vivino, Paraskevi Voulgari, Daniel Wallace, Ava Wu and Wen Zhang. We thank the anonymous patients from the NECESSITY Patient Advisory Group and the Sjögren's Foundation for their valuable contribution to the Delphi process. We thank Dr EW St Clair and Dr AN Baer, who made publicly available the data of the baminercept trial.

    References

    Supplementary materials

    • Supplementary Data

      This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Footnotes

    • Handling editor Josef S Smolen

    • XM and RP contributed equally.

    • Collaborators NECESSITY WP5 - STAR development working group: Suzanne Arends (Department of Rheumatology and Clinical Immunology, University Medical Center Groningen, Groningen, The Netherlands), Francesca Barone (Centre for Translational Inflammation Research, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK), Albin Björk (Division of Rheumatology, Department of Medicine, Karolinska Institutet, Stockholm, Sweden), Coralie Bouillot (Association Française du Gougerot Sjögren et des Syndromes Secs, France), Guillermo Carvajal Alegria (University of Brest, Inserm, CHU de Brest, LBAI, UMR1227, Brest, France; Service de Rhumatologie, Centre de Référence Maladies Autoimmunes Rares CERAINO, CHU Cavale Blanche, Brest, France), Wen-Hung Chen (GlaxoSmithKline, Research Triangle Park, North Carolina, USA), Kenneth Clark (GlaxoSmithKline Medicines Research Centre, Stevenage, Hertfordshire, UK), Konstantina Delli (Department of Oral and Maxillofacial Surgery, University Medical Center Groningen (UMCG), University of Groningen, The Netherlands), Salvatore de Vita (Rheumatology Clinic, University Hospital of Udine, Italy), Liseth de Wolff (Department of Rheumatology and Clinical Immunology, University Medical Center Groningen, Groningen, The Netherlands), Jennifer Evans (Novartis Pharmaceuticals Corporation USA), Stéphanie Galtier (Institut de Recherches Internationales Servier (IRIS), Suresnes Cedex, France), Saviana Gandolfo (Rheumatology Clinic, Department of Medical Area, University of Udine, ASUFC, Udine, Italy), Mickael Guedj (Institut de Recherches Internationales Servier (IRIS), Suresnes Cedex, France), Dewi Guellec (CHU de Brest, Service de Rhumatologie, Inserm, CIC 1412, Brest, France), Safae Hamkour (Center of Translational Immunology, Department of Immunology, University Medical Center Utrecht, Utrecht, The Netherlands), Dominik Hartl (Novartis Institutes for BioMedical Research, Basel, Switzerland), Malin V Jonsson (Section for Oral and Maxillofacial Radiology, Department of Clinical Dentistry, Faculty of Medicine, University of Bergen, Norway), Roland Jonsson (Broegelmann Research Laboratory, Department of Clinical Science, University of Bergen; Department of Rheumatology, Haukeland University Hospital, Bergen, Norway), Frans GM Kroese (Department of Rheumatology and Clinical Immunology, University Medical Center Groningen, Groningen, The Netherlands), Aike Albert Kruize (Department of Rheumatology and Clinical Immunology, University Medical Center Utrecht, Utrecht, The Netherlands), Laurence Laigle (Translational Medicines, Institut de Recherches Internationales Servier (IRIS), Suresnes Cedex, France), Véronique Le Guern (AP-HP, Hôpital Cochin, Centre de référence maladies auto-immunes et systémiques rares, service de médecine interne, Paris, France), Wen-Lin Luo (Department of Biometrics and Statistical Science, Novartis Pharmaceuticals, East Hanover, New Jersey), Esther Mossel (Department of Rheumatology and Clinical Immunology, University Medical Center Groningen, Groningen, The Netherlands), Wan-Fai Ng (Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, UK; NIHR Newcastle Biomedical Research Centre and NIHR Clinical Research Facility, Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne), Gaëtane Nocturne (Department of Rheumatology, Université Paris-Saclay, INSERM U1184: Centre for Immunology of Viral Infections and Autoimmune Diseases, Assistance Publique - Hôpitaux de Paris, Hôpital Bicêtre, Le Kremlin-Bicêtre, Paris, France), Marleen Nys (Global Biometric Sciences, Bristol Myers Squibb, Braine L’Alleud, Belgium), Roald Omdal (Clinical Immunology Unit, Department of Internal Medicine, Stavanger University Hospital, Stavanger, Norway), Jacques-Olivier Pers (LBAI, UMR1227, University of Brest, Inserm, Brest, France; CHU de Brest, Brest, France), Maggy Pincemin (Association Française du Gougerot Sjögren et des Syndromes Secs, France), Manel Ramos-Casals (Department of Autoimmune Diseases, Hospital Clinic de Barcelona Institut Clinic de Medicina Dermatologia, Barcelona, Catalunya, Spain), Philippe Ravaud (Centre d’Epidémiologie Clinique, Hôpital Hôtel-Dieu, Assistance Publique - Hôpitaux de Paris, Paris, France), Neelanjana Ray (Global Drug Development - Immunology, Bristol Myers Squibb, Princeton, New Jersey, USA), Alain Saraux (CHU de Brest, Service de Rhumatologie, Université de Brest, Inserm, UMR1227; Lymphocytes B et Autoimmunité, Université de Brest, Inserm, LabEx IGO, Brest, France), Athanasios Tzioufas (Rheumatology Clinic, Department of Medical Area, University of Udine, ASUFC, Udine, Italy), Gwenny Verstappen (Department of Rheumatology and Clinical Immunology, University Medical Center Groningen, Groningen, The Netherlands), Arjan Vissink (Department of Oral and Maxillofacial Surgery, University Medical Center Groningen, Groningen, The Netherlands), Marie Wahren-Herlenius (Division of Rheumatology, Department of Medicine, Karolinska Institutet, Stockholm, Sweden).

    • Contributors RS, XM and RP had full access to all data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Conception and design of the study: RS, DC, SJB, MB, HB, J-EG, BF, WH, JAvR, VD-P, PG, XM, RP. Analysis of data: RS, GB, EP, RP. Interpretation of data: RS, DC, SJB, MB, HB, J-EG, BF, WH, JAvR, VD-P, PG, XM, RP, MB. Drafting of the manuscript: RS, MC, XM, RP. Critical revision of the manuscript: RS, MC, XM, RP, DC, SJB, MB, HB, J-EG, BF, WH, JAvR, VD-P, PG. Contribution of data: SJB, HB, J-EG, WH, JAvR, VD-P, PG, XM, MB. All authors approved the manuscript’s content before submission. RS is author acting as guarantor.

    • Funding This project has received funding from the Innovative Medicines Initiative 2 Joint Undertaking (JU) under grant agreement number 806975. JU receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA. The present article reflects only the authors’ view and JU is not responsible for any use that may be made of the information it contains.

    • Disclaimer The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research or the Department of Health.

    • Competing interests RS has received consulting fees from GlaxoSmithKline, Boehringer, Janssen and Novartis, participated in an advisory board for Janssen, and received support for attending meeting from GlaxoSmithKline and Amgen. DC has received consulting fees from GlaxoSmithKline, Bristol Myers Squibb, Janssen, Amgen, Pfizer and Roche. J-EG has received honoraria from AbbVie, Bristol Myers Squibb, Eli Lilly, Galapagos, Gilead, Pfizer, Roche, Sanofi, Novartis, MSD, CSL Behring and Genzyme and received grant from Bristol Myers Squibb. SJB and BF receive funding from the National Institute for Health Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK. SJB has provided consultancy services in the field of clinical trial design for Sjögren’s syndrome for AbbVie, AstraZeneca, Galapagos and Novartis Pharmaceuticals in 2018–2021. BF has received grants from Servier, Galapagos and Janssen, provided consultancy services for Novartis, Bristol Myers Squibb, Janssen and Servier, and received honoraria from Bristol Myers Squibb and Novartis. MB has received grants from Amgen/MedImmune, Janssen and GlaxoSmithKline, and personal fees from UCB, Amgen/MedImmune, Janssen and GlaxoSmithKline. HB has received grants from Bristol Myers Squibb and Roche, and consulting fees from Bristol Myers Squibb, Roche, Novartis, MedImmune and Union Chimique Belge. WH and PG are employees of Novartis Pharma and recipients of Novartis stocks. XM has received a grant from Ose Pharmaceuticals and consultancy fees from Bristol Myers Squibb, Galapagos, GlaxoSmithKline, Janssen, Novartis, Pfizer and UCB. GB, MC, EP, JAvR, VD-P and RP declare no competing interests.

    • Patient and public involvement Patients and/or the public were involved in the design, or conduct, or reporting, or dissemination plans of this research. Refer to the Methods section for further details.

    • Provenance and peer review Not commissioned; externally peer reviewed.

    • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.