Article Text

Download PDFPDF

Original article
Evaluation of the efficacy and safety of sarilumab combination therapy in patients with rheumatoid arthritis with inadequate response to conventional disease-modifying antirheumatic drugs or tumour necrosis factor α inhibitors: systematic literature review and network meta-analyses
  1. Ernest Choy1,
  2. Nick Freemantle2,
  3. Clare Proudfoot3,
  4. Chieh-I Chen4,
  5. Laurence Pollissard5,
  6. Andreas Kuznik4,
  7. Hubert Van Hoogstraten6,
  8. Erin Mangan7,
  9. Paulo Carita5 and
  10. Thi-Minh-Thao Huynh8
  1. 1 Division of Infection and Immunity, Cardiff University, Cardiff, UK
  2. 2 Institute for Clinical Trials and Methodology, University College London, London, UK
  3. 3 Formerly, Health Economics and Outcomes Research, Sanofi, Guildford, UK
  4. 4 Health Economics and Outcomes Research, Regeneron Pharmaceuticals, Inc, New York City, New York, USA
  5. 5 Global Health Economics & Value Assessment, Sanofi France, Chilly-Mazarin, France
  6. 6 Global Medical Affairs, I&I, Sanofi, Bridgewater, New Jersey, USA
  7. 7 Medical Affairs, Regeneron Pharmaceuticals, Inc, New York City, New York, USA
  8. 8 Real World Evidence & Clinical Outcome Generation, Sanofi France, Chilly-Mazarin, France
  1. Correspondence to Dr Thi-Minh-Thao Huynh; thi-minh-thao.huynh{at}


Objective To compare efficacy and safety of subcutaneous sarilumab 200 mg and 150 mg every 2 weeks plus conventional synthetic disease-modifying antirheumatic drugs (+csDMARDs) versus other targeted DMARDs+csDMARDs and placebo+csDMARDs, in inadequate responders to csDMARDs (csDMARD-IR) or tumour necrosis factor α inhibitors (TNFi-IR).

Methods Systematic literature review and network meta-analyses (NMA) conducted on 24 week efficacy and safety outcomes: Health Assessment Questionnaire Disability Index, modified total sharp score (mTSS, including 52 weeks), American College of Rheumatology (ACR) 20/50/70, European League Against Rheumatism Disease Activity Score 28-joint count erythrocyte sedimentation rate (DAS28)<2.6; serious infections/serious adverse events (including 52 weeks).

Results 53 trials were selected for NMA. csDMARD-IR: Sarilumab 200 mg+csDMARDs and 150 mg+csDMARDs were superior versus placebo+csDMARDs on all outcomes. Against most targeted DMARDs, sarilumab 200 mg showed no statistically significant differences, except superiority to baricitinib 2 mg, tofacitinib and certolizumab on 24 week mTSS. Sarilumab 150 mg was similar to all targeted DMARDs. TNFi-IR: Sarilumab 200 mg was similar to abatacept, golimumab, tocilizumab 4 mg and 8 mg/kg intravenously and rituximab on ACR20/50/70, superior to baricitinib 2 mg on ACR50 and DAS28<2.6 and to abatacept, golimumab, tocilizumab 4 mg/kg intravenously and rituximab on DAS28<2.6. Sarilumab 150 mg was similar to targeted DMARDs but superior to baricitinib 2 mg and rituximab on DAS28<2.6 and inferior to tocilizumab 8 mg on ACR20 and DAS28<2.6. Serious adverse events, including serious infections, appeared similar for sarilumab versus comparators.

Conclusions Results suggest that in csDMARD-IR and TNFi-IR (a smaller network), sarilumab+csDMARD had superior efficacy and similar safety versus placebo+csDMARDs and at least similar efficacy and safety versus other targeted DMARDs+csDMARDs.

  • sarilumab
  • biologic disease-modifying antirheumatic drugs
  • rheumatoid arthritis
  • network meta-analysis

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key messages

What is already known about this subject?

  • The addition of bDMARDs to csDMARDs is recommended in guidelines from ACR and EULAR for achieving remission or reducing disease activity in patients with RA who have an inadequate response to csDMARDs alone.

  • Given the variety of treatments currently available for RA, a comprehensive evaluation of the comparative effectiveness and safety of sarilumab against other DMARDs is necessary to inform treatment decisions and health technology assessments, as well as to guide evidence-based medicine.

What does this study add?

  • In the absence of head-to-head trials, network meta-analysis can provide estimates of comparative effectiveness via the combined evaluation of direct and indirect trial evidence.

  • For inadequate responders of csDMARDs or tumour necrosis factor inhibitors, sarilumab 150 mg and 200 mg subcutaneous every 2 weeks plus csDMARDs had superior efficacy and similar safety versus continued use of csDMARDs alone. Sarilumab 150 mg and 200 mg had at least similar efficacy versus all other comparable doses of targeted DMARDs added to csDMARDs.

How might this impact on clinical practice?

  • Physicians may use these results from this clinical study to inform treatment decisions for patients with RA.


The addition of biological disease-modifying antirheumatic drugs (bDMARDs) (including tumour necrosis factor-α inhibitors (TNFi), T cell costimulatory inhibitors, anti-B cell agents and anti-interleukin-6 receptor (anti-IL-6R) monoclonal antibodies) or targeted synthetic DMARDs (tsDMARDs) to conventional synthetic DMARDs (csDMARDs) is recommended in guidelines issued by both The American College of Rheumatology (ACR)1 and the European League Against Rheumatism (EULAR)2 for achieving remission or reducing disease activity in patients with rheumatoid arthritis (RA) who have inadequate response (IR) to csDMARDs alone. Sarilumab is a human immunoglobulin (Ig) G1 anti-IL-6Rα monoclonal antibody for the treatment of RA as monotherapy or combination therapy with csDMARDs.3–6 Given the variety of treatments currently available for RA, a comprehensive evaluation of the comparative effectiveness and safety of sarilumab against other DMARDs is necessary to inform treatment decisions and health technology assessments, as well as to guide evidence-based medicine.7

Active comparator randomised controlled trials (RCTs) are the gold standard methodological approach for comparative efficacy.8 However, research is mainly characterised by placebo-controlled studies, while head-to-head trials are not readily available. In the absence of head-to-head studies, network meta-analysis (NMA) can provide estimates of comparative effectiveness via the combined evaluation of direct and indirect trial evidence;9–11 treatments can then be compared with each other via common comparators.

This NMA was conducted to evaluate the comparative efficacy and safety of subcutaneous (SC) sarilumab at doses of 150 mg and 200 mg, administered every 2 weeks (q2w) and added to csDMARDs. Sarilumab was evaluated versus other licensed treatments for RA, including csDMARDs, bDMARDs and tsDMARDs, at recommended doses for the treatment of RA, in two groups of patients: csDMARD-IR and TNFi-IR. The csDMARD-IR population was studied separately for combination therapy and monotherapy. The focus of the current NMA is on patients receiving an addition of a bDMARD or tsDMARD to their existing csDMARD treatment regimen.


A systematic literature review (SLR) and NMA were conducted following methods in line with PRISMA guidelines12 and recommended in the current National Institute for Health and Care Excellence (NICE) specification for manufacturer and sponsor submission of evidence13 and the 2016 NICE technology appraisal of adalimumab, etanercept, infliximab, certolizumab pegol, golimumab, tocilizumab and abatacept for RA.14

Study selection

Searches for the SLR were conducted in MEDLINE, EMBASE, Cochrane databases (all with no backwards time limit) and conference proceedings (since 2013), on evidence published until 6 December 2016, and studies were selected according to predefined PICOS (population/intervention/comparator/outcome/study design) criteria12 13 15 16 (table 1). All titles, abstracts and articles were screened independently by two researchers, with study selection following published best practice guidelines for NMA.13 15 16 Data on study design, patient characteristics, efficacy, safety and patient-reported outcomes (PROs) at the time points 12 (±4), 24 (±4) and 52 (±8) weeks for all studies (except open-label extensions) were extracted independently by two reviewers in a predefined data extraction process.

Table 1

Population/intervention/comparator/outcome/study design and search criteria for the systematic review

Evidence for the NMA was filtered for drugs licensed for RA at doses approved in Europe, the USA and Canada. In addition, the investigational drug baricitinib 2 mg daily (qd) and 4 mg qd combined with methotrexate/csDMARD were included as this agent was at advanced regulatory stages at the time of analyses. Rituximab (currently only licensed for the TNFi-IR population) was included for the csDMARD-IR population in the interest of providing a bridge for relevant comparators, while anakinra was excluded due to its uncommon use, in addition to its reported limited effectiveness relative to other biologics.1

All trials comparing one intervention of interest with at least one other intervention of interest or methotrexate or ≥1 csDMARD(s) were considered in the evidence base. Small studies (less than 30 patients per arm) were excluded from the evidence base on the basis that small studies have been shown to distort meta-analyses.17 Studies that did not report any outcomes of interest were also excluded.

Treatment categorisation

Treatment categorisation was based on grouping all the available treatments for inclusion in the networks (table 2). Methotrexate and csDMARD used as background therapies were considered similar and grouped, while randomised treatment groups with one csDMARD+methotrexate were separated from those including two csDMARDs+methotrexate. Different licensed dosages and different routes of administration (eg, intravenous (IV) vs SC delivery) of the same treatment were pooled in many cases, on the basis of evidence of equivalence (table 2). These decisions were explored by examining forest plots of the OR for ACR20 at 24 weeks in individual studies by group of interventions. If the confidence intervals were overlapping (eg, for infliximab studies), the doses were pooled. The validity of the decisions was also confirmed via clinician input.

Table 2

Key features of patient demographics and baseline data for selected studies

Outcomes examined for the NMA included: ACR 20%, 50% and 70% (ACR20/50/70) response criteria, EULAR Disease Activity Score 28-joint count (DAS28) remission (defined as DAS28 erythrocyte sedimentation rate (ESR) or C reactive protein (CRP) <2.6), Health Assessment Questionnaire Disability Index (HAQ-DI) change from baseline (CFB), modified total sharp score (mTSS) CFB, incidence of serious infections (SIs) and serious adverse events (SAEs). However, as different studies reported different scores for radiographic progression, for example, van der Heijde mTSS or Genant total sharp score, only the studies reporting van der Heijde mTSS were considered for this endpoint; the other scoring systems were deemed to be incomparable.18

All efficacy outcomes were examined at 24 weeks; mTSS was also evaluated at week 52 in addition to week 24; SI and SAE in the csDMARD-IR and TNFi-IR populations were evaluated at week 24 and week 52, respectively.

Network meta-analysis

NMA feasibility assessment

The sufficiency of the evidence base to draw feasible networks was assessed for all outcomes of interest. The exchangeability assumption is critical and requires that selected trials measure the same underlying relative treatment effects. Deviations to this assumption can be evaluated through two metrics: (1) heterogeneity (ie, evaluation of comparability in characteristics and results across included studies) and (2) consistency (ie, evaluation of consistency between direct and indirect evidence).

A high level of variability in placebo response was observed across both the csDMARD-IR and TNFi-IR networks. Such heterogeneity of response in the placebo arms of the studies (ie, placebo+csDMARDs in combination studies) has previously been noted in other RA clinical studies and by NICE.19 Therefore, to account for the variation in the placebo responses across studies, alternative analytic methods were applied in the present NMA.

For the larger csDMARD-IR combination network, NMA with regression on baseline risk (BR-NMA) was used to adjust for variability in placebo responder rates. The BR-NMA model is similar to the conventional NMA method with the addition of an adjustment for the baseline odds and better adjusts for potential bias introduced by variability in the placebo responder rates across the different studies. This approach is recommended by NICE Decision Support Unit (DSU) guidelines.20 However, as only binary outcomes have sufficient data to facilitate the BR-NMA, NMA with regression on baseline risk for placebo response was conducted on binary outcomes (ACR20/50/70 and DAS28 remission) as the base case model for the csDMARD-IR population.

For any regression, a relatively high number of studies per covariate is necessary, otherwise the model is unlikely to converge and less precise estimations are produced, resulting in wide credible intervals around the point estimates. In previous NMAs, prior to the publication of NICE guidance to address the problem of high variation of study effects, a conventional OR approach was applied, which gave inconsistent results (eg, this may have overestimated relative effect for treatment with studies having low study effect and reverse).19 Therefore, for the smaller TNFi-IR network, an alternative method of NMA based on risk differences (RD-NMA) was adopted,13 21 whereby a risk difference scale is used in place of a log OR scale; responder levels are treated as continuous outcomes following a normal distribution. This approach was based on Spiegelhalter and colleagues21 and practical guidance in the NICE DSU Guidance on Network Meta-Analysis.20

For safety outcomes, a conventional OR model was used for SAE in the csDMARD combination population, and for SI and SAE in the TNFi-IR population. RD-NMA was applied for SI in the csDMARD-IR population due to convergence issues in the OR model.

Bayesian NMA

The selected outcomes, that is, relative efficacy and safety of the treatments of interest, were evaluated using a Bayesian NMA approach,16 22 23 which involves a likelihood distribution, a model with parameters and prior distributions for these parameters. In this analysis, a linear model with normal likelihood distribution was used for continuous outcomes, and a binomial likelihood with a log link for the dichotomous outcomes.20 21 Flat (non-informative) prior distributions were assumed for nearly all outcomes so as not to influence the observed results by the prior distribution; this approach was consistent with NICE guidelines.20 Prior distributions of the baseline treatments and relative treatment effects were normal, with zero mean and variance of 10 000, while a uniform distribution with range zero to five was used as the prior of the between-study SD.

For most outcomes, random-effects and fixed-effects models were evaluated to allow for heterogeneity of treatment effects between studies. Random-effects models were applied where sufficient data were available; where the number of studies was smaller (eg, most outcomes in the TNFi-IR population), it was necessary to use the fixed-effects model, as random-effects models would provide unrealistically wide credible intervals for such limited datasets. Where both random-effects and fixed-effects models were run, the choice of base case was informed by Deviance Information Criterion (DIC) values.21 Total residual deviance (compared against the number of fitted data points) was also considered in model selection, indicating the adequacy of the model to the data. In addition, the consistency of modelled data with directly reported trial results was also taken into consideration in selecting the preferred model.

Posterior densities for unknown parameters were estimated using Markov chain Monte Carlo simulations. All results for conventional OR and RD-NMA were based on 100 000 iterations on three chains, with a burn-in of 20 000 iterations. All results for BR-NMA models were based on 70 000 iterations on three chains, with a burn-in of 15 000 iterations. Convergence was assessed by visual inspection of trace plots. The accuracy of the posterior estimates was assessed using the Monte Carlo error for each parameter (Monte Carlo error <1% of the posterior SD). All models were implemented using WinBUGS.14

Bayesian NMA provided posterior distributions of the relative treatment effects between interventions and the probability that one treatment is better than another for each outcome of interest. The results of the NMA are presented in terms of ‘point estimates’ (median of posterior) for the relative treatment effects, along with the 95% credible intervals.

Scenario analyses

A series of scenario analyses were conducted whereby outlier studies excluded (or included) in the base case were included (or excluded) in separate scenarios (online supplementary table 3). In csDMARD-IR, a scenario was tested to address the potential modifying effect of patient weight. Weight was selected as a potential modifier by first establishing the link via scatter plot and a trend, and then evaluating the regression and coefficient of R2 between patient characteristics at baseline and ACR20. This process identified weight as a potential effect modifier. However, meta-regression using average weight of the study as a variable was not possible due to the level of missing data for weight across the studies. Instead, those studies conducted in exclusively Asian populations were excluded in a scenario analysis. The basis of this exclusion was that Asian ethnicity would serve as a proxy for populations with relatively lower weight than other populations.

Supplemental material

In a separate scenario, the ATTRACT and SWEFOT studies were included in a scenario and mTSS at 52 weeks was examined; ATTRACT (and the connected SWEFOT study of interferon triple-combination therapy with two csDMARDs and methotrexate) was initially excluded in the base case due to a high mTSS at baseline. In an additional scenario in csDMARD-IR, TNFi were pooled together as a class; ACR outcomes were compared with the base case, which evaluated the TNFi individually. This scenario was evaluated to inform cost-effectiveness evaluations of sarilumab.

Finally, a scenario analysis in the TNFi-IR population considered exclusion of the GO-AFTER study, which evaluated a mix of monotherapy and combination therapy.


Literature search and selection

The literature search identified a total of 15 698 citations (figure 1) relevant to DMARD combination treatments and monotherapies for RA. Three hundred and nine citations that met the screening criteria, reporting results of 108 trials, were retrieved. Of these, 87 RCTs were included in the SLR, but 32 were excluded based on the n<30 sample size or owing to not reporting outcomes of interest, invalid study design or not linked in network (including RACAT and Machado 2014 reported data on the outcomes of interest but could not be linked in the analyses networks; these were subsequently pooled with other TNFi studies in scenario analysis). RACAT was also excluded, as the control arm is not a single csDMARD but a combination of sulfasalazine and hydroxychloroquine. There were no equivalent controls from other RCTs.

Figure 1

Systematic review and network meta-analyses study selection flow chart. *45 studies reporting outcomes at week 24 and one study reporting outcome at week 52. csDMARD, conventional synthetic disease-modifying antirheumatic drugs; IR, inadequate responders; NMA, network meta-analyses; TNFi, tumour necrosis factor α inhibitors.

A total of 46 RCTs (45 studies at week 24 and one study at week 52) were included for the csDMARD-IR population and nine RCTs were included for the TNFi-IR population for the present NMA in combination treatments (figure 1). These included the three sarilumab+csDMARD combination treatment RCTs: MOBILITY-A, MOBILITY-B and TARGET.

NMA evidence base

Although sarilumab has been evaluated in phase III studies across both csDMARD-IR and TNFi-IR patient populations, availability of data for the other comparators varied across the two populations; most data were in csDMARD-IR patients, with fewer RCTs in the TNFi-IR setting, limiting the ability to accurately evaluate the comparative efficacy of combination therapies in TNFi-IR. For both patient populations, the networks for EULAR response were small and a high level of variability was observed in response rates between different studies and thus these results are not reported here. The networks for ACR response, with ACR20 in particular, were the most robust for both populations (figure 2) where most interventions were included in multiple trials. Based on previously published studies, high variation in the placebo response rates was observed across studies.14 24

Figure 2

Evidence base networks for American College of Rheumatology 20 outcomes at 24 weeks. Comi, combination; csDMARD, conventional synthetic disease-modifying antirheumatic drugs; IR, inadequate responders; MTX, methotrexate; TNFi, tumour necrosis factor α inhibitors.

Key features of patient demographics and baseline data from the selected studies are provided in table 2.

csDMARD-IR studies

Among 46 trials included in csDMARD combination population (online supplementary table 1), 29 were phase III trials, seven were phase II trials, two were phase II/III trials and eight did not mention trial phase. Study durations varied from 24 up to 52 weeks with several studies allowing for open-label extensions. In 33 studies, patients had to have been on stable methotrexate for at least 12 weeks prior to entering the study, in four studies, this criterion was not required and in the rest of the studies, no information was reported. Sample sizes varied from less than 40 patients to more than 400 patients per randomised group. Rescue medication was permitted in 25 of the trials, not permitted in two trials and not reported in the remainder of the trials.

TNFi-IR studies

The TNFi-IR studies included seven phase III trials and one trial that did not mention the RCT phase (online supplementary table 2). Study duration varied from 24 up to 104 weeks. Sample sizes varied from 42 patients per arm to more than 200 patients per arm. Rescue medication was allowed in five of the trials and not reported in the remainder of the trials. Overall, eight studies reported ACR20/50/70 and HAQ-DI at 24 weeks (respectively) and were included in the NMA; the others included different endpoints that were not evaluated in this NMA.

Base case NMA results

NMA results for the csDMARD-IR and TNFi-IR populations are shown in tables 3 and 4 versus csDMARDs in the csDMARD-IR population, and superior efficacy was observed for sarilumab 200 mg and sarilumab 150 mg on all outcomes (table 3). Sarilumab 200 mg showed superior efficacy versus baricitinib 2 mg, tofacitinib and certolizumab combinations on 24 week mTSS, and similar efficacy versus baricitinib 4 mg, adalimumab, etanercept, golimumab, infliximab and tocilizumab combinations (all doses) on all other outcomes. Sarilumab 150 mg showed similar efficacy to all lower doses of targeted DMARD combinations on all outcomes. Rates of SI/SAE were similar for sarilumab 150 mg and sarilumab 200 mg versus all comparators in csDMARD-IR.

Table 3

Summary results for sarilumab 200 mg q2w combination vs other DMARD combinations in the csDMARD-IR population: median estimates of relative treatment effects (95% credible intervals) (base case) for week 24 efficacy and week 52 mTSS and safety (SI, SAE)

Table 4

Summary results for sarilumab 200 mg q2w combinations vs other combinations in the TNFi-IR population: median estimates of relative treatment effects (95% credible intervals) (base case) for week 24 efficacy and safety (SI, SAE)

In the TNFi-IR population (table 4), superior efficacy was observed for sarilumab 200 mg versus baricitinib 2 mg combination on ACR50 and DAS28<2.6 and versus abatacept, golimumab, tocilizumab 4 mg/kg IV and rituximab combinations on DAS28<2.6. On ACR20/50/70, similar efficacy was observed for sarilumab 200 mg compared with abatacept, golimumab, tocilizumab 4 mg and 8 mg/kg IV and rituximab combinations. Sarilumab 150 mg had superior efficacy versus baricitinib 2 mg and rituximab combinations on DAS28<2.6, similar efficacy to all other bDMARD combinations (all lowest approved dose) on all outcomes and similar efficacy to tocilizumab 4 mg on ACR70; however, efficacy was lower versus tocilizumab 8 mg on ACR20 and DAS28 remission. SAEs, including SIs, appeared similar for sarilumab 200 mg and 150 mg versus all comparators.

Scenario analyses

In the csDMARD-IR scenario, which excluded six studies assessing Asian patients only, results were similar to the ACR20/50 base cases for sarilumab 200 mg against all comparators. However, sarilumab 200 mg was superior to tocilizumab IV 4 mg/kg combination for ACR70. For the scenario that included the ATTRACT and SWEFOT studies, sarilumab 200 mg combination therapy showed superiority to csDMARD and sarilumab 150 mg combination for mTSS at 52 weeks, and inferiority to infliximab combination in the fixed-effects model. In the random-effects model, sarilumab 200 mg combination was comparable to all treatments. The scenario that pooled all 13 TNFi treatment interventions from the 43 studies included in the csDMARD-IR network, sarilumab 200 mg combination therapy was found to be superior to csDMARDs and comparable to all other combination therapies. The scenario in the TNFi-IR population excluding the GO-AFTER study obtained results consistent with the base case for sarilumab 200 mg against all the comparators except for golimumab combination on ACR20/70.


Active comparator controlled, randomised trials evaluating the comparative efficacy and safety of bDMARDs or tsDMARDs are few and limited to adalimumab as an active comparator.4 25–29 In the absence of head-to-head trial evidence, indirect comparison through a NMA provides best estimates of comparative efficacy. NMA also provides fully conditional estimates of relative treatment effect. This NMA was undertaken to compare sarilumab versus relevant csDMARD, bDMARD and tsDMARD comparators in csDMARD-IR and TNFi-IR adult RA patient populations.

In the csDMARD-IR network, sarilumab showed significantly better efficacy versus csDMARDs, consistent with the head-to-head evidence from MOBILITY-B, and similar efficacy and safety to combination therapies including all licensed biologics, and the tsDMARDs tofacitinib and baricitinib. Typically, safety outcomes presented broad credible intervals due to their relatively low occurrence.

Sarilumab showed significantly better efficacy versus csDMARD in the TNFi-IR population, consistent with the head-to-head evidence from TARGET and comparable efficacy and safety to other biological regimens for most outcomes. Both doses of sarilumab showed favourable outcomes on ACR50 and on DAS28 remission in the TNFi-IR population compared with combination therapies with baricitinib and tocilizumab and all other bDMARDs and tofacitinib.

Strengths and limitations

There are considerable challenges in undertaking an NMA when there is heterogeneity in the placebo arms across trials. Variability in placebo response was observed across both csDMARD-IR and TNFi-IR networks. Some degree of variation in patient characteristics across studies is an inevitable feature of the RA evidence base given the evolution of clinical trial design and patient populations over the 20-year period since the first biological trials. Furthermore, geographic location may be another potential confounding factor in RA clinical trials. There are also key differences in the inclusion/exclusion of studies. If these characteristics are effect modifiers of the relative treatment effects of interest, the heterogeneity of the evidence base15 23 30 can limit the validity of indirect comparisons. Therefore, a scenario analysis was conducted to test weight as a potential modifier.

In addition, a high level of heterogeneity of response in the placebo arms of studies (ie, placebo+csDMARDs in combination studies) has been previously noted by NICE, using certolizumab pegol in RA as an example,19 where the treatment effect expressed as log ORs had a negative relationship with the baseline risk.14 24 This is an issue that can particularly limit indirect comparisons. One explanation for heterogeneity in placebo arms of recent studies may be that more recent trials in RA have included larger proportions of patients from the Latin American region, whereas earlier trials included a higher proportion of patients from North America and Western Europe. For the Latin American region, higher response rates in RA have been noted by NICE in the placebo arm as well as the active arm, compared with other regions.19 This phenomenon was also observed in the phase III MOBILITY and TARGET trials for sarilumab and other RA trials, including the GO-FORWARD trial of golimumab and the tocilizumab trials. Several reasons could account for regional variation, including differences in background and prior care, differences in patient conceptualisation of PRO components of outcome measures and differences in physician approach to practice. In the present study, variation in the placebo responses across studies were addressed by applying alternative analytical methods. We attempted to address this issue within the NMA methodology, as baseline risk regression has been suggested as a solution to this and has been previously used in RA.19 20 31 It performs well when there is a large number of trials in a network, as in the csDMARD-IR population.

An additional challenge was met for the smaller TNFi-IR network. While sparse data preclude a number of analytic options, including meta-regression, NMA on risk difference is a promising strategy to address this limitation.14 The TNF-IR outcome networks were small (with at most seven studies) and it was therefore difficult to obtain model convergence and precise estimation by following the REM approach. To address this issue, less vague priors were used: (1) for relative treatment effect, called log-odds (under the belief of OR=(0,500), d~Normal (0,10)) and study effect (under the belief of p=(0.005, 0.995), mu~Normal (0,10)) based on the work of Spiegelhalter and colleagues for coefficient of regression.32 Therefore, it was estimated from BR of ACR20/50/70 in csDMARD-IR with variance less than 1 (SD=(0.13;0.85)) and with the mean of 0 in order to give the chance for both negative and positive sides: B~Normal (0,1), between study SD also decreased gradually in uniform distribution. However, even with informative priors, very wide credible intervals were obtained; BR-NMA results for the TNF-IR population were highly uncertain (eg, OR of ACR20 of sarilumab 200 mg combination versus csDMARDs observed in the TARGET trial was 3.28 with 95% CI 2.11, 5.12, while the NMA regression result was 2.50 with 95% CI 0.82, 6.78).

Thus, in the present NMA, the RD-NMA models worked well, even in a situation with few studies or in the case of rare events (eg, SI or SAE) and predicted data well, with a higher degree of certainty than the BR-NMA. We confirmed the reliability of this approach by reconducting analyses for the ACR20/50/70 networks using a probit random-effect model and informative priors19 20 31 for the between-study variance (log normal with mean −2.56 and variance of 1.74*1.74, proposed by Turner et al (2012),33 and results were consistent with RD-NMA models.

Finally, we faced a situation whereby different dosing regimens for some drugs were evaluated across studies. To solve this issue, the authors first assessed the overlapping of CIs of the individual studies. In most cases, there was overlap, therefore, these studies were pooled, and the validity of this approach was justified via clinical input. However, in the one case where there was no overlap, tocilizumab 25 mg two times a week versus tocilizumab 50 mg once a week, clinical input informed the decision to pool these comparable regimens.

The robustness of this NMA derives from exploration and application of rigorous methods to account for heterogeneity and also inclusion of up to date evidence including new bDMARDs sarilumab and the tsDMARD baricitinib. A range of efficacy and safety outcomes also provided a comprehensive picture of comparative efficacy and safety of sarilumab in the csDMARD-IR and TNFi-IR populations, to inform clinical decision-making and conduct of health technology assessments. The most robust networks, ACR20/50, used only one common comparator on all comparisons with sarilumab on these endpoints. Moreover, there was no major concern of inconsistencies given that the appropriate models were implemented; so, for outcomes with plentiful studies, as in the csDMARD-IR population, the results were considered robust. Four scenario analyses confirmed the results against the base case analysis, where comparisons were feasible.

Many NMAs have been published in RA, which differed in their precise aims, inclusion criteria, analyses performed and results. Thorlund and colleagues34 reviewed 13 published NMAs and despite similar stated eligibility criteria and objectives, found differences in the estimated treatment effects, the inclusion of trials, analytic approaches and endpoints evaluated. For example, some studies report DAS28-ESR and others report DAS28-CRP. In the present NMA, we examined both outcomes, although the variability in outcome definition may have impacted the DAS28 results and so it may not be appropriate to compare fully the results of this NMA with previously published NMAs. However, published NMAs have shown similar efficacy and safety between different biological drugs for the majority of comparisons14 34 and the results of the present NMA for those biologics are in line with these findings.

In the present NMA, there were limitations to the conclusions that could be made for the efficacy of sarilumab versus use of a further TNFi in TNFi-IR patients due to the very limited evidence base. The only trial that could be included was the GO-AFTER trial, in which only ~58% patients had failed their previous TNFi because of lack of efficacy. This percentage is lower than the other included studies in which almost 100% of patients had failed a previous TNFi due to lack of efficacy (eg, TARGET, 92.3%). Therefore, the conclusions regarding the relative efficacy and safety of sarilumab (or other non-TNFis) versus TNFis in TNFi-IR patients should be interpreted with caution.

Nonetheless, this NMA was conducted following best practice guidelines and demonstrated that sarilumab SC at both 150 mg and 200 mg doses in combination with csDMARDs or methotrexate has superior efficacy compared with csDMARDs alone and comparable or better efficacy compared with other biological and targeted synthetic combination therapies in both csDMARD-IR and TNFi-IR patient populations. Sarilumab 150 mg and 200 mg had parity efficacy and safety to tocilizumab 4 mg and 8 mg/kg intravenously. SAEs including SIs appeared similar for sarilumab 150 mg and 200 mg versus all comparators.


The authors would like to thank Parexel for conducting the literature searches and analyses. Medical writing assistance and editorial support, under the direction of the authors, were respectively provided by Gauri Saal, MA Economics, and Sinead Stewart, both of Prime (Knutsford, UK), funded by Sanofi/Regeneron Pharmaceuticals according to Good Publication Practice guidelines ( The Sponsor was involved in the study design, collection, analysis and interpretation of data as well as data checking of information provided in the manuscript. The authors had unrestricted access to study data, were responsible for all content and editorial decisions and received no honoraria related to the development of this publication.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.


  • Funding This study was sponsored by Sanofi and Regeneron Pharmaceuticals, Inc.

  • Competing interests EC has received research grants, consultancy and speaker fees from Amgen, Biogen, Bristol-Myers Squibb, Boehringer Ingelheim, Celgene, Chugai Pharma, Eli Lilly, Janssen, Novimmune, Novartis, Pfizer, Regeneron, Roche, R-Pharm, Sanofi, Tonix and UCB. T-M-TH, LP, HvH and PC are employees of Sanofi and hold stock and/or stock options in the company. CP is a former employee of and current shareholder in Sanofi and current employee of Novartis. C-IC, AK and EM are employees of Regeneron Pharmaceuticals, Inc. and hold stock and/or stock options in the company. NF has received consulting fees from Sanofi, Novo Nordisk, Ipsen, Allergan, Takeda, Biogen, Abbvie, Lifecell.

  • Patient consent Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement No associated data will be shared.