Objectives In this systematic review, we aim to identify laboratory biomarkers that predict response to tumour necrosis factor inhibitors (TNFi) in patients with rheumatoid arthritis (RA).
Methods EMBASE, PubMed and Cochrane Library (CENTRAL) were searched for studies that presented predictive accuracy measures of laboratory biomarkers, or in which these were calculable. Likelihood ratios were calculated in order to determine whether a test result relevantly changed the probability of response. Likelihood ratios between 2–10 and 0.5–0.1 were considered weak predictors, respectively, and ratios above 10 or below 0.1 were considered strong predictors of response. Primary focus was on biomarkers studied ≥3 times.
Results From 41 included studies, data on 99 different biomarkers were extracted. Five biomarkers were studied ≥3 times, being (1) anti-cyclic citrullinated peptide (CCP), (2) rheumatoid factor, (3) –308 polymorphism in the TNF-α gene, (4) SE copies in the HLA-DRB1 gene and (5) FcGR2A polymorphism. No studies showed a strong predictive association and only one study on anti-CCP showed a weak positive association.
Conclusions No biomarkers were found that consistently showed a (strong) predictive effect for response to TNFi in patients with RA. Given the disappointing yield of previous predictive biomarker research, future studies should focus on exploring, combining and validating the most promising laboratory biomarkers identified in this review, and searching for new predictors. Besides this, they should focus on contexts where prediction-aided decision-making can have a large impact (even with limited predictive value of markers/models).
PROSPERO registration number CRD42021278987.
- Arthritis, Rheumatoid
- Tumor Necrosis Factor Inhibitors
Data availability statement
All data relevant to the study are included in the article or uploaded as supplemental information.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
WHAT IS ALREADY KNOWN ON THIS TOPIC
Although previous systematic reviews did not find biomarkers for prediction of tumour necrosis factor inhibitor (TNFi) response in patients with rheumatoid arthritis (RA), the number of studies investigating biomarkers is constantly increasing.
WHAT THIS STUDY ADDS
This review provides updated information about the predictive value of laboratory biomarkers for treatment response to TNFi in RA.
None of the biomarkers identified in this review showed consistent and relevant predictive effects.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
Future studies should focus on exploring, combining and validating the most promising laboratory biomarkers identified in this review, searching for new potential predictors and combining promising predictors.
Researchers should focus on contexts where prediction-aided decision-making can have a large impact (even with limited predictive value of markers/models).
If treatment targets for rheumatoid arthritis (RA) are not achieved with conventional synthetic (cs) disease-modifying antirheumatic drugs (DMARDs), current guidelines recommend starting a biological DMARD (bDMARD) or a targeted synthetic DMARD (tsDMARD).1 There are different types of bDMARDs, with tumour necrosis factor inhibitors (TNFi) being the most widely used. As TNFi have proven their effectiveness, are well tolerated and are used for more than 20 years worldwide, TNFi are often the first bDMARD prescribed in clinical practice. However, previous studies show that a substantial proportion of patients do not respond to, or tolerate, their first bDMARD treatment.2–4
If the effect of the first bDMARD is not sufficient after 3–6 months, patients switch to another bDMARD or tsDMARD following a trial-and-error approach. During this process, patients may experience temporarily higher disease activity. Although this is usually bridged by use of glucocorticoids, the disease activity increase causing burden in terms of clinical symptoms (eg, pain, fatigue), decreased functioning and an increased risk of irreversible joint damage.5 Additionally, there are costs associated with higher disease activity and with every switch in medication (eg, consultations, absenteeism, loading doses). Reaching remission or low disease activity earlier on in the treatment course is also beneficial as first, earlier remission is associated with sustained remission, and second, as dose tapering can also be initiated sooner, which will lower the chance of dose-dependent side effects and costs as well. Thus, being able to predict response to different treatments might be of value.
In the past, several reviews assessed the predictive value of biomarkers for prediction of response to bDMARD treatment in rheumatology, but these reviews did not identify strong or consistent biomarkers for the prediction of response to biological treatment.6–8 This indicates that finding a valuable biomarker is difficult, especially since the biomarker should be of added value in the context of current treat-to-target clinical care. Despite this, the number of studies investigating biomarkers has increased considerably over the past years, and recent systematic evidence synthesis is lacking. This is particularly true for the field of laboratory biomarkers, with new markers, analysis techniques and easier access to genetic testing. Therefore, in the current review, we focus on laboratory biomarkers that can be measured by biochemical tests in blood and/or urine.
We aimed to systematically summarise data on predictive value of laboratory biomarkers that are measured before the start of TNFi treatment and predict response after 3–6 months in patients diagnosed with RA.
Search strategy and article selection
EMBASE, PubMed and Cochrane Library (CENTRAL) were searched for relevant papers from inception until September 2020. The search strategy contained four domains: the patient group (patients with RA), the (pharmacological) intervention (TNFi), the predictor (biomarkers) and outcome parameters (response criteria). The complete search strategy can be found in online supplemental file 1. This review was registered in PROSPERO (2021 CRD42021278987) and the AMSTAR-2 (A MeaSurement Tool to Assess systematic Reviews) checklist was used as reporting guideline.9
Articles were independently screened by two authors (BvdB and MHMW) for title and abstract according to the following prespecified criteria. Articles proceeded if they (1) concerned human studies in patients with RA treated with TNFi (etanercept, adalimumab, infliximab, golimumab or certolizumab pegol); (2) investigated a laboratory biomarker, measurable by tests in blood and/or urine (synovial biomarkers were excluded as our focus was on markers that could be easily implemented in routine care, which often does not include synovium biopsies); (3) defined response by 28-joint Disease Activity Score (DAS28), European Alliance of Associations for Rheumatology (EULAR) or American College of Rheumatology (ACR) response criteria as these are considered the most valid response measures; (4) included ≥50 patients in the analysis and (5) were written in English, for correct interpretation by the research team. Full-text reports were obtained if the inclusion criteria were met or if any uncertainty was present based on title/abstract screening. During full-text analysis, articles were excluded if the biomarker was determined after start of TNFi treatment, if the biomarker was not predefined (eg, genome-wide association studies were excluded if no marker-specific validation was performed), if response was measured <12 weeks of >30 weeks after start of TNFi, if the article concerned no original data or if any criteria from the title/abstract screening were not met in the full-text report. Additionally, studies were only included if predictive accuracy measures (sensitivity/specificity) of the biomarker were reported or if it was possible to calculate these (eg, number of true/false positives/negatives given). Multivariable models including biomarkers were also included. Randomised controlled trials as well as prospective and retrospective cohort studies were included, as we deemed these designs to be appropriate for answering our research question. Reasons for exclusion of studies in the full-text phase were recorded and can be found in online supplemental file 2. Additional studies were identified by scanning the reference lists of included studies or relevant reviews identified through the search, scanning papers that cited included studies and by consulting experts, in order to ensure literature saturation.
From each study, the following data were extracted: general information (ie, authors, title, year of publication), study and patient characteristics (ie, sample size, type of TNFi, duration of follow-up, disease duration, medication history, concurrent csDMARD use), biomarker characteristics (ie, name, cut-off), primary outcome (ie, scoring system for disease activity, definition of response) and results (ie, true positives, true negatives, false positives, false negatives). Data extraction was done independently and in parallel by two reviewers (MHMW and BvdB) for a random sample of the eligible studies. The results were compared and differences discussed until agreement was reached. After agreement was reached, the remaining data were extracted and checked by two reviewers (MHMW and BvdB, respectively). Data extraction was done using a data extraction form. Dichotomisation is essential for application in clinical practice, as treatment choices are dichotomous, therefore data needed to be recategorised as binary and presented as a 2×2 table. For biomarkers with >2 categories (ie, genetic biomarkers), we used the category (ie, genetic variant) that was most commonly used in other studies using that predictor. The outcome was defined as response yes/no. For EULAR response criteria, moderate and good responders were pooled if possible, but studies solely reporting EULAR good response were also included. Studies reporting an absolute DAS28 ≤3.2 after follow-up or an absolute improvement in DAS28 of ≥1.2 were included, as these criteria were considered to be sufficienly comparable with EULAR response criteria. For ACR response criteria, ACR50 was preferred as this was deemed comparable with EULAR good and moderate response, but if only ACR20 was mentioned, this was also accepted. ACR70 outcomes were not included as these were considered too strict compared with EULAR response definition. The preferred duration of follow-up was 6 months. For studies showing results at multiple time points, the points closest to 6 months (24 weeks) were included with a minimum of 12 weeks and a maximum of 30 weeks of follow-up.
Risk of bias assessment
Quality of included studies was assessed using the Quality In Prognosis Studies (QUIPS) tool. This tool addresses six domains of bias, but items 5 (Study Confounding) and 6 (Statistical Analysis and Reporting) were not scored because we extracted unadjusted and unanalysed data from the studies. Each of the domains was judged as having low, moderate or high risk of bias. All included studies were assessed by two authors (BvdB and MHMW), after which results were compared and differences discussed until agreement was reached for each domain for each study.
Biomarkers were divided into two groups: biomarkers studied at least three times and biomarkers studies once or twice. For each study, the positive predictive value (PPV), negative predictive value (NPV), sensitivity and specificity of the specific biomarker are presented, also if the biomarker consists of multiple variables (ie, prediction models). Likelihood ratios (LRs) are calculated in order to determine whether a test result relevantly changes the probability of response. LRs between 2 and 10 are considered weakly positive, and ratios greater than 10 are indicated as strong positive predictors of response, and conversely, LRs between 0.5 and 0.1 are indicated as weak negative predictors, and ratios below 0.1 as strong negative predictors of response.10 These predictive value criteria in combination with the quality of included studies (risk of bias) show which biomarkers are promising. When pooling results was deemed appropriate, an additional meta-analysis was performed.
Our search in PubMed, Embase and Cochrane Library resulted in 3455 articles suitable for screening of title and abstract, after which 235 full-text articles were screened (figure 1). During full-text evaluation, a total of 194 articles were excluded, reasons for exclusion are depicted in figure 1. From the remaining 41 studies, data on 99 different biomarkers were extracted. Results of the risk of bias assessment showed that 8 of 41 studies scored low risk of bias on each subdomain of the QUIPS tool, and 8 studies scored high risk of bias on ≥1 subdomain. Detailed results of the risk of bias assessment can be found in online supplemental file 3.
Biomarkers studied more than two times
Five biomarkers were analysed in more than two studies (table 1), being (1) anti-cyclic citrullinated peptide (anti-CCP), (2) rheumatoid factor (RF), (3) –308 polymorphism in the TNF-α gene, in which the GG genotype was considered the variant predictive of response, (4) presence of one or two SE copies in the HLA-DRB1 gene and (5) FcGR2A polymorphism (rs1081274), in which the RR genotype was considered the variant predictive of response. These biomarkers were studied in 24 unique studies for which study characteristics are shown in table 2. These studies included different TNFi, that is, etanercept, adalimumab, infliximab, certolizumab pegol and golimumab. Response was measured using different response criteria: EULAR (n=14), relative DAS28 decrease >1.2 (n=5), further (n=2), ACR50 (n=1), EULAR good response (n=1) and absolute DAS28 >3.2 (n=1). None of the studies showed an LR greater than 10 or below 0.1 for any of the biomarkers. For presented LRs, predictors were not combined with other known predictors of response. Anti-CCP was investigated in eight studies. One study showed a weak positive association (LR+ between 2.0 and 10).11 The effect for the other studies was non-significant, and the direction of the effect was conflicting. Five studies showed a positive direction of the effect of anti-CCP positivity in relation to response11–15 and three studies showed a negative direction.16–18 This was also true for RF, as four out of nine studies showed a negative direction of the effect of RF positivity towards response,16 18–20 one study showed no association21 and four studies showed a positive direction of the effect of RF positivity.11–13 22 These conflicting results for anti-CCP and RF were all univariate. Some studies also performed additional multivariable analyses accounting for other variables, and these results showed no statistical significance for anti-CCP and RF as a predictor. Seven studies addressed the −308 polymorphism of the TNF-α gene and response to TNFi. Of these, six studies showed a positive direction of effect (LR+ between 1.05 and 1.63) between the GG genotype and response to etanercept and infliximab,23–28 and one study showed a negative direction of effect between the GG genotype and response to adalimumab.29 Four studies investigated copies in the HLA-DRB1 gene24 29–31 and three studies investigated the FcGR2A polymorphism32–34; none of them showed significant predictive value.
Biomarkers studied once or twice
Seventy biomarkers were studied once or twice (online supplemental file 4). The majority of these biomarkers included gene polymorphisms and proteins. No biomarkers showed an LR greater than 10 or below 0.1 in any individual study. However, three biomarkers (all studied once) showed a sensitivity and specificity of both above 70%. These were high levels (>3.5 pg/mL) of granulocyte-macrophage colony-stimulating factor,35 high interleukin (IL)-34 concentration (>194.12 pg/mL)36 and the combined biomarker of high serum IL-6 and low survivin level (high IL-6 defined as >41.59 pg/mL and low survivin defined as ≤780.74 pg/mL).37
In this review, we summarised literature on laboratory biomarkers potentially predictive of response to TNFi in patients with RA. None of the five biomarkers analysed in more than two studies, being anti-CCP status, RF status, −308 GG polymorphism in the TNF-α gene, one or two HLA-DRB1 SE copies and the RR polymorphism in the FcGR2A gene, showed an LR greater than 10 or below 0.1. One out of eight studies on anti-CCP showed a weak positive association (LR greater than 2). The five biomarkers studied more than two times showed inconsistent directions between studies, questioning the predictive value of these biomarkers.
Our review included additional studies addressing laboratory biomarkers for TNFi response compared with the previous review by Cuppen et al.8 However, the findings are similar, as both reviews concluded that results for anti-CCP status, RF status and the −308 GG polymorphism in the TNF-α gene were non-significant and inconsistent among studies in TNFi. Presence of one or two HLA-DRB1 SE copies was mentioned by Cuppen et al as a promising biomarker due to an added predictive value of >15%; however, this marker had only been studied once at that time. In the current review, we found only very weak associations for this biomarker in four included studies, questioning the predictive value of this biomarker as well.
As no single strong response predictors seem to exist, combination of multiple biomarkers might be necessary. On forehand, we expected to find multivariable models, consisting of multiple laboratory biomarkers, in our review. Our search did yield a number of multivariable prediction models; however, these included patient characteristics as well and were therefore beyond the scope of this review. Exclusion of patient, disease and treatment characteristics was chosen as no characteristics have shown strong predictive effect influencing clinical decision-making.7 However, this can be considered a limitation of our review. We found one multivariable predictor that met our inclusion criteria, which was the combination of serum IL-6 and survivin.37 This biomarker showed a sensitivity and specificity of 80% and 91%, respectively, and is therefore promising, although replication is warranted.
Some limitations of this study have to be considered. First, we wanted to be able to calculate specificity, sensitivity, PPV, NPV, LR+ and LR−. These measures are important for demonstrating the accuracy of a biomarker for response chances, and to judge credibility of the findings.38 However, this criterion led to exclusion of many studies (n=80). Incomplete reporting of data may have been partly caused by the fact that prediction of response was rarely the main research question, and may have introduced reporting bias. For feasibility reasons, we did not contact corresponding authors of these excluded articles for data in the correct format, which can be considered a limitation as well. On the other hand, we aimed to maximise the chance of finding an effect if a true effect was present. Therefore, the most sensible options were accepted regarding duration of follow-up. Since no biomarker showed a strong relation in terms of their LR, the risk of bias did not influence interpretation of our results.
The interpretation of the potential added value of biomarkers in clinical care is complex and often counterintuitive. First, the context in which prediction is of added value to current clinical care should be evaluated, taking into account whether the result of the predictor can truly influence clinical decision-making. Additionally, characteristics of the predictor should be taken into account such as consistent test results, measurable irrespective of cotreatment, result available without delay and low costs. Lastly, prediction is often embedded in research with a different purpose, leading to poor outcome reporting and no validation. It is relatively easy to take a first step by looking at predictors in a cohort or trial, but it is important to include all known predictors, report correct predictive outcome measures and perform validation.
In conclusion, this review provides a systematic overview of laboratory biomarkers for prediction of response to TNFi in RA. Currently, no single biomarker leads to a relevant change in the probability of response and can be of value for clinical practice. Future studies should focus on exploring the most promising laboratory biomarkers identified in this review, searching for new potential predictors and combining promising predictors. Researchers should pay attention to the context in which the biomarker is used, outcome reporting and validation of findings.
Data availability statement
All data relevant to the study are included in the article or uploaded as supplemental information.
Patient consent for publication
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Contributors Literature search—MHMW and BvdB. Data collection and analysis—MHMW and BvdB. Guarantor—MW. All authors contributed to data interpretation and writing of the manuscript. Furthermore, all authors approved the final version of the manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.