Article Text
Abstract
Objectives To assess the sensitivity and specificity of the 2019 EULAR/American College of Rheumatology (ACR) classification criteria for systemic lupus erythematosus (SLE) in outpatients at an academic tertiary care centre and to compare them to the 1997 ACR and the 2012 Systemic Lupus International Collaborating Clinics criteria.
Methods Prospective and retrospective observational cohort study.
Results 3377 patients were included: 606 with SLE, 1015 with non-SLE autoimmune-mediated rheumatic diseases (ARD) and 1756 with non-ARD diseases (hepatocellular carcinoma, primary biliary cirrhosis, autoimmune hepatitis). The 2019 criteria were more sensitive than the 1997 criteria (87.0% vs 81.8%), but less specific (98.1% vs 99.5% in the entire cohort and 96.5% vs 98.8% in patients with non-SLE ARD), resulting in Youden Indexes for patients with SLE/non-SLE ARD of 0.835 and 0.806, respectively. The most sensitive items were history of antinuclear antibody (ANA) positivity and detection of anti-double-stranded deoxyribonucleic acid (dsDNA) antibodies. These were also the least specific items. The most specific items were class III/IV lupus nephritis and the combination of low C3 and low C4 complement levels, followed by class II/V lupus nephritis, either low C3 or low C4 complement levels, delirium and psychosis, when these were not attributable to non-SLE causes.
Conclusions In this cohort from an independent academic medical centre, the sensitivity and specificity of the 2019 lupus classification criteria were confirmed. Overall agreement of the 1997 and the 2019 criteria was very good.
- Systemic Lupus Erythematosus
- Lupus Erythematosus, Systemic
- Autoimmune Diseases
Data availability statement
No data are available. Not applicable.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
WHAT IS ALREADY KNOWN ON THIS TOPIC
The 2019 EULAR/American College of Rheumatology (ACR) classification of systemic lupus erythematosus (SLE) have been designed and validated to ensure highly specific selection of patients with SLE for clinical studies. Data are scarce on the application of this classification in clinical practice and how it performs compared with the 1997 ACR classification.
WHAT THIS STUDY ADDS
This study applies the 2019 and the 1997 classification criteria to a large cohort of patients with SLE and non-SLE from an academic tertiary care centre. The sensitivity of the 2019 criteria was somewhat lower, and the specificity was comparable to the original validation cohort. There was good overall agreement of the 2019 classification with the 1997 classification.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
This study underscores the applicability of the 2019 EULAR/ACR classification in clinical practice.
Introduction
Systemic lupus erythematosus (SLE) is an enigmatic disease. The name was most likely coined in the early Middle Ages, if not earlier, to denote the devouring of flesh like a wolf’s bite in affected patients.1 However, just like a wolf may prowl through the woods and never be seen, the disease notoriously eludes diagnosis and may be hard to grasp.
In order to facilitate the identification of patients with lupus for clinical trials, classification criteria have been devised and revised over the years. Early work to arrive at a useful classification began in 1962,2 was published in 1971,3 and subsequently revised. The 1997 classification that was adopted by the American College of Rheumatology (ACR)4 was widely accepted and used for clinical trials (as well as for the diagnosis at the bedside) worldwide for some 20 years. This classification was a modification of an earlier version which had been published in 1982 and was also widely used for many years.5 In 1997, the immunological criterion was updated. However, the updated classification itself was never validated until the Systemic Lupus International Collaborating Clinics (SLICC) published an alternative classification in 2012 and compared its performance to the 1997 classification.6 When applied against the validation cohort of the SLICC classification, the 2012 SLICC classification was more sensitive than the 1997 ACR classification (97% vs 83%), but less specific (84% vs 96%),6 thereby increasing the risk of including patients without SLE in lupus trials, which would dilute any treatment effect. Of note, in the original 1982 publication, sensitivity and specificity both were 96%.5
In 2019, the EULAR and the ACR jointly published new classification criteria that incorporated state-of-the-art knowledge about the role of serological findings in SLE.7 8 The working group employed sophisticated methodology. The new classification was meticulously derived from a carefully curated cohort of patients with lupus and non-lupus, and the initial publication included a carefully assembled validation cohort.7 The non-lupus subjects in this cohort had ‘conditions mimicking SLE’, and prior to both derivation and validation, all subjects were centrally adjudicated for the absence or presence of SLE.
The sensitivity of the 2019 EULAR/ACR classification was improved over the 1997 version and comparable to the 2012 SLICC classification. The specificity of the 2019 classification was 93% in the validation cohort, which was equal to the 1997 ACR classification4 when applied to the same cohort, and superior to the 2012 SLICC classification, which, interestingly, exhibited the same specificity as in the initial publication (84%).6 Overall, the 2019 criteria had a larger combined sensitivity and specificity than the 1997 and the 2012 criteria (1.90 vs 1.76 and 1.80, respectively, in the validation cohort).7
Importantly, the 2019 classification declared a history of antinuclear antibody (ANA) positivity as an entry criterion. This excludes a small number of patients with lupus who have never had a positive ANA test from being classified as SLE, which the authors deemed acceptable for the purpose of recruiting for clinical trials.7
Derivation and validation of the 2019 EULAR/ACR were extremely well done, and the new classification was implemented immediately by rheumatologists worldwide. But how does the classification perform in daily clinical practice, where recruitment for studies takes place? Here we report the application of the criteria in a cohort of patients with SLE, patients with non-SLE ARD, and patients with hepatic diseases from a large outpatient clinic of a university medical centre in Germany.
Methods
Subjects
This observational cohort study included patients with a diagnosis of SLE as well as patients with other autoimmune-mediated rheumatic diseases (ARD) and patients from the hepatology clinic (non-ARD) of a single academic medical centre in Germany. The 1997 ACR and the 2019 EULAR/ACR classification systems were applied prospectively to all lupus subjects as they presented to the clinic. Lupus criteria are recorded if and only if the signs and symptoms cannot be attributed to an alternative diagnosis, as adjudicated by the patient’s rheumatologist. For patients without lupus, classification was performed retrospectively by chart review.
Patient and public involvement
Patients were not involved in the design and execution of this study.
Diagnosis of SLE
All cases of suspected SLE in the rheumatology clinic were discussed by two experienced physicians. The diagnosis was made or dismissed by consensus. If no agreement was reached, a third rheumatologist was consulted.
Statistical analysis
Sensitivity and specificity of the classification systems as well as individual items were computed in the usual manner. The 1997 ACR and the 2019 EULAR/ACR classification systems were compared using McNemar’s test and by Cohen’s kappa. All analyses were carried out with SAS V.9.4.
Results
Study population
The study cohort comprised 3377 subjects: 606 from the lupus clinic, 1015 patients without SLE from the rheumatology clinic and 1756 patients without ARD (table 1). Of these, 474 had a diagnosis of vasculitis, 421 had rheumatoid arthritis and 120 had undifferentiated connective tissue disease; 1172 had hepatocellular carcinoma, 253 had primary biliary cirrhosis, 331 had autoimmune hepatitis. In the SLE cohort, 88.3% were women as opposed to 50.1% in the non-SLE cohort (table 1). The subjects were predominantly Caucasian.
Performance of the 2019 versus the 2012 and the 1997 classification systems
In the current SLE cohort, the 2019 EULAR/ACR criteria were more sensitive than both the 2012 SLICC and the 1997 ACR criteria (87% vs 85.3% and 81.8%, respectively; table 2). Specificity was somewhat lower than both the 2012 and the 1997 systems when the 2019 EULAR/ACR criteria were used, regardless of the cohort that was considered as a non-lupus reference: In patients with non-SLE ARD, specificities were 96.5%, 98.1% and 98.8% with the 2019, the 2012 and the 1997 criteria, respectively (table 2). In patients without ARD, specificities were 99.1%, 99.5% and 99.8%, respectively (table 2). The resulting Youden Indexes for the 2019 and the 1997 classifications were 0.835 and 0.806; the combined sensitivities and specificities7 were 183.5% and 180.6%, respectively, when using the non-SLE ARD cohort as the reference.
Overall, the agreement of the 1997 ACR and the 2019 EULAR/ACR classification systems was very good with a kappa coefficient of 0.88. However, the 2019 criteria are more likely to detect lupus than the 1997 criteria: In 2.6% of all cases from the combined cohorts, the 2019 criteria would detect lupus where the 1997 criteria would not (table 3). Conversely, in 0.6% of cases, the 1997 criteria would detect lupus where the 2019 criteria would not (p<0.0001 by McNemar’s test).
When the entry criterion of a history of ANA positivity was disregarded, sensitivity of the 2019 criteria increased from 87% to 89.9%, while specificity decreased from 98.1% to 97.2% (table 4).
Items’ performance of the 2019 classification system
When the individual items of the 2019 EULA/ACR classification were analysed for the current cohort, the entry criterion of a history of ANA positivity had a sensitivity of 94.9% (95% CI, 92.8% to 96.5%; table 4), making ANA positivity the single most sensitive item. The next most sensitive item was detection of anti-double-stranded deoxyribonucleic acid (dsDNA) antibodies (78.9%; 95% CI, 75.4% to 82.1%; table 4).
Among the 31 (5.1%) of patients with SLE who did not have documented ANA positivity, 14 had dsDNA antibodies and 18 had malar rash. Three had lupus nephritis, and in 13 cases, complement (C3/C4) was depleted.
Two items of the 2019 classification were highly specific for SLE: class III/IV lupus nephritis and the combination of low C3 and low C4 complement levels were each 100% specific with narrow CIs (table 4), followed by class II/V lupus nephritis, either low C3 or low C4 complement levels, delirium, psychosis and autoimmune haemolysis, that each had 99.9% specificity.
Considering the individual non-SLE diagnoses, specificities of low C3 and low C4 were 100% for each diagnosis. Specificities of either low C3 or low C4 were 100% for all diagnoses but two: vasculitis and undifferentiated connective tissue disease, where one patient from both groups met this criterion.
As expected, the least specific item was a history of ANA positivity (66.6%), followed by thrombocytopenia (87.2%; table 4). Interestingly, even anti-dsDNA antibodies were only 91.1% specific (table 4). Leucopenia was 94.6% specific. All other items had specificities of at least 96.1% (table 4).
Discussion
The 2019 EULAR/ACR criteria for the classification of SLE are the most recent and up-to-date effort to provide a clear framework for the identification of patients with SLE that can be included in clinical trials. SLE is a heterogeneous disease, more a chameleon than a wolf, and this makes it particularly challenging to design and carry out trials of novel medications.9 Lupus trials are notoriously prone to fail to demonstrate benefit.10 Being able to clearly define the study cohort is one important prerequisite to pave the way to success.11
For the purpose of generating homogeneity, the 2019 EULAR/ACR classification includes a history of ANA positivity as an entry criterion.7 This provided for maximum specificity, which was one goal of the EULAR/ACR working group, at the expense of decreased sensitivity.7 Importantly, the classification criteria were not designed to make the diagnosis of SLE, which remains a clinical decision.8 Nonetheless, all of the items in the classification are typical features of SLE, and as a matter of fact, SLE classification criteria have been useful as an adjunctive tool to make the clinical diagnosis ever since the first inception.
Here we applied the new criteria to a patient cohort from a large academic centre in Germany with a wide range of rheumatological as well as a number of important non-rheumatological diagnoses that are managed in our outpatient clinic. The specificity of the 2019 criteria, when applied to our non-SLE ARD cohort, was higher than the specificity in the original 2019 validation cohort (96.5% vs 93%, respectively)7 as well as the original derivation and validation cohorts of the 2012 SLICC classification (92% and 84%, respectively).6 When applied to the non-ARD cohort, specificity was even higher (99.1%); however, these patients without ARD are not typically screened for study recruitment. In our patients with SLE, sensitivity was 87%, which is considerably lower than the 96% sensitivity in the original validation cohort.7 Thirteen per cent of patients who were diagnosed with SLE at our institution do not meet the 2019 EULAR/ACR criteria. One possible explanation could be that local experts tended to make the diagnosis in patients which might not have been regarded as patients with SLE by the adjudication committee of the EULAR/ACR working group. If so, this would mean that fewer patients, but more likely those with a clearly non-equivocal diagnosis of SLE would be identified when screening the already diagnosed patients with lupus for clinical trials—which was one of the aims of developing novel classification criteria.8 Systemic lupus is a heterogenous disease. If stringent classification criteria were confused with clinical decision-making, many patients with chronic and debilitating systemic disease might not be managed in a way that soothes their ill-feeling and improves their long-term prognosis.
The combined sensitivities and specificities of the 2019 criteria and the 1997 criteria were 183.5% and 180.6%, respectively. In the original validation cohort, these numbers were 190% and 176%.7 Thus, the overall performance of the 2019 classification was lower, whereas the performance of the 1997 classification was better in our cohort than in the validation cohort.
The current study is not the first to externally validate the 2019 EULAR/ACR criteria.12 However, to our knowledge, with 606 patients with SLE and 2771 patients without SLE, our cohort is by far the largest that has been published to date. In a recent analysis of the Australian Lupus Registry and Biobank that included 394 patients with lupus and 123 patients with non-SLE ARD, sensitivity and specificity of the 2019 criteria were 94.9% and 87.8%, respectively,13 which is the inverse of the sensitivity and specificity in our cohort. The numbers were similar in a subset of patients with early disease (≤15 months).13 In a Spanish study of 79 patients with long-standing disease (mean duration, 15 years), sensitivity was 86.1%, similar to the sensitivity in our cohort.
When the diagnostic performances of the individual items were analysed, low C3 and/or C4 complement levels, lupus nephritis and delirium and psychosis had very high specificities (100% or 99.9%). This contrasts with the items’ performance of the combined original derivation and validation cohorts, especially with regard to complement levels that were only highly specific when both C3 and C4 were reduced (specificity, 96%), but not when only one level was reduced (specificity, 83%).14 We cannot tell with certainty why only two patients without SLE had either low C3 or low C4, resulting in this striking difference to the original derivation and validation cohorts, where more patients without SLE exhibited some kind of complement depletion.
When the attribution rule was not applied, joint involvement had very low specificity (82.9%); when properly attributed, specificity rose to 98.1%. This underscores the importance of correct attribution of signs and symptoms to SLE; signs and symptoms must not be taken into account if there is an alternative explanation that is more plausible than lupus.14
As already pointed out, classification criteria are not intended to be used to make a diagnosis. Notwithstanding, if the 2019 criteria were indeed used to diagnose patients with SLE at our centre, 13% of patients would not be diagnosed. With the 1997 ACR criteria, the sensitivity of 81.4% means that almost one in five patients who were already diagnosed and treated for SLE would not be considered to have SLE, had the classification criteria been applied as diagnostic criteria. This highlights the importance of clinical judgement in the management of SLE.
Still, clinical judgement may not always be accurate, especially in cases of early SLE that may be mistaken for other conditions, including ‘undifferentiated’ connective tissue disease. This may result in specificities that are lower than expected, especially against the background of the published literature.
The 2019 classification introduced a history of ANA detection as an entry criterion for the classification as SLE. This increases specificity from 97.2% to 98.1%, although at the cost of a loss of sensitivity, which decreases from 92.4% to 87% when the entry criterion is applied. As remarked on by the authors of the original publication,7 maximum specificity is desired in order for the criteria to serve their purpose, which is to identify a homogeneous patient population for the inclusion in clinical trials. The decreased sensitivity must be kept in mind when using the criteria as a support tool for clinical decision-making. The classification criteria are not diagnostic criteria, and to rely solely on these criteria will leave some patients with SLE undiagnosed and, consequentially, untreated.
To our knowledge, this is the largest study of the performance of the 2019 classification in an unselected cohort of patients from an academic tertiary care centre, which regularly participates in clinical trials of patients with lupus. As such, it provides important information about the actual implementation of this new classification system, which is highly relevant for the selection of a homogeneous group of patients for these trials. We deliberately chose to also include patients from the hepatology clinic at our centre in order to better understand the performance of the classification. Autoimmune hepatitis and primary biliary cirrhosis are two autoinflammatory diseases whose signs and systems might also be mistaken for lupus if these patients are referred to a rheumatologist by a primary care provider. The subjects with hepatocellular cancer are included in our continuous scientific data collection, and we did not exclude them from the data set for the current study in order to be able to better judge how the lupus classification discriminates against clearly non-inflammatory disease.
Our study has the limitation that it was a single-centre study. At the same time, this may be regarded as a strength of this analysis: The data presented herein stem from the application of SLE classification criteria in everyday practice, and as such provide a good estimate of the performance of the criteria when actually recruiting study subjects.
Taken together, the current, large study confirms the high specificity of the 2019 EULAR/ACR criteria, while at the same time demonstrating good overall agreement with the 1997 ACR criteria.
Data availability statement
No data are available. Not applicable.
Ethics statements
Patient consent for publication
Ethics approval
This study involves human participants and was approved by Independent Ethics Committee of the state of Rhineland-Palatinate, nos. 837.467.13 (9152-F) and 2019-14695. Participants gave informed consent to participate in the study before taking part.
Footnotes
Twitter @bovender_de
IS and DK contributed equally.
SCB-L and JW-M contributed equally.
Contributors IS analysed the data. DK wrote and revised the manuscript. AW programmed the clinical database. KP collected data and assisted with the analysis. PC collected data and assisted with the analysis. EMS collected data and assisted with the analysis. JW-M contrived the project, collected and analysed the data and revised the manuscript. JW-M is responsible for the overall content as guarantor.
Funding This work was supported by the Deutsche Forschungsgemeinschaft (DFG), grant no. WE5779/2-3, to JW-M. EMS is supported by the Clinician Scientist Fellowship ‘Else Kröner Research College: 2018_Kolleg.05’. SCB-L is supported by the Clinician Scientist Fellowship ‘TransMed Jumpstart Program: 2019_A72’ which is funded by the Else Kröner Fresenius Foundation.
Competing interests JW-M received honoraria from GlaxoSmithKline, AstraZeneca, Otsuka and Novartis unrelated to this work. DK received honoraria from AstraZeneca and Novartis unrelated to this work. The other authors declare nothing to disclose.
Provenance and peer review Not commissioned; externally peer reviewed.