Computable phenotype for real-world, data-driven retrospective identification of relapse in ANCA-associated vasculitis

Objective ANCA-associated vasculitis (AAV) is a relapsing-remitting disease, resulting in incremental tissue injury. The gold-standard relapse definition (Birmingham Vasculitis Activity Score, BVAS>0) is often missing or inaccurate in registry settings, leading to errors in ascertainment of this key outcome. We sought to create a computable phenotype (CP) to automate retrospective identification of relapse using real-world data in the research setting. Methods We studied 536 patients with AAV and >6 months follow-up recruited to the Rare Kidney Disease registry (a national longitudinal, multicentre cohort study). We followed five steps: (1) independent encounter adjudication using primary medical records to assign the ground truth, (2) selection of data elements (DEs), (3) CP development using multilevel regression modelling, (4) internal validation and (5) development of additional models to handle missingness. Cut-points were determined by maximising the F1-score. We developed a web application for CP implementation, which outputs an individualised probability of relapse. Results Development and validation datasets comprised 1209 and 377 encounters, respectively. After classifying encounters with diagnostic histopathology as relapse, we identified five key DEs; DE1: change in ANCA level, DE2: suggestive blood/urine tests, DE3: suggestive imaging, DE4: immunosuppression status, DE5: immunosuppression change. F1-score, sensitivity and specificity were 0.85 (95% CI 0.77 to 0.92), 0.89 (95% CI 0.80 to 0.99) and 0.96 (95% CI 0.93 to 0.99), respectively. Where DE5 was missing, DE2 plus either DE1/DE3 were required to match the accuracy of BVAS. Conclusions This CP accurately quantifies the individualised probability of relapse in AAV retrospectively, using objective, readily accessible registry data. This framework could be leveraged for other outcomes and relapsing diseases.


Steps in building the CP for relapse
Step 1 -Independent expert adjudication of encounters to assign the reference probability of relapse (ground truth) Remission was defined as the absence of symptoms, signs and/or objective evidence of vasculitis activity.Encounters were adjudicated (in advance of this study) by a committee of expert clinicians (at least 2 of: JS, SM, NC and ML), using the patient's entire medical records.The medical records included clinical notes, medication records and all laboratory, radiological and histopathological data across the patient's entire longitudinal disease course.In keeping with clinical practice, and the observational nature of the study, some investigations were not performed, or the results were missing in a small number.The committee assigned one of four probabilities categories: definite, high probable, possible or no relapse.The degree of certainty corresponded to the strength of supporting objective evidence: if clear histopathological evidence of active vasculitis was present a 'definite relapse' label was applied, while suggestive laboratory or radiological evidence portended to a 'high probability of relapse' and encounters with a convincing clinical scenario but lacking objective evidence were labelled as 'possible relapse'.
Step 2 -Selection of data elements and corresponding value sets.Models with a small number of predictors are more attractive to implement in the clinical arena (minimising data requirements and maximising understandability).Limiting complexity also reduces the margin for error when applying the model to other datasets.Rather than predicting future relapse, the aim was to select data elements that uniformly characterise patient encounters, thereby objectively aiding retrospective relapse labelling, and allowing automation of the expert adjudication process.
Expert domain knowledge was elicited by a semi-formalised Delphi approach, to inform the selection of data elements.A methodology paper is currently being prepared for publication by our group.The most informative biomarkers with regards to relapse were: anti-MPO, anti-PR3, creatinine, uCD163, CRP, proteinuria (negative to positive) and haematuria (negative to positive).

Patient and Public Involvement
Patient involvement was primarily through the national patient group, 'Vasculitis Ireland Awareness', focus groups and 'question and answer' sessions at their national annual meeting.Patients were instrumental in prioritising relapse as a key focus for research, which led to this study.AAV relapse is a recognised target of research for the wider vasculitis community also.Patients were involved in the design and conduct of this research.Julie Power, a patient representative, joined the study steering committee.We developed a study newsletter in conjunction with VIA (hosted on their website: https://vasculitis-ia.org/) to inform participants and the wider vasculitis community about study updates.We will disseminate study findings BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Supplementary Figure 2: A).A bar chart demonstrating the percentage of completeness for BVAS across six European vasculitis registries in the FAIRVASC consortium (Italian, The joint Vasculitis Registry in German-speaking countries (Germany/Austria/Switzerland, GeVas), Polish Vasculitis Registry (PolVas), Skane Sweedish Registry, French Vasculitis Study Group (FVSG) and Czech Vasculitis Registry).B).The performance metrics of BVAS when compared to the adjudicated probability of relapse in the Rare Kidney Disease (RKD) registry.C).A stacked bar chart demonstrating the distribution of the recorded BVAS compared to the adjudicated probability of relapse across all encounters with an available BVAS in the RKD registry (31%).The shading of the bars denote the adjudicated probability of relapse as per the legend.The model rank and F1-Scores correspond to that in Figure 3. Prevalence of relapse 17% in models 1-31 and 15% in the BVAS analysis.Data element (DE) key: 1=ANCA titre, 2=Suggestive bloods/urine, 3=Suggestive imaging, 4=Immunosuppressive (IS) status, 5=IS response.95% confidence interval (95% CI), Birmingham vasculitis activity score (BVAS), positive predictive value (PPV), negative predictive value (NPV), area under the receiver operating curve (AUC).
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) through this website, social media channels and at patient sessions at the biannual International Vasculitis and ANCA Workshop.BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)

Table 3 :
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance The mean (95% CI) of all performance metrics for the 31 model combinations of the 5 data elements, ranked by F1-Score.Models 1-22 have a classification accuracy at least as good as the BVAS >0 definition of relapse, denoted by the 95% CI of their F1-score crossing the F1-score point estimate (0.7) for the BVAS >0 definition.