Objective To develop and validate the evidence-based and consensus-based Behçet’s Syndrome Overall Damage Index (BODI).
Methods Starting from 120 literature-retrieved preliminary items, the BODI underwent multiple Delphi rounds with an international multidisciplinary panel consisting of rheumatologists, internists, ophthalmologists, neurologists, and patient delegates until consensus was reached on the final content. The BODI was validated in a cross-sectional multicentre cohort of 228 patients with Behçet’s syndrome (BS) through the study of (a) correlation between BODI and Vasculitis Damage Index (VDI) and (b) correlation between BODI and disease activity measures (ie, Behçet’s Disease Current Activity Form (BDCAF), Physician Global Assessment (PGA), Patient Global Assessment (PtGA)), c) content and face validity and (d) feasibility.
Results The final BODI consists of 4 overarching principles and 46 unweighted-items grouped into 9 organ domains. It showed good to excellent reliability, with a mean Cohen’s k of 0.84 (95% CI 0.78 to 0.90) and a mean intra-class correlation coefficient of 0.88 (95% CI 0.80 to 0.95). Overall, 128 (56.1%) patients had a BODI score ≥1, with a median score of 1.0 (range 0–14). The BODI significantly correlated with the VDI (r=0.693, p<0.001), demonstrating to effectively measure damage (construct validity), but had greater sensitivity in identifying major organ damage and did not correlate with disease activity measures (ie, BDCAF: p=0.807, PGA: p=0.820, PtGA: p=0.794) discriminating damage from the major confounding factor. The instrument was deemed credible (face validity), complete (content validity) and feasible by an independent group of clinicians.
Conclusions Pending further validation, the BODI may be used to assess organ damage in patients with BS in the context of observational and controlled trials.
- Behcet’s disease
- Outcomes research
- Systemic vasculitis
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Behçet’s syndrome (BS) is a multisystem recurring inflammatory disorder characterised by the deregulation of both the innate and adaptive immune responses, and it is classified among the types of vasculitis.1 Distinctively, BS has a strong genetic background and is concentrated in the region spanning from the Mediterranean basin to the Far East between latitudes 30° and 45° North.2 3 The clinical manifestations of BS have an unpredictable relapsing-remitting course and may vary across a wide spectrum, from limited mucocutaneous lesions up to severe and even life-threatening events such as blindness, large vessel involvement, parenchymal central nervous system inflammation and gut perforation.4 The treatment of BS aims to promptly suppress inflammation and prevent recurrences to avoid irreversible damage.5 Nevertheless, both disease activity and treatment exposure may lead to irreversible anatomic and functional organ damage, which correlates with increased morbidity and mortality.6
Disease activity refers to reversible inflammatory manifestations that might resolve spontaneously or if adequately treated, whereas damage refers to irreversible manifestations that, once they have occurred, will not respond to treatment escalation.7 8 According to the Outcome Measures in Rheumatology (OMERACT) working group, both measures of disease activity and damage should be included, as separate and complementary entities, into the core outcome set for BS that should be adopted in clinical trials and daily practice.9 10 Although various indexes for the evaluation of disease activity in BS are available or under development, no specific overall damage measure is currently available or in conception.11 The most commonly used damage assessment tool for systemic vasculitides is the Vasculitis Damage Index (VDI).12 The VDI has been mainly validated in small vessel vasculitides8 but not in BS, where it has significant limitations in terms of face and content validity as well as in the ability to discriminate damage from disease activity.10–13
The aim of the present study was to develop and initially validate the Behçet’s Syndrome Overall Damage Index (BODI), the first evidence-based and consensus-based instrument for assessing, describing and measuring organ damage in patients with BS.
The BODI project (ClinicalTrials.gov Identifier: NCT03803462) was initiated in September 2016 as a result of the spontaneous cooperation of an international group of experts in the management of BS (table 1). The project included the development of an outcome measure to score irreversible organ damage in BS, followed by the preliminary validation of the newly developed instrument in a cross-sectional multicentre cohort of patients affected with BS.
As a first step, an extensive review of the literature and a comprehensive analysis of the existing damage assessment tools were performed to (a) develop a draft definition of the overarching principles for scoring damage and (b) provide a list of preliminary items pertaining to damage and their definitions. The literature review was performed by the bibliographic fellows in MEDLINE via PubMed, between January 1970 and December 2016, using a combination of relevant index terms including ‘Behcet’ and different types of organ involvement and manifestations as index terms. Additional papers were obtained by checking the references of the selected studies as well as from other sources known to the authors. Among the existing tools for scoring damage, the VDI,12 the Autoinflammatory Disease Damage Index (ADDI)14 and the Systemic Lupus International Cooperating Clinics/American College of Rheumatology Damage Index (SDI)15 were evaluated.
The preliminary items were selected, rejected and merged or split to define a list of candidate items pertaining to damage on the basis of the following predetermined inclusion criteria: (a) attributability, (b) irreversibility, (c) distinguishability from disease activity, (d) epidemiological relevance and (e) potential effect on long-term outcomes. A detailed glossary reporting the definition for each candidate item was created.
Multiple rounds of a web-based Delphi method were performed to review, implement and refine the drafted overarching principles and candidate damage items until consensus was reached on the final BODI content and the glossary definitions. An expert panel (EP) consisting of a multidisciplinary investigator group (rheumatologists, internal medicine specialists, immunologists, neurologists and ophthalmologists) from 10 Southern European centres and patient delegates was involved in the Delphi exercise. Patient representatives worked with the EP, providing input to the process of damage index development and to refine the preliminary items of damage. In particular, they were asked to provide their perspective on those items with highly perceived impact on the quality of life (QoL). However, it was difficult to involve patients in the areas of outcome measure validation due to the technical methods required for data analysis.
Each participant in the Delphi was asked to rate on a 5-point Likert scale (5=strongly agree; 4=agree; 3=unsure; 2=disagree; 1=strongly disagree) her/his agreement with the proposed overarching principles of damage, the inclusion of the individual items in the index and the suitability of their definitions. The respective bibliographic references were made available to the EP using the open-source software Zotero (www.zotero.org). Experts were also asked to suggest potential new items and their definitions. Items that were scored ≥4 on the Likert scale by ≥80% of the EP were included in the index; those that were scored ≥4 by 50–79% of the EP were conditionally included in the subsequent Delphi round after being revised according to the provided comments and suggestions; those that were scored ≥4 by <50% of the EP members were rejected. Final agreement was reached when all items and their respective definitions reached ≥80% consensus from the EP.
A multistep validation of the newly developed instrument was performed based on the OMERACT Filter 2.0 principles.16
A reliability exercise was designed by developing a set of clinical vignettes, referenced from real clinical cases, in which both ‘true’ and ‘confounding/false’ damage items were included. ‘True’ items met the BODI criteria for damage and thus had to be scored. ‘Confounding/false’ items did not meet the BODI criteria and thus should not be scored. After a training process consisting of a user manual and a video tutorial (https://vimeo.com/264992929), a group of investigators not involved in the development of the instrument was asked to score the BODI for each clinical vignette. The inter-rater agreement was analysed by evaluating Cohen’s kappa (K) and the interclass correlation coefficient (ICC) to assess the agreement between observers in categorising (present/absent) the individual items and calculating the final score, respectively. The categorisation of ‘true’ and ‘false’ items as well as the BODI score obtained by the facilitating group represented the gold standard. Both K and ICC were calculated for all possible pairs of assessors and between each assessor and the gold standard. A good to excellent level of reliability (K≥0.7) was required for clinicians to participate in the validation process.
Afterwards, amulticentre cohort was established by asking one clinician from each centre to enrol 30 consecutive patients with BS to test the BODI content, construct, discrimination and criterion validity. The inclusion criteria were (a) diagnosis of BS fulfilling the International Study Group (ISSG)17 criteria or the International Criteria for Behçet's Disease (ICBD)18, (b) disease duration ≥12 months, (c) age at enrolment ≥18 years and (d) informed consent given.
For every patient, demographics, clinical manifestations and previous and ongoing medications were recorded at baseline. Damage was assessed by the BODI and the VDI. To test the effective comprehensiveness of the new tool (content validity), recruiting physicians were asked to report potential damage items detected in the validation cohort but not included in the BODI. To assess the construct validity (if the instrument truly measures what it claims to measure), the BODI score was correlated with the VDI. To assess the discrimination validity (if the instrument discriminates what it claims to measure from confounding factors, ie, damage from disease activity), the BODI score was correlated with disease activity measures, such as the Behçet’s Disease Current Activity Form (BDCAF),19 the Physician Global Assessment (PGA) on a 0–10 cm visual analogic scale (VAS) and the VAS—Patient Global Assessment (PtGA).20 Construct validity was further assessed by analysing factors associated with the BODI score. Health-related QoL (HR-QoL) was evaluated using the physical component summary (PCS) and the mental component summary (MCS) of the Short Form 36-V2 Health Survey (SF-36V2).21 The ability of the BODI to record damage accrual over time was examined by analysing the change in the BODI score over 5 years in patients with longer follow-up.
Finally, an online survey was submitted to a panel of clinicians to record their overall judgement on the credibility (face validity), completeness (content validity) and feasibility of the instrument.
Categorical variables are expressed as absolute numbers and frequencies (%). Normally and non-normally distributed continuous variables are reported as the mean±SD and median and IQR, respectively. Spearman’s coefficient (ρ) was used to test the correlation between the BODI score and other quantitative measures of damage (ie, the VDI) and disease activity (ie, BDCAF, PGA, PtGA). In case of missing values, participants were contacted and asked to fill them out. Given the exploratory nature of our analysis, no study size calculation was performed a priori, but a sample size of at least 200 patients was considered sufficient to satisfactorily complete the validation exercise. To minimise the potential source of bias, the truth (face validity, content validity, construct validity and criterion validity), discrimination (reliability and sensitivity to change) and feasibility of the newly developed tool were tested using multiple approaches.
Univariate analysis was performed to assess associations with damage measured by the BODI. Forward–backward multiple regression models were fitted with covariates with p<0.05 to identify factors independently associated with the BODI score.
Statistical analyses were performed using SPSS software (SPSS for Macintosh, version 24 PSS, Chicago, Illinois, USA). All statistical tests were two-sided. P values less than 0.05 were considered significant.
A list of 120 candidate items of damage with their respective definitions was generated (online supplementary table 1). Of these, 58 were original/new items not included in the VDI, ADDI or SDI.
Two online Delphi rounds were performed, and both were completed by 100% of the EP members. Overall, 104 comments, criticisms and suggestions regarding overarching principles, candidate items and their definitions were recorded. Four potential new candidate items were proposed by the EP and two were proposed by the patient delegates during the first Delphi round. At the end of the second Delphi round, 46 items were accepted with a high agreement level.
The BODI consists of 4 overarching principles and 34 items with 12 subitems, categorised into 9 organ/system domains: mucocutaneous, musculoskeletal, ocular, vascular, cardiovascular, neuropsychiatric, gastrointestinal, reproductive system and miscellaneous. Each item and subitem scores 1 point. The total score ranges from 0 to 46. The overarching principles for scoring damage are reported in table 2. The final version of the BODI is shown as a printable form in figure 1, and the glossary with the definitions of the individual items is retrievable online (online supplementary table 2).
There were 8 complete reliability exercises comprising a data set of 440 unique paired responses. The mean (95% CI) Cohen’s K calculated for all possible pairs of observers and between each observer and the gold standard were 0.73 (0.70–0.76) and 0.84 (0.78–0.90), respectively. Similarly, the mean ICC calculated between observers was 0.78 (0.57–0.94), and the mean ICC between each observer and the gold standard was 0.88 (0.80–0.95).
The multicentre validation cohort consisted of 228 patients with BS, 227 patients scored 4 points or more on the ICBD criteria, 184 patients fulfilled the ISSG criteria and 183 have both criteria. The BODI validation cohort characteristics are reported in table 3. Overall, 128 (56.1%) patients had at least one damage item (BODI ≥1). The mean and median BODI scores were 1.6 (±2.2) and 1.0 (0–2.0), respectively, with scores ranging from 0 to 13. See online supplementary table 3 for the baseline features of the BODI validation cohort separately for the patients without damage.
No further damage items beyond those classifiable by the BODI were reported from the recruiting physicians. The BODI construct validity was supported by a highly significant correlation with the VDI (r=0.740, p<0.001) measured in the multicentre cohort. The new instrument was able to better capture damage in the major organ domains than the VDI (figure 2). Vascular and neurological damages were numerically more frequent and scored statistically significantly higher values when assessed by the BODI than when assessed by the VDI. Ocular and gastrointestinal damage scores were numerically higher using the BODI, even if the relatively low number of events observed in our cohort prevented statistical significance from being reached. The burden of mucocutaneous damage was lower when measured with the BODI (20.6%) than when measured with the VDI (46.9%).
The ability of the BODI to discriminate damage from disease activity was supported by the lack of a significant correlation between the BODI and the BDCAF (r=−0.016, p=0.807), PGA (r=0.002, p=0.971) or PtGA (r=−0.030, p=0.658). In contrast, a significant correlation was found between the VDI and the BDCAF (r=0.141, p=0.034) but not the PGA (r=0.015, p=0.820) or the PtGA (r=−0.017, p=0.794).
In the multivariate analysis, male sex (β=0.137, p=0.017), disease duration (β=0.232, p>0.001), treatment with tumour necrosis factor inhibitors (TNFi) (β=0.215, p>0.001) and previous major organ involvement (vascular and/or neurologic and/or gastrointestinal) (β=0.389 p<0.001) were independently associated with a higher BODI score (table 4) and the occurrence of any BODI damage (BODI ≥1) (online supplementary table 4). To rule out any possible bias related to patient selection, the recruiting centre was included in the multivariate models as a covariate, which showed that the centre had no influence on the damage measured by the BODI score.
No significant association was recorded between the BODI and HR-QoL when assessed by the physical component summary (PCS) (r=−0.030, p=0.660) and the MCS (r=0.074, p=0.272) of the Short Form 36-V2 Health Survey (SF-36V2).
In terms of sensitivity to change, when data from 144 patients with a longer disease duration were analysed, the mean increase in the BODI score over 5 years was 0.31 (±0.74) (p<0.001), confirming that the BODI is able to measure clinically detectable changes in organ damage burden over time. According to the BODI overarching principles, no patients showed decreased damage scores over time.
Overall, 17 clinicians completed the feasibility questionnaire. All of them judged the BODI to be credible, covering the full spectrum of potential damage in patients with BS, further supporting the face and content validity of the instrument. In terms of feasibility, 94.1% of those interviewed stated that the instrument was understandable, easy and acceptable for use in daily practice and clinical trials (Supplementary File 5). Furthermore, the damage assessment by the BODI was judged to be acceptable in terms of the time needed to complete it by 94.1% of responders, who declared that the mean (range) completion time was 8.7 (2–20) min.
The present study focused on the development and preliminary validation of an index specifically designed to identify, describe and measure overall damage in patients with BS. The BODI consists of a package that includes guidelines for scoring, a list of damage items and the glossary. It will allow an evaluation of the overall damage as a major outcome in clinical settings and to systematically measure damage in clinical studies. The BODI was developed to assess damage as a measure of treatment failure and as a predictor of long-term outcomes, especially mortality. Therefore, it may represent the answer to the unmet need of specific outcome measures for damage suggested by the EULAR task force5 and the OMERACT working group on BS.9
The development of the BODI was grounded in solid evidence-based and consensus-based methodology. An extensive literature review and an accurate analysis of existing tools were integrated with a multidisciplinary Delphi method to ensure a wide coverage of all potential aspects of damage in BS and both the validity and acceptability of the consensus statements. The inclusion of explicit input from patients with BS was essential to further ensure the face and content validity of the instrument.22 The patients’ involvement was particularly important for the definition of those items of damage with a potential impact on perception of QoL, such as damage to the musculoskeletal and reproductive systems.
Whether the attribution to BS should be a mandatory criterion for scoring damage, it was one of the most debated issues. However, damage in BS is frequently multifactorial, and discriminating among disease-related abnormalities, drug toxicities and comorbidities is often not possible. Therefore, to minimise the risk of interobserver variability and to preserve the reliability of the instrument, the EP agreed that damage must be scored if it develops after BS onset, regardless of its attribution. Further, because of its impact on face and construct validity, the distinguishability of damage from disease activity was another major subject of debate on several candidate BODI items. Indeed, some potentially reversible lesions may take a long time before healing or evolving towards anatomical and/or functional sequela. Such lesions (eg, skin ulcers, deep venous, arterial and intracardiac thrombosis) may persist for a long time after disease activity is suppressed and may negatively affect other long-term outcomes that BODI aims to predict, such as disability or mortality. Therefore, according to the principles of most existing damage assessment tools, the EP agreed to define damage over time, that is, if a disease manifestation does not respond to appropriate treatment after a suitable amount of time, then it should be scored in the BODI. Furthermore, some irreversible lesions (eg, ischemic heart disease, gastrointestinal perforation) are more attributable to disease activity as soon as they occur; however, because of their consequence in terms of non-healing scars, they also have the embedded property and high probability to capture related damage. Actually, one of the main concepts in scoring damage is to recognise aspects of the disease that will not respond to medications, but that should be recorded for their impact on the morbidity burden. To enhance the instrument comprehensiveness and reliability, the authors agreed that such lesions should be scored as damage after a consensus-defined time gap of 6 months. Indeed, for these difficult issues debated in the Delphi process, the BODI experts reached the same conclusions and operational solutions adopted in the main existing damage assessment tools.12 14 15 23
The Delphi method concluded in only two rounds and resulted in an extremely high level of agreement regarding the accepted items and their definitions, probably supported by the initial evidence-based selection of candidate damage items.
The subsequent validation process demonstrated the BODI reliability, validity and sensitivity to change. Indeed, after a specific training, a good to excellent inter-rater agreement was recorded. Moreover, an independent group of clinicians judged the BODI to be credible, comprehensive, understandable and feasible. The BODI content validity is supported by the evidence-based and consensus-based methodology adopted for its development, including the lack of potential damage items reported in the validation cohort but not covered by the instrument. Supporting the BODI construct validity is the finding of a significant correlation with the VDI, the most widely used damage assessment tool for vasculitides. Although some of the items in the BODI are shared by the VDI, the former was shown to be more comprehensive and specific for patients with BS by capturing a higher number of BS-related damage items, such as neurological and vascular lesions, and preventing the scoring of active manifestations, such as oral ulcers. Notably, the BODI discrimination validity was supported by the lack of correlations between the BODI and the BDCAF, PGA or PtGA. Further indirect proofs of the BODI construct validity are provided by its independent association with demographic (ie, male sex, longer disease duration) and clinical factors (major organ involvement, use of TNFi) that were expected to influence damage accrual. The relationship between damage and HR-QoL in BS should be analysed in future prospective studies because the failure to demonstrate a cross-sectional correlation between them may be due to a mechanism of adaptation, as demonstrated in other chronic autoimmune diseases.24 25
Some limitations of this study need to be acknowledged. The panel involved in building the consensus might be considered too restricted. However, because the BODI drafting was based on a rigorous evidence-based methodology, we believed that a relatively small number of experts were appropriate. On the other hand, the multidisciplinary nature of the EP and the active involvement of patients markedly contributed to making the instrument comprehensive, specific and easy to use. Additionally, the small number of patients of non-Southern European descent limits the generalisability of the validation of the BODI to populations from other endemic areas. However, we expect that the ability of the BODI to capture damage in the different major organ domains would guarantee covering the amount of damage potentially related to the known geographical variation in disease expression.26 Nevertheless, a prospective validation with a wider and ethnically heterogeneous cohort is planned to generally evaluate the impact of those BODI items seldom recorded or absent (ie, ‘severe aortic regurgitation’, ‘constrictive pericarditis’, ‘major tissue loss’, ‘amyloidosis’) in the present cohort and to weight each damage item depending on its relevance for predicting mortality.
In conclusion, we developed the first outcome measure for BS overall damage assessment that can be used in daily practice and clinical studies. The preliminary validation of the BODI showed highly promising performance in terms of comprehensiveness, specificity, reliability, sensitivity to change and feasibility.
What is already known about this subject?
In patients affected with Behçet’s syndrome, irreversible anatomic and functional organ damage may arise from acute inflammatory attack or drug toxicity.
EULAR recommendations for the management of Behçet’s syndrome highlighted the prevention of irreversible organ damage as a major goal of treatment.
What does this study add?
The present study provides the first tool specifically designed to identify, describe and measure organ damage in patients with Behçet’s syndrome.
How might this impact on clinical practice?
The BODI will allow the analysis of overall damage as a major outcome in clinical settings and the systematic measurement of damage in clinical studies.
A prospective validation with a wider and ethnically heterogeneous cohort is planned.
The authors would like to thank all the patients participating in this research.
Twitter Gerard Espinosa @gerardespinosa5.
Collaborators The BODI Project group, Nestor Avgoustidis, Elisabetta Chessa, Giovanni Ciancio, Mattia Congia, Raquel Faria, Roberto Rios Garcés, Giulio Guerrini, Gema Lledó Ibáñez, Piero Mascia, Ignasi Rodriguez Pinto, Vincenzo Venerito, Antonio Vitale.
Contributors Study conception: MP, AM, AF. Substantial contributions to study design: MP, AM, AF, GB, LC, AC, RC, MG, FI, PN, AMS, CV. Substantial contributions to the acquisition, analysis and interpretation of the data: MP, AF, GE, LSP, NK, ALM, GL, IO, VP, ES, GB, LC, AC, RC, JC, MG, FI, PN, AMS, CV, MM, AM. Drafting the article or revising it critically for important intellectual content: MP, AF, GE, LSP, NK, ALM, GL, IO, VP, ES, GB, LC, AC, RC, JC, MG, FI, PN, AMS, CV, MM, AM. Final approval of the version of the article to be published: MP, AF, GE, LSP, NK, ALM, GL, IO, VP, ES, GB, LC, AC, RC, JC, MG, FI, PN, AMS, CV, MM, AM.
Funding The BODI Project was partly supported by a grant from the Italian Behçet’s syndrome patient association (SIMBA ONLUS).
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data are available upon reasonable request. All data relevant to the study are included in the article or uploaded as supplementary information.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.