Objectives To define the elementary ultrasound (US) lesions in giant cell arteritis (GCA) and to evaluate the reliability of the assessment of US lesions according to these definitions in a web-based reliability exercise.
Methods Potential definitions of normal and abnormal US findings of temporal and extracranial large arteries were retrieved by a systematic literature review. As a subsequent step, a structured Delphi exercise was conducted involving an expert panel of the Outcome Measures in Rheumatology (OMERACT) US Large Vessel Vasculitis Group to agree definitions of normal US appearance and key elementary US lesions of vasculitis of temporal and extracranial large arteries. The reliability of these definitions on normal and abnormal blood vessels was tested on 150 still images and videos in a web-based reliability exercise.
Results Twenty-four experts participated in both Delphi rounds. From originally 25 statements, nine definitions were obtained for normal appearance, vasculitis and arteriosclerosis of cranial and extracranial vessels. The ‘halo’ and ‘compression’ signs were the key US lesions in GCA. The reliability of the definitions for normal temporal and axillary arteries, the ‘halo’ sign and the ‘compression’ sign was excellent with inter-rater agreements of 91–99% and mean kappa values of 0.83–0.98 for both inter-rater and intra-rater reliabilities of all 25 experts.
Conclusions The ‘halo’ and the ‘compression’ signs are regarded as the most important US abnormalities for GCA. The inter-rater and intra-rater agreement of the new OMERACT definitions for US lesions in GCA was excellent.
- giant cell arteritis
- systemic vasculitis
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What is already known about this subject?
Ultrasound (US) of temporal, axillary and other arteries is increasingly used for confirming a suspected diagnosis of giant cell arteritis (GCA) in clinical practice.
Although several prospective studies comparing US with the clinical diagnosis and/or results of temporal artery biopsy reveal a good diagnostic performance, its diagnostic value has been questioned particularly because of a lack of data on reliability.
What does this study add?
This study includes the first systemic literature review on US definitions of normal and abnormal temporal and extracranial arteries in suspected GCA.
This is the first study that provides definitions of the normal US appearance and key elementary lesions of vasculitis of temporal and extracranial large arteries based on international expert consensus.
Inter-rater and intra-rater reliabilities for reading stored US images and videos of normal and vasculitic temporal and axillar arteries applying the consensus-based definitions are excellent.
How might this impact on clinical practice?
These consensus-based definitions provide clinicians with a clear guideline on how to evaluate US findings in suspected GCA.
They provide a basis for future trials in GCA including US as an inclusion criterion and evaluating US as an outcome parameter.
This study shows that images and videos of US examinations of temporal and axillary arteries in suspected GCA can be stored and reliably re-evaluated by experts.
Giant cell arteritis (GCA) is the most common primary systemic vasculitis, occurring predominantly in Caucasian populations.1 GCA mainly involves large and medium-sized arteries, predominantly branches of the external carotid arteries such as the temporal arteries, and the aorta and its large branches such as the subclavian and axillary arteries. Temporal artery biopsy has been regarded as the gold standard for decades; however, biopsy is invasive, and it lacks sensitivity, particularly in extracranial large vessel (LV)-GCA.2 Imaging techniques including ultrasound (US), MRI and positron emission tomography-CT are increasingly being used in diagnosis of GCA and may in future replace biopsy in many cases.3 4 Notably, US is less invasive, reveals a higher sensitivity, particularly in extracranial disease, and results become available faster.5 Early diagnosis and treatment of patients with GCA are important since patients may develop irreversible ischaemic complications, including vision loss and stroke. The implementation of fast track clinics that involve US as a point-of-care test for patients with suspected GCA has led to a decrease of permanent vision loss.6 7 A recently published multicentre study showed that a diagnostic algorithm including US is cost-effective compared with a conventional strategy focusing on biopsy only.2
GCA is characterised by inflammatory infiltration of the artery wall resulting in the so-called ‘halo’ sign, first described in 1995, which is a hypoechoic (dark) thickening of the vessel wall as visualised by US.8 In contrast to the healthy artery, the inflammatory wall thickening is not compressible upon application of pressure with the US probe. This feature has recently been termed the ‘compression’ sign.9
Several studies have been conducted thus far to investigate the accuracy, construct and criterion validity of US in the diagnosis of GCA, and four meta-analyses of these studies have been published until now.10–13 Despite the growing body of evidence supporting the utility of US in GCA, standardised definitions of the elementary normal and abnormal appearance and their reliability are lacking. Therefore, an Outcome Measures in Rheumatology (OMERACT) Large Vessel Vasculitis (LVV)-US Working Group was formed in order to agree on the US lesions suggestive of GCA as well as to test the reliability of these definitions.
The first aim of this study was to retrieve currently available definitions of US key elementary lesions describing vasculitis in temporal and extracranial large arteries by a systematic literature review (SLR). Second, we intended to produce consensus-based definitions of normal and GCA characteristic appearances of temporal and extracranial large arteries as detected by US, using a Delphi process among international experts. This Delphi process included definitions of the US appearance of (1) normal, (2) arteriosclerotic and (3) vasculitic temporal and axillary arteries and (4) a consensus on which anatomical structures and findings should be considered when performing US in suspected GCA. The third aim was to test the inter-rater and intra-rater reliabilities of the definitions of each elementary US lesion in GCA using a web-based exercise.
The study design followed the stipulated OMERACT methodology in accordance with previous studies of the OMERACT US Working Group for defining disease characteristic lesions and testing reliability of US in other rheumatic diseases.14–16 The OMERACT LVV-US Working Group was formed at the Annual American College of Rheumatology (ACR) meeting Boston, Massachusetts, USA, in 2014.
SLR to identify previously applied US definitions of LVV
According to the OMERACT standard operating procedures, a SLR was conducted to identify definitions of normal and abnormal US appearance of large arteries applied in previous studies. Details on the key question, search, data synthesis and quality assessment are provided in the online supplementary material. In brief, two authors (CDu and CDe) searched the PubMed, EMBASE and the Cochrane Library databases using Medical Subject Headings (MeSH) terms, full text and truncated words (see online supplementary material for full search strategy) from the inception dates (1946, 1974 and 1993, respectively) to 23 November 2014. The following inclusion criteria were applied: (1) number of patients enrolled ≥20 patients and (2a) full research articles of prospective or retrospective studies on diagnostic accuracy of US in suspected LVV (ie, cranial and extracranial LV-GCA, Takayasu arteritis (TAK) and idiopathic aortitis as these exhibit similar US pathologies) using an appropriate reference standard (ie, clinical diagnosis, published criteria and/or positive temporal artery biopsy) or (2b) cross-sectional studies assessing LVV by US in patients with established GCA, polymyalgia rheumatica (PMR) or TAK. Data were extracted using a predefined template. The Quality Assessment of Diagnostic Accuracy Studies-2 and Quality In Prognosis Studies tools were used to assess quality of diagnostic accuracy and prognostic studies, respectively17 18 (see online supplementary tables S4 and S6).
Delphi consensus on definitions of LVV elementary US appearances
The group decided to focus the Delphi exercise on US key lesions for GCA only, because of the paucity of US data in TAK and idiopathic aortitis.
Based on the results from the SLR, the steering committee (CDu, CDe and WAS) developed a WORDTM-based written questionnaire that included 25 statements. Of these 25 statements, 3 addressed the definitions of the appearances of normal and arteriosclerotic temporal and extracranial large arteries; 15 statements addressed 5 definitions of the ‘halo’ sign, stenosis (temporal and extracranial large arteries), occlusion, ‘compression’ sign (temporal arteries) and vessel wall pulsation (temporal arteries) and 7 statements addressed the requirements for diagnosis of vasculitis by US.
Twenty-five physicians experienced in US and/or LVV were invited by email to participate. They were from 14 countries (Austria, Czech Republic, Denmark, France, Germany, Italy, Norway, Poland, Portugal, Slovenia, Spain, The Netherlands, UK and USA). The group consisted of 22 rheumatologists, 1 internist and 2 physicians in the last year of rheumatology training. Nine, six, four, two and four participants have performed >300, 101–300, 51–100, 21–50 and <20 diagnostic GCA US examinations, respectively. Sixteen were currently offering a diagnostic GCA US clinic. The participants were asked to rate each definition using a level of agreement or disagreement for each statement according to a 1–5 Likert scale with 1=strongly disagree, 2=disagree, 3=neither agree nor disagree, 4=agree and 5=strongly agree. A Likert score of 4 or 5 was considered as agreement. Only when statements achieved a score of > 75%, a consensus was considered for appropriately defining the category. Statements satisfying these requirements were used for the definition of the most important US elementary appearances for the diagnosis of vasculitis. Those statements with already achieved agreement, but suggestions for an improved wording in the first Delphi round were rephrased according to the experts’ comments and reappraised in the second round. Statements with a <75% agreement in the first round were not further taken to the second round.
The questionnaire also included a rating of the importance of the different US elementary appearances for the diagnosis of cranial and extracranial LVV using a Likert scale as mentioned above. Up to two reminders were sent out to the experts if they had not responded within the given time limit. The answers of the first Delphi round were summarised with the percentage of agreement to each statement. For the second Delphi round, all comments of the panelists were anonymised and re-sent together with a questionnaire revised by the steering committee to those experts who had responded in the first round. At a face-to-face meeting of the expert panel (‘round 3´), held at the 2015 San Francisco ACR Meeting, the wording of one definition was slightly revised.
Inter-rater and intra-rater web-based reliability exercise
All members of the OMERACT LVV-US Working Group were asked to submit 16 representative still images and 20 representative videos (figures 1-3): eight still images and eight videos represented normal anatomical segments (common temporal artery, frontal branch, parietal branch and axillar arteries) in longitudinal and transverse planes; and the eight other still images and eight videos represented the same segments exhibiting the ‘halo’ sign. Four additional videos showed a positive and a negative ‘compression’ sign of the temporal artery branches in longitudinal and transverse views, respectively. All pathological images and videos originated from patients with active disease who met the expanded ACR classification criteria of GCA, and in whom diagnosis was confirmed either by temporal artery biopsy or on a clinical basis, including US and follow-up.19 The images and videos were collected by a facilitator of the group (SC) who constructed an electronic database using REDCap (Research Electronic Data Capture; Vanderbilt University, Nashville, Tennessee, USA) hosted by a server from the Italian Society for Rheumatology.20
From 550 submitted images and videos, 150 images and videos were selected by the facilitator for the web-based reliability exercise: 20 videos of axillary arteries, 20 still images of axillary arteries, 45 videos of temporal arteries, 45 still images of temporal arteries and 20 videos of the ‘compression’ sign applied to temporal arteries. The distribution between longitudinal/transverse views and normal/pathological vessels was as follows: temporal artery still images and videos: transverse 56, longitudinal 54, pathological 57 and normal 53. Axillary artery still images and videos: transverse 18, longitudinal 22, pathological 19 and normal 21. A link with the web-based exercise was sent to the same physicians who participated in the Delphi process, asking them to apply the definitions agreed in the Delphi exercise to decide whether each still image or video was suggestive of vasculitis according to the definitions. Two weeks after the first evaluation, the participants received the same images and videos in a different order for evaluating the intra-rater agreement.
All images and videos were anonymised for patients’ data, the centre where the image was obtained, US machine settings/producer and intima-media thickness (IMT) measurements. Images and videos from patients were only submitted from countries without restrictions for patient image transfer.
In the SLR and in the Delphi process, only descriptive statistics were used. Intra-rater and inter-rater reliabilities were calculated using the kappa coefficient (κ). Intra-rater reliability was assessed by Cohen’s κ, and inter-rater reliability was studied by calculating the mean κ on all pairs (ie, Light’s κ).21 Kappa coefficients were interpreted according to Landis and Koch with κ values of 0–0.2 considered poor, 0.2–0.4 fair, 0.4–0.6 moderate, 0.6–0.8 good and 0.8–1 excellent.22 The percentage of observed agreement (ie, the percentage of observations that obtained the same score) and prevalence of the observed lesions were also calculated. Analyses were performed using R Statistical Software (Foundation for Statistical Computing, Vienna, Austria).
SLR on definitions of key elementary US lesions describing vasculitis
Out of 2960 articles screened, 39 studies were finally included (see online supplementary figure S1). Some of these studies addressed more than one key objective (and are reported in the following as if they were separate articles). Twenty-four articles focused on diagnostic accuracy of US in GCA9 23–45 (main study characteristics, detailed results including risk of bias and provided definitions of US key elementary lesions are summarised in online supplementary tables S2–4), studies investigated the value of US for the prediction of GCA outcome46–48 (online supplementary tables S5–6), 13 studies reported the possible role of US for monitoring disease activity23 27 28 30 33 36 37 39 40 46–49 (online supplementary table S7) and 14 cross-sectional studies assessed LVV by US in patients with GCA, PMR and TAK4 30 37 49–59 (online supplementary table S8). All diagnostic accuracy studies evaluated the role of US for the diagnosis of cranial GCA, two of them also included patients with extra cranial LV-GCA.37 44 In seven reports, arterial involvement of patients with PMR was addressed,23 27 36 50–53 and two cross-sectional studies assessed by US the involvement of LV in patients with TAK.58 59 No diagnostic accuracy study was identified for TAK and isolated idiopathic aortitis.
Most US studies in patients with GCA and PMR tested the ‘halo’ sign (n=36)4 9 23–44 46–57 as a key elementary lesion defining vasculitis. Other US signs of vasculitis reported (mostly in combination with the ‘halo’ sign) were stenosis (n=21),4 9 23–25 30 32 33 36–38 41 46 49–56 occlusion (n=18),4 9 23 30–32 36 38 41 46 49–56 the ‘compression’ sign (n=2)9 45 and a conspicuous vessel wall pulsation by M-mode (n=1).31 Cut-off values of the IMT for the definition of the ‘halo’ sign were provided in nine studies,4 27 38 39 43 50 53 54 57 ranging from 0.3 to 1 mm for temporal arteries and from 1.3 to 2 mm for extracranial large arteries. For TAK, the term ‘Macaroni ‘sign has been used in two studies describing the same pathology as the ‘halo’ sign.58 59 Stenosis, occlusion and arterial dilatation have also been addressed as US key elementary signs in patients with TAK.58 59
No separate definitions for the distinction between acute and chronic vasculitic lesions have been published, neither for GCA nor for TAK.
Twenty-four of the 25 invited participants responded to the first Delphi questionnaire (96% response rate). All 24 participants also responded to the second round of the Delphi questionnaire (100% response rate).
In round 1, a consensus was achieved on nine definitions on normal temporal and extracranial large arteries, arteriosclerosis, ‘halo’ sign, stenosis of temporal and extracranial large arteries, occlusion, ‘compression’ sign (temporal arteries) and US assessment of the ‘compression’ sign (temporal arteries) (table 1). A definition of the ‘halo’ sign not including the measurement of the IMT was preferred by the group, because of the high variance of proposed cut-off values for temporal and extracranial large arteries found in the SLR and the lack of validated data at that time.4 27 38 39 43 50 53 54 57
In round 2, three definitions (arteriosclerosis, ‘halo’ sign and stenosis of temporal arteries) were redefined, voted and agreed upon. The statements on vessel wall pulsation (definition and assessment) and the assessment of the ‘halo’ sign by measurement of vessel wall thickness did not reach the threshold for consensus. At the OMERACT LVV-US face-to-face group meeting (‘round 3’), the second part of the definition on ’stenosis in temporal arteries´ was rephrased from ‘… before or behind the stenosis’ to ‘… proximal or distal to the stenosis’. The final definitions for normal and pathological cranial and extracranial vessels are described in table 1.
The ‘halo’ sign and ‘compression’ signs were deemed to be the most important US signs for cranial and extracranial GCA with 100% and 83.3% agreement, respectively. Of the panelists, 95.8 % thought that the ‘halo’ sign needs to be present to meet the minimum requirement for vasculitis.
Web-based exercise on still images and videos
Eighteen members from 13 different countries had submitted images and videos including five different US brands (Hitachi, Esaote, GE, Siemens and Philips) using linear transducers with maximum grey scale frequencies of 15, 18 or 22 MHz. Twenty-five group members participated in the web-based exercise in round 1, and 25/25 participants (100%) performed the exercise in round 2.
The reliability of the 25 participants was excellent with mean inter-rater agreements for all still images and videos of 91–99% and mean Light’s κ values of 0.83–0.98 for inter-rater reliability (table 2) depending on the lesions and sites assessed. Also, the examined intra-rater reliability with a mean agreement of 91–99% and a mean Cohen`s kappa values of range 0.83–0.98 (table 3) was excellent. The inter-rater and intra-rater reliabilities performed all with κ >0.8 irrespective of the view (longitudinal or transverse, still images or videos) or anatomical segments.
Many previous studies have investigated US as a diagnostic tool for GCA using different definitions for normal and abnormal findings. This study now provides expert consensus-based definitions for US in LVV that can be applied in future studies. The consensus-based definitions revealed excellent inter-rater and intra-rater reliabilities when tested on images and videos of patients.
Although we included all types of LVV as possible search terms in the SLR, the Delphi as well as reliability exercise was focused on GCA only, as the SLR revealed insufficient data to provide a solid basis for the consensus process. It is, however, the clinical experience of the experts that US abnormalities in patients with TAK look similar. Future US studies in TAK and idiopathic aortitis are necessary to gather more data on US key lesions also in these LVV entities.
The OMERACT Group agreed that ‘halo’ sign and ‘compression’ sign should be regarded as the primary elementary US signs of cranial and/or LV-GCA without including stenosis or occlusion. The ‘halo’ sign has been applied in most published studies.4 9 23–44 46–57 The ‘compression’ sign was only addressed by two studies from one research group so far.9 45 However, it has shown good diagnostic performance and is feasible in daily practice. It is a method to better visualise the ‘halo’ sign. In early studies, the presence of stenosis helped to increase the sensitivity of temporal artery US.10 23 On the other hand, many sonographers feel that stenosis may reduce the specificity of the examination.2 Furthermore, due to far higher resolution of modern US equipment, a ‘halo’ sign can now usually be visualised in stenotic vessel areas, and temporal artery occlusions in GCA usually occur together with the non-compressible ‘halo’ sign’.5
It was also agreed not to include the measurement of IMT for the definition of the ‘halo’ sign, as at the time of the Delphi process only proposals for cut-off values but no studies for validating cut-off values were available. Several previous studies had proposed a wide range of cut-off values for the diameter of a halo sign, for example, 0.3–1 mm for temporal arteries and 1.3–2 mm for extracranial large arteries.4 27 38 39 43 50 53 54 57 A study investigating patients with newly diagnosed active GCA and healthy controls has been recently performed by members of the group (WAS and VSS) for calculating IMT cut-off values in normal temporal and axillary arteries.60 The role of IMT measurements for diagnosis and monitoring is yet uncertain and needs to be addressed by future studies.
The inter-rater and intra-rater agreements of the web-based exercise were excellent. Images and videos were submitted by participating experts as in previous OMERACT-related US exercises.14–16 61 Images and videos for the present web-based exercise were taken from patients with newly diagnosed active GCA since US signs in patients with established disease resolve rapidly with treatment.40 Reliability data for 12 sonographers reading videos from the international multicentre TABUL study have now been published.2 Videos from that study were randomly chosen from all stored videos of the study, irrespective of their quality, whereas the quality of images and videos in the OMERACT study may have been better as the members submitted material which they deemed representative. Sonographers of the TABUL study were less experienced than sonographers of the present OMERACT study. Kappa values for the intra-rater reliability in the TABUL study were 0.69–0.81. Inter-rater reliability was only provided as intraclass correlation coefficients (ICCs). Notably, the reliability of 14 pathologists reading temporal artery biopsy specimens was similar when compared with the 12 sonographers (ICC 0.61 vs 0.62).
We asked the experts to submit images from GCA cases and controls which include patients with arteriosclerosis. Few of the control cases indeed had arteriosclerotic changes; however, we did not specifically question in our rating to distinguish between arteriosclerosis and non-arteriosclerotic controls. We were therefore unable to conduct a separate analysis in this regard. We did not score images and videos with stenosis or occlusions.
In conclusion, an international expert consensus was reached using OMERACT methodology for the definitions of normal US appearance and abnormalities seen in the temporal and axillary arteries in GCA. This OMERACT exercise (along with the previously reported TABUL study) shows that images and videos of US scans of inflamed temporal and axillary arteries can reliably document the characteristic and diagnostic abnormalities in patients with suspected GCA. Our study supports the use of US abnormalities, including both images and videos, as an inclusion criterion for future GCA trials. Confidence is increasing in the use of US in mainstream clinical practice, and it may be incorporated into future guidelines for GCA diagnosis.
The next step in the OMERACT validation process is the inter-rater and intra-rater reliability test of these definitions for normal and vasculitic arteries in patient-based exercises.
SC and CDu contributed equally.
Contributors WAS, CDe, CDu and SC designed the study. CDe and CDu organised the Delphi exercise. WAS and SC organised the web-based exercise. SC build the database, collected all images and selected the study images. SC was responsible for data entry and data precession together with CAS, GC and SR who did the statistical analyses. WAS, CDu and SC wrote the manuscript. All authors contributed to the acquisition of data and have read and revised the manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient consent Not required.
Ethics approval Ethical approval for the web-based exercise was obtained from the Data Agency of the Southwest Hospital in Esbjerg, Denmark.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement There are not additional unpublished data from the study.