Characteristics of rheumatoid arthritis and its association with major comorbid conditions: cross-sectional study of 502 649 UK Biobank participants

Introduction To characterise the detailed phenotypic and comorbid characteristics of participants with rheumatoid arthritis (RA) in the large population-based UK Biobank, thereby enabling future longitudinal analyses. Methods We undertook a cross-sectional study using baseline data from the unique UK Biobank resource (n=502 649). RA was based on self-report, and type of medication was used as a proxy measure of valid diagnosis. Participants with and without RA were compared in terms of sociodemographic, lifestyle and other disease-related risk factors. Logistic regression models were used to determine whether participants with RA were more likely to report comorbid conditions, and whether this varied by RA severity. The models were adjusted for potential confounders and lifestyle risk factors. Results At baseline, 5657 (1.13%) eligible UK Biobank participants reported RA of whom 2849 (0.57%) had medically treated RA (median duration=10 years). Prevalence was significantly higher among female, South Asian and socioeconomically deprived participants. Participants with RA were significantly more likely to report diabetes (covariate-adjusted OR 1.18, 95% CI 1.06 to 1.32, p<0.01), hypertension (OR 1.19, 95% CI 1.21 to 1.27, p<0.001) and cardiovascular disease (OR 1.52, 95% CI 1.39 to 1.67, p<0.001). Conclusions UK Biobank provides extensive data concerning RA population-level comorbidity and risk factors. The frequency, distribution and characteristics of participants reporting RA in UK Biobank are largely consistent with other studies. It provides a unique opportunity to interrogate biomarkers, genetic data, detailed imaging and linkage to clinical records at the population level across primary and secondary care.


INTRODUCTION
Rheumatoid arthritis (RA) is a chronic inflammatory syndrome that causes pain, swelling and, if untreated, progressive damage to joints. The UK prevalence of RA has been estimated as 0.8% 1 which equates to ∼690 000 people. In addition to disability and poorer quality of life, RA is also associated with increased morbidity and mortality compared with the general population. 2 The leading cause of death among RA patients is cardiovascular disease, with risk 50% higher than the general population. 3 UK Biobank is a very large general population cohort study of middle-to-older-aged adults in the UK designed to be

Key messages
What is already known about this subject? ▸ Rheumatoid arthritis (RA) is associated with significant cardiometabolic and psychiatric comorbidities that require long-term studies to evaluate. ▸ UK Biobank is a large population-based cohort study that offers opportunities to study chronic diseases.

What does this study add?
▸ This is the first report of the baseline characteristics of the RA population in UK Biobank. ▸ The prevalence of RA in UK Biobank is 1.13% by self-report and 0.57% by medication. ▸ Patients with RA were significantly more likely to report diabetes, hypertension and cardiovascular disease.
How might this impact on clinical practice?
▸ UK Biobank emphasises the burden of RA and comorbidities, highlighting the need for ongoing vigilance and management. ▸ UK Biobank will be an invaluable resource for the longitudinal study of RA and related comorbidities, especially as genetic and other results become available.
representative of the general population in terms of age, sex, socioeconomic status and ethnicity. It was created to provide a useful resource to study a wide range of important chronic conditions of adulthood, such as RA. Follow-up is being conducted via linkage to routine administrative data sources such as primary care attendances, hospital admissions, drug prescriptions and death certificates and will, in due course, provide information on incident cases of RA. Genetic, biomarker and imaging data will also become available in future which will greatly enhance the studies that can be undertaken using the UK Biobank cohort. However, because of the age criteria used, large numbers of participants already have prevalent diseases such as RA at recruitment, permitting cross-sectional studies to be undertaken now. It is vital to determine the baseline characteristics of the participants with a specific condition such as RA before subsequent studies can be undertaken and their relevance and significance deduced. The aim of this study was to ascertain the frequency and distribution of reported RA within this very large general population cohort study, and the extent to which it was associated with comorbid conditions. Our work provides a comprehensive baseline evaluation on which many subsequent analyses on this uniquely phenotyped and genotyped cohort (for around n=150 000 and the rest due in Q3 2016) can be performed. RA was defined as self-report of the condition; UK Biobank did not collect information on disease severity. Therefore, type and 'intensity' of medication was used as a proxy measure of valid diagnosis and 'severity', stratified into no RA medication; corticosteroids only; one synthetic disease-modifying antirheumatic drug (DMARD) such as methotrexate or sulfasalazine; more than one synthetic DMARD; and one or more biologic DMARD such as anti-tumour necrosis factor (TNF) therapy or rituximab. The full list of eligible drugs recorded by UK Biobank is contained in the online supplementary table S1. We checked the UK Biobank medication list for drugs recorded under trade names; this was rare, but instances are indicated in the table. Participants on combination treatments were grouped according to the highest intensity of medication type based on the aforementioned list. Medications were self-reported.

METHODS
Diabetes, hypertension, cardiovascular disease and depression were based on self-report. Participants selfreported use of lipid/cholesterol-lowering and antihypertensive medication. Participants who did not report hypertension but had measured diastolic blood pressure above 90 or systolic blood pressure above 140 were also categorised as having hypertension. Cardiovascular disease included angina, myocardial infarction, arrhythmias, pericarditis, cardiac failure, valve disease and cardiomyopathy. Participants reported whether they had suffered pain for more than 3 months at a number of sites: no pain, headache, facial, neck or shoulder, back, stomach, hip, knee or pain all over. These data were used to categorise participants into no pain, 1 site, 2-3 sites, 4-7 sites or all over the body.

Statistical analyses
Participants with and without RA were compared, in terms of their characteristics, lifestyle factors and presence of comorbidity, using χ 2 tests for categorical data, χ 2 tests for trend for ordinal data and Kruskal-Wallis tests for continuous data. We used a series of binary logistic regression analyses to examine the association between RA and comorbid conditions adjusted for the potential confounding effects of age, sex, ethnicity and socioeconomic deprivation quintile, as well as potential shared lifestyle risk factors of smoking status, alcohol intake and BMI.
Among those with RA, we used χ 2 tests for trend to examine whether the characteristics of participants and the prevalence of comorbidity varied according to the type of RA medication, a surrogate marker of RA severity. Participants were excluded from these analyses if their medication was unknown. We undertook generalised ordered logistic regression analyses (GOLOGIT2; http:// www3.nd.edu/~rwilliam/gologit2/) to explore the relationship between severity of RA and pain, after adjustment for potential confounders and shared lifestyle risk factors. GOLOGIT2 has the advantage of not assuming a stable relationship between an ordered group variable and outcome. 7 We used the 'autofit' option to relax the parallel-lines constraint where appropriate.
As RA was defined on the basis of self-report, it is possible that some participants with joint symptoms may not have had RA or may have had another musculoskeletal condition that they erroneously labelled as RA. Therefore, we ran the analyses using two definitions of RA. First, we assume that all participants who reported RA had the condition. Second, we used a much tighter definition which included only those participants who reported having RA and were also on relevant medication which rheumatologists would commonly use in RA (see online supplementary table S1). Participants who reported RA but were not recorded as taking relevant medication were excluded from the second analyses. In both sets of analyses, we compared participants with those who did not report RA. We also reran all the analyses excluding the participants who reported osteoarthritis and then also excluding the participants who reported other musculoskeletal conditions, including psoriatic arthritis, systemic lupus erythematosus, ankylosing spondylitis and polymyalgia rheumatic. Finally, we reran the analyses classifying as hypertensive only those participants who reported having the condition. These rerun analyses were not meaningfully different from the final results which are shown. All analyses were conducted using STATAV.13.1 (StataCorp, College Station, Texas, USA).
For men and women, we found significant associations between reporting RA and adiposity-related measureshigher BMI, waist circumference, percentage body fat and waist:hip ratio. As additional analysis, we ran linear regressions between RA status and each, adjusted for the covariates of age, sex and smoking and the associations remained. When we used the narrower definition of treated RA and compared this subgroup with participants who did not report RA, all of the associations remained statistically significant, except there was no difference between groups in terms of physical activity score and BMI in men (table 2). Women that reported RA were significantly more likely to report hormone replacement therapy, menopause at all, menopause before the age of 50 and having had a hysterectomy (table 2).
Participants with reported RA had lower grip strength (table 2). They were significantly more likely to report pain, and those with pain were more likely to report it at more than one site or all over the body (table 3). Among participants who reported RA, the number of pain sites increased with increasing RA severity as defined by medication type (table 4). These findings persisted after adjustment for age, sex, ethnicity, socioeconomic quintile, BMI and alcohol intake (table 4). Participants with reported RA had higher systolic blood pressure (table 2). The prevalence of hypertension was higher among those with reported RA (68.04% vs 60.30%; table 3). There was no clear pattern in relation to RA severity based on type of medication (table 5).
Overall, participants with reported RA had a higher prevalence of diabetes (7.87% vs 5.25%), and this survived correction for variables we considered as confounding (table 3). The prevalence of cardiovascular disease was significantly higher among those with RA, and rates increased with higher severity based on medication except for biologics (table 6). There was a significantly increased rate of depression in participants that reported RA, although this did not persist after adjustment for potential confounders and did not reach statistical significance using the tighter definition of treated RA (table 6).
The results were very similar when we reran the analyses excluding RA participants who reported other musculoskeletal conditions such as psoriatic arthritis (n=33), systemic lupus erythematosus (n=41), polymyalgia (n=35) and osteoarthritis (n=607), and also when we classified hypertension purely on the basis of self-report. Participants additionally self-reported use of nonsteroidal anti-inflammatory drugs (NSAIDs) and cyclooxygenase-2 (COX-2) inhibitors. We report usage rates in the online supplementary table S3, stratified by RA group. There were significantly increased rates of usage in the narrow and broad RA groups (both p<0.001).

DISCUSSION
Within UK Biobank, the overall prevalence of RA was 1.13% based on self-report and 0.55% using a narrower definition of treated RA. These are similar to population prevalence estimates based on strict American College of Rheumatology criteria. In a systematic review published in 2006, 19 studies had measured prevalence using these criteria. 8 Gabriel et al 9 reported a prevalence of 1.1% for the USA, and all the other studies produced a prevalence below 1%. Only two studies have been conducted in the British Isles, and none in the past 12 years. In 1999, Power et al 10 derived a prevalence of 0.5% for Dublin, Ireland, and, in 2002, Symmons et al 1 published a figure of 0.9% for Norfolk, England. Whereas these studies used clinician diagnosis of RA, UK Biobank relies on self-reported diagnosis of RA and use of relevant medication. It should be noted that we could not distinguish between participants that chose not to report (or did not remember) their diseases or medications and those that were genuinely healthy and had nothing to report. We assumed that participants who did not report RA or relevant medication did not in fact have RA; however, this may not necessarily be the    case, resulting in a slight underestimation of RA in the sample. In this way, we believe our analysis comparing treated RA with the rest of the population is conservative and therefore robust. In UK Biobank, the female to male ratio of the prevalence of self-reported RA was 2.3:1. In Alamanos et al's 8 systematic review, all of the studies demonstrated a higher prevalence in women with the female to male ratio ranging from 1.6:1 to 5.7:1. The only British study to report sex-specific prevalence rates demonstrated a female to male prevalence ratio of 2.5:1, which is very close to our findings in UK Biobank. 1 Ethnic differences were also observed in UK Biobank. Compared with white participants, the prevalence was higher in South Asian participants and lower in black and Chinese. This is consistent with other studies that have reported a low prevalence among Chinese, 11 Taiwanese, 12 South Korean, 13 black 14 and black Caribbean communities, 15 ranging from 0.1 to 0.3 per 100 population in different Asian groups (eg, Far Eastern). 16 UK Biobank participants who reported RA were more likely to have cardiovascular disease, with an adjusted OR of 1.52 (vs non-RA participants). This finding is consistent with previous studies. Ogdie et al 17 conducted a general population cohort study using routine primary care data. They studied a composite outcome of myocardial infarction, stroke or cardiovascular death among 41 752 participants with RA. In comparison with participants without RA, those with RA were at increased risk irrespective of whether they were taking a DMARD (adjusted HR 1.58, 95% CI 1.46 to 1.70) or not (adjusted HR 1.39, 95% CI 1.28 to 1.50). Similarly, QRISK2 data, which included 531 family practitioners, 2.29 million participants and 140 115 cardiovascular events, 18 demonstrated that RA was a significant risk factor for stroke among men (OR 1.34, 95% CI 1.19 to 1.51) and women (OR 1.33, 95% CI 1.22 to 1.46). RA was also a risk factor for cardiovascular disease among men (OR 1.38, 95% CI 1.25 to 1.52) and women (OR 1.50, 95% CI 1.39 to 1.61). Hence, UK Biobank data on cardiovascular disease prevalence seem to be in broad agreement with other major datasets, lending external validity. RA was associated with significantly increased rates of NSAID/COX-2 inhibitor use. Interpretation of NSAID use is problematic because these are often taken on a pro re nata basis and NSAIDs such as ibuprofen are available as over-the-counter drugs.
It has been suggested that RA and cardiovascular disease may share a genetic predisposition mediated via common inflammatory and metabolic pathways. 19 20 However, common lifestyle risk factors may also play a role. In a systematic review, Sugiyama et al 21 demonstrated a significant association between current smoking and RA in men (OR 1.87, 95% CI 1.49 to 2.34) and women (1.31, 95% CI 1.12 to 1.54). In our study, participants with RA had a higher prevalence of smoking, and adjustment for potential shared lifestyle risk factors, such as smoking status, somewhat attenuated the association between RA and cardiovascular disease in terms of effect size, suggesting these may explain some of the association.
People with RA have a higher prevalence of some established risk factors for cardiovascular disease, but not others. 22 Boyer et al conducted a meta-analysis of 15 case-control studies, comprising a total of 2956 people with RA; notably in UK Biobank alone, we have nearly the same number of participants in one single cohort and of course multiple number of non-RA participants. Boyer reported a higher prevalence of smoking and diabetes, but not hypertension. We observed a significantly higher prevalence of diabetes in participants that reported RA. UK Biobank will report glycosylated hemoglobin (HbA1c) results on all participants in 2016, and these data will allow better assessment of the association of RA and its severity with glycaemic levels, and whether there is an increased risk of undiagnosed diabetes or impaired glucose tolerance. In contrast to Boyer et al, the prevalence of hypertension was significantly higher among UK Biobank participants with RA, and the subgroup on RA medication remained at significantly higher risk of hypertension after adjusting for potential measured confounders. This is similar to what has been reported by other studies. 23 Overall, there was no association between self-reported RA and depression. This finding is not consistent with previous studies and merits further study. In 2013, Matcham et al 24 published a meta-analysis of 72 studies comprising 13 189 participants. The pooled prevalence of depression was 16.8% (95% CI 10% to 24%), but the estimates of prevalence varied considerably according to case definition up to a maximum of 38.8% based on studies using the Patient Health Questionnaire-9 (PHQ-9 25 ). Coexistence of depression and RA is associated with increased pain, fatigue, reduced healthrelated quality of life, increased physical disability and healthcare costs. 26 However, depression tends to be underdiagnosed by clinicians. In a cohort study of more than 33 000 patients with RA, depressive symptoms were reported by 11.7% of patients but only identified by 1% of rheumatologists. 27 Similarly, the incidence of depressive symptoms was 7.8 per 100 patient-years based on patient report, but only 0.4 per 100 patient-years based on rheumatologist reports. The UK Biobank results are based on self-reported depression, and it is possible that people who participate in cohort studies like UK Biobank are unrepresentative of the general population and less likely to feel or report depression.
Of the participants who reported RA, 51% were not on relevant RA medication. Cohorts recruited from secondary care settings 28 report much higher prevalence (>90%) of DMARD use, but data from inception cohorts of patients presenting with inflammatory arthritis in the community 29 report lower rates of DMARD usage (<60%), consistent with our findings. It is not currently possible to formally confirm the diagnosis of RA or stratify RA by disease activity scores, inflammatory markers or self-reported severity in UK Biobank. Therefore, we used type and intensity of RA therapy as a proxy measure of disease severity, on the assumption that more potent RA therapies, such as combination DMARDs or biologic agents, would be used by participants with the most severe disease. This approach could be partially confounded by the more widespread use of treat-to-target strategies resulting in patients with more recently diagnosed RA receiving combination DMARDs earlier than patients with more long-standing disease. Patients who self-reported RA but were on no RA medication are the group in whom confirmation of the diagnosis, via biomarkers, imaging and clinical examination, is most warranted, whereas those on medications are highly likely to have genuine RA. The former group is likely to include people who do not have a true diagnosis of RA (eg, mislabelling of osteoarthritis), but may also include some people with long-standing RA who are unable to take or no longer require DMARD therapy. Among those on RA medications, more potent therapy was associated with a higher prevalence of pain over multiple sites of the body. Conversely, participants on more potent medication were less likely to have cardiovascular disease than those on just steroids. Care needs to be taken in interpreting these findings as genuine negative correlations between disease severity, treatment and comorbidity. First, the number of participants who reported RA but were on steroid medication only was very low (n=12). Second, they may be taking steroids for other indications. Therefore, confirmation of their RA diagnosis via biomarkers and linkage to clinical records is warranted. Third, they may have severe RA but been unable to take more potent therapy due to contraindications, including comorbid conditions. Finally, the association with therapy may reflect reverse causation because patients only continue on these potent but expensive (biologics) and potentially toxic therapies (DMARDs and biologics) if they are responding well to them. Some therapies may also be contraindicated in people with significant existing cardiovascular and other comorbidities. It is also possible that patients with severe RA may decline more potent treatment or may be in a period of remission. The low prevalence of biologic drug treatment in patients with self-reported RA is noteworthy and suggests that the RA population in UK Biobank may be much milder than RA in the UK as a whole; patients with more severe RA appear, as anticipated given the method of population-based recruitment, underrepresented in the UK Biobank cohort compared with RA patients in secondary care settings. Patients with less severe disease are more likely to participate in a study like UK Biobank. In terms of identifying the 'cleanest' RA phenotype, future studies may consider their primary analysis as treated RA on DMARDs, in the absence of other systematic rheumatic diseases (such as systemic lupus erythematosus (SLE) and osteoarthritis), although there are likely to be patients who do have RA and osteoarthritis, particularly with increasing age. There also exists the possibility of using future biomarker data (especially citrullinated peptide antibody) to better characterise the RA phenotype, as used in other cohorts such as the Women's Health Initiative. 30 UK Biobank is a very large cohort study that is representative of the general UK population in terms of breakdown by age, sex, ethnicity and socioeconomic status, within the age range recruited. However, in common with similar observational studies, participants are not necessarily representative of the general population in terms of other characteristics, such as lifestyle. Therefore, while UK Biobank is of value in identifying risk factors for the development and progression of disease, care is required in generalising other measures such as disease prevalence. Nonetheless, the prevalence of RA in UK Biobank is broadly in agreement with similar studies, taking into account that case definition is currently based solely on self-report, 31 and many other characteristics-higher cardiovascular disease prevalence, socioeconomic gradient, more diabetes, pain and lower grip strengths-also match up and extend prior observations. UK Biobank is currently conducting biomarker assays, imaging studies and linkage to clinical records, all of which will improve the case definition and stratification of RA, and therefore the utility of this resource for studying RA. The data reported in this manuscript provide a comprehensive cross-sectional baseline description of RA and relevant comorbidities in UK Biobank. Our results provide an initial analysis from a dataset that is likely to have significant future impactful findings. We did not exhaustively investigate all possible comorbidities at this stage, but instead focused on relatively common psychiatric/cardiometabolic diseases and sociodemographic/anthropometric traits. The follow-up information being collected via linkage to routine administrative data sources, including primary care consultations, hospital admissions, prescriptions and deaths, will provide useful information on the natural history of RA. Finally, ongoing genotyping will enable researchers to explore genetic predisposition, gene-environment interactions and the extent to which the underlying mechanisms are common to other diseases. As these further data become available, UK Biobank will become an increasingly powerful resource to study RA and related comorbidities. We believe this present paper will serve as an excellent starting point for future researchers to interrogate many aspects of RA as UK Biobank data mature.
version to be published. DFM and DML had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. SS, DFM, DP, NS and JPP were involved in study conception and design. JPP was involved in acquisition of the data. DML, DFM, DP, IBM, NS, JPP and SS were involved in analysis and interpretation of the data.
Funding This research has been conducted using the UK Biobank resource; we are grateful to UK Biobank participants. UK Biobank was established by the Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish Government and the Northwest Regional Development Agency. It has also had funding from the Welsh Assembly Government and the British Heart Foundation.
Competing interests JPP is a member of the UK Biobank steering committee. This had no influence on the current study. Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
Open Access This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http:// creativecommons.org/licenses/by/4.0/