Objective To summarise the methodological aspects in studies with work participation (WP) as outcome domain in inflammatory arthritis (IA) and other chronic diseases.
Methods Two systematic literature reviews (SLRs) were conducted in key electronic databases (2014–2019): search 1 focused on longitudinal prospective studies in IA and search 2 on SLRs in other chronic diseases. Two reviewers independently identified eligible studies and extracted data covering pre-defined methodological areas.
Results In total, 58 studies in IA (22 randomised controlled trials, 36 longitudinal observational studies) and 24 SLRs in other chronic diseases were included. WP was the primary outcome in 26/58 (45%) studies. The methodological aspects least accounted for in IA studies were as follows (proportions of studies positively adhering to the topic are shown): aligning the studied population (16/58 (28%)) and sample size calculation (8/58 (14%)) with the work-related study objective; attribution of WP to overall health (28/58 (48%)); accounting for skewness of presenteeism/sick leave (10/52 (19%)); accounting for work-related contextual factors (25/58 (43%)); reporting attrition and its reasons (1/58 (2%)); reporting both aggregated results and proportions of individuals reaching predefined meaningful change or state (11/58 (16%)). SLRs in other chronic diseases confirmed heterogeneity and methodological flaws identified in IA studies without identifying new issues.
Conclusion High methodological heterogeneity was observed in studies with WP as outcome domain. Consensus around various methodological aspects specific to WP studies is needed to improve quality of future studies. This review informs the EULAR Points to Consider for conducting and reporting studies with WP as an outcome in IA.
- outcome assessment
- health care
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What is already known about this subject?
Inflammatory arthritis (IA) has substantial impact on work participation (WP).
Previous systematic literature reviews of studies with WP as an outcome documented deficiencies in the study design, analysis and reporting of results, hampering interpretation, comparison and meta-analysis.
What does this study add?
This study provides a synthesis of the methodological choices and issues in studies with WP as an outcome domain in IA and in other chronic diseases.
Methodological heterogeneity and flaws were identified across four key areas of potential concern: (1) study design, (2) outcome domains and measurement instruments, (3) data analysis and (4) reporting of results.
How might this impact on clinical practice?
This study aims to inform the efforts to improve the methodological quality and homogeneity of future studies with WP as an outcome domain, and ultimately contribute to high-quality evidence on interventions to support endurable WP.
This review informs the EULAR Points to Consider when designing, analysing and reporting studies with WP as an outcome domain in IA.
Inflammatory arthritis (IA) encompasses a group of chronic diseases typically affecting adults in working age, and often leading to work disability with consequent loss of income for patients and high social expenditures for society.1 The treatment of IA aims at reaching remission or, at least, low disease activity in order to prevent structural damage and improve patients’ quality of life. Despite the proven efficacy of new therapies such as biologic (b) and targeted synthetic (ts) disease-modifying anti-rheumatic drugs (DMARDs), the burden of restricted participation in work remains high.
People living with IA have identified the ability to maintain a job and being productive while at work as a priority, ranked right after suppressing pain and improving physical function.2 Work participation (WP) is defined as an active engagement in the role of worker.3 In addition to the employment status (being employed or not), restrictions in work participation can be quantified using absenteeism (namely sick leave) and presenteeism.4 Absenteeism refers to the time missed from work due to health reasons and presenteeism refers to experienced restrictions or impaired productivity while at work due to health reasons.4 People can transition back and forth between not working, working with difficulty and working without difficulty.5
To ensure effective interventions to support endurable WP, high-quality evidence is required. However, several systematic literature reviews in IA showed inconclusive results that could be partially attributed to methodological issues in the study design, analysis and reporting of results hampering correct interpretation, comparison and meta-analysis of studies.6 7
WP is increasingly seen as an important outcome of interventions and thus as a target for improvement. During the past decade, the Outcome Measures in Rheumatology (OMERACT) Productivity Working Group focused its work on evaluating and improving the validity of outcomes and outcome measurement instruments of WP.4 8–10 Despite its continuous efforts to harmonise measurement of worker productivity loss across studies, valid instruments are not sufficient to ensure high-quality clinical studies.
The primary aim of this systematic literature review (SLR) was to inform the EULAR task force working on ‘points to consider when designing, analysing and reporting studies with WP as an outcome domain among patients with IA’. The specific objectives of the present work were (1) to summarise the methodological choices in studies with WP as an outcome domain in IA and (2) to identify the methodological issues reported in SLRs of studies with WP as an outcome domain in other chronic diseases.
Search strategy and eligibility criteria
EULAR task force working on ‘points to consider when designing, analysing and reporting studies with WP as an outcome domain among patients with IA’ outlined the scope of the literature search and pre-identified 24 topics in seven main areas of potential concern: (1) study design, (2) outcome domains, (3) outcome measurement instruments, (4) contextual factors, (5) data analysis, (6) reporting of results and (7) estimating productivity costs. These topics were based on (a) knowledge of the literature and experience with conducting such studies and (b) potential role of the issues on bias (selection, information and statistical bias). After a careful evaluation of the seven pre-defined areas and 24 topics, and to avoid redundancy, they were grouped in four main areas (study design, work outcome domains and instruments, data analysis and reporting of results) and 16 topics (figure 1).
For topics 3 and 9, some context is needed. The follow-up time for outcome assessment should be sufficient to capture changes in the work outcome of interest (topic 3). While for presenteeism and sick leave responsiveness was demonstrated at 24 weeks of follow-up,11 for work status, a follow-up of at least 1 year is preferred. In fact, changes in work status can only be detected over shorter follow-up periods of ≤6 months if large sample sizes are used. Work status change and, more precisely, transitions between employment and unemployment can be seen as formally the last step in a sequence of events that start with presenteeism and/or absenteeism.12 On the other hand, regarding the recall of the assessment instrument (topic 9), there is evidence that a recall period beyond 3 months for sick leave becomes inaccurate8 13 and that patients prefer a recall of 1 week for presenteeism (with maximal accuracy for a 4-week recall).14
Two searches were conducted according to the PICOT (Population, Intervention, Comparator, Outcomes, Time of follow-up) framework—details are provided in online supplemental figure S1. Search 1 focused on studies with WP as outcome domain in IA, aiming at critically appraising methodological choices and heterogeneity across studies, and search 2 on SLRs of studies with WP as outcome domain in other chronic diseases, aiming to identify whether our pre-identified methodological issues in studies in IA were also recognised in other chronic diseases and/or new aspects were revealed.
For search 1, the following study designs were included: randomised controlled trials (RCTs), controlled clinical trials and prospective observational studies (including registries). Also, studies in IA assessing costs of changes in work participation were identified and included in order to assess whether volumes of work productivity (eg, days, hours) were reported as a separate step before converting volumes into costs.15 Other specific methodological aspects related to this particular type of study were considered beyond the scope for the current review. Exclusion criteria for both searches are provided in online supplemental text S1.
The search strategies were designed by an experienced librarian (LF). MEDLINE, EMBASE, CINAHL and the Cochrane Library were searched (details on search strategies in online supplemental text S2 and S3) between January 2009 and May 2019.
Study selection and data extraction
For both searches, references and abstracts were imported into the reference management software EndNote V.X7.0.2 and deduplicated.
As a high number of hits resulted from the initially defined broad timeframe (n=7715), it was decided to limit the review to recent studies published from January 2014 to April 2019 (n=5534). This decision was based on feasibility and with the rationale that the most recent studies would likely be of better methodological quality and better reflect current standards.
Two researchers (MLM and MMtW) independently screened all titles and abstracts. Next, full texts were reviewed to determine eligibility. Disagreements were resolved by discussion, and if necessary, the methodologists (SR and PP) were involved to make a final decision.
For both searches, study details and results of eligible studies were retrieved by two reviewers (MLM and AA) using a standardised data extraction sheet. Both reviewers (MLM and AA) retrieved data from a 20% random selection of all the included studies. Given an agreement of 89% and consensus on how to further avoid divergences in data extraction, reviewers continued to independently retrieve data of the remaining studies.
For studies in IA, general characteristics of the studies were first retrieved, such as the type of study (RCTs vs longitudinal observational studies), type of intervention (pharmacological intervention, non-pharmacological intervention and natural course of the disease), assessed WP outcome domain (work status and/or sick leave and/or presenteeism) and also if the WP outcome domain was assessed as primary or secondary outcome (online supplemental table S1). Then, the methodological choices regarding the 16 pre-defined topics (figure 1) were retrieved by area: study design (table 1), work outcome domains and instruments (table 2), data analysis (table 3) and reporting of results (table 4).
For SLRs in other chronic diseases, all the methodological issues, as reported by the authors of the SLRs, were retrieved and categorised into the 16 pre-defined topics (figure 1). The quality of the SLRs was not assessed as we were interested in reviewing which methodological flaws were reported in other chronic diseases, particularly focusing on new aspects not previously identified in IA studies. Both SLRs were registered in PROSPERO (CRD42020186798).
For the SLR in IA, the literature search yielded 7715 hits. After removing duplicates, conference abstracts and publications before 2014, 2427 articles remained for screening of titles and abstracts, leading to screening of 132 full-text articles. Twenty-three studies on costs of WP were cross-sectional or retrospective and therefore did not comply with inclusion criteria to assess general methodological choices. A total of 81 studies were included in our analysis (flowchart in online supplemental figure S2): 58 for extraction of general methodological choices,16–73 23 for outcome reporting studies on costs of work productivity16 74–96 and one providing information on both outcomes.16
The search for SLRs in other chronic diseases yielded 10 208 hits. After excluding duplicates and studies before 2014, 3547 titles and abstracts were screened, resulting in screening of 148 full-text articles, and finally 24 were included in the analysis (flowchart in online supplemental figure S3).97–120
General characteristics of the included studies
The 58 IA studies appraising general methodological issues comprised 46 longitudinal observational studies16 17 20 23 26 27 29–31 33 34 39–47 50 51 55–59 61 63 65 66 68 70–73 and 22 RCTs.18 19 21 22 24 25 28 32 35–38 48 49 52–54 60 62 64 67 69 The characteristics of included studies are provided in online supplemental table S1.
Most of the IA studies were on rheumatoid arthritis (RA) (n=33, 57%),16 17 21–24 27–29 31–34 36 37 39–41 44–47 49 52–54 56 57 61 64 68 69 72 followed by axial spondyloarthritis (axSpA) (n=16, 28%)18 19 30 38 50 51 59 60 62 63 65–67 70 71 73 and psoriatic arthritis (PsA) (n=6, 10%),25 35 43 48 55 58 and finally, two studies assessed two diagnostic groups: RA and axSpA,42 and axSpA and PsA.26
The type of intervention and WP outcome domain for each study is presented in online supplemental table S1, and the corresponding data grouped by type of study (RCTs vs longitudinal observational studies) is shown in online supplemental table S2. Work was assessed as a primary outcome in only 26/58 (45%) of the studies,16 17 21 23 26 27 29–31 41 42 44–47 50 56–58 61 63 64 66 68 71 rarely being the primary outcome in RCTs (n=2/22, 9%).21 64 The time horizon for the assessment of WP outcomes varied from 24 weeks to 12 years and its distribution, as well as the frequency of assessment by work outcome domain, are both provided in online supplemental table S3.
The general characteristics of included SLRs are presented in online supplemental table S4. Most studies focused on cancer (n=15; 63%),98 99 101 105–109 111 112 115 116 118–120 followed by stroke (n=3, 13%).97 113 117 The most frequently assessed work outcome was ‘return to work after a temporary absence’ (n=12, 50%).97 99 101 107–110 113 115–117 120
Table 1 provides an overview of methodological choices in the area of study design. The included population was aligned with the specific work-related study objective in only 16/58 (28%) IA studies,21–23 26 27 30 31 43–46 50 61 63 64 66 while the sample size calculation was performed solely in 8 (14%) studies.21 30 42 44 56 58 64 68
Large heterogeneity was observed in the follow-up time of the IA studies, although the majority of studies assessed changes in work status within a follow-up of >6 months. Of the five studies assessing changes in work status over an unrealistic short follow-up period ≤6 months,21 30 52 53 67 two also assessed it after 12 months (online supplemental table S3).30 53
The frequency of assessment of sick leave in observational studies (excluding registries, n=8) was longer than 3 months in more than half of the studies (12/20 (60%))20 26 33 34 39 42 50 51 59 63 66 72; however, the other 8/20 (40%) had a frequency of assessment shorter than 3 months hampering correct aggregation into cumulative sick leave.31 43 55 56 58 68 71 73
While all RCTs had a comparator,18 19 21 22 24 25 28 32 35–38 48 49 52–54 60 62 64 67 69 only 8/36 (22%) observational studies had one.27 29 30 42 46 51 70 71
The general population, a meaningful benchmark in studies with work as an outcome, was used as a comparator solely in five observational studies.27 29 30 46 70
Regarding SLRs in other chronic diseases, similar issues were reported for all the topics of study design, with the most common flaw being no mentioning of the sample size calculation for work as outcome, as reported in 21/24 (87%) SLRs.97–114 118–120
Work outcome domains and instruments
The methodological choices regarding the work outcome domains and instruments are presented in table 2. Among studies in IA, the definition of ‘work status’ was described in two-thirds of studies (71%)21 23 27 30 40 41 46 50 53 57 62 70 and definitions showed large heterogeneity. Sick leave was defined in all studies assessing it,16 18–22 24–27 29 31–39 42–47 49–56 58–60 63 65–69 71–73 and all but one reported the definition of presenteeism.30
SLRs in other chronic diseases reported high variability in the definition of all WP outcomes in included studies precluding meta-analysis.97–99 102 105–107 109 111 113 114 117 120 In contrast, the majority of studies in IA assessed sick leave and presenteeism using validated instruments—91% and 88% of studies, respectively. The Work Productivity and Activity Impairment (WPAI) questionnaire was the outcome measurement instrument most frequently used (n=29).18–21 24–26 31–34 36–39 42 43 50–53 55 56 58 63 67 69 71 73
Overall, the work outcome domains’ attribution (to overall health, arthritis or no attribution) was heterogeneous across studies, with sick leave being the domain most frequently assessed in relation to overall health (23/46 (50%) studies).16 18 20 22 25 27 29 31 33 34 38 42 44–47 49 51 55 58 65 67 72
Reviews in other chronic diseases pointed out inconsistencies of the recall period (varying from 7 days to 7 years).102 103 115 On the contrary, in IA, the recall period of sick leave (excluding registries since recall is not applicable) was accurate8 13 (ie, ≤3 months—figure 1) in 34/37 (92%) studies,18–21 24–26 31–39 42 43 50–56 58–60 63 67–69 71 73 and the recall of presenteeism was reliable and in line with the face validity for patients14 (ie, between 7 days and 4 weeks—figure 1) in 34/40 studies (85%).18–21 24–26 31–39 42 43 50–56 58 60 62 63 67–69 71 73
Regarding the methodological choices in the area of data analysis (table 3), only 10/53 (19%) IA studies reported skewness of sick leave and/or presenteeism and accounted for the skewness in the analyses.20 22 45–47 50 58 59 63 65
Also, only 8/22 (36%) RCTS19 25 36 37 49 52 67 69 and 17/36 (47%) observational studies16 20 23 27 41 44 46 50 56–58 61 63 65 66 70 72 took contextual factors into account, most frequently demographic factors, such as age and gender, while other specific work-related contextual factors (eg, nature of work and workplace support) were less frequently accounted for.27 41 50 58 63 70 SLRs in other chronic diseases reported that adjustment for contextual factors/confounders in the included studies, if any, was performed for very few factors.101 105 107 109 110 112–117 120
The majority of studies in IA (n=49/52, 94%) took interdependence between work outcomes into account acknowledging that (1) data (over time) on sick leave are less meaningful without information on the proportion of persons employed (over time) in that specific population (sick leave cannot happen if the person is not employed) and/or (2) assessing presenteeism is less meaningful if information on sick leave is not provided (eg, presenteeism cannot happen on days a person is absent due to sick leave).18–22 24–27 29 31–39 42–47 50–56 58–60 62–69 71 73
Reporting of results
The methodological choices in IA studies as well as the issues raised in SLRs in other chronic diseases regarding the area of reporting are described in table 4.
The reporting loss to follow-up and the work-related reasons for drop-out were often neglected in IA studies, being reported in only one study.50 In other chronic diseases, this was also inconsistently reported.112 118
All IA studies reported the size and characteristics of the (sub)groups analysed.16–73
In IA studies, the choice on how to report study findings was heterogeneous, with only 11/58 (16%) studies presenting both aggregated results (mean/median) and percentages according to meaningful thresholds.19 30 35 42 51 56 59 62 65 66 71 This was also outlined by the SLRs in other chronic diseases where the lack of patient-level data was a barrier to study pooling and meta-analysis.104
Data on natural volumes (days/hours) used to calculate costs was presented in the majority of the studies reporting productivity costs (21/24, 88%).16 74–76 78–82 84–92 94–96
WP has been a frequently assessed endpoint in IA studies over the past 5 years; however, these studies revealed a high methodological heterogeneity and a number of important flaws. Several issues were detected in the areas of study design, work outcome definition and assessment, as well as in the analysis and reporting of the results. Review of SLRs in other chronic diseases revealed that observed methodological issues are not rheumatology specific as these are also common in studies of work outcomes in other clinical fields.
Different WP outcomes of interest apply to specific subpopulations (eg, employed/employable people) and need to be assessed in a sufficiently large group over a certain timeframe.4 Notwithstanding, this was often neglected, particularly when WP was not the primary outcome as occurred in the majority of RCTs. Thus, the studied population, the intermediate assessment time-points and overall follow-up time were tailored on the primary outcomes, hampering the power to detect statistically significant effects on WP outcomes and leading to follow-up times not adequate for some of the WP outcomes of interest. Moreover, even in RCTs with long-term extensions, WP outcome domains were not assessed across the extension study period as other outcomes. This pose particular challenges in studies aiming to understand the impact of an intervention on long-term employment, work disability or prolonged sick leave (eg, assessing costs of productivity loss), as having a time horizon of 6 months is not adequate.8 Remarkably, also studies with WP as the primary outcome had important flaws in this area, for example, the sample size calculation was often not reported.
Careful choice of which WP outcome to assess and which measurement instrument to use is of paramount importance, particularly when dealing with a comparison of interventions.8 9 As far as the definition of employment and work disability is concerned, clinical studies might want to align with definitions that are relevant for their administrative entities (eg, countries, regions, states, etc),8 thus likely contributing to heterogeneity in work status definitions as found in IA studies. In contrast, presenteeism and sick leave were often described in line with the frequent use of validated instruments (eg, WPAI) that include an appropriate definition for the work outcome domain.8 In this regard, stakeholders should strive to harmonise worldwide comparable and locally applicable definitions along with endorsing specific outcome measurement instruments, for example, as OMERACT is doing for presenteeism.8 9 Two other important methodological aspects, namely, disease attribution and recall, are relevant but not (yet) encompassed by the OMERACT framework. Regarding disease attribution, only half of the studies assessed the WP outcome domain in relation to overall health (more meaningful for benchmarking with the general population). This may be problematic since it is well established that patients have difficulties in distinguishing which restrictions can be attributable to IA, other specific health problems (eg, osteoarthritis) or overall health.5 Inconsistency of the recall period was often reported in studies of other chronic diseases, however less evident in IA studies. This is likely due to the widespread use of validated instruments such as WPAI (past 7 days recall) and the Work Productivity Survey (WPS; past month recall) in the field of rheumatology.
WP, as any outcome, is subject to the effect of a number of variables, related to either the disease, the social environment or other aspects, which require to be considered in order to reliably assess the net change of the outcome. Contextual factors, defined by OMERACT, from a statistical viewpoint, as a “variable that is not an outcome of the study but needs to be recognized (and measured) to understand the study results”, include potential confounders and effect modifiers (https://omeract.org/handbook-resources/). The characterisation of core contextual factors (ie, when do they really matter to influence practice) remains a challenge, partially because the influence of most contextual factors tends to vary according to the setting.8 The International Classification of Functioning, Disability and Health (ICF) provided, in addition to the bio-psycho-social framework, also a classification distinguishing personal and environmental factors, and this was the basis for a further grouping of contextual factors relevant for WP by the OMERACT work productivity group.10 Lack of accounting for contextual factors was common in IA and often reported also by SLRs in other chronic disease. Work-related contextual factors such as job type, adaptations at work and more personal aspects such as ability to cope and satisfaction were often neglected. This emphasises the urgent need of action for improving and implementing feasible strategies to account for relevant work-related contextual factors.
Other methodological issues pertain to how data are analysed and reported. WP presents a continuum of subdomains which are (hierarchically) dependent on each other and/or can compete over time.5 The majority of studies assessing sick leave and presenteeism took interdependence between work outcomes into account, encompassing the widespread use of the WPAI, which already considers interdependence of sick leave and presenteeism (overall work impairment). SLRs in other chronic diseases reported that despite using the correct instrument (eg, WPAI), the studies frequently neglected some important subdomains.119 Indeed, to account for interdependence, WPAI must be comprehensively used, that is, assessing both presenteeism and sick leave plus the overall work impairment. Yet, consensus is needed on how to deal with such dependencies when instruments other than WPAI are used. It is known that distribution of presenteeism, and especially sick leave, may often be highly skewed (even zero inflated).6 7 Not accounting for this, as we observed in the majority of studies, may affect the robustness of conclusions.
Furthermore, drop-out may be related to underlying work context and thus not be at random, so the rates and reason for drop-out should be carefully considered to ensure a correct interpretation of the impact of IA on WP outcomes overtime. However, these were not reported in the majority of studies. Likewise, to enhance the insight into WP outcomes and to ensure more transparent interpretation of the differences between interventions, the mean and median values of sick leave or presenteeism and also the proportion of patients attaining a specific meaningful (change in) outcome are advisable to report.8 In IA studies, the choice on how to report data on work outcome domains was heterogeneous, with only 19% of studies presenting both aggregated results and percentages according to meaningful thresholds. Choice of thresholds was not uniform across studies, highlighting the needs for consensus in this respect.
This review has some limitations. Although we used a sensitive approach to identify studies with WP as an outcome domain in IA as well as SLRs in other chronic diseases, we cannot be sure that some relevant studies were missed. While retrieving data from SLRs in other chronic diseases, only the reported issues were collected, as going through the primary studies was beyond the scope. This may have resulted in missing some relevant methodological aspects not captured by the SLR authors. The exclusion of studies <2014 due to feasibility reasons implies that our summary is generalisable to issues found in recent studies.
In conclusion, a high methodological heterogeneity and important flaws were detected among the included studies in the main areas of study design, work outcome definition and assessment, analysis and reporting of results. This SLR alerts for the need of implementation of minimum quality standards around these key methodological aspects to homogenise and improve the quality of future studies in IA and likely in other chronic diseases. This review informs the EULAR Points to Consider for the conduction, analysis and reporting of studies with work as an outcome domain in IA.
All EULAR task force members working on ‘points to consider when designing, analysing and reporting studies with WP as an outcome domain among patients with IA’ for defining the main focus of the literature search. The work on this manuscript was previously accepted as a conference abstract to the EULAR Congress 2020 and published in the correspondent supplement of Annals of the Rheumatic Diseases.
MLM and AA contributed equally.
SR and PP contributed equally.
Contributors All coauthors contributed to the development of the study design and outline. LF has developed and run the library searches. MLM and MMtW screened all titles and abstracts and reviewed the full texts for inclusion. MLM and AA retrieved data using standardised data extraction sheets. MLM, AA, SR, PP and AB have analysed and synthesised the data. MLM and AA have drafted the first version of the manuscript, and all authors have critically reviewed and agreed with the final version of the manuscript.
Funding This study is part of the EULAR ‘points to consider when designing, analysing and reporting studies with WP as an outcome domain among patients with IA’ funded by EULAR, grant number EPI021.
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement All data relevant to the study are included in the article or uploaded as online supplemental information.