Article Text

Download PDFPDF

Original article
Imputing missing data of function and disease activity in rheumatoid arthritis registers: what is the best technique?
  1. Denis Mongin1,
  2. Kim Lauper1,
  3. Carl Turesson2,3,
  4. Merete Lund Hetland4,5,
  5. Eirik Klami Kristianslund6,
  6. Tore K Kvien6,
  7. Maria Jose Santos7,
  8. Karel Pavelka8,
  9. Florenzo Iannone9,
  10. Axel Finckh1 and
  11. Delphine Sophie Courvoisier1
  1. 1Division of Rheumatology, Geneva University Hospitals, Geneva, Switzerland
  2. 2Department of Internal Medicine, Lund University, Lund, Sweden
  3. 3Department of Rheumatology, Skåne University Hospital, Malmö, Sweden
  4. 4Centre for Rheumatology and Spine Diseases, Rigshospitalet Glostrup, Glostrup, Denmark
  5. 5Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
  6. 6Department of Rheumatology, Diakonhjemmet Hospital, Oslo, Norway
  7. 7Department of Rheumatology, Hospital Garcia de Orta, Almada, Portugal
  8. 8Institute of Rheumatology and Clinic of Rheumatology, Charles University, Prague, Czech Republic
  9. 9Department of Emergency and Transplantation, Rheumatology Unit, GISEA, University Hospital of Bari, Bari, Italy
  1. Correspondence to Dr Denis Mongin; denis.mongin{at}


Objective To compare several methods of missing data imputation for function (Health Assessment Questionnaire) and for disease activity (Disease Activity Score-28 and Clinical Disease Activity Index) in rheumatoid arthritis (RA) patients.

Methods One thousand RA patients from observational cohort studies with complete data for function and disease activity at baseline, 6, 12 and 24 months were selected to conduct a simulation study. Values were deleted at random or following a predicted attrition bias. Three types of imputation were performed: (1) methods imputing forward in time (last observation carried forward; linear forward extrapolation); (2) methods considering data both forward and backward in time (nearest available observation—NAO; linear extrapolation; polynomial extrapolation); and (3) methods using multi-individual models (linear mixed effects cubic regression—LME3; multiple imputation by chained equation—MICE). The performance of each estimation method was assessed using the difference between the mean outcome value, the remission and low disease activity rates after imputation of the missing values and the true value.

Results When imputing missing baseline values, all methods underestimated equally the true value, but LME3 and MICE correctly estimated remission and low disease activity rates. When imputing missing follow-up values at 6, 12, or 24 months, NAO provided the least biassed estimate of the mean disease activity and corresponding remission rate. These results were not affected by the presence of attrition bias.

Conclusion When imputing function and disease activity in large registers of active RA patients, researchers can consider the use of a simple method such as NAO for missing follow-up data, and the use of mixed-effects regression or multiple imputation for baseline data.

  • DAS28
  • rheumatoid arthritis
  • epidemiology
  • outcomes research
  • disease activity

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from


  • Twitter @@k_lauper, @@delcourvoisier

  • Contributors DM and DC designed the analyses. DM performed the simulations and analysed the data. DM, KL, DC and AF drafted the manuscript. All authors critically appraised and approved the final version of the manuscript.

  • Funding The PANABA collaboration was funded by Bristol Myers-Squibb, data of Czech patients from registry ATTRA were obtained with the support of grant of Ministry of Health 00023728.

  • Competing interests DM: none declared for this work. KL: none declared for this work. KP: received honoraria for lectures: AbbVie, Roche, MSD, UCB, Pfizer, Amgen, Egis, BMS. CT: none declared for this work. MJS: none declared for this work. EKK: none declared for this work. AF: none declared for this work. DC: none declared for this work.

  • Patient consent for publication Not required.

  • Ethics approval Approval of each local ethical committee for the collection of clinical data in each register.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data may be obtained from a third party and are not publicly available.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.