Article Text

Download PDFPDF

Original research
Four trajectories of 24-hour urine protein levels in real-world lupus nephritis cohorts
  1. Danting Zhang,
  2. Fangfang Sun,
  3. Jie Chen,
  4. Huihua Ding,
  5. Xiaodong Wang,
  6. Nan Shen,
  7. Ting Li and
  8. Shuang Ye
  1. Department of Rheumatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China
  1. Correspondence to Dr Nan Shen; nanshensibs{at}; Dr Ting Li; leeting007{at}; Dr Shuang Ye; ye_shuang2000{at}


Objectives A 24-hour urine protein (24hUP) is a key measurement in the management of lupus nephritis (LN); however, trajectories of 24hUP in LN is poorly defined.

Methods Two LN cohorts that underwent renal biopsies at Renji Hospital were included. Patients received standard of care in a real-world setting and 24hUP data were collected over time. Trajectory patterns of 24hUP were determined using the latent class mixed modelling (LCMM). Baseline characters were compared among trajectories and multinomial logistic regression was used to determine independent risk factors. Optimal combinations of variables were identified for model construction and user-friendly nomograms were developed.

Results The derivation cohort composed of 194 patients with LN with 1479 study visits and a median follow-up of 17.5 (12.2–21.7) months. Four trajectories of 24hUP were identified, that is, Rapid Responders, Good Responders, Suboptimal Responders and Non-Responders, with the KDIGO renal complete remission rates (time to complete remission, months) of 84.2% (4.19), 79.6% (7.94), 40.4% (not applicable) and 9.8% (not applicable), respectively (p<0.001). The ‘Rapid Responders’ distinguish itself from other trajectories and a nomogram, composed of age, systemic lupus erythematosus duration, albumin and 24hUP yielded C-indices >0.85. Another nomogram to predict ‘Good Responders’ yielded C-indices of 0.73~0.78, which composed of gender, new-onset LN, glomerulosclerosis and partial remission within 6 months. When applied to the validation cohort with 117 patients and 500 study visits, nomograms effectively sorted out ‘Rapid Responders’ and ‘Good Responders’.

Conclusion Four trajectories of LN shed some light to guide the management of LN and further clinical trials design.

  • Lupus Nephritis
  • Lupus Erythematosus, Systemic
  • Epidemiology

Data availability statement

Data are available upon reasonable request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Urine protein is strongly correlated with lupus nephritis (LN) prognosis and served as a major parameter reflecting treatment response.


  • Four trajectories of 24-hour urine protein in patients with LN in a real-world setting were revealed. User-friendly nomograms were developed and validated to predict specific trajectories.


  • Identification of different disease behaviours might improve the management algorithm of LN.


Lupus nephritis (LN) is one of the most common and severe form of systemic lupus erythematosus (SLE). Over half of the individuals with SLE will develop LN, and approximately 10% will end up with end-stage renal disease (ESRD) within 5 years; the number of ESRD can reach 30% within 15 years.1 2 Despite management advances, LN continues to have a high rate of morbidity and mortality.

The change of 24-hour urine protein (24hUP) level is critical in LN management. A rapid decrease in proteinuria is the strongest early indicator of a favourable long-term renal outcome in LN.3–8 For instances, patients with 24hUP levels decreased to less than 1 g/day without serum creatinine (sCr) increase at early months compared with baseline had a better 10-year renal prognosis.6 Alternatively, a reduction in urine protein of greater than 50% at 6 months is associated with a better 15-year renal survival.7 Trialists attempted to establish an appropriate cut-off value for long-term renal outcome prediction using data from the Euro-Lupus Nephritis cohort9 and the MAINTAIN Nephritis Trial,10 with cut-off of <0.8 g/day and 0.7 g/day at 1 year, respectively. Urine protein–creatinine ratio (uPCR) target below 0.5–0.7 g/g has been adopted by European League Against Rheumatism and European Renal Association-European Dialysis and Transplant Association (EULAR/ERA–EDTA) 11 recommendations as the treatment goal for LN; whereas KDIGO guideline recommends a more stringent endpoint (<0.5 g/g).12

Given the importance of treatment response at 1 year for LN, predicting models have been extensively investigated. However, even with advanced urine biomarkers and machine learning analyses, the predicting accuracy of the latest models are only moderate with areas under the curve (AUCs) less than 0.80.13 14 The shortage of the outcome measure used in these studies is that participants achieved renal response once however relapsed in 1 year will be misclassified as responders, which will introduce bias to the prediction. Largely due to the heterogeneity and complexity of the kinetics of urine protein in LN, studies using more pertinent methodology are mandatory. Latent class mixed modelling (LCMM), a novel method for identifying homogenous subgroups in longitudinal data (trajectories), has been widely employed in psychiatry,15 16 nephrology17 18 and neurology19 20 to analyse illness development and response to therapy. It has been used in rheumatic disorders such as rheumatoid arthritis,21 22 gout23 and SLE.24 However, LCMM has not been used to uncover trajectories of urine protein in LN.

The purpose of this study was to determine 24hUP trajectories over time in patients with LN under standard of care (SoC) in the real-world setting.


Patients and data collection

Between January 2013 and January 2021, patients hospitalised to the department of Rheumatology at Renji Hospital South Campus for renal biopsy were included as the derivation cohort. An independent validation cohort with Renji Hospital West Campus enrollment during January 2017 to August 2020 was also established. All patients fulfilled the American College of Rheumatology revised classification criteria for SLE in 199725 and had renal biopsy confirmed LN. Those with a follow-up less than 6 months were ruled out including deceased or rapidly progressed to ESRD or lost to follow-up within 6 months.

At baseline, the following data was collected: demographic information, medical history, laboratory tests and pathological parameters derived from kidney biopsy. Urine protein measurements from baseline to the last visit up to 2 years were subjected to trajectory analysis. Urine protein levels were recorded with 24hUP as standard or using spot uPCR as a surrogate. Estimated glomerular filtration rate (eGFR) was calculated with 2021 CKD-EPI Creatinine Equation ( The eGFR slope was calculated using linear regression across the eGFR values over the 2 years of follow-up. To avoid acute effects, the first 90 days after biopsy were excluded from this calculation.26


KDIGO definitions of complete and partial remission were applied, that is, complete remission (CR): proteinuria ≤0.5 g/day and eGFR >90 mL/min/1.73 m2; partial remission (PR): a ≥50% reduction in proteinuria and proteinuria <3 g/day for nephrotic range (24hUP >3.5 g/day) with stabilisation of sCr (±25%) or improvement of sCr but not to normal range.12 New-onset LN was defined as LN duration less than 3 months before enrollment/renal biopsy.27 ESRD is a diagnosis determined by the clinician initiating long-term renal replacement therapy.

Statistical analyses

The trajectories of urine protein were investigated using latent class mixture models (LCMMs). With 24hUP as the dependent variable, mixed effects models with random intercepts were fitted in the ‘lcmm’ R package.28 We constructed and interpreted latent class trajectory models using the eight-step methodology proposed by Lennon et al.29 We began by developing a scoping model using three models and examining the residual profile. After a mixed-effect model with latent classes K=1–10 was created using LCMM, the optimal number of classes was determined by Bayesian Information Criteria. Relative models then were refined in three parameters (normalised or unnormalised 24hUP, spline, time). Model adequacy was measured with the average of posterior probability of assignments and odds of correct classification. Further details are available in online supplemental materials.

Patients’ baseline characteristics were then compared between latent classes. Quantitative data was compared by non-parametric tests. Categorical data was compared by Pearson’s χ2 test with Yates’ continuity correction, and Fisher’s exact test, as appropriate. For comparison between multiple groups, Kruskal-Wallis rank-sum test was used and Bonferroni method was applied to correct for multiple comparisons. Missing data was addressed by multiple imputation for regression. Multinomial logistic regression analysis was carried out to identify independent predictors for more than two groups, which were subjected to further model construction. The adjusted OR was calculated. Time to event (CR or first switch of immunosuppressant as induction) were considered not evaluable when less than 50% of individuals experienced the event in each group.

For model construction, patients in the derivation cohort were randomly divided into a training and testing sets (sample size, training set: testing set=7:3). An all-subset regression was applied in the training set using the R package ‘leaps’ to identify the best combination of factors for a logistic regression model. A two-sided DeLong test was performed to compare the C-indices of models. Nomogram constructed from the derivation cohort was used to identify the specific trajectory in the validation cohort. Performance of nomograms were evaluated and compared between two cohorts. P<0.05 was considered statistically significant. Data analysis was carried out using R programming language (V.3.6.3; All codes are accessible at on GitHub (GitHub, San Francisco, California, USA).


Patient characteristics

Overall, 255 patients with LN received renal biopsy in the derivation campus. Among them 48 patients (18.8%) without an eligible follow-up for more than 6 months, along with 3 who rapidly progressed into ESRD and 10 deceased were excluded. Finally, the derivation data set comprised 194 patients with LN (83.5% women) with 1479 visits and a median follow-up for urine protein of 17.5 (12.2–21.7) months. Patients’ age at enrollment was 35.0±12.0 years. According to LN pathology classification (International Society of Nephrology (ISN)/Renal Pathology Society (RPS)),30 there were 3.1%, 17.5%, 28.9%, 35.1% and 15.5%, respectively, for class I/II, III, IV, mixed (III+V or IV+V) and V. The patients were under SoC, as evidenced by the majority of patients (>85%) received either mycophenolate mofetil (MMF) or cyclophosphamide (CYC) as the first-line induction therapy. Also, hydroxychloroquine was prescribed to 92.3% of patients. ACE inhibitor or angiotensin receptor antagonist was prescribed to 84% of patients. During follow-up, the KDIGO CR rate was 56.7%; four individuals died and four developed ESRD.

Four trajectories of 24hUP levels in LN revealed by LCMM

A four-group cubic spline model with good adequacy was chosen (online supplemental materials). LN from four groups showed different response patterns (figure 1): 24hUP of most patients in cluster 1 (n=57) rapidly reached 0.5 g/day; 24hUP of cluster 2 patients (n=49) declined slower than cluster 1 over time but the majority reached 0.5 g/day; as comparison, 24hUP of most patients in the trajectory of clusters 3 (n=47) and cluster 4 (n=41) did not reach 0.5 g/day. The CR rates (time to CR, months) were 84.2% (4.19) in cluster 1, 79.6% (7.94) in cluster 2, 40.4% (not applicable) in cluster 3 and 9.8% (not applicable) in cluster 4, respectively. Therefore, according to their clinical behaviours, patients were labelled as ‘Rapid Responders’ (cluster 1), ‘Good Responders’ (cluster 2), ‘Suboptimal Responders’ (cluster 3), ‘Non-Responders’ (cluster 4) or ‘inadequate responders’ (clusters 3 and 4).

Figure 1

Four trajectories of 24-hour urine protein levels in real-world lupus nephritis cohort. (A) Trajectory plots with average and 95% predictive intervals of 24hUP levels for each cluster. The x axis represents time, and the y axis represents the level of normalised urine protein converted by ‘lcmm’ package. (B) Individual level ‘spaghetti plots’ showed changes of original urine protein levels (uPCR as a surrogate <5%) of each patient in all clusters and in Cluster 1 (C), Cluster 2 (D), Cluster 3 (E) and Cluster 4 (F), defined as ‘Rapid Responders’, ‘Good Responders’, ‘Suboptimal Responders’ and ‘Non-Responders’. uPCR, urine protein–creatinine ratio; 24hUP, 24-hour urine protein.

Distinct baseline clinical features and treatment exposures in four latent clusters

Among four trajectories, ‘Rapid Responders’ had relatively milder disease (table 1, online supplemental table S4). They were older and had lower baseline 24hUP (1.65±1.24 g/day, p<0.001). They exhibited less class IV and mixed type LN (24.6% and 19.3%, p=0.002), as well as the lowest activity index (AI) in biopsy. Tubular atrophy was likewise the least observed. Fewer individuals had acute kidney injury (AKI) or hypertension. They were exposed to a lower cumulative number of IS with a higher frequency of MMF exposure (50.9%, p=0.001) and a lower initiating dosage of prednisone as induction (62.63±39.74 mg, p=0.004).

Table 1

Characteristics of patients in the ‘Rapid Responders’, ‘Good Responders’, ‘Suboptimal Responders’ or ‘Non-Responders’ latent classes in the derivation cohort

For ‘Good Responders’, the CR rates were similar with ‘Rapid Responders’ but the time to CR was longer. In terms of pathology, ‘Good Responders’ had a higher prevalence of type IV (34.7%) and mixed LN (34.7%) with higher AI. 65.3% ‘Good Responders’ exposure to CYC and the average initial prednisone dosage was 160.73 mg. Renal function was stable, with only four patients having an over 25% increase of sCr from baseline. The mean eGFR slope was 9.86 mL/min/1.73 m2/year.

‘Suboptimal Responders’ had the greatest percentages of mixed type LN (48.9%). Additionally, AKI was more frequently observed (44.4%, p=0.004) and the baseline sCr level was higher with higher brain natriuretic peptide level. As for ‘Non-Responders’, 29.3% patients had an elevation of sCr more than 25% from the baseline during follow-up. Notably, no specific pathological features were detected among ‘Non-Responders’; except for urinary red blood cells/high power was the least observed.

Moreover, switches of IS for induction therapy were increasingly documented through cluster 1 to cluster 4 during a similar follow-up time. The proportions of patients with more than one IS for induction were 13%, 29.2%, 48.9% and 77.5% in the four respective clusters. Serology markers including anti-double-stranded DNA antibody, complement levels or antiphospholipid antibodies were not helpful to distinguish clusters.

Multinomial logistic regression models for trajectory recognition

To determine independent risk factors of each cluster, 20 statistically significant features identified above were then enrolled in the multinomial logistic regression. Several significant determinants related to the latent classes of clusters 2, 3, 4, in comparison to ‘Rapid Responders’ (cluster 1) were identified (table 2). Age, gender, cell crescents, glomerulosclerosis, tubular atrophy, tubulointerstitial sclerosis, baseline 24hUP and albumin were shared independent indicators for non-‘Rapid Responders’. Multinomial logistic regression models using ‘Non-Responders’ (cluster 4) as the reference group, on the other hand, was unsatisfactory in terms of group discriminative capability (online supplemental table S5).

Table 2

Multinomial logistic regression models of baseline variables associated with being in either the ‘Good Responders’, ‘Suboptimal Responders’ or ‘Non-Responders’ latent class compared with ‘Rapid Responders’ using data from the whole cohort

Nomograms developed for predicting ‘Rapid Responders’ and ‘Good Responders’

Since ‘Rapid Responders’ (cluster 1) distinguished itself from the other three clusters, a nomogram was developed to facilitate its identification in clinical practice. We merged clusters other than cluster 1 as the comparator (online supplemental table S6). Twenty clinical features were subjected to all-subset regression to determine models combining three, four or five predictors with the greatest adjusted R2 (online supplemental figure S3). To assess the robustness of three logistic models, C-indices were generated in both training and testing sets. Model 2 was superior to Model 1 in both sets in terms of C-indices (0.87 and 0.88, p=0.048 and 0.001, DeLong test, respectively), and comparable with Model 3 (p=0.31 and 0.78, DeLong test). Considering both the feasibility and discriminative ability, Model 2 combining four variables (age, SLE duration, albumin and baseline 24hUP) was selected to identify ‘Rapid Responders’ (online supplemental table S7), plotted with nomogram in figure 2A.

Figure 2

Nomograms for distinguishing ‘Rapid Responders’ (A) and ‘Good Responders’ (B) in the derivation cohort. Univariable areas under the curve and cut-off values of each continuous item in the training set were listed. LN, lupus nephritis; new-onset LN, duration <3 months; PR, partial remission; SLE, systemic lupus erythematosus; 24hUP, 24-hour urine protein.

To make a step forward after removal of ‘Rapid Responders’, clinical characteristics of ‘Good Responders’ (cluster 2) were compared with those ‘inadequate responders’ (clusters 3 and 4). Another round of all-subset regression was performed (online supplemental table S8). As a result, the models had an unstable predictive accuracy with C-indices in training (<0.79) and testing set (<0.67). Therefore, we included time to PR to enhance the model performance. The cut-off value of time to PR was 6 months for discrimination (online supplemental figure S4). Finally, a nomogram to predict ‘Good Responders’ yielded a fair robustness with C-indices of 0.78 and 0.73, composed four factors with female gender, new-onset LN, proportions of glomerulosclerosis in pathology and PR within 6 months (figure 2B, online supplemental figure S5 and table S9).

Validation cohort supported the effectiveness of two nomograms

One hundred and seventeen patients (93.2% women) were enrolled as an independent validation cohort with 500 study visits and a median follow-up for urine protein of 18.00 (12.30–23.70) months. A comparison between the derivation and validation cohorts was shown in online supplemental table S10. In the validation cohort, there were fewer male patients, and the baseline eGFR and albumin were higher. Other parameters including age, SLE duration, LN class, baseline 24hUP and the CR rates were comparable.

Nomograms sorted out 28 (23.9%) ‘Rapid Responders’ and 19 (16.2%) ‘Good Responders’ in the validation cohort, respectively. Baseline characteristics of these two clusters were compared between two cohorts. (table 3) For ‘Rapid Responders’, the CR rates between the validation and derivation cohort were 64.3% versus 84.2%, (p=0.1); and the time to CR was 4.86±3.09 versus 4.19±3.45 months (p=0.171), respectively. Numerical differences can be appreciated yet without statistical significance. For ‘Good Responders’, the CR rates between two cohorts (68.4% vs 79.6%, p=0.512) and time to CR (6.62±4.27 vs 7.94±5.36 months, p=0.347) were similar. (Spaghetti plots for two subsets in the validation cohort were presented in online supplemental figure S6 and S7).

Table 3

Comparisons of clinical and response features between ‘Rapid Responders’/‘Good Responders’ in the derivation cohort and those identified by nomogram models in the validation cohort


In our renal biopsy-proven LN cohorts, four trajectories of urine protein changes over time were revealed. This is the first study, by using a semi-supervised machine learning approach, to delineate LN behaviour under the SoC in a real-world setting. Clinical patterns as ‘Rapid Responders’ and ‘Good Responders’ and their indicators were identified. A set of nomograms were developed and validated for prediction.

Four trajectories are of clinical significance as they were closely correlated with KDIGO defined endpoints. First of all, they provide an important perspective to understand the discrepancy outcome between real-world study and clinical trials. Indeed, our real-world data displayed a much higher overall CR rate compared with the placebo arm (with SoC) in multiple clinical trials. This is likely due to ‘Rapid Responders’ are a sizeable yet distinct group of patients, which had largely been excluded from most of the recent trials. To be more specific, according to our ‘Rapid Responders’ nomogram, a baseline 24hUP exceeding 2.4 g/day had a high predictive value (94%) for non-‘Rapid Responders’ (AUC of 0.80, online supplemental table S7). The average levels of baseline urine protein among recent trials were far beyond this level, that is, BLISS-LN (2.9 g/day),31 NOBLITIY (3.5 g/day),32 AURORA (3.8 g/day)33 and LUNAR (4.2 g/day)34 trials. Along with a time span between the renal biopsy and trial recruitment up to 6 months to 2 years, these ‘Rapid Responders’ and probably ‘Good Responders’ as well, were unlikely to be eligible to enter these trials. From another practical perspective, for new-onset patients with LN subject to clinical or trial assignment decision-making, the evaluation for possible ‘Rapid Responders’ should be underscored. Patients with a lower baseline urine protein and a higher serum albumin, older age, a shorter duration of SLE, were more likely to be ‘Rapid Responders’ to SoC and aggressive treatment might not be appropriate in the regard of risk/benefit or cost/effective ratio.

Distinguishing ‘Good Responders’ (cluster 2) from those ‘inadequate responders’ (clusters 3 and 4) is also of great clinical interest; however, the initial attempt turned out to have limited predictive value with only baseline factors. Our data suggested that a 6-month follow-up under SoC provides both logical and practical ‘trial and error’ feedback. A sketch of ‘Good Responders’ was presented by the second nomogram: female with new-onset LN, low proportion of glomerulosclerosis on pathology and PR achieved within 6 months; these are clues to justify a reversible disease under SoC induction therapy.

Separated baseline parameters in line with ours for 1-year LN outcomes prediction had been reported previously. Baseline uPCR/24hUP is a well-conceived important factor in predicting 1-year renal response, although the predictive value alone is only moderate with AUCs of 0.50~0.65.13 14 35 36 Age and gender have also been recognised as predictors of early renal response.14 36 Consistent with our result, longer interval between the onset of LN and renal biopsy performed (baseline) was associated with a decreased likelihood of achieving CR thereafter.36 Glomerulosclerosis is a known risk factors of renal failure.37 It is noteworthy that our work helps to shape the global concept of LN management in an integrated fashion. We intended to build a nomogram-based provisional treatment algorithm (online supplemental figure S8), but more external validation is necessary to prove its reproducibility and generalisability. When patients with LN are encountered, the first step is to sort out those ‘Rapid Responders’ to SoC and ‘Good Responders’ to SoC in a 6-month time frame, SoC is suggested. Managing the remaining ‘inadequate responders’ to SoC is still an open question. Our data implicated that attempts of switching conventional IS might not change the overall trajectory pattern. Abide by the current evidence,38 add-on options including anti-BLyS therapy (belimumab), B cell-depletion agent (obinutuzumab) or calcineurin inhibitor (voclosporin or tacrolimus) may be considered.

There are some limitations. First, the exclusion criteria filtered out those with less than 6 months follow-up data including patients rapidly advanced into ESRD, died or lost to follow-up, which introduced certain bias that hampered the totality of disease behaviour analyses in LN. Second, spot uPCR was used as a surrogate in this study, which may not reflect the exact level of 24hUP.39 In this study, the baseline proteinuria was all identified with 24hUP and the uPCR was used as a surrogate in <5% records. Third, the average follow-up time was too short to address long-term outcome and by no means to capture the flare pattern. Finally, a larger multicentric cohort with different ethnicity is warranted to address the generalisability issue. More comprehensive renal/urinary/peripheral biomarkers evaluation in a multiomics manner might empower truly discrimination of different LN patterns and guide targeted treatment, which deserves further exploration.

Data availability statement

Data are available upon reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

The retrospective study was approved by the ethics committees of Renji Hospital, Shanghai Jiao Tong University School of Medicine (KY2021-059-B). Participants gave informed consent to participate in the study before taking part.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • DZ and FS contributed equally.

  • NS, TL and SY contributed equally.

  • Contributors DZ: Writing—original draft; Methodology; Visualisation; Software. FS: Conceptualisation; Funding acquisition; Writing—original draft; Data curation; Methodology. JC: Conceptualisation; Data curation. HD: Data curation; Validation. XW: Supervision; Resources. NS: Funding acquisition; Supervision; Resources; Validation. TL: Conceptualisation; Supervision; Writing—review and editing; Resources. SY: Guarantor; Conceptualisation; Funding acquisition; Supervision; Resources; Writing—review and editing.

  • Funding This research is supported by grants from the Clinical Research Plan of Shanghai Hospital Development Center (Project No. SHDC2020CR1015B & SHDC2020CR6026) and Shanghai Municipal Health Commission (No. 202040291).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.