Results
Meta-analysis of the ACLF studies
For the meta-analysis, 32 cohort studies were included, with a total of 13 939 patients with ACLF, of whom 6277 (45.03%) died during the follow-up period. The study characteristics are listed in online supplemental table 2. NOS scores are indicated in online supplemental table 3, and all included studies depicted a quality score >7.
Besides MELD score, 12 predictive factors were related to ACLF outcome from meta-analysis, as listed below: age, international normalised ratio (INR), infection, serum sodium, white blood cell count (WBC), serum creatinine, neutrophils, total bilirubin (TBiL), gastrointestinal haemorrhage, ascites, hepatic encephalopathy (HE) and liver cirrhosis. All 13 factors (online supplemental figure 3) were associated with the 90-day outcomes. The pooled effects of risk factors were as follows (HR with 95% CI): age (1.03 (1.02 to 1.03)), INR (1.67 (1.41 to 1.97)), creatinine (2.18 (1.04 to 4.00)), TBiL (1.03 (1.02 to 1.04)), MELD score (1.12 (1.08 to 1.16)), WBC (1.05 (1.02 to 1.07)), neutrophils (1.07 (1.03 to 1.12)), HE (2.09 (1.78 to 2.47)), infection (1.90 (1.42 to 2.53)), sodium (0.96 (0.92 to 0.99)), gastrointestinal haemorrhage (1.73 (1.30 to 2.29)), ascites (1.82 (1.38 to 2.42)) and hepatic cirrhosis (1.46 (1.23 to 1.73)). Ten of the 13 factors (online supplemental figure 2) were significantly associated with 28-day outcome: age (1.04 (1.02 to 1.05)), INR (1.75 (1.48 to 2.07)), TBiL (1.04 (1.02 to 1.05)), MELD score (1.07 (1.05 to 1.09)), WBC (1.13, (1.06 to 1.20)), neutrophils (1.06 (1.02 to 1.11)), infection (2.03 (1.36 to 3.04)), HE (2.69 (2.27 to 3.17)), sodium (0.96 (0.93 to 0.99)) and gastrointestinal haemorrhage (1.84 (1.25 to 2.72)). The pooled HR with 95% CI is illustrated in the forest plot (online supplemental figure 1), and the detailed results of data synthesis are presented in online supplemental figure 4–5.
Patient characteristics of the studied cohorts
Among the 751 patients with ACLF enrolled from the CATCH-LIFE cohort, 649 (86.4%) had hepatitis B virus (HBV) infection, 51 (6.8%) had alcohol abuse and 51 (6.8%) had other aetiologies. The mean age of the patients was 48±12 years, and most were males (n=662, 82.8%) (figure 1 and online supplemental table 4). Ascites were the most frequently observed complication, affecting 78.7% of patients. This was followed by infection (38.2%), HE (17.3%), hepatorenal syndrome (9.2%) and gastrointestinal haemorrhage (4.3%). Liver failure (73.4%) was the most frequent type of organ failure, followed by coagulation failure (32.1%), renal failure (5.7%) and brain failure (4.7%). The measurements of the disease severity were as follows: 6.7 (1.4) for COSSH-ACLF IIs, 7.0 (1.9) for COSSH-ACLFs, 40.1 (9.6) for CLIF-C ACLFs, 26.3 (6.5) for MELD score and 28.0 (9.5) for MELD-Na score. The LT-free mortality rates were 23.40% on 28 days and 68.46% on 90 days.
Development of a new CATCH-LIFE-MELD score
The candidate variables were subjected to the Fine-Gray competing risk regression analysis in the derivation cohort. To exclude variables with significant collinearity, age (28 days: HR 1.029 (95% CI 1.016 to 1.041), p<0.001; 90 days: 1.025 (1.016 to 1.035), p<0.001), HE grade (28 days: 1.350 (1.176 to 1.550), p<0.001; 90 days: 1.302 (1.156 to 1.465), p<0.001), neutrophil count (28 days: 1.040 (1.011 to 1.069), p=0.006; 90 days: 1.030 (1.006 to 1.055), p=0.013) were identified as additional parameters for a new prognostic model of ACLF outcome, alongside the MELD score (28 days: 1.082 (1.064 to 1.101), p<0.001, 90 days: 1.084 (1.068 to 1.101), p<0.001) (online supplemental figure 6). The CATCH-LIFE-MELD score was developed as follows: R=0.028×age+0.3×HE grade+0.039×neutrophil count+0.079×MELD score.
Model performance
Model discrimination was assessed using AUROC, C-index, PDF analysis, NRI and IDI. As demonstrated in figure 2 and online supplemental table 5, the CATCH-LIFE-MELD model yielded AUROC of 0.823 and 0.848 in predicting the 28-day and 90-day outcomes of ACLF, respectively. The AUROC exhibited enhancements of 8.4%, 10.47%, 9.30%, 10.3% and 12.9%, respectively, compared with those of COSSH-ACLF IIs (0.759, p<0.001), COSSH-ACLFs (0.745, p<0.001), CLIF-C ACLFs (0.753, p<0.001), MELD score (0.746, p<0.001) and MELD-Na score (0.729, p<0.001). The C-index of CATCH-LIFE-MELD score for ACLF outcome (0.791 for 28 days and 0.788 for 90 days) was higher than that of COSSH-ACLF IIs (0.741 and 0.707), COSSH-ACLFs (0.729 and 0.706), CLIF-C ACLFs (0.731 and 0.690), MELD score (0.727 and 0.689) and MELD-Na score (0.712 and 0.669), with p<0.001 for each comparison (figure 2 and online supplemental table 6). We also compared the NRI and IDI of different models. As illustrated in online supplemental tables 7 and 8 and figure 2, CATCH-LIFE-MELD score represented significant improvements in NRI and IDI compared with the five other scores. PDF analysis (figure 3) displayed significantly lower overlapping coefficients of CATCH-LIFE-MELD score (52.78%/47.91% for 28-day and 90-day outcomes) than those of the COSSH-ACLF IIs (62.07%/66.43%), COSSH-ACLFs (66.01%/66.41%), CLIF-C ACLFs (62.83%/69.36%), MELD score (60.59%/67.42%) and MELD-Na score (62.86%/71.25%), exhibiting that it had more remarkable discrimination. The calibration plots indicated good agreement between the observed mortality and the predicted probability of death at 28 and 90 days (online supplemental figure 7). Moreover, the H-L test demonstrated a similar result (28 days: χ2=5.779, p=0.672; 90 days: χ2=5.786, p=0.661) (figure 4 and online supplemental table 9). As depicted in figure 4 and online supplemental table 10, CATCH-LIFE-MELD score had the largest Nagelkerke’s R2 (28 days: 0.318; 90 days: 0.411) and the lowest Brier score (28 days: 0.135; 90 days: 0.157) in predicting the 28-day/90-day outcomes of ACLF, indicating a better calibration. Finally, as illustrated by DCA (figure 5), the new model outperformed the other five models in predicting the 28-day/90-day outcomes.
Figure 2Discrimination of the prognostic scores for prediction of 28-day/90-day mortality. (A) Area under the receiver operating characteristic curve (AUROC) analysis and (B) concordance index (C-index) of prognostic scores for predicting 28-day/90-day mortality. (C) Net reclassification improvement (NRI) and (D) integrated discrimination improvement (IDI) of the Chinese Acute-on-Chronic Liver Failure Consortium-model for end-stage liver disease score (CATCH-LIFE-MELDs) compared with those of five other scores (MELD score (MELDs), MELD with sodium score (MELD-Nas), Chronic Liver Failure Consortium acute-on-chronic liver failure score (CLIF-C ACLFs), Chinese Group on the Study of Severe Hepatitis B-ACLF score (COSSH-ACLFs) and COSSH-ACLF II score (COSSH-ACLF IIs)).
Figure 3Probability density function of the prognostic scores for the 28-day/90-day prognosis of surviving and non-surviving patients in the derivation and validation groups. ACLF, acute-on-chronic liver failure; CATCH-LIFE-MELDs, Chinese Acute-on-Chronic Liver Failure Consortium-MELD score; CLIF-C ACLFs, Chronic Liver Failure Consortium ACLF score; COSSH ACLFs, Chinese Group on the Study of Severe Hepatitis B ACLF score; COSSH-ACLF IIs, COSSH-ACLF II score; MELDs, model for end-stage liver disease score; MELD-Nas, MELD with sodium score.
Figure 4(A) Nagelkerke’s R2, (B) Brier score and (C) Hosmer-Lemeshow (H-L) test of prognostic scores for prediction of 28-day/90-day mortality. For H-L test, the smaller X2, the greater the correlation p value, and the better the goodness of fit. Suitable calibration is indicated by H-L p value ≥ 0.05. The ordinate represents X2. *: 0.01 ≤ p value < 0.05; **: 0.001 ≤ p value < 0.01. ACLF, acute-on-chronic liver failure; CATCH-LIFE-MELDs, Chinese Acute-on-Chronic Liver Failure Consortium-MELD score; CLIF-C ACLFs, Chronic Liver Failure Consortium ACLF score; COSSH ACLFs, Chinese Group on the Study of Severe Hepatitis B ACLF score; COSSH-ACLF IIs, COSSH-ACLF II score; MELDs, model for end-stage liver disease score; MELD-Nas, MELD with sodium score.
Model validation
The performance of the CATCH-LIFE-MELD score model was confirmed in an independent cohort of 414 patients with ACLF. A comparison of clinical features between the training and validation patient groups can be found in online supplemental table 4.
Consistently, the C-index of the new score (0.805/0.778 for 28-day and 90-day, respectively) was superior to those of the MELD score (0.644/0.647, p<0.001/p<0.001), MELD-Na score (0.628/0.629, p<0.001/p<0.001), CLIF-C ACLFs (0.761/0.726, p<0.001/p<0.001), COSSH-ACLFs (0.762/0.747, p=0.014/p=0.034) and COSSH-ACLF IIs (0.763/0.744, p=0.009/p=0.014) (online supplemental table 11 and figure 2). In the validation cohort, the AUROCs of the model were similar to those in the derivation cohort (online supplemental table 12 and figure 2). Furthermore, the overlapping coefficient of CATCH-LIFE-MELD score between survivors and non-survivors in the validation cohort was reduced in the PDF analysis (CATCH-LIFE-MELD score: 54.68%/59.65%; COSSH-ACLF IIs: 61.88%/62.94%; COSSH-ACLFs: 60.18%/62.10%; CLIF-C ACLFs: 60.81%/65.57%; MELD score: 71.35%/74.76% and MELD-Na score: 73.01%/76.15%, figure 3). CATCH-LIFE-MELD score displayed a slight improvement compared with COSSH-ACLF IIs (NRI: 13.3%/1.1%; IDI: 9.2%/3.8%), COSSH-ACLFs (NRI: 23.3%/12.3%; IDI: 10.7%/5.4%) and CLIF-C ACLFs (NRI: 13.2%/15.9%; IDI: 10.5%/7.4%). A significant improvement in NRI and IDI for 28-day/90-day mortality was observed in comparison with MELD score (NRI: 36.7%/25.0%; IDI: 18.4%/13.2%) and MELD-Na score (NRI: 44.6%/34.4%; IDI: 19.9%/14.6%) (online supplemental tables 13 and 14 and figure 2).
The calibration analysis of CATCH-LIFE-MELD score depicted a good fit (28 days: χ2=4.682, p=0.666; 90 days: χ2=4.299, p=0.797, online supplemental table 9, online supplemental figures 3 and 7) in the validation cohort. Similar to the derivation cohort, CATCH-LIFE-MELD score had the largest Nagelkerke’s R2 (28 days: 0.344; 90 days: 0.308) and the lowest Brier score (28 days: 0.128; 90 days: 0.169) in predicting outcomes at both 28 and 90 days (online supplemental table 10 and figure 4).
Finally, DCA was evaluated in the validation cohort. As indicated in figure 5, the new model was better than the other five models in predicting 28-day or 90-day mortality, indicating that the new model had a good clinical benefit.
Figure 5Decision curve analysis for predicting the 28-day/90-day prognosis of patients with acute-on-chronic liver failure (ACLF). CATCH-LIFE-MELDs, Chinese Acute-on-Chronic Liver Failure Consortium-MELD score; CLIF-C ACLFs, Chronic Liver Failure Consortium ACLF score; COSSH ACLFs, Chinese Group on the Study of Severe Hepatitis B ACLF score; COSSH-ACLF IIs, COSSH-ACLF II score; MELDs, model for end-stage liver disease score; MELD-Nas, MELD with sodium score.
Risk stratification by CATCH-LIFE-MELD score
An X-tile plot was used for the risk stratification of CATCH-LIFE-MELD score. Patients with ACLF were categorised into three risk strata of death determined by two optimal cut-off points (3.09 and 5.04): low risk (<3.09), intermediate risk (3.09–5.04) and high risk (>5.04). The 28-day or 90-day mortality significantly varied among the groups (figure 6). The HRs for 28-day or 90-day deaths were 5.23/4.61 (p<0.001) in the intermediate-risk group and 11.84/7.61 (p<0.001) in the high-risk group compared with the low-risk group. Moreover, the above risk stratification in the validation cohort still indicated a similar separation efficiency (28/90 days, intermediate-risk groups: 4.30/2.88, p<0.001; high-risk groups: 7.79/4.38, p<0.001) to the derivation cohort. These results indicated that CATCH-LIFE-MELD score are a better tool for risk stratification in patients with ACLF than other prognostic scores.
Figure 6Risk stratification of patients with acute-on-chronic liver failure (ACLF) by the Chinese Acute-on-Chronic Liver Failure Consortium-model for end-stage liver disease score (CATCH-LIFE-MELDs). (A) Derivation cohort. (B) Validation cohort. The cumulative incidence of death at 28/90 days was stratified according to the CATCH-LIFE-MELDs classification rule (low risk/intermediate risk/high risk: CATCH-LIFE-MELDs <3.09/3.09–5.04/>5.04). P<0.001 (log-rank test) for comparisons of survival probability among the three risk strata.
Performance of CATCH-LIFE-MELD score in specific subgroups of ACLF
As illustrated in figure 7 and online supplemental tables 15–18, the prediction efficiency of CATCH-LIFE-MELD score in HBV-ACLF and non-HBV-ACLF was >0.75, significantly better than other traditional models. Moreover, the predictive efficiency of all models was undermined by the presence of liver cirrhosis. Nevertheless, the new model had a C-index >0.750 in patients with liver cirrhosis, which still significantly surpassed other models. Moreover, the performance of CATCH-LIFE-MELD score was only mildly affected by liver failure. In contrast, the prediction performance of other traditional models was significantly worse in patients with ACLF without liver failure. We further divided patients with cirrhosis into liver failure and non-liver failure subgroups. We found that the C-index of the new model in the cirrhosis-liver failure subgroup decreased by only 2.05% (28 days) and 2.46% (90 days) compared with the cirrhosis-non-liver failure subgroup. However, there was a significant downward trend in the other traditional models (COSSH-ACLF IIs: 14.13%/12.09%; COSSH-ACLFs: 19.28%/11.76%; CLIF-C ACLFs: 14.83%/10.66%; MELD score: 23.59%/20.87%; MELD-Na score: 17.47%/15.24%). In conclusion, the CATCH-LIFE-MELD score was more stable and accurate in predicting short-term outcomes in different subgroups of patients with ACLF.
Figure 7Concordance index (C-index) of prognostic scores for predicting 28-day/90-day mortality in patients with acute-on-chronic liver failure (ACLF) with or without hepatitis B virus (HBV) infection (A1 and A2), with cirrhosis or without cirrhosis (B1 and B2), with liver failure or without liver failure (C1 and C2) and cirrhosis subgroup with liver failure or without liver failure (D1 and D2). Cut-off line y=0.75. The broken line represents the rate of change of the C-index between the cirrhosis-liver failure subgroup and the cirrhosis-non-liver failure group. CATCH-LIFE-MELDs, Chinese Acute-on-Chronic Liver Failure Consortium-MELD score; CLIF-C ACLFs, Chronic Liver Failure Consortium ACLF score; COSSH ACLFs, Chinese Group on the Study of Severe Hepatitis B ACLF score; COSSH-ACLF IIs, COSSH-ACLF II score; MELDs, model for end-stage liver disease score; MELD-Nas, MELD with sodium score.
Validation of the CATCH-LIFE-MELD score under COSSH criteria and EASL criteria
We further validated the performance of the CATCH-LIFE-MELD score under COSSH criteria and European Association for the Study of the Liver (EASL) criteria. As illustrated in online supplemental tables 19–22 and figure 8, the C-index of CATCH-LIFE-MELD score for 28-day or 90-day outcomes (in the derivation cohort: COSSH criteria, 0.805/0.801; EASL criteria, 0.718/0.712; in the validation cohort: COSSH criteria, 0.793/0.775; EASL criteria, 0.758/0.737) were significantly higher than other score systems whether under COSSH or EASL criteria.
Figure 8Concordance index (C-index) of prognostic scores for predicting 28-day/90-day mortality under (A) Chinese Group on the Study of Severe Hepatitis B (COSSH) diagnostic criteria and (B) European Association for the Study of the Liver (EASL) diagnostic criteria. ACLF, acute-on-chronic liver failure; CATCH-LIFE-MELDs, Chinese Acute-on-Chronic Liver Failure Consortium-MELD score; CLIF-C ACLFs, Chronic Liver Failure Consortium ACLF score; COSSH ACLFs, COSSH-ACLF score; COSSH-ACLF IIs, COSSH-ACLF II score; MELDs, model for end-stage liver disease score; MELD-Nas, MELD with sodium score.