If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Novel metabolic classification for extrahepatic complication of metabolic associated fatty liver disease: A data-driven cluster analysis with international validation
1 These authors contribute equally to this article.
Junzhao Ye
Footnotes
1 These authors contribute equally to this article.
Affiliations
Department of Gastroenterology, The First Affiliated Hospital, Sun Yat-sen University, No. 58 Zhongshan II Road, Yuexiu District, Guangzhou, 510080, China
Department of Gastroenterology, Affiliated Dongguan Hospital, Southern Medical University (Dongguan People's Hospital), Dongguan, Guangdong 516000, China
Department of Ultrasound, The First Affiliated Hospital, Sun Yat-sen University, No. 58 Zhongshan II Road, Yuexiu District, Guangzhou, Guangdong 510080, China
Department of Radiology, The First Affiliated Hospital, Sun Yat-sen University, No. 58 Zhongshan II Road, Yuexiu District, Guangzhou, Guangdong 510080, China
Correspondence to: T. Wu, Department of Gastroenterology, The Eighth Clinical Medical College, Guangzhou University of Traditional Chinese Medicine, No. 6 Qinren Road, Chancheng Distric, Foshan 528000, China.
Department of Gastroenterology, Guangzhou University of Chinese Medicine Affiliated Foshan Hospital of Traditional Chinese Medicine, No. 6 Qinren Road, Chancheng Distric, Foshan 528000, China
Department of Gastroenterology, The First Affiliated Hospital, Sun Yat-sen University, No. 58 Zhongshan II Road, Yuexiu District, Guangzhou, 510080, China
A data driven cluster analysis with simple variables at diagnosis reclassify MAFLD subtypes.
•
Baseline clusters facilitate predicting the risk of long-term extrahepatic complication.
•
Novel MAFLD subtypes could be replicated in the international cohorts.
Abstract
Background
Traditional classification systems of metabolic-associated fatty liver disease (MAFLD) do not account for the high rate of extrahepatic complications. To create a new classification of MAFLD using metabolic parameters to identify risks of complications more accurately.
Methods
The retrospective study included MAFLD patients from the First Affiliated Hospital of Sun Yat-sen University for model development, and the model was validated respectively using Chinese cohort and UK Biobank database. Cluster analysis with k-means cluster was built using age, body mass index (BMI), glycosylated hemoglobin (HbA1c), total cholesterol/high density lipoprotein cholesterol (HDL-C) ratio, triglyceride, and lipoprotein(a) [Lp(a)] levels. Cox regression models were used to compare the risk of type 2 diabetes (T2DM), chronic heart disease (CHD), stroke and mortality between the clusters.
Results
1038 MAFLD patients from cross-sectional population were recruited for the model derivation, with 10,451 cases (33.4 % of MAFLD) from Chinese cohort and 304,141 cases (34.9 % of MAFLD, 1010 cases with magnetic resonance imaging proton density fat fraction measurement [MRI-PDFF]) from the international cohort validated. Five replicable clusters of MAFLD patients were identified: Cluster 1(mild obesity and dyslipidemia-related), Cluster 2 (age related), Cluster 3 (severe insulin resistance-related), Cluster 4[high Lp(a)-related], and Cluster 5 (severe mixed hyperlipidemia-related). Patients in different clusters exhibited differences in the development of T2DM, CHD, stroke and all-causes mortality. Patients in Cluster 3 had significantly worst survival outcomes and higher risks of T2DM and CVD than those in other clusters.
Conclusion
The novel classification offers improved discrimination of new-onset MAFLD patients with different metabolic complications.
Metabolic associated fatty liver disease (MAFLD), formerly named nonalcoholic fatty liver disease (NAFLD), is the most common chronic liver disease and affects appropriately 30 % of the general population globally [
The Asian Pacific Association for the Study of the Liver clinical practice guidelines for the diagnosis and management of metabolic associated fatty liver disease.
]. Widespread adoption of the name and definition of MAFLD allow greater standardization across the spectrum of disease and realize cogent, coherent and logical framework to understand, diagnose and treat this common disease [
]. Population-based follow-up studies or analysis of the causes of death of patients with MAFLD have indicated that approximately half of the deaths are related to complications including T2DM, and atherosclerosis [
]. Thus, methods of identifying MAFLD early and its progression are needed to help improve outcomes.
Currently, there are 2 phenotyping systems used in the management of MAFLD. The former is based on pathological type and intrahepatic disease presentations, and relies on histological grading of steatosis, inflammation and fibrosis. With the updated nomenclature identifying disease as MAFLD, patients are stratified into categories of body mass index (BMI), the presence of T2DM, and metabolic disorders for the lean/normal weight population [
The Asian Pacific Association for the Study of the Liver clinical practice guidelines for the diagnosis and management of metabolic associated fatty liver disease.
]. This method aims to classify subtypes in order to lessen disease heterogeneity. However, both classification methods have shortcomings. First, there is conflicting evidence of whether an individual with lean MAFLD may have a more benign disease course than a person with overweight/obesity MAFLD [
]. More importantly, hepatic fibrosis severity has been identified as the strongest predictor for the development of T2DM and arteriosclerotic cardiovascular disease (ASCVD), however, only a minority of patients exhibit significant fibrosis at any time and thus it may not explain the development of T2DM and ASCVD [
]. In addition, even though there is convincing evidence that steatohepatitis or fibrosis progression is associated with poor prognosis, there appears to be interpatient variability with respect to the speed of progression of these conditions [
Fibrosis severity as a determinant of cause-specific mortality in patients with advanced nonalcoholic fatty liver disease: a multi-national cohort Study.
There is growing evidence that the heterogeneity of MAFLD is likely influenced by multiple factors, including age, sex, metabolic status, diet, genetic background, and the gut microbiota [
]. Circulating metabolites such as lipids and glucose may be involved directly in, or contribute to, the pathogenesis of extrahepatic complications of MAFLD [
]. Dyslipidemia associated with abnormal levels of total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C), HDL-C, triglycerides, and free fatty acids (FFAs) and abnormal apolipoprotein profiles of apolipoprotein (Apo) A1, ApoB and lipoprotein (a) [Lp(a)] are associated with development of ASCVD, causes damage to vascular endothelial cells, and the deposition of pro-inflammatory immune cells in multiple organs that induce the release of cytokines, which has been identified as a key factor in the development of other extrahepatic diseases [
The Asian Pacific Association for the Study of the Liver clinical practice guidelines for the diagnosis and management of metabolic associated fatty liver disease.
]. This makes it difficult for clinicians to optimize treatment of patients with the same histological phenotype with early control of metabolic parameters. A refined classification system based on the status of metabolic parameters might facilitate identifying patients at greatest risk of complications and thus early treatment regimens to reduce the risks.
This study was to develop a new classification system of MAFLD that includes serum metabolic markers in order to identify patients at greatest risk of complications, and validate the prognostic value of the classification system in two independent cohorts.
2. Materials and methods
2.1 Study design and population
This study was implemented in 2 phases. The first phase was designed to identify subtypes of MAFLD using a data-driven approach based on comprehensive, non-invasive assessments including anthropometric measurements, glucose, lipids, and apolipoprotein profiles, and fat content of the liver and pancreas and abdomen fat thickness by magnetic resonance imaging proton density fat fraction (MRI-PDFF). The final variables in the model were selected based on their discrimination abilities, ease of availability in different clinical settings without the need for individual interpretation, and the minimum number of variables needed to provide adequate prognostic value. The second phase followed a longitudinal design to validate the prognosis value of the different subtypes of MAFLD identified in phase 1.
Patients in the first phase included a cross-sectional population derived from consecutive patients with new-onset MAFLD who received MRI-PDFF and were admitted to the fatty liver center of the First Affiliated Hospital of Sun Yat-sen University from January 2015 to January 2022. All these patients in phase 1 were Han Chinese population from Guangdong Province and the area nearby, and were highly homogenous in genetic background. The second phase validation cohort was comprised of the datasets of 2 populations. The Chinese validation dataset consisted of consecutive patients admitted to 3 tertiary care, University-affiliated Medical Centers in China (the First Affiliated Hospital of Sun Yat-sen University, the First Affiliated Hospital of Guangzhou Medical University, and the Affiliated Dongguan Hospital of Southern Medical University) from January 2004 to December 2015, and followed-up until December 2018. This cohort included a total of 10,451 patients underwent regular clinical visits including detailed medical history as well as alcohol consumptions information collection, anthropometric measurements, steatosis assessment with ultrasonography and serological examination with biochemical and metabolic parameters as our previous report [
]. The UK Biobank database was extracted as the second independent and external validation with the fatty liver index (FLI) as the method to rule in steatosis, and those data with MRI-PDFF would be analyzed as a subgroup. UK Biobank is a large prospective cohort of middle-aged adults designed to support biomedical analysis focused on improving the prevention, diagnosis, and treatment of chronic disease and the UK Biobank project recruited 502,628 participants (5.5 % response rate, the majority of whom were age 40–70 years) from the general population between April 2007 and December 2010 who attended 1 of 22 assessment centers across England, Wales, and Scotland and completed a touch screen questionnaire, had physical measurements taken, and provided biological samples. The inclusion in the study required complete demographic, laboratory and imaging data. Patients were from 18 to 65 years of age in the cross-sectional population but over 30 years old in the Chinese cohort. All patients in the cross-sectional population were naïve to treatment for any metabolic diseases, and were firstly diagnosed as MAFLD.
The exclusion criteria for patients from the cross-sectional population, the Chinese cohort and the international validation cohort included: 1) Imaging evidence of hepatocellular carcinoma (computed tomography [CT] or MR scan of the abdomen) and elevated alpha protein level; 2) Preexisting coronary heart disease (CHD), heart failure, stroke, chronic kidney disease or malignancies; 3) Excessive alcohol intake (>30 g/day in men and >20 g/ day in women); 4) End-stage liver diseases; 5) Hepatitis B (positive for hepatitis B surface antigen for over 6 months) and hepatitis C virus infection (positive tests for antibody against the hepatitis C virus); 6) Autoimmune liver disease (positive tests for anti-nuclear antibody, anti-smooth muscle antibody, or anti-mitochondrial antibody); 7) Secondary causes of fatty liver such as long-term use of the steroids, amiodarone, tamoxifen or methotrexate; and 8) Pregnancy and breastfeeding.
2.2 Clinical and metabolic evaluation
In the cross-sectional population and Chinese cohort, patient demographic, social and medication history, age, gender, medical history, and nicotine and alcohol consumption were collected with a questionnaire interview. All patients received anthropometric measurements including body weight, body height, waist circumference, and hip circumference, and blood pressure was taken. Venous blood samples were collected after fasting for a minimum of 8 h. The insulin resistance (IR) index was calculated using the homeostasis model assessment IR (HOMA-IR) according to the following equation: HOMA-IR = fast insulin (FINS) (μU/mL) × fast blood glucose(FBG) (mmol/L) / 22.5 [
Insulin resistance exhibits varied metabolic abnormalities in nonalcoholic fatty liver disease, chronic hepatitis B and the combination of the two: a cross-sectional study.
The Asian Pacific Association for the Study of the Liver clinical practice guidelines for the diagnosis and management of metabolic associated fatty liver disease.
]. Diagnosis of fatty liver in the cross-sectional population and the international cohort who received measurements with MRI-PDFF. Diagnosis of fatty liver in the Chinese validation cohort was determined by ultrasonography, and defined as the presence of liver and kidney echo discrepancies, with or without the presence of posterior attenuation of the ultrasound beam, vessel blurring, and difficult visualization of the gallbladder wall and the diaphragm. Diagnosis of fatty liver for the international cohort was according to the algorithm of fatty liver index (FLI) with an established cut-off values of ≥60 [
All radiographic studies were finished within 2 weeks of clinical estimations. High-resolution B-mode ultrasonography was used to evaluated fat deposition in the liver and the right and left common carotid arteries by experienced radiologists. The finding on the original images were further captured by 2 experienced investigators at each center who were blinded to the aim of the study and patient characteristics. Fatty liver was evaluated in all cases with the criteria of parenchymal brightness, increased echo contrast of liver to kidney, presence of posterior attenuation of ultrasound beam, vessel blurring, difficult visualization of the gallbladder wall or the diaphragm. Carotid atherosclerosis was defined as a focal intima-media thickness > 0.5 mm or 50 % of surrounding intima-media into the arterial lumen. Liver fat content, pancreas fat content and abdominal subcutaneous fat thickness (ASFT) was assessed using MRI-PDFF with 2-point DIXON-fat-water-separation MRI at 3.0 Tesla (Siemens 3.0 T Magnetom Verio), which were in accordance with our previous study [
]. Liver stiffness measurement (LSM) was obtained by real-time shear wave elastography (Super Sonic Imagine, Aix en Provence, France) and cutoff values for LSM for discriminating different severity of hepatic fibrosis was defined as follows: significant fibrosis, 6.3–8.9 kPa; advanced fibrosis, 9.0–10.9 kPa; and cirrhosis, >11.0 kPa [
The ability of lipoprotein (a) level to predict early carotid atherosclerosis is impaired in patients with advanced liver fibrosis related to metabolic-associated fatty liver disease.
]. All measurements were obtained by the same 2 experienced physicians, and the detail could be found in Supplementary instruction and methods.
2.5 Outcomes
T2DM was defined as FBG ≥ 7.0 mmol/L, a 2-hour plasma glucose oral glucose tolerance test glucose value of ≥11.1 mmol/L, HbA1c > 6.5 % or current use of hypoglycemic drugs or insulin [
]. CHD was defined as an acute myocardial infarction (MI) diagnosed by abnormal biochemical indicators indicating myocardial ischemic necrosis, accompanied by symptoms of myocardial ischemia, or an electrocardiogram suggesting pathological Q wave or ST segment elevation/depression, or a final coronary intervention, angina symptoms requiring hospitalization, or heart failure due to ischemic cardiomyopathy [
Predicting the 10-year risks of atherosclerotic cardiovascular disease in Chinese population: the China-PAR project (Prediction for ASCVD risk in China).
Guidelines for the early management of patients with acute ischemic stroke: 2019 update to the 2018 guidelines for the early management of acute ischemic stroke: a guideline for healthcare professionals from the American Heart Association/American Stroke Association.
Detailed description of clustering methods could be found in Supplementary instructions and methods. For cluster analysis, we first used TwoStep k-means clustering, latent class analysis and Kohonen's neural networks to explore clustering in the cross-sectional population who had the most comprehensive data (detailed flow and algorithms were presented in Supplementary Fig. 1, detailed information of variables selection and input into these models were shown in Supplementary Table 1 and the original data could be found in Supplementary material 2), after we evaluated different cluster results with stability (Supplementary Fig. 2) and finally chose TwoStep k-means clustering [
]. Cluster analysis was performed on values centered to a mean value of 0 and a standard deviation (SD) of 1. Males and females were clustered separately to avoid stratification due to sex-dependent differences in the cluster variables, and to provide separate cohorts for validation of results. Results were verified using k-means clustering resulting in similar cluster distributions to TwoStep clustering. Finally, analysis showed that including baseline age, BMI, HbA1c, TC/HDL-C ratio, and triglyceride and Lp(a) levels resulted in better cluster quality statistically and clinically. We used the t-distributed stochastic neighbor embedding (t-SNE) visualization technique to reduce the normalized data into a 2-dimensional space. Radar plots were drawn using standardized z-values for teach variable for the clusters that were calculated these by adjusting the cluster mean for each variable to the whole cohort mean and SD for that variable.
Normally distributed data were presented as median (standard deviation). Differences between groups were determined using Student's t-test, ANOVA, and Pearson's chi-squared test with Bonferroni corrections. Prognosis of the subgroups were estimated by the Kaplan–Meier method and compared by the log-rank test. Cox regression models with forward selection (likelihood ratio) were used to estimate hazard ratios (HRs) and 95 % confidence intervals (CI) for different outcomes. Adjustments were made for potential confounders including age, sex, BMI, waist circumstance, waist-to-hip ratio, current smoking, history of hypertension and drug use, and ALT, GGT, and uric acid levels. Two-sided value of p < 0.05 was considered statistically significant. All statistical analyses were performed using SPSS version 20.0 statistical software (SPSS Inc., Chicago, IL, USA), while t-SNE analysis was conducted and the plot visualized with R version 4.0.6 software.
3. Results
3.1 Characteristics of patients in the different population
A total of 1038 newly diagnosed patients with MAFLD (mean age 41.3 years) and 143 control patients (mean age 41.6 years) from the cross-sectional population who received MRI-PDFF assessments were included in the analysis (enrollment details in Supplementary Fig. 4). The frequencies of male sex, T2DM, higher BMI and waist and hip circumstances, higher levels of ALT, AST, GGT, triglycerides, LDL-C and ApoB, and higher HOMA-IR, and lower HDL-C were significantly greater in the MAFLD group than the control group (all, p < 0.05) (Table 1).
Table 1Baseline characteristic of study population.
Follow-up time was presented as median (minimum to maximum).
10.0(2.0–16.0)
9.7(1.8–16.0)
9.0(0.1–11.4)
8.0(0.2–11.5)
9.0(7.0–10.1)
9.0(5.3–11.0)
Current smoker, n(%)
157(15.1 %)a
14(14.1 %)
1015(28.8 %)
1520(21.9 %)
Hypertension, n(%)
115(11.1 %)abc
19(12.9 %)
1077(30.8 %)d
1099(15.8 %)
35,593(33.4 %)
36,641(18.5 %)
337(33.4 %)
642(17.8 %)
Type 2 diabetes, n(%)
72(6.9 %)abc
6(4.5 %)bc
973(27.7 %)de
969(13.9 %)de
1793(1.7 %)
1640(0.8 %)
22(2.2 %)
29(0.8 %)
BMI (kg/m2)
26.8 ± 3.6abc
24.8 ± 2.8ac
25.9 ± 3.7de
22.6 ± 3.4de
31.3 ± 4.4f
24.9 ± 2.8f
29.3 ± 4.3
25.9 ± 3.9
Waist circumstance (cm)
90.1 ± 8.9abc
85.1 ± 9.6b
89.8 ± 9.0de
80.1 ± 9.8e
102 ± 9.7f
82.7 ± 8.9f
96.5 ± 11.4
85.7 ± 11.9
Hip circumstance (cm)
100.5 ± 7.0a
96.9 ± 6.2
100.2 ± 6.9
96.0 ± 7.8
WHR
0.90 ± 0.05abc
0.88 ± 0.06abc
0.89 ± 0.05de
0.82 ± 0.05d
0.93 ± 0.07f
0.83 ± 0.07f
0.91 ± 0.08
0.84 ± 0.08
SBP (mmHg)
131 ± 16a
129 ± 18a
134 ± 19
127 ± 18
DBP (mmHg)
86 ± 12a
85 ± 11a
82 ± 11
76 ± 12
ALT (U/L)
55 ± 46abc
35 ± 29abc
28 ± 14
21 ± 12de
30 ± 17
19 ± 10f
30 ± 17
21 ± 13
AST (U/L)
38 ± 31bc
32 ± 22abc
27 ± 22de
27 ± 22d
29 ± 12
24 ± 8f
28 ± 11
25 ± 14
GGT(U/L)
62 ± 74abc
48 ± 52bc
77 ± 109de
46 ± 78de
54 ± 54f
26 ± 19f
45 ± 41
32 ± 32
ALP(U/L)
81 ± 40
81 ± 40
82 ± 37
79 ± 29
Hs-CRP (mg/L)
1.8 ± 0.4
1.4 ± 0.3
FBG (mmol/L)
5.1 ± 1.1ab
5.0 ± 1.1a
5.8 ± 2.1de
5.5 ± 1.6de
5.3 ± 1.4f
5.0 ± 0.9f
5.2 ± 1.3
4.9 ± 1.0
FIN (uU/Ml)
12.0 ± 9.4a
9.4 ± 10.6
11.2 ± 6.4
9.6 ± 13.4
HOMA-IR
2.8 ± 2.4a
2.1 ± 2.1
2.7 ± 1.8
2.1 ± 2.9
HbA1c%
5.9 ± 1.0abc
4.7 ± 0.7bc
4.8 ± 0.7de
4.9 ± 2.1de
5.6 ± 0.7f
5.3 ± 0.6f
5.5 ± 2.0
5.1 ± 0.9
Uric acid (μmol/L)
414 ± 104a
368 ± 90a
409 ± 90
361 ± 94
TC (mmol/L)
5.1 ± 1.1abc
5.0 ± 1.1abc
5.3 ± 1.1de
4.7 ± 1.0de
5.8 ± 1.2
5.7 ± 1.1
5.8 ± 1.2
5.7 ± 1.1
TG (mmol/L)
1.85 ± 1.25abc
1.34 ± 0.65c
2.18 ± 1.71de
1.26 ± 0.50de
2.40 ± 1.10f
1.35 ± 0.62f
2.12 ± 1.22
1.51 ± 0.85
HDL-c (mmol/L)
1.15 ± 0.29abc
1.21 ± 0.28abc
1.12 ± 0.33de
1.26 ± 0.49de
1.26 ± 0.29f
1.56 ± 0.36f
1.29 ± 0.3
1.51 ± 0.37
LDL-c (mmol/L)
3.25 ± 0.79abc
3.11 ± 0.75bc
3.37 ± 1.14de
2.95 ± 1.31de
3.73 ± 0.87f
3.51 ± 0.82f
3.67 ± 0.87
3.54 ± 0.81
FFA (mmol/L)
562 ± 205
554 ± 267a
610 ± 321
607 ± 283
ApoA1 (g/L)
1.26 ± 0.22abc
1.26 ± 0.22abc
1.24 ± 0.33de
1.26 ± 0.34de
1.44 ± 0.23f
1.60 ± 0.27f
1.45 ± 0.24
1.57 ± 0.26
ApoB (g/L)
0.95 ± 0.22abc
0.86 ± 0.19abc
0.98 ± 0.27de
0.89 ± 0.27de
1.10 ± 0.25f
1.00 ± 0.22f
1.08 ± 0.24
1.02 ± 0.22
ApoE (mg/L)
45 ± 16a
43 ± 16a
47 ± 17
41 ± 11
Lp(a) (mg/L)
164 ± 218abc
177 ± 199bc
267 ± 288de
227 ± 206de
435 ± 488
453 ± 432
391 ± 453
451 ± 433
Fatty liver index
54.4 ± 25.1bc
35.2 ± 25.1abc
55.6 ± 24.3de
29.9 ± 22.5de
80.5 ± 11.4f
27.6 ± 16.9
66.1 ± 23.7
36.4 ± 27.1
Image examination
Liver fat content (%)
13.9 ± 7.7d
3.6 ± 1.0d
10.8 ± 5.7
2.0 ± 1.1
Pancreas fat content (%)
2.7 ± 2.2
2.7 ± 3.8
Abdominal wall fat thickness (cm)
23.3 ± 8.1
22.6 ± 7.4
Liver stiffness (kPa)
6.5 ± 3.2
6.6 ± 3.1
Carotid atherosclerosis, n(%)
329(31.7 %)
29(20.1 %)
Medication history, n(%)
Anti-hypertension drugs, n(%)
1404(39.8 %)de
1634(23.5 %) de
16,280(15.3 %)
15,819(8.0 %)f
141(14.0 %)
217(6.1 %)
Lipid-lowering drugs, n(%)
614(17.5 %)de
1341(19.6 %) de
18,728(17.6 %)
16,807(8.5 %)
179(17.7 %)
281(7.8 %)
Aspirin, n(%)
280(8.0 %)de
400(5.7 %)de
15,748(14.8 %)
21,751(11.0 %)
149(14.8 %)
375(10.4 %)
Comparison p < 0.05 between a: cross-sectional population with MRI-PDFF and the Chinese cohort; b: cross-sectional population with MRI-PDFF and the international cohort; c: cross-sectional population with MRI-PDFF and the international cohort with MRI-PDFF; d: the Chinese cohort and the international cohort; e: the Chinese cohort and the international cohort with MRI-PDFF; f: the international cohort and the international cohort with MRI-PDFF.
All of the comparison were performed within MAFLD or non MAFLD patients.
The Chinese validation cohort included a total of 10,451 persons (27.2 % with MAFLD, 3494 patients) with a mean age of 42.9 years, and median follow-up of 9.7 years. The international cohort included 304,141 participants (106,768 MAFLD patients and 197,733 non-MAFLD controls) and the subgroup of 4611 patients received MRI-PDFF, with a median follow-up time of 9.0 years (Supplementary Fig. 3). Comparisons of clinical characteristics between MAFLD and non-MAFLD patients were similar to that of patients from cross-sectional population with MRI-PDFF (Table 1). At baseline, MAFLD patients in the international cohort were the oldest, had the highest BMI and waist circumference, and the lowest proportions of males as compared to the Chinese cohort and the established samples. Comparisons of demographic, clinical, and biochemical data, and outcomes among the establish population and the validation cohorts are summarized in Table 1 and Supplementary Fig. 4.
3.2 Cluster analysis of the new-onset MAFLD cohort
Cluster analysis of the cross-sectional population identified 5 clusters of patients with homogenous phenotypes, and t-SNE analysis exhibited and verified the polarized distributions of patient characteristics across the clusters (Fig. 1A ). The proportions of each cluster across the established and validation cohorts are shown in Fig. 1B–E and Supplementary Fig. 3. Cluster characteristics of liver enzymes levels, liver stiffness and the fat content of the liver, pancreas and abdominal wall are shown in Fig. 3. Results of the analyses based on sex are summarized in Supplementary Figs. 4 and 5.
Fig. 1Visualization of the clustering and patient distribution according to method of classification. (A) t-SNE visualization of the clustering data of the cross-sectional population in the two-dimensional space. All data points are labelled with k-means clustering. Distribution of patients in (B) the cross-sectional population, (C) the Chinese cohort, (D) the international cohort and (E) the international cohort with MRI-PDFF.
Cluster 1 included approximately 40.0 % of the clustered individuals, and was characterized by slightly higher BMI, serum triglycerides, TC/HDL ratio, and Lp(a) level than those of the controls. Cluster 1 was named mild obesity and dyslipidemia related MAFLD (MODM). Cluster 2 (28.2 % of the included subjects) had the greatest mean age and relatively higher triglyceride level and higher TC/HDL-C ratio, HOMA-IR than Cluster 1, and was termed as age-related MAFLD (AM). Cluster 3 included 173(16.7 %) patients and was characterized by severe obesity and an extremely high HOMA-IR and was named severe insulin-resistance related MAFLD (SIRM). Cluster 4 (9.4 % of all patients) was characterized by an extremely high Lp(a) level, but relatively lower triglyceride level, TC/HDL-C ratio and HOMA-IR than the other clusters, and was named as simple high Lp(a)-related MAFLD (HLPAM). Cluster 5 (5.7 % of the total population) was characterized by an extremely high TC/HDL-C ratio, triglyceride level, and relatively high Lp(a) level, BMI, and HOMA-IR. It was named as severe mixed hyperlipidemia-related MAFLD (SMHLM) (Fig. 1A and Fig. 2).
Fig. 2Cluster characteristics of patients from the cross-sectional population. Distributions of age, BMI, triglyceride, TC/HDL-c Lp(a) and HbA1c (%) for each cluster according to method of classification. BMI, body mass index; TC, total cholesterol; HDL-c, high density lipoprotein cholesterol; Lp(a), lipoprotein (a).
A hexagonal radar chart for the corresponding characteristics of patients from cross-sectional population that received MRI-PDFF, which provides a more visible comparison, is shown in Fig. 3 and Supplementary Table 2. The data included fat accumulation assessment (waist circumference, waist-to-hip ratio, pancreas fat content, abdomen fat thickness, and liver fat content, liver enzymes, and lipid and glucose levels. The adjusted cohort means were set as references. The patients in Cluster 3 had significantly higher levels of ALT, AST, and GGT, indicating more severe liver injuries than the other clusters (Fig. 3 and Supplementary Tables 2–5). Notably, Cluster 3 also had the highest levels of FFAs, LDL-C, ApoA1, and ApoE, highest liver and pancreas fat content, and greatest waist circumstance. Although the levels of liver enzymes in Cluster 2 were lower than those of Cluster 3, liver stiffness in Cluster 2 was the highest of all the clusters. Cluster 5 also had relatively serious metabolic disorders, which manifested as higher levels of GGT, uric acids, and ApoE, and greater liver fat content and abdominal as compared to Cluster 1. In Cluster 2 and Cluster 4, there were no marked abnormalities of ALT, AST, FFA, and ApoA1 levels, as well as pancreatic fat content. Similar trends were also presented in the two validated cohort at baseline (Supplementary Figs. 7–17).
Fig. 3Profiles of the five clusters from the cross-sectional population. (A) Cluster 1: mild obesity and dyslipidemia related MAFLD; (B) Cluster 2: age related MAFLD); (C) Cluster 3: severe insulin resistance related MAFLD; (D) Cluster 4: high Lp(a) related MAFLD; (E) Cluster 5: severe mixed hyperlipidemia related MAFLD. Radar plots were drawn for each cluster by using z-values which were calculated by adjusting the cluster mean for each variable to the cohort mean and SD for each variable. We then compared the radar plots visually and describe the particular characteristics of each cluster.
To certificate whether the classification is specific to MAFLD, we also used the same cluster analysis method to group non MAFLD patients. The results were shown by t-SNE graphic (Supplementary Fig. 6).
3.3 Comparisons of risk of complications and mortality among the clusters
During follow-up, we compared MAFLD progression and development of extrahepatic complications (T2DM, CHD, stroke, and overall mortality) between clusters in the Chinese and international validation cohorts as well as the subgroups that received MRI-PDFF in the latter (Fig. 4, Fig. 5). In the Chinese validation cohort, patients in cluster 3 had the highest risk of developing T2DM with a median follow-up time of 9.7 years, while patients in Cluster 1 had the lowest risks of complications and the frequencies of the complications investigated were similar to those of the control group (Fig. 4A). These results were in accordance with those of the international validation cohort and the subgroup that received MRI-PDFF, with Clusters 3 having the worst outcomes and Cluster 5 having a better outcome with respect to the occurrence of T2DM (Fig. 5A).
Fig. 4Prognosis of disease over time by clusters in the Chinese cohort. Prognosis of (A) type-2 diabetes, (B) CHD and (C) stroke and (D) time to death.
Fig. 5Prognosis of disease over time by clusters in the international cohort and the international cohort with MRI-PDFF. Prognosis of (A) type-2 diabetes, (B) CHD and (C) stroke and (D) time to death.
The associations of the outcomes and different clusters were examined using Cox regression models, with non-MAFLD patients as the reference. Multivariate, adjusted Cox proportional hazards models demonstrated that Clusters 3 was at a significantly higher risk of developing T2DM (HR = 11.78, 95 % CI: 5.13–27.02; p < 0.0001) than the other clusters, or the controls for the Chinese validation cohort (Supplementary Tables 6–11). A similar finding was observed in the international cohort (HR = 7.54, 95 % CI: 6.91–8.22; p < 0.0001, Supplementary Tables 10–11).
Cluster 1 had the lowest risk for the development of CHD, while the risk was markedly greater for patients in Clusters 2, 3, 4, and 5 across the 2 cohorts and the MRI-PDFF subgroup (Figs. 4B, 5A, B). Kaplan–Meier analysis of the international cohort revealed that the cumulative CHD rate of Cluster 1 was the lowest, and the stepwise increases in Clusters 2, 3, 4 and 5 were higher than in Cluster 1 and non-MAFLD patients (log rank: p < 0.001) (Fig. 5A). No significant difference was observed between Clusters 2 and 4 (Supplementary Tables 6–11). Notably, worse outcomes with respect to CHD development were found in Clusters 2 and 5 as compared to Cluster 1 and the non-MAFLD controls, and patients that received with MRI-PDFF. The difference in CHD development rate between Cluster 3 and Cluster 1 was borderline significant; however, further adjusted Cox models showed that Cluster 3 was associated with the highest risk of CHD, while Clusters 2, 4, and 5 were found to have an intermediate higher risk. However, there was only a statistically significant difference between Cluster 5 and the MRI-PDFF subgroup.
With respect to the occurrence of stroke, in the international cohort all clusters, with the exception of Cluster 5, had an increased risk of stroke as compared to non-MAFLD patients (all, p < 0.001), but no significant difference was seen among clusters in those with MRI-PDFF measurements (Supplementary Table 11). Meanwhile, Clusters 3 and 4 were also with increased HR (HR = 1.52, 95 % CI: 1.24–1.86; p < 0.0001 and HR = 1.19, 95 % CI: 1.01–1.40; p = 0.033) on multivariate analysis.
With respect to all-cause mortality, the risk of death was higher in Clusters 2, 3 and 4 as compared to the risk of death in the control group, whereas those in Cluster 5 presented similar but Cluster 1 presented better survivals in the international cohort (Fig. 4, Fig. 5). However, there was no significant difference in the MRI-PDFF subgroup. Adjusted Cox regression model analysis produced similar results.
In addition, we found the same association of the cluster characteristics cease to exist in non MAFLD patients by comparing to the cumulative CHD, T2DM, stroke and all-cause mortality rate of MAFLD patients (Supplementary Figs. 18–21). We also use the same algorithm clustering the non MAFLD patients of the International cohort and incidence and risk of different outcomes in different clusters for both MAFLD and non MAFLD patients were performed by Cox regression (Supplementary Table 12). We also found that the performance of each variable using in clustering algorithm varied in predicting different outcomes (Supplementary Table 13).
4. Discussion
MAFLD has been identified as a driving force for the increasing rates of T2DM and CHD, as well as mortality associated with these conditions. [
]. An important cause of the lack of adequate treatments has been ascribed to the high heterogeneity of the condition of NAFLD; therefore, a new classification system was developed and the condition was renamed MAFLD [
The Asian Pacific Association for the Study of the Liver clinical practice guidelines for the diagnosis and management of metabolic associated fatty liver disease.
]. However, the classification is too simple, and is based on an international expert consensus instead of data. To address this issue, the current study first identified 5 novel subgroups of MAFLD patients through data-driven cluster analysis using demographic and metabolic data including age and HbA1c, TC, triglyceride, HDL-C and Lp(a) levels. Further analysis of the novel subgroups (clusters) found that the new classification into 5 clusters is superior to the classic MAFLD classification as it provides prognostic data and the risks of complications of the 5 different clusters. The findings were then validated in cohorts different from those the models.
Insulin resistance (IR) occurs when a normal amount of insulin cannot efficiently cause the metabolism of glucose, and is a critical physiological-pathological feature of MAFLD and metabolic syndrome [
]. IR is compensated for hyperinsulinemia, which promotes increased release of FFAs from adipocytes and consequently a higher rate of FFA influx into liver and pancreas as well as hepatic fat de-novo synthesis. This accelerates the development of glycemic dysregulation including hepatic IR and diabetes, and also results in the development of bio-directional positive feedback loops of which escalate metabolic abnormalities [
]. A retrospective study involving 4010 participants, with 2220 NAFLD patients, suggested that serum FFA levels in NAFLD patients were associated with IR (OR 9.24, 95 % CI 6.43–13.36), prediabetes (OR 10.48, 95 % CI 5.66–19.39) and T2DM (OR 19.43, 95%CI 12.75–29.81) risk [
]. A cohort study based on Mendelian randomization found that after adjusting for multiple confounding factors, the degree of hepatic steatosis was correlated with IR (β: 0.25, 95%CI 0.20–0.31, p < 0.001), and severity of hepatic fibrosis (β: 0.35, 95%CI 0.30–0.39) [
]. Our study showed that patients with severe IR can be clustered as a distinct group with high BMI, FFAs and liver fat content and relatively higher liver stiffness, indicating that these patients will likely benefit from targeted treatment of IR and the prevention of liver-related adverse outcomes.
Mixed hyperlipidemia, including hypertriglyceridemia and hypercholesterolemia, and an atherogenic lipoprotein profile are a large contributing factor for the development of cardiovascular disease (CVD). Patients with MAFLD commonly present with increased very low-density lipoprotein (VLDL) secretion [
Atherogenic dyslipidemia, but not hyperglycemia, is an independent factor associated with liver fibrosis in subjects with type 2 diabetes and NAFLD: a population-based study.
]. A retrospective cohort study with 7077 initially NAFLD-free participants reported that after 7 years follow-up elevated serum ApoB levels independently predicted an increased risk of development of NAFLD [
]. Another cohort study recruited 32,121 Chinese subjects suggested that the risk of NAFLD development had non-linear positive relation with the TC/HDL-C ratio [
]. In our study, patients in Cluster 5, characterized by extremely high serum levels of various lipids, had a higher risk of developing CHD and T2DM than non-MAFLD patients and patients in Cluster 1, even after adjusting for a history of lipid lowing drug use. This suggests that MAFLD patients with severe lipid abnormalities (Cluster 5) should receive more intensive lipid control.
Lp(a) is a LDL-like particle composed of an ApoB molecule covalently linked to an ApoA molecule [
]. A cross-sectional study enrolling 2242 Korean patients reported that there was dose-response relation between serum Lp(a) level and presence of NAFLD (p for trend <0.001) [
]. Our study suggested that MAFLD elevated serum Lp(a) levels associated with a higher risk of CHD development, even though there may be milder hepatic steatosis. It can be inferred that Lp(a) is more useful to predict the development of CVD in MAFLD.
Liver is not only the hepatic border of metabolic syndrome, but is the metabolic factory now unbalanced by metabolic changes [
]. Out classifications underline the role of the liver in the pathogenesis of metabolic derangement during MAFLD. Since liver biopsy is not available and liver disease outcomes were not included for analysis, the proposed classification of our study is only for extrahepatic complications. The new clustering method were not yet defined as represent different etiologies of MAFLD, nor that this clustering is the best classification of MAFLD subtypes. Interestingly, the variables that identified with best discrimination abilities and stability were metabolic factors but not relative liver specific factors such as liver injuries or fibrosis associated markers. When using new-onset MAFLD patients as study subjects, it may mean fewer fibrosis events might catch up as the fibrosis development need a longer time. Therefore, it is unclear whether the specific clusters that may imply the driving or potential major metabolic dysfunctions at the disease onset are in any way specific to MAFLD fibrosis progressions. Moreover, whether patients in one cluster can be transferred into the other clusters needs to be investigated in future prospective follow-up studies with a larger sample size. In our study, inclusion of MAFLD patients in the cross-sectional (phase 1) study were based on MRI-PDFF, but the Chinese validation cohort were included based on Ultrasound and the International cohort were based on the FLI, which in itself is calculated based on some variables included in the clustering algorithm. The difference in inclusion may lead to some bias in cluster grouping. Liver fibrosis stage and its clinical course has been identified as a key factor in the development of T2DM and CVD; however, our cluster model was built using data of patients with new-onset MAFLD, therefore, it would be considered as a stable and mild abnormality variable (Cluster-center coordinates in the cross-sectional population are shown in Supplementary Table 14), which might be more appropriate as the monitoring parameters to refine the risks stratification. The bias from medication history also needs consideration. Our study found that no matter in the Chinese cohort or the International cohort, only MODM patients were less likely having lipid-lowering drugs and aspirin than those in other clusters (Supplementary Tables 3–5), indicating that medication had little influenced on the results. The value of using additional cluster variables such as genetic variants deserve further study. Lastly, data of other known risk factors for MAFLD complications, such as c-peptide, inflammatory cytokines, and biopsy estimation of histological category were not available and thus were not included in the analysis.
5. Conclusions
The novel classification method developed in the study based on lipid levels, age and HbA1c level offers improved discrimination of new-onset MAFLD patients with different metabolic complications, especially for the long-term risks of development of T2DM and CHD. The classification of MAFLD patients into 5 categories may assist in developing more individualized management strategies based on targeting the underlying metabolic dysfunction.
Data sharing
•
Only data that does not involve the individual participants' personal privacy can be available to others who are interested in our study.
•
The data that do not contain any identified factors will be made available.
•
Additional, related documents will be available.
•
The data will be available with publication.
•
The data will be made available by contacting the Corresponding Author through email addresses.
•
The sharing of our data needs a signed data access agreement.
Funding
This study is funded by National Natural Science Foundation of China (82100648, 81870404, 81670518), Guangdong Basic and Applied Basic Research Foundation (2022A151501225369) and China Postdoctoral Science Foundation (2020M683128).
Ethics declaration
The protocol of the study was approved by the Institutional and Regional Medical Ethics committees of the hospitals participating in the study (ethnic code: [2020]187), and registered in the Chinese Clinical Trial Register (ChiCTR-DCD-2000034197). Written consent from patients in the first validation cohort was waived because their data were retrospectively and anonymously extracted from the electronic Hospital Information Manage System. All participants for the international validation cohort provided written informed consent and the study was approved by the NHS National Research Ethics Service. This research has been conducted using the UK Biobank Resource under Application Number 56925.
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information.
We are grateful to Professor Aihua Lin from Department of Medical Statistics, School of Public Health, Sun Yat-sen University and Yang Jhih-Jyun from QuickyDen Co, Ltd., Guangzhou, China who provide assistance in statistical analysis to this study.
The Asian Pacific Association for the Study of the Liver clinical practice guidelines for the diagnosis and management of metabolic associated fatty liver disease.
Fibrosis severity as a determinant of cause-specific mortality in patients with advanced nonalcoholic fatty liver disease: a multi-national cohort Study.
Insulin resistance exhibits varied metabolic abnormalities in nonalcoholic fatty liver disease, chronic hepatitis B and the combination of the two: a cross-sectional study.
The ability of lipoprotein (a) level to predict early carotid atherosclerosis is impaired in patients with advanced liver fibrosis related to metabolic-associated fatty liver disease.
Predicting the 10-year risks of atherosclerotic cardiovascular disease in Chinese population: the China-PAR project (Prediction for ASCVD risk in China).
Guidelines for the early management of patients with acute ischemic stroke: 2019 update to the 2018 guidelines for the early management of acute ischemic stroke: a guideline for healthcare professionals from the American Heart Association/American Stroke Association.
Atherogenic dyslipidemia, but not hyperglycemia, is an independent factor associated with liver fibrosis in subjects with type 2 diabetes and NAFLD: a population-based study.