La Trobe

DeepKEGG: a multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery

Download (5.75 MB)
journal contribution
posted on 2024-05-21, 02:51 authored by W Lan, H Liao, Qingfeng ChenQingfeng Chen, L Zhu, Y Pan, Yi-Ping Phoebe ChenYi-Ping Phoebe Chen

Abstract: Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.

Funding

This work was partially supported by the National Natural Science Foundation of China (No. 62072124 and 62172158), the Natural Science Foundation of Guangxi (No. 2023JJG170006), the Natural Science Foundation of Hunan Province (No. 2023JJ50117), the Natural Science and Technology Innovation Development Foundation of Guangxi University (No. 2022BZRC009), the CAAI-Huawei MindSpore Open Fund (No. CAAIXSJLJJ-2022-022A), the Project of Guangxi Key Laboratory of Eye Health (No. GXYJK-202407), the Project of Guangxi Health Commission eye and related diseases artificial intelligence screen technology key laboratory (No. GXYAI-202402).

History

Publication Date

2024-03-27

Journal

Briefings in Bioinformatics

Volume

25

Issue

3

Article Number

bbae185

Pagination

16p.

Publisher

Oxford University Press (OUP)

ISSN

1467-5463

Rights Statement

© The Author(s) 2024. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.