空间双线性注意力网络识别溃疡性结肠炎与克罗恩病

文章快速检索

引用本文

戚婧, 阮广聪, 杨毅, 吴毅, 曹倩, 魏艳玲, 粘永健. 空间双线性注意力网络识别溃疡性结肠炎与克罗恩病[J]. 陆军军医大学学报, 2023, 45(3): 227-234, 242. DOI: 10.16016/j.2097-0927.202209187 复制到剪切板

QI Jing, RUAN Guangcong, YANG Yi, WU Yi, CAO Qian, WEI Yanling, NIAN Yongjian. Identification of ulcerative colitis and Crohn's disease based on spatial and bilinear attention network[J]. Journal of Army Medical University, 2023, 45(3): 227-234, 242. DOI: 10.16016/j.2097-0927.202209187 复制到剪切板

这篇开放获取文章遵循CC BY许可协议

空间双线性注意力网络识别溃疡性结肠炎与克罗恩病

戚婧¹, 阮广聪², 杨毅¹, 吴毅¹, 曹倩³, 魏艳玲², 粘永健¹

1. 400038 重庆, 陆军军医大学(第三军医大学)生物医学工程与影像医学系数字医学教研室;
2. 400042 重庆, 陆军特色医学中心消化内科;
3. 310016 杭州, 浙江大学医学院附属邵逸夫医院消化内科

收稿: 2022-09-21；修回: 2022-10-18

基金项目: 重庆市研究生科研创新项目(CYS22746)

通信作者: 粘永健，E-mail：yjnian@tmmu.edu.cn

[摘要] 目的利用深度学习技术辅助内镜医师识别溃疡性结肠炎(ulcerative colitis, UC)与克罗恩病(Crohn's disease, CD)。方法收集2018年1月至2020年11月陆军特色医学中心消化内科与邵逸夫医院消化内科共1 576例受试者的内镜图像, 包括CD、UC与正常三类共计34 300幅, 并按照9:1的比例随机划分训练集与测试集, 用于对网络进行训练与测试。在ResNet50基础上构建新颖的空间双线性深度网络(SABA-ResNet), 引入空间注意机制, 通过膨胀卷积扩大感受野以联系上下文信息, 并与普通卷积局部归纳特性相配合, 自适应聚焦病变区域。利用双线性注意提高网络的特征表示能力, 以二阶信息加权特征映射的通道信息, 从而提高模型的分类性能。结果 SABA-ResNet在测试集上对CD、UC和正常识别的总体准确率为92.67%(95%CI: 91.91~93.37), AUC分别为0.978(95%CI: 0.972~0.983)、0.977(95%CI: 0.971~0.982)和0.999(95%CI: 0.998~1.000), 灵敏度分别为88.40%、89.07%、98.89%, 特异性分别为95.49%、94.88%、98.93%, F1值分别为88.80%、89.01%和98.60%。消融实验与类激活映射图表明空间注意与双线性注意可帮助模型捕获更多病变区域的特征。结论所构建的网络将空间注意与双线性注意相结合, 在CD、UC与正常的识别中取得了良好的性能, 可以有效辅助内镜医师对UC与CD进行识别。

[关键词] 炎症性肠病深度学习溃疡性结肠炎克罗恩病

Identification of ulcerative colitis and Crohn's disease based on spatial and bilinear attention network

QI Jing¹, RUAN Guangcong², YANG Yi¹, WU Yi¹, CAO Qian³, WEI Yanling², NIAN Yongjian¹

1. Department of Digital Medicine, School of Biomedical Engineering and Imaging Medicine, Army Medical University(Third Military Medical University), Chongqing, 400038;
2. Department of Gastroenterology, Army Medical Center of PLA, Chongqing, 400042;
3. Department of Gastroenterology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province, 310016, China

Supported by the Project of Postgraduate Scientific Research Innovation of Chongqing (CYS22746)

Corresponding author: NIAN Yongjian, E-mail: yjnian@tmmu.edu.cn.

[Abstract] Objective To identify ulcerative colitis (UC) and Crohn's disease (CD) with aid of deep learning technology for endoscopists. Methods From January 2018 to November 2020, the endoscopic images of 1 576 subjects (including 34 300 CD, UC and normal images) were collected from the Department of Gastroenterology of Army Medical Center of PLA and Sir Run Run Shaw Hospital.The training set and test set were randomly divided according to the ratio of 9:1 to train and test the neural network.A novel spatial and bilinear deep network (SABA-ResNet) was constructed on the basis of ResNet50.The spatial attention mechanism was introduced, and the receptive field was expanded by dilated convolution to leverage contextual information, which was combined with the local induction of standard convolution to adaptively focus the lesion region.Bilinear attention was applied to improve the feature representation ability of the network, and the second-order information was used to weight the channel information of the feature map, so as to improve the classification performance of the model. Results The overall accuracy of SABA-ResNet for the recognition of CD, UC and normal tissues on the test set was 92.67%(95%CI: 91.91~93.37), the AUC value was 0.978(95%CI: 0.972~0.983), 0.977(95%CI: 0.971~0.982) and 0.999(95%CI: 0.998~1.000), the sensitivity was 88.40%, 89.07% and 98.89%, the specificity was 95.49%, 94.88% and 98.93%, and the F1 value was 88.80%, 89.01% and 98.60%, respectively.The ablation experiment and the class activation map suggested that spatial attention and bilinear attention could help the model capture more features of the lesion region. Conclusion Our constructed network combines spatial attention and bilinear attention, achieves excellent performance in the recognition of CD, UC and normal tissue, and effectively assist endoscopists in the diagnosis of UC and CD.

[Key words] inflammatory bowel disease deep learning ulcerative colitis Crohn's disease

炎症性肠病(inflammatory bowel disease, IBD)包括溃疡性结肠炎(ulcerative colitis, UC)和克罗恩病(Crohn’s disease, CD)，是由遗传、环境、免疫和微生物因素相互作用引起的慢性胃肠道炎症性疾病。IBD具有遗传倾向且不可治愈，还将增加患结直肠癌的风险^[1]。作为一种多系统疾病，除胃肠道外，IBD还会影响肌肉骨骼、眼部和皮肤系统^[2]，严重影响了患者的生活质量。21世纪以来，受环境因素影响，亚洲国家IBD的发病率和流行率逐渐增高^[3]。由于UC和CD在治疗策略、预后、综合评估和临床护理等方面存在较大差异，因此，针对两者的精准识别对于临床治疗决策的制定以及提高预后具有重要意义。

内镜检查在IBD的诊断、治疗和随访监测中起着至关重要的作用，可用于鉴别UC和CD，识别病变的严重程度及范围^[4]，但其鉴别过程主要依靠内镜医师的经验与水平，当CD病变仅累及结肠时，易混淆为UC。此外，内镜医师长时间的工作疲劳也在一定程度上影响了UC与CD的识别准确性。

人工智能技术的发展改变了临床医师和研究人员处理与分析大数据集的方式。已有一些研究探索了人工智能在IBD亚型分类中的应用，在一项对59例CD患者、26例UC患者和42例健康对照者的横断面研究中，SMOLANDER等^[5]将深度信念网络和支持向量机应用于基因表达数据集，对CD与UC的总准确率可达(95.71±0.21)%。TONG等^[6]在一项回顾性队列研究中，收集了875例CD患者、5 128例UC患者以及396例肠结核患者的内镜图像的文本描述，通过随机森林和卷积神经网络(CNNs)进行识别，RF和CNNs诊断UC/CD的精确度分别为0.97/0.65和0.99/0.87。相比于基因表达与文本信息，图像可以更加直观地展现病变特点并具有无限细粒度的特性，包含了更加丰富的语义信息。WANG等^[7]开发了ResNeXt101网络用于识别CD、UC和正常3种内镜图像，总体准确率为92.04%，其性能优于大多数临床医师，展现了深度学习在IBD亚型辅助诊断中的应用潜力。

CD和UC的病灶区域在内镜图像上呈现弥散状，在很多情况下具有较强的相似性，如何使深度学习模型更加聚焦于任务相关的高贡献区域，同时避免背景噪声的影响，是提高识别性能的关键因素。本研究基于ResNet50网络基础，提出SABA-ResNet网络，通过引入空间注意机制，加强对病灶区域的特征提取，引入双线性注意模块，在捕获二阶统计信息的同时，建立了二阶局部与全局特征之间的相互依赖性，加强了网络对关键特征的识别能力，有利于提升模型的分类性能。

1 材料与方法 1.1 数据集构建

收集2018年1月至2020年11月期间，陆军特色医学中心消化内科与浙江大学邵逸夫医院消化内科共1 576例受试者的34 300张内镜图像，其中CD患者556例，UC患者658例，正常362例。本研究经过陆军特色医学中心伦理委员会的批准(2021-285)。

本研究按9 ∶1的比例将数据集随机划分为训练集与测试集，其中训练集和测试集分别有29 414张和4 886张内镜图像。具体的数据来源分布与数据集的划分见表 1。

表 1 数据来源与数据集划分(张)

数据来源	CD	UC	正常	总计
陆军特色医学中心	1 093	3 379	12 689	17 161
邵逸夫医院	9 670	7 469	0	17 139
训练集	9 315	9 302	10 797	29 414
测试集	1 448	1 546	1 892	4 886
总计	10 763	10 848	12 689	34 300

表选项

1.2 整体网络结构

SABA-ResNet结构如图 1所示，主干网络为ResNet50，其主体有一个茎(Stem)和4个阶段，茎中包括一个7×7卷积层(Convolution)，一个批归一化层(BatchNorm)，一个ReLU激活函数和一个最大池化层(Maxpool)，而每个阶段中包括一个下采样模块(Bottleneck Downsample)和若干个残差模块(Bottleneck)。在前3个阶段输出后，引入SA模块，而在最后一个阶段输出后，引入BA模块，最后经过平均池化层(Avgpool)与全连接层(FC)后输出3种类别的概率分数。

图 1 SABA-ResNet整体网络结构

图选项

1.3 空间注意模块

空间注意模块的结构如图 2所示，若空间注意模块的输入特征为，其中C_i、H_i和W_i分别为第i阶段的通道、高度和宽度。F_input^SA首先由1×1卷积进行通道降维以减少计算量，得到大小为的中间特征，其中r为降维因子，实验中设置为4。随后利用3×3的普通卷积和膨胀卷积共同编码空间信息，膨胀卷积^[8]具有更大的感受野，可有效利用上下文信息，对弥漫状病灶有更好的辨别能力，而普通卷积结构紧凑，用以捕获局部细节信息，对散发状病灶适应性更强，随后两个分支均由3×3卷积将通道维度压缩至1，得到的空间映射分别为和M₂∈ ，其计算如下：

(1)

指标	VGG19	ResNet34	InceptionV4	SE-ResNet50	DenseNet121	SABA-ResNet
总体ACC	90.79	91.67	92.18	92.24	92.26	92.67
CD
ACC	91.71	92.51	93.02	92.92	93.02	93.39
SEN	87.02	87.78	87.91	88.12	89.30	88.40
SPE	93.69	94.50	95.17	94.94	94.59	95.49
PPV	85.31	87.05	88.46	88.00	87.42	89.20
NPV	94.49	94.83	94.92	94.99	95.45	95.13
F1	86.15	87.41	88.19	88.06	88.35	88.80
UC
ACC	91.47	92.10	92.82	92.73	92.82	93.04
SEN	85.58	87.26	88.42	88.16	87.13	89.07
SPE	94.19	94.34	94.85	94.85	95.45	94.88
PPV	87.21	87.71	88.82	88.79	89.86	88.95
NPV	93.38	94.12	94.65	94.54	94.12	94.94
F1	86.39	87.48	88.62	88.48	88.47	89.01
正常
ACC	98.40	98.73	98.53	98.83	98.69	98.92
SEN	97.94	98.26	98.52	98.73	98.73	98.89
SPE	98.70	99.03	98.53	98.90	98.66	98.93
PPV	97.94	98.46	97.69	98.26	97.90	98.32
NPV	98.70	98.90	99.06	99.20	99.19	99.30
F1	97.94	98.36	98.11	98.50	98.32	98.60

模型	总体ACC		ACC	SEN	SPE	PPV	NPV	F1
ResNet50	91.94	CD	92.90	87.78	95.06	88.20	94.86	87.99
		UC	92.55	87.84	94.73	88.53	94.39	88.18
		正常	98.42	98.47	98.40	97.49	99.03	97.98
ResNet50+BA	92.30	CD	93.12	88.74	94.97	88.13	95.25	88.44
		UC	92.84	87.84	95.15	89.34	94.41	88.58
		正常	98.65	98.68	98.63	97.85	99.16	98.26
ResNet50+SA	92.39	CD	93.21	88.88	95.03	88.27	95.30	88.58
		UC	92.98	88.16	95.21	89.49	94.56	88.82
		正常	98.59	98.52	98.63	97.85	99.06	98.18
SABA-ResNet	92.67	CD	93.39	88.40	95.49	89.20	95.13	88.80
		UC	93.04	89.07	94.88	88.95	94.94	89.01
		正常	98.92	98.89	98.93	98.32	99.30	98.60

[1]	MCDOWELL C, FAROOQ U, HASEEB M. Inflammatory bowel disease[M]. Treasure Island: StatPearls Publishing, 2022.
[2]	MALIK T F, AURELIO D M. Extraintestinal manifestations of inflammatory bowel disease[M]. Treasure Island: StatPearls Publishing, 2022.
[3]	PARK J, CHEON J H. Incidence and prevalence of inflammatory bowel disease across Asia[J]. Yonsei Med J, 2021, 62(2): 99-108.
[4]	FLYNN S, EISENSTEIN S. Inflammatory bowel disease presentation and diagnosis[J]. Surg Clin North Am, 2019, 99(6): 1051-1062.
[5]	SMOLANDER J, DEHMER M, EMMERT-STREIB F. Comparing deep belief networks with support vector machines for classifying gene expression data from complex disorders[J]. FEBS Open Bio, 2019, 9(7): 1232-1248.
[6]	TONG Y R, LU K M, YANG Y Y, et al. Can natural language processing help differentiate inflammatory intestinal diseases in China? Models applying random forest and convolutional neural network approaches[J]. BMC Med Inform Decis Mak, 2020, 20(1): 248.
[7]	WANG L J, CHEN L P, WANG X Y, et al. Development of a convolutional neural network-based colonoscopy image assessment model for differentiating Crohn's disease and ulcerative colitis[J]. Front Med, 2022, 9: 789862.
[8]	YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[EB/OL]. 2015, arXiv: 1511.07122. https://arxiv.org/abs/1511.07122.
[9]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]. IEEE International Conference on Computer Vision, 2017: 618-626.
[10]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all You need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010. DOI: 10.5555/3295222.3295349.
[11]	GUO M H, XU T X, LIU J J, et al. Attention mechanisms in computer vision: a survey[J]. Comput Vis Media, 2022, 8(3): 331-368.
[12]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 7132-7141. DOI: 10.1109/CVPR.2018.00745.
[13]	JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. New York: ACM, 2015: 2017-2025. DOI: 10.5555/2969442.2969465.
[14]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[EB/OL]. arXiv, 2018: 1807.06521. https://arxiv.org/abs/1807.06521.
[15]	GAO Z L, XIE J T, WANG Q L, et al. Global second-order pooling convolutional networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 3019-3028. DOI: 10.1109/CVPR.2019.00314.
[16]	QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: frequency channel attention networks[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2022: 763-772. DOI: 10.1109/ICCV48922.2021.00082.
[17]	LIN T Y, ROYCHOWDHURY A, MAJI S. Bilinear CNN models for fine-grained visual recognition[C]//2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2016: 1449-1457. DOI: 10.1109/ICCV.2015.170.
[18]	KIM J H, JUN J, ZHANG B T. Bilinear attention networks[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. New York: ACM, 2018: 1571-1581. DOI: 10.5555/3326943.3327087.
[19]	FANG P F, ZHOU J M, ROY S, et al. Bilinear attention networks for person retrieval[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2020: 8029-8038. DOI: 10.1109/ICCV.2019.00812.
[20]	XIAO B, YANG Z Y, QIU X M, et al. PAM-DenseNet: a deep convolutional neural network for computer-aided COVID-19 diagnosis[J]. IEEE Trans Cybern, 2022, 52(11): 12163-12174.
[21]	ZHANG J P, XIE Y T, XIA Y, et al. Attention residual learning for skin lesion classification[J]. IEEE Trans Med Imaging, 2019, 38(9): 2092-2103.
[22]	XING X H, YUAN Y X, MENG M Q H. Zoom in lesions for better diagnosis: attention guided deformation network for WCE image classification[J]. IEEE Trans Med Imaging, 2020, 39(12): 4047-4059.
[23]	SEYEDIAN S S, NOKHOSTIN F, MALAMIR M D. A review of the diagnosis, prevention, and treatment methods of inflammatory bowel disease[J]. J Med Life, 2019, 12(2): 113-122.
[24]	RUAN G C, QI J, CHENG Y, et al. Development and validation of a deep neural network for accurate identification of endoscopic images from patients with ulcerative colitis and Crohn's disease[J]. Front Med (Lausanne), 2022, 9(8): 854677.
[25]	LI P H, XIE J T, WANG Q L, et al. Towards faster training of global covariance pooling networks by iterative matrix square root normalization[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 947-955. DOI: 10.1109/CVPR.2018.00105.

文章信息

文章历史

相关文章

工作空间