교보문고

학술논문

흉부 X-선 영상을 이용한 Vision transformer 기반 폐렴 진단 모델의 성능 평가

이용수 28

영문명: Performance Evaluation of Vision Transformer-based Pneumonia Detection Model using Chest X-ray Images
발행기관: 한국방사선학회
저자명: 장준용(Junyong Chang) 최용은(Youngeun Choi) 이승완(Seungwan Lee)
간행물 정보: 『한국방사선학회 논문지』제18권 제5호, 541~549쪽, 전체 9쪽
주제분류: 공학 > 기타공학
파일형태: PDF
발행일자: 2024.10.31

4,000원

구매일시로부터 72시간 이내에 다운로드 가능합니다.
이 학술논문 정보는 (주)교보문고와 각 발행기관 사이에 저작물 이용 계약이 체결된 것으로, 교보문고를 통해 제공되고 있습니다.

1:1 문의

국문 초록

Convolutional neural network(CNN), recurrent neural network(RNN)와 같은 다양한 인공 신경망이 연구되고 있으며, 타 인공지능 기반 모델의 기초 구조로 활용되고 있다. 그 중, 트랜스포머를 기반으로 하는 인공 신경망은 자연어 처리 분야에서 그 성능이 입증되었고, 활발하게 연구되고 있는 구조이다. 최근 트랜스포머 기반 인공 신경망의 내부구조 변경을 통해 영상처리가 가능한 Vision transformer(ViT) 모델이 개발되었다. 비젼 영상처리에 있어 ViT 모델의 정확도와 성능은 다양한 연구를 통해 입증되었다. 본 연구에서는 흉부 X-선 영상을 이용하여 폐렴을 진단할 수 있는 ViT 기반 모델을 개발하고, 개발 모델의 학습효율 및 성능을 정량적으로 평가하였다. ViT 기반 모델의 구조는 encoder block의 개수를 다르게 하여 설계하였고, 신경망 학습 시 패치의 크기를 다르게 설정하였다. 또한 개발한 ViT 기반 모델을 검증하기 위하여 기존 CNN 기반 모델인 VGGNet, GoogLeNet 및 ResNet 모델과 성능 비교를 수행하였다. 연구결과 ViT 기반 모델의 학습효율 및 성능은 encoder block의 개수 및 학습 패치 크기에 따라 변화함을 확인하였고 F1 score가 최소 0.875, 최대 0.919로 측정되었다. 32 × 32 크기의 패치를 이용하여 학습한 ViT 기반 모델의 학습효율은 기존 CNN 기반 모델에 비해 우수한 것으로 확인되었으며, 본 연구에서 설계한 모든 ViT 기반 모델이 VGGNet 보다 폐렴 진단의 정확도가 높은 결과를 확인하였다. 결론적으로 본 연구에서 개발한 ViT 기반 모델은 흉부 X-선 영상을 이용한 폐렴 진단에 잠재적으로 사용될 수 있으며, 본 연구를 통해 ViT 기반 모델의 임상적 활용가능성을 향상시킬 수 있을 것이다.

영문 초록

The various structures of artificial neural networks, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have been extensively studied and served as the backbone of numerous models. Among these, a transformer architecture has demonstrated its potential for natural language processing and become a subject of in-depth research. Currently, the techniques can be adapted for image processing through the modifications of its internal structure, leading to the development of Vision transformer (ViT) models. The ViTs have shown high accuracy and performance with large data-sets. This study aims to develop a ViT-based model for detecting pneumonia using chest X-ray images and quantitatively evaluate its performance. The various architectures of the ViT-based model were constructed by varying the number of encoder blocks, and different patch sizes were applied for network training. Also, the performance of the ViT-based model was compared to the CNN-based models, such as VGGNet, GoogLeNet, and ResNet. The results showed that the traninig efficiency and accuracy of the ViT-based model depended on the number of encoder blocks and the patch size, and the F1 scores of the ViT-based model ranged from 0.875 to 0.919. The training effeciency of the ViT-based model with a large patch size was superior to the CNN-based models, and the pneumonia detection accuracy of the ViT-based model was higher than that of the VGGNet. In conclusion, the ViT-based model can be potentially used for pneumonia detection using chest X-ray images, and the clinical availability of the ViT-based model would be improved by this study.

키워드

딥러닝 폐렴 진단 흉부 X-선 영상 Vision transformer Deep learning Pneumonia detection Chest X-ray image

국문 초록

영문 초록

목차

키워드

해당간행물 수록 논문

참고문헌

관련논문

공학 > 기타공학분야 BEST

공학 > 기타공학분야 NEW

최근 이용한 논문

APA

MLA