Computed Tomography (CT) Diagnosis of COVID-19 using Supervised Learning

Ruthvik Raja M.V
3 min readApr 28, 2021

--

During the Corona Virus outbreak, CT(Computed Tomography) is widely used for diagnosing COVID-19 patients. Due to many privacy concerns the CT images are not publicly available to implement Machine Learning and Deep Learning techniques for research and development of the AI-enabled algorithms to classify the CT images. To address this problem some researchers has created an open source dataset COVID- CT, which consists of 349 COVID-19 diagnosed CT images from 216 patients and 397 Non COVID-19 CT images. The usage of this dataset is confirmed by the senior Radiologist who has been treating and diagnosing COVID-19 patients since the outbreak of the novel Corona Virus. We also performed Machine Learning models like K-Nearest Neighbours, Support Vector Machine, Logistic regression and Deep Learning techniques like Convolutional Neural Network on the dataset to diagnose COVID-19.

The number of CT-Covid and CT-Non Covid images available publicly were only 349 and 397. There are so many images that are available online in various database repositories but due to so many restrictions imposed by the hospitals we were unable to retrieve the images. So, to train the CNN or Machine Learning model we used Data Augmentation to generate all kind of possible images from a pre-defined training set. This helps the model to not overfit and to produce high accuracy score by training under all possible scenarios. Initially we used Machine Learning models like K-Nearest Neighbours (KNN), Support Vector Machines (SVM), Logistic Regression, Decision Trees etc but these models failed to produce high accuracy even after performing Hyper parameter tuning. The KNN model produced a highest accuracy score of 64% when k=7 and distance metric is Manhattan. The Logistic Regression achieved a highest accuracy score of 60% when we set the hyper-parameter number of iterations to 1000. The SVM algorithm achieved a accuracy score of 63% and finally the Decision Trees failed to execute in the allowed time because the size of each input image is 480x480, there are nearly 2000 images so the model failed to create decision trees in the allowed time complexity.
Finally we have implemented the CNN model by loading all the original and generated images, then all the images are re- scaled to same shape because each image has different size and then the images are appended with the labels(0 for Non-Covid and 1 for Covid). After labelling the images we have split and shuffled the images for training(80%) and testing(20%) the CNN model.

The complete details about the Project and the Dataset can be found in the following link: https://github.com/ruthvikraja/COVID-19-CT

The Final Results are as follows:

The accuracy score of the CNN model can be further increased by implementing Transfer Learning and Hyper Parameter Tuning…

Originally published at https://dev.to on April 28, 2021. The Python code for the above problem can be found from the above link.

--

--

Ruthvik Raja M.V
Ruthvik Raja M.V

Written by Ruthvik Raja M.V

Coding, Data Science and Business Management. Languages: Python, R. DEV Community: https://dev.to/ruthvikraja_mv, GitHub: https://github.com/ruthvikraja

No responses yet