Pneumonia prediction on chest x-ray images using deep learning approach

ABSTRACT


INTRODUCTION
Coronavirus disease (COVID- 19) is caused by coronavirus-2 (SARS-CoV-2).This virus poses a difficult challenge and is deeply affecting many people around the world [1].SARS-CoV-2 increases the number of deaths and also causes negative impacts on the economic and social fields [2].All countries in the world are worried because of the COVID-19 that attacks many people regardless of age.A person is declared infected with SARS-CoV-2 if they have clinical symptoms such as cough, fever, pneumonia, dyspnea (shortness of breath), and acute respiratory distress syndrome (ARDS) [3].But so far, the most common indication for patients to be admitted to the hospital is the symptoms of pneumonia caused by the coronavirus [4].
Pneumonia is an inflammation or infection of the lungs.There is a role for bacteria, viruses, and fungi in it.Pneumonia is an acute infectious disease and can threaten anyone's life [5].One of the viruses that cause pneumonia is the corona virus.This is in line with the opinion of Gilani and friend who argue that pneumonia is caused by viruses, fungi, and bacteria and then causes an infection in the lungs [6].Just like COVID-19, pneumonia will also be a threat to people who have previous chronic illnesses.Pneumonia causes a decrease in oxygen levels in the body and makes it difficult for a person to breathe.Usually patients with serious conditions must be hospitalized and if it is very acute, then the patient really needs ventilator support to help to breathe [7].Therefore, it is necessary to conduct research on pneumonia detection using chest x-ray images.Chest x-ray examination has become the main reference to determine abnormalities that occur in the cavity.A chest x-ray is used to find out if a person's lungs are normal, or if they have pneumonia that caused by the corona virus.
In deep learning, computers have the ability to identify what features are useful for the model used.Starting from the raw data to the refinement of information manually.A key feature of deep learning methods is the focus on learning data representation.Visual geometry group 16 (VGG16) and dense convolutional network (DenseNet) -121 are methods in deep learning.DenseNet-121 is a method that produces good accuracy [8].The DenseNet-121 model is one of the models from DenseNet which aims to classify.DenseNet is a model that makes deep learning even deeper.DenseNet is also very effective to use.Layers in this model are associated with deeper, non-subsequent layers.In other words, the principal layer is associated with the second, third and fourth.Then the next layer will be associated with the third, fourth and fifth.[9].Besides DenseNet121, there is also VGG16.In 2014, Zisserman and Simonyan built a model called VGG16.VGG16 is one of the VGG NET networks.This model is built based on the AlexNet network and can more accurately classify and identify images [10].The VGG16 model has the advantage of being accurate.But besides that, it also has weaknesses.For example, when an engineer is deepening the structure of the network and the number of parameters during training, it will increase the training time thereby making time efficiency low when using this model.DenseNet121 and VGG16 are suitable for research with image prediction [11].Based on the background that has been described, this research focuses on the implementation of deep learning with denseNet121 and VGG16 methods to classify chest x-ray as pneumonia caused by corona virus or not.

RELATED WORKS
In this chapter, the researcher conducts a literature study on international journals related to the research topic that has been determined.This section aims to find references to learn everything related to pneumonia and also to assist researchers in solving problems so as to find solutions to these problems.The learning outcomes of the related works to pneumonia are shown in Table 1, and the learning outcomes of the related works to the deep learning approach are shown in Table 2

CNN and DenseNet121
Understanding the business, understanding the data, preparation the data, modelling, and evaluation.
This research achieves an accuracy of DenseNet121 with 86.6% with an image size of 128*128.Then the accuracy increases by 16.4% for a magnification level of 100X.

THEORY AND METHODS
Researchers will use the VGG16 and DenseNet121 methods.Then, in this section, the researcher will explain the theory and methods.Subsections 3.1.and 3.2.are an explanation to understand the deep learning approach that researchers used for this research.

VGG16
VGG16 is included in transfer learning.Transfer learning is a process to train a model.Transfer learning can also be modified and used for other problems.Some layers of the trained model are used in the new model.This can reduce the training time of a model in a neural network for optimization.Usually, when we use a trained transfer learning model, we freeze some layers of the pre-trained model [27].Therefore, VGG16 is a network proposed by the Visual Geometric Group.VGG is suggested to use 16 layers.Of these 16 layers, other layers are found, such as the max pool layer.Even so, there are still no trainable parameters [28].This is in line with the opinion from Ayan that VGG16 has 16 layers with a small field of 3×3, five maxpooling layers of 2×2, 144 million parameters, and there are also three connected layers.The last layer of the VGG16 has an activation function named soft-max [29].Figure 1 is an illustration of architecture of VGG16.The VGG16 network has 16 layers that are 3×3 in size.The VGG16 has a maximum union layer of 2×2 and a total of 5 layers.After the last max pooling layer, there are 3 fully connected layers.

DenseNet121
DenseNet stands for densely connected convolutional networks that takes the insights from dense connections, connecting each layer to every prior layer and has high accuracy and helps to accomplish tasks, especially in the field of medical image classification.This is in line with the statement that tells if DenseNet121 is a convolutional neural network (CNN) model that has as one of its goals the diagnosis of illness.Basically, DenseNet121 has 121 layers consisting of 116 convolution layers.The convolution layers are then divided into four pooling layers, four dense blocks, one classification layer, and three transition layers [30].Figure 3 is an illustration of architectures of DenseNet121.

RESEARCH METHODOLOGY
The framework aims to solve research problems that can be written in a flowchart from start to finish.The framework is useful for research so that it can run in a systematic and structured manner.The research framework for implementing pneumonia prediction with the deep learning approach will be described in Figure 4. Based on Figure 4, then the first step is a literature review.A literature review is required to identify the research problem.The literature review used is on international journals regarding deep learning ISSN: 2252-8938  Pneumonia prediction on chest x-ray images using deep learning approach (Rani Puspita) 471 approaches.From the results of the literature review, the researchers concluded using the VGG16 and DeseNet121 methods for predicting pneumonia with chest x-ray images because these methods are often used and will produce high accuracy for the evaluation.After that, then the next step is the identification of a problem contained in the research.After that, it enters the stages of collecting data, and exploring data.
Then proceed with experiments using the VGG16 method and continue with evaluation.After finishing with VGG16, the next step is experimenting using DenseNet121 and continuing with evaluation.When you have found the best accuracy based on experimental results, the research problem has been resolved and the goal has been achieved.

PROPOSED METHODS
Explaining the solution to the problem is the purpose of the proposed method.In this session, the researcher will design the proposed method.Illustrations for the proposed method can be seen in Figure 5. First of all, conduct a data preparation which includes data collection, and data exploration.For data collection, the researcher describes where the data comes from and after that, the researcher tells how much data is used for training, testing, and validation.Then data exploration is used to prepare the data.Describing and visualizing the data is key at this stage.After the data has been properly prepared, the next step is modeling using the deep learning approach with the VGG16 and DenseNet121 methods.After that, the results of training and testing from VGG16 and DenseNet121 will be obtained.The programming language used is python.The first step in pneumonia prediction is training the data by defining the library.In this study, researchers used Pytorch.PyTorch is a framework based on the Torch library.After training, it is followed by evaluation.Then when there are results for evaluation, the next step is comparing the accuracy of the two methods.After that, the researcher knows which method is best based on the evaluation results.In addition, researchers have also achieved the research goal of knowing accurate predictions for pneumonia using chest x-ray images.

. Data collection and exploration
The data used in this research were normal chest x-ray images and pneumonia chest x-ray image data taken from the kaggle.comwebsite.After collecting data, it is followed by data exploration.Data exploration is a step to learn the data before conducting experiments.This data is divided into three parts, namely 5,216 training data, 624 testing data, and 16 validation data.In the training data, there were 1,341 normal data and there were 3,875 infected with pneumonia.In the testing data, there were 234 normal data and 390 that were infected with pneumonia and in the data validation, there are 8 normal data and there are 8 that are infected with pneumonia.

Summary prediction results
The Table 3 is a detailed explanation of the results of the predictions that has been completed.In the following explanation, there is some information such as method, training accuration, and testing accuration.Before that, Table 3 is a parameter of VGG16 and Table 4 is a parameter of DenseNet121.And Table 5 is the summary of the training and testing evaluation of this research.Based on the two models that were created using the CNN method using two models, namely the DenseNet121 and VGG16 models, the VGG16 model obtained a training accuracy of 93% and a testing accuracy of 90%.Then for the DenseNet121 model, the results of training accuracy were 92% and testing was 88%.According to the table, we can see that the performance of VGG16 is better than the DenseNet121 model.Of course, the parameters have a big influence.

CONCLUSION
Pneumonia prediction was carried out using two methods, namely VGG16 and DenseNet121.The data used is a chest x-ray of pneumonia.Data is divided into Training, Testing, and Evaluating.The best method for this research case is VGG16 with 93% accuracy training and 90% accuracy testing.In this study, DenseNet121 obtained lower accuracy than VGG16, namely 92% for accuracy training and 88% for accuracy testing.Parameters have a large influence on the accuracy of each model, and with the parameters that have been used, the VGG16 is a method that has high accuracy and can be used to predict chest x-ray images aimed at checking pneumonia in patients.

Figure 2
is an illustration of block diagram of VGG16.

Table 1 .
. The related works to pneumonia Pneumonia prediction on chest x-ray images using deep learning approach (Rani Puspita) 469

Table 2 .
The related works to deep learning approach

Table 3 .
Parameter of VGG16 Pneumonia prediction on chest x-ray images using deep learning approach (Rani Puspita) 473

Table 5 .
Evaluation result for the research