Electrocardiogram signals classification using discrete wavelet transform and support vector machine classifier

ABSTRACT


INTRODUCTION
Cardiovascular disease is a collection of irregularities affecting the heart; it is considered one of the most important causes of death in the world. According to the World Health Organization, there were approximately 17.9 million deaths in 2016; this big number represents 31% of deaths worldwide [1]. The lives of people with cardiovascular diseases are in constant danger, quick and effective diagnosis of these diseases can save a lot of lives. Several techniques in the medical field are used to diagnose cardiovascular disease, such as blood tests, coronary angiography, cardiac MRI, X-ray and electrocardiography. However, most of these techniques require medical assistance from experienced people, which is not always the case if we knew that almost 30% of cases with these diseases come from poor countries. Electrocardiography is a non-invasive detection technique based on recording the electrical activity of the heart over time [2], the signal obtained during this recording is called an electrocardiogram (ECG). The ECG signal is considered among the most widely used biomedical signals to detect heart problems, ECG signal contains a large number of information that can be of great interest in the detection and diagnosis of many heart diseases [3] which appears in some distortions of the signal shape. Despite the advantages of the use of ECG signals, there are many limitations of this technique; the difficulty of interpreting the signal and the lack of experienced personnel are among the constraints most encountered during identification of ECG signals; also, ECG signal contains various unwanted noises that prevent the correct extraction of useful and necessary information for classification [2], [4]. Hence, finding solutions to solve these problems has become a necessity. The field of medical engineering try the challenge by developing models capable of processing ECG signals and dentifying any abnormalities present in the signal [5], thus, signal processing is required; signal processing techniques analyze efficiently various kinds of signals, especially ECG signals. The aim of ECG signal processing is to extract features to distinguish between normal signals and those representing abnormalities. Several techniques based on signal processing have been developed over years for processing ECG signals. Ahlstrom and Tompkins have used digital filters for real time ECG signal processing [6] to denoise the signal and detect the QRS complex. Hargittai [7] used Savitzky-Golay Least-Squares Polynomial filters to preserve the details of the signal, Francisco et al. [8] process ECG signal using principal component analysis (PCA), Haque et al. [9] use adaptive filtering algorithms, while Ahmed et al. [10] use a method based on cross correlation theory. Gustavo et al. [11] propose a comparison between several methods to remove baseline wander. Also the choice of features and the method used for its extraction affect directly the quality of signal characterization, some techniques are based on the extraction of morphological features such as the detection of the QRS complex proposed by Jiapu [12], the calculation of R-R intervals and peaks detection mentioned in the works of Shanti Chandra et al. [13] and Priyanka [14], other methods are based on the extraction of the statistical ones or even a mixture of DWT and statistical features, Abdullah et al. [15], or morphological and statistical features, Sahoo et al. [16].
The discrete wavelet transform (DWT) is a very powerful tool in the field of signal processing [17]- [20], this technique gives satisfying results in the processing of the noise which affects the signal that can allows us to reconstruct a denoised signal, also the DWT allows to extract different features that characterize the signal. The interest of the discrete wavelet transforms (DWT) pushes us to better exploit this technique in the processing of ECG signal [21], [22]. Also, machine learning techniques are constantly evolving; this evolution is reflected in the use of these techniques in several areas [23], more precisely in the classification and identification of signals. The combination of signal processing techniques and machine learning models gives us promising results and the performance of each model changes depending on the chosen algorithm The purpose of this modest work is to establish a characterization model of ECG signals able to differentiate between normal and abnormal signals, the adopted model is based on the extraction of statistical features from the approximation coefficients obtained by the wavelet decomposition of the signal and classify these features using an SVM classifier; this model was approved after the choice of the best wavelet and the best scale increasing it accuracy, the recordings of the used ECG signals are taken from the MIT-BIH arrhythmia database [24], [25]. It's composed of 48 recordings of different patients; these patients are classified into two categories: A healthy and a sick one. After the processing of the ECG signals and the extraction of features a classification of patients to healthy or sick ones is needed. To do this we have chosen a support vector machine (SVM) classifier who seems most appropriate to do this task [26]- [28].

PRELIMINARIES 2.1. Wavelet analysis
Wavelet analysis is one of the powerful tools in signal processing, it is considered as a technique which aims to solve the problem of non-stationary signals. This notion was introduced in the 20th century by Haar who constructed the simplest wavelet, and then developed in the 1980s through the research work of Mallat [18], [19], [29]. One of the powerful things of this approach is that it allows a signal to be analyzed in time and frequency, which makes it very useful in extracting the various information contained in that signal.
The wavelet transform decomposes a signal x(t) using a series of wavelets ψ a,b which derive from a mother wavelet by a displacement in time carried out by a translation and a dilation or compression by scaling of the mother wavelet, this series of wavelets is defined as [30]: where a and b are respectively the scaling and the translation coefficients ψ being the mother wavelet which must verify the following condition [17]: The wavelet transform exists in two principal forms the CWT and the discrete wavelet transform (DWT), thus we define the continuous wavelet transform (CWT) of a signal x(t) as [18]: The computation of this continuous wavelet transform seems sometimes difficult and time consuming. The discrete wavelet transform give a solution to this problem. To do this we use an algorithm (proposed by Mallat) [18] to calculate the discrete wavelet coefficients, this decomposition consists in passing the signal through a succession of complementary low-pass h[n] and high-pass g[n] filters (in stages), the low-frequency components resulting from the low-pass filters h[n] represent approximation coefficients and those from the high frequency g [n] filter are called the detail coefficients [31] as shown in Figure 1.

Features extraction
The choice of the nature, number and method of features extraction is considered one of the decisive steps in the characterization of a signal, since each of these parameters contain various information which influence on the efficiency of classification. In the literature many are the features and the methods used to characterize a signal especially for the ECG signals, there are the morphological features such as the PR, RR and QRS intervals, which are determined basically by the detection of the QRS complex [12]- [14]. There are also statistical features such as maximum, minimum, mean, variance and standard deviation [15], [16], [32], [33].
In this related work we will focus on the following features: The mean value (4): The root mean square (5): The variance (6): Standard deviation (7): Kurtosis (8): Wavelet entropy (9): where N is the length of the signal x

Classification
The chosen classification technique is support vector machine (SVM), SVM is a machine learning technique that allows binary classifications. The purpose of this method is to separate the features of the signal into different categories [27], [28]. The purpose of this method is to separate the features of the signal into different categories, it's based on the construction of a separation area between the different classes of the learning set{ , }, with ∈ {−1,1} and = 1,2, … . , the lines that delimit this area are called a hyperplane which is defined by [28], [29]: represents a normal vector on the hyperplane and is the bias.
Thus, the area where we have: = −1 implies that Similarly, for = +1 we have In all cases we have ( Good separation aims at increasing the width of the margin between hyperplanes, this is equivalent to minimizing If we can't separate linearly these classes, then the optimization problem becomes [26]:

METHODOLOGY
In this work we want to establish a simple model that allows:  ECG signal processing.  Extraction of the features that characterize the signal, these features will be calculated by applying (4)- (9).  The classification of signals processed into normal (healthy patients) and other abnormal signals (the sick ones) using a linear SVM classifier. Processing techniques and extraction methods are based on discrete wavelet analysis; Figure 2 shows the process of identifying ECG signals.

Database
The ECG signals are taken from the MIT BIH data base of physionet [24], [25], this database contains 48 recordings each recording is half hour length, the recordings are sampled at 360 Hz, in this study we analyzed duration of one minute for each signal.

ECG signal processing
The recordings of ECG are not protected from noises; these noises cause significant perturbations during classification and diagnosis. For ECG signals, there are different sources of noises, noises of technical origins such powerline interference (50 or 60 Hz noise from mains supply) or bad wiring and others of physical origins generated by the physical activity of the body such electromyographic (EMG) noise, baseline wander (low frequencies) which are considered among the most important [34]. The steps of signal denoising are shown in Figure 3.
In this stage of processing, we tried to eliminate these noises using the discrete wavelet decomposition. After the decomposition of the signal into eight levels we notice that the noise (high frequencies) is located in the detail coefficients d1 and d2. For the baseline wander we notice that the approximation coefficient A8 corresponds to the frequency interval of these fluctuations 0-0.5 Hz [20]; Figure 4 shows detail and approximation coefficients resulting from the wavelet decomposition of recording 121 m, and Figure 5 shows part of the same recording before, as shown in Figure 5(a) and after, as shown in Figure 5(b) denoising, the mother wavelet used in the decomposition is Symlet [35].

Features extraction
After denoising the ECG signal, we decompose the denoised signal using DWT by using different types of wavelets to analyze the signal, and then the approximation coefficients are calculated up to scale 8 as shown in Figure 6. After the extraction of the approximation coefficients, the features mentioned above (4)-(8) are calculated from the approximation coefficients a1, a2, a3, a4, a5, a6, a7, a8, thus we create for each level a data set containing the features for all the records.

Classification
The resulting dataset is introduced in a SVM classifier and it's divided into two subsets; the first will be served for training the model, while the second will be designed to test the performance of the model. The choice of the training and test subsets is crucial and it can causes many problems which influence the effectiveness of the model such as overfitting and underfitting [36]; to limit this kinds of problems we use k flod cross validation to select this subsets [37], [38], this method consists in dividing the data set into parts of number of k, one of these parts is chosen as a subset of test and the other − 1 parts as a training subset, this process is repeated k times and each time a different part is taken to do the test. The precision of the model is obtained by averaging the precision of each iteration [39]- [41]. In this work we took = 8, that is to say that 7/8 of the data set is taken for training and 1/8 is considered as a test subset. The next step will be devoted to measuring the performance of the model for the different types of mother wavelets at different scales (up to  Figure 6. Approximation coefficients a1, a2, a3, a4, a5, a6, a7, a8 Table 1 as shown in appendix, summarizes the results obtained for different wavelet families at the first eight scales of the approximation coefficients. As we can see, the best accuracy is generally obtained at the fourth scale. The study carried out by Siti [42] which is based on the calculation of statistical features from the detail coefficients at the 4th, 5th and 6th and the KNN as a classification method reaches an accuracy of 71% using the sym7 wavelet; while in this work we had reached an accuracy of 81.67% using only the statistical features extracted from the approximation coefficient a4 for the same wavelet and an SVM classifier; ;also in the study of Siti [42] we reached an accuracy of 85% as the best accuracy by calculating the MFCC coefficients, while in this work we found 87.50% as better accuracy by adopting only the coif5 wavelet as the mother wavelet as shown in Table 2.

CONCLUSION
In this article we focused on the classification of ECG signals, we were then able to establish a model that allows to process and classify these signals. ECG signals were processed using DWT discrete wavelet analysis, the discrete wavelet transform allows the calculation of approximation coefficients which are used to extract the features for different scales, the dataset which consists of these parameters is fed into an SVM classifier with cross validation in order to distinguish between a normal and an abnormal signal. This model was tested with different mother wavelets and at different scales to reach an accuracy of 87.50%, which was the best. To conclude, the choice of the mother wavelet, the scale of the decomposition as well as the size of the training and test sets have a considerable influence on the accuracy of the model.