Classification of Cardiovascular Disease INFO1010

Classification of Cardiovascular Disease INFO1010

i
A Novel Framework for Classification of Cardiovascular Disease
ELECTROCARDIOGRAM Signals Images Using a Machine Learning Techniques …

Preview text

i
A Novel Framework for Classification of Cardiovascular Disease
ELECTROCARDIOGRAM Signals Images Using a Machine Learning Techniques
Bachelor’s degree in Computing Information system
SRN : 18035 10
Student Name: Hakeem Mothe
A project was submitted in partial fulfillment of the requirement for the award of a
bachelor’s degree in computing and informatics of the University of Seychelles
… Month ………….., 2022
ii
Declaration
This thesis is my original work. Where otherwise stated and has not been presented to
any other university or any other award.
NAME ………………………… ………………………………
SRN: 180351310 Date
iii
Abstract
Cardiovascular illness is currently one of the most dangerous diseases. It has a high rate
of death, especially sudden death. According to the World Health Organization, almost
18 million people have died as a result of this. Early on electrocardiogram signals are
used to make a diagnosis. Many specialists have worke d on deep learning neural
network s, convolutional neural networks, and artificial neural networks. The
classification of distinct electrocardiogram signals is used here to aid in the early
detection of cardiovascular illness. As a resul t, pre -trained deep learning is used in
conjunction with Boost to achieve the best and most accurate result. Therefore the
project achieves this by using machine learning in a bid to enhance the prediction of the
disease. Various patient records are documented and stored in the oracle database will be
used to predict the disease which has been used in the training model to predict the cause
of the disease by analyzing various images data.
iv
Acknowl edgments
First and foremost, I would like to thank you, I would like to acknowledge the blessing of
the almighty lord for the continued mercies and good health he has bestowed unto me.
Without forgetting the continued from my lovely parents who have given me support
through the project as it is through their avenues th at they have assisted me in
implementing this software. I am indebted to all people who have been encouraged to
accomplish this particular project.
In a special way, I would like to thank Mangalyaan Uruguayan for his thorough,
continued support and critici sm to make this thesis more effective and useful in a real -life
situation. I would like to pass a vote of thanks to the acknowledgment sponsor for
continued support as they have been paying my fees up to this far.
v
Table of contents
Contents
Declaration ………………………….. ………………………….. ………………………….. ………………………….. …… ii
Abstract ………………………….. ………………………….. ………………………….. ………………………….. ………. iii
Acknowledgments ………………………….. ………………………….. ………………………….. …………………. iv
Table of contents ………………………….. ………………………….. ………………………….. ……………………….. v
CHAPTER ONE ………………………….. ………………………….. ………………………….. ………………………. 1
1.0 Introduction. ………………………….. ………………………….. ………………………….. …………………… 1
1.2 Problem statement ………………………….. ………………………….. ………………………….. …………… 2
1.3 Proposed solution ………………………….. ………………………….. ………………………….. ……………. 2
1.4 Aims and Objectives. ………………………….. ………………………….. ………………………….. ……….. 3
1.5 Objectives of the project. ………………………….. ………………………….. ………………………….. …. 4
1.6 The purpose of the study ………………………….. ………………………….. ………………………….. ….. 4
1.8 Project hypothesis ………………………….. ………………………….. ………………………….. ……………. 5
1.8.1 Null hypothesis ………………………….. ………………………….. ………………………….. ………….. 5
1.8.2 Alternate hypothesis. ………………………….. ………………………….. ………………………….. …. 5
1.9 Limitations of the study. ………………………….. ………………………….. ………………………….. ….. 5
CHAPTER TWO. ………………………….. ………………………….. ………………………….. …………………….. 6
2.0 Literature review ………………………….. ………………………….. ………………………….. …………….. 6
2.1 Background information ………………………….. ………………………….. ………………………….. ….. 6
Project methodology ………………………….. ………………………….. ………………………….. …………………. 9
3.1 Introduction. ………………………….. ………………………….. ………………………….. …………………… 9
3.2 Requirement analysis. ………………………….. ………………………….. ………………………….. ……… 9
3.2. 1 Observation ………………………….. ………………………….. ………………………….. …………….. 10
3.2.2 Interviews ………………………….. ………………………….. ………………………….. ……………….. 10
3.2.3 User stories. ………………………….. ………………………….. ………………………….. …………….. 10
3.2.4 Using cases studies of similar convoluted neural networks framework and
machine learning. ………………………….. ………………………….. ………………………….. ……………. 11
3.3 Data collection methods ………………………….. ………………………….. ………………………….. …. 11
3.4 Hardware and software requirements. ………………………….. ………………………….. ……….. 11
3.3.1 Software requirements. ………………………….. ………………………….. ………………………… 11
vi
3.3.2 Hardware requirements. ………………………….. ………………………….. ………………………. 12
CHAPTER FOUR ………………………….. ………………………….. ………………………….. ………………….. 13
Data Analysis and Findings. ………………………….. ………………………….. ………………………….. … 13
4.1 Introduction to data analysis. ………………………….. ………………………….. ……………………… 13
4.2 Dataset ………………………….. ………………………….. ………………………….. ………………………….. 13
4.3 The experimental setup. ………………………….. ………………………….. ………………………….. …. 13
4.4 Performance metrics ………………………….. ………………………….. ………………………….. ……… 16
4.5 Data preparation ………………………….. ………………………….. ………………………….. …………… 17
4.6 Dataset classification using convoluted neural networks. ………………………….. ………….. 18
4.7 The training data regime ………………………….. ………………………….. ………………………….. .. 18
4.8 CONVOLUTED NEURAL NETWORKSprediction. ………………………….. ……………….. 19
5.1 Accuracy of the model ………………………….. ………………………….. ………………………….. ……. 20
5.2 prediction explanation. ………………………….. ………………………….. ………………………….. …… 20
5.3 convoluted neural networks prediction approach ………………………….. …………………….. 21
5.4 Conclusion and future work. ………………………….. ………………………….. ………………………. 22
5.5 Future work ………………………….. ………………………….. ………………………….. ………………….. 23
7.0 Appendices ………………………….. ………………………….. ………………………….. ……………………….. 29
Appendix 1: Gantt chart ………………………….. ………………………….. ………………………….. ……… 29
Appendix 2: Project timeline ………………………….. ………………………….. ………………………….. .. 30
Appendix 3 ………………………….. ………………………….. ………………………….. ……………………… 31
Appendix 4 ………………………….. ………………………….. ………………………….. …………………………. 32
1
CHAPTER ONE
1.0 Introduction.
Cardiovascular disease is among the ravaging diseases in the world and they have the
potential of killing in a reduced period. Seychelles is not exceptional and the disease has
effectively killed a number of individuals ranging from children to the elderly . It is
documented 85% of these deaths are attributed to cardiovascular disease (Kirkham et al,
2020) .
Notionally, it is assumed most cardiovascular diseases are associated with low – and middle –
income earners. When we compare in terms of population, the age of 70 is the most affected
by cardiovascular diseases. When quantified in terms of statistics, this was about 38% of the
population, which was about 70 years of age (Williamson et al.,2020) .
Therefore, this has become a huge challenge, especially to Seychelles as an island country as
there is less technological power to acquire necessary resources to enhance the health care
system . Most cardiovascular diseases are caused by coronary artery disease ( CAD). Critically
analyzing it is presumed that has caused a lot of death due to the formation of plaque called
atherosclerotic. The mode of action of this atherosclerotic is that it block s the blood supply to
the heart which results in a heart attack. Clinically, the attack is referred to as myocardial
infarction. To mitigate these kinds of issues one is required in advance to test this kind of
disease in advance so that it can be treated any time it is recognized that it is available.
Therefore, cognizance of this disease can be done in the following manner: invasive and non –
invasive methods. Our focus in this project is to make use of electrocardogram images then
apply convoluted neural networks and machine learning techniques to classify the kind of
cardiovascular disease we have at hand. There before methods such as coronary angiography
(CA) which is an invasive method are known is articulated to be the standard mechanism for
the diagno sis of heart disease. This particular operation is very expensive complex and
requires one of the most unique expertise in handling the procedure. However, this
mechanism becoming one of the outstanding mechanisms in performing the procedure
accrues some c hallenges such as kidney issues and some paralysis (Tupe et al.,2020) .
2
1.2 Problem statement
The cardiac disease has been enumerated to be one of the worst killer diseases globally. If
this disease is not detected in advance, its effect is very catastrophi c. Therefore, due to cost
and many other diseases, this disease requires an appropriate mechanism in discovering them.
A lot of institute and laboratories has deeply gone into the depths of heart attack disease
diagnosis models for early diagnosis and many other clinical mechanisms. There before
literature review has been carried out to identify these problems include Logistic regression,
support vector machines, decision support trees, and Random Forest, which has been used to
establish what can be done to have an early diagnosis of the disease.
Despite the application of those models, still statistics demonstrate that there is a significant
number of individuals dying significantly in different part of Seychelles which begs the
question what is the problem . Additionally, it was reported that this disease is associated
with, the income levels of individuals, her reports also classified that in 2008 only in different
part of Seychelles (Tupe et al.,2020) .
1.3 Proposed solution
To develop a novel model framework for the classification of cardiovascular diseases
ELECTROCARDIOGRAM signal images using convoluted neural network and machine
learning techniques (Jamil et al.,2022). This combination will be appropriate in predicting the
kind of cardiovascular disease simply because previous models such as Random Forest,
decision tree, and logistics were not effective in predicting the real disease. This model once
the ELECTROCARDIOGRAM signal image has been presented, will be classified
effectively and the real cause of the problem will become identified at all times and in every
moment.
Figure 1: The proposed model in predicting our cardiovascular disease (Wiliamson et
al.,2020)
3
Step one is a very important aspect of the success of this project. If the
ELECTROCARDIOGRAM input signal images are not well articulated, it will make step
four very difficult in ensuring that accurate facilities feature extraction has been achieved all
along (KÅ‚osowski et al.,2020) . In step denoising aspect of the project has taken place. To
ensure that our images lack noise which makes them difficult to establish the correct feature
extraction for our images.
1.4 Aims and Objectives.
The aims are project refers to the overarching purpose of our work project . In this project our
aims include and are limited to the following;
(i) To classify electrocardiogram signals in order to detect cardiovascular disease by
the application of supervised learning specificall y convolutional neural networks
(convoluted neural networks ).
(ii) The project only targets patients in Seychelles, with the small population it would
be easier to target them and handle them. The small population is easily
manageable at all times in voluminous project .
(iii) The Application of various methods of machine learning or supervised learning to
validate their accuracy and quickness in classifying electrocardiogram signal
images.
(iv) To predict various cardiovascular diseases using electrocardiogram signals by
application of Alex Net and other machine learning techniques such as Error
matrix in revealing what is the major cause and disease which is prevalent among
4
many cardiovascular diseases .
(v) To predict another opportunistic disease is caused as a result of cardiovascular
disease. It is evident that opportunistic disease arises as a result of opportunistic
disease due to the following factors namely, kidney failure and paralysis.
1.5 Objectives of the project.
Objectives act as the roadmap and guiding principles upon which the project should follow.
These objectives include and are not limited to:
(i) To determine which machine learning approaches are most suited for the
classification of ELECTROCARDIOGRAM signals.
(ii) To investigate the accuracy of convoluted neural networks and machine learning
approaches, as well as to determine which methods are the most accurate.
1.6 The purpose of the study
The purpose of this study is to establish what the causes of cardiovascular disease are and
what are the appropriate mechanisms which can be done to ensure that everything works fine.
Convolutional neural networks and machine learning is being deployed in this environment to
ensure that proper predictions have be en done (Sambasivam et al.,2021) . In a forthright
manner, our models need to predict effectively with a high level of accuracy.
This project will back up available and existing mechanisms in ensuring that detection of
available methods is fruitful and accu rate. Therefore, we shall develop a model as
demonstrated below in detecting cardiovascular disease.
Figure 2: The framework for detection (Tupe et al.,2020).
5
1.8 Project hypothesis
The project hypothesis therefore, the analysis needs to be performed to prove or disapprove
the hypothesis. This is done especially in chapter four of this project where finding and
analysis of the experiment are carried out. In this project , Null and alternative hypothesis is
chosen to be applied in the project .
1.8.1 Null hypothesis
This is a statistical theory that suggests there is no relationship between two observed
variables in a certain measured phenomenon.
Null hypothesis: Most cardi ovascular disease is not associated with old age.
1.8.2 Alternate hypothesis.
Most cardiovascular diseases are caused by the age factor. This anticipates aging population
is more susceptible to most cardiovascular diseases.
1.9 Limitations of the study.
The study limitation refers to things that did not happen in the course of the project due to
one or more factors. For instance, the study was only limited to Seychelles country and only a
small population was picked for cardiovascular diseases for imaging . The outcome of the
project may be associated with many false positives because it may not be a true reflection of
what normally happens in other countries .
The project is only limited to cardiovascular disease but does not bring forth what are the
sympto ms and root causes of cardiovascular disease (Han et al.,2019). If this is not fully
addressed it simply means proper and appropriate models will be based on the symptoms and
not based on the type of cardiovascular disease. The model should be able to learn after
training various symptoms of various cardiovascular diseases and make predictions. Ideally,
if this is not well achieved all of the implementations will not be okay at all times. Therefore,
we recommend our model should be able to learn and predict w hat is appropriate at all times.
6
CHAPTER TWO.
2.0 Literature review
2.1 Background information
A great effort has been put forth to develop a method in detecting cardiovascular
disease early using a variety of clinical approaches.
Monogram et al.(2018) modeled a system that could be used to categorize heart diseases
using a decision tree and another technique called support vector machine to classify various
cardiovascular diseases. The scholar concluded that the decision tree classifier once modeled
well can be more accurate when tested using a confusion matrix.
R.W. Jones and colleagues devised an approach termed convolutional neural networks to
forecast heart illness. The algorithm wa s tested and trained using self -administered using
open -ended questions. Notably, the algorithm had 3 layers that were hidden and they were
trained and tested using backpropagation algorithms.
Anita yewan et al . (2021) argued that once the performance of t he genetic algorithm was
compared it was identified that they exhibited high performance and small error when a small
dataset was set across for investigations. To compare the performance of genetic components
the scholars involved back propagations which is one of the architectures of neural networks
(Chung, 2020) . When the same experiment was performed with voluminous data it was
identified that the backpropagation algorithm performed better. To achieve all this, it should
be noted that the dataset should not contain any inconsistencies and no more than one
inaccura cy.
Pourghasemi et.al and colleagues ensured that there was a training model which was referred
to as a learning vector quantization neural network. In this vector quantization technique, a
7
num ber of clinical characteristics were identified which included and are not limited to 13
clinical characteristics. When the algorithm was trained, it was established that the accuracy
of the model was 80% on the data which was used for training.
Manzai et al. suggested a hidden layer ANN which makes use of Radial basis functions which
makes use of large neurons which have a large number of neurons. When functions were
integrated into analyzing and effectiveness it was established in prediction they had an
ac curacy of roughly 97 % (Chung, 2018 )..
El -Hasnony et al. presented a special hybrid technique framework that could be used to
predict cardiovascular diseases which incorporated a number of algorithms so that it can
improve its accuracy. It was identified tha t these models had a sensitivity rate of 74% and an
accuracy of 82 % and a specificity level of 93%( Bashir et al., 2020 )
Ramprakaz et al. presented a deep neural network. Which has been to identify features that
will be used for the selection of a number of features. This entailed application of network –
based algorithms and two -statistical one models.
In this approach of feature selection appropriate techniques and strategies ne eded to be
undertaken such that we can be able to avoid overfitting and underfitting .In establishing the
viability of this model, it proved that all is okay as it achieved an accuracy of 94%, sensivity
percentage of 93, and recall rate of 94 % ((Bashir et al., 2020) .
In the year 2017, the seminar and conference organized the Automatic cardiac diagnosis
challenge which entailed evaluation of various automatic methods in order to classify various
domains in areas such as health, Antiretroviral , and machine le arning. Majorly the challenge
involved 5 categories and thematic areas which appeared to be of more concern. Several
experts were involved in order to solve the key areas of concern, it was identified in the
challenge that, appropriate tests need to be cro ss-examined and the real cause of this problem
be identified and the matter be put to rest.
Detection of coronary artery disease by use of non -invasive imagery (Pontone et al., 2020) .
This assessment provided ver y promising results in essence. We adopted, machine learning
techniques that could be used to improve the myocardial ischemia, which helps in reducing
the level of this cardiovascular disease unnecessary down streaming
Figure 3: Structure of a forward pass convolutional neural network with SSD detector.
8
This figure demonstrates how cardiovascular disease is detected when subjected to an SSD
network. A convolutional neural network algorithm is used to train and eventuality is
predicted fr om the selected feature which has been given the output we want. The input in
this context is the ELECTROCARDIOGRAM images signal which shows images taken from
various individuals at all times 12. We expect the feature selected to be various predisposing
factors of various cardiovascular diseases.
The latest project has shown that state -of-the -art machine learning project has the potential
for growth, with promising outcomes in detection and classification of features of selection in
order to produce promi sing results classification n clinical environment through imaging.
The majority of investigations looked at multidimensional electrocardiogram signals images
as a number of time series. The classification involved the use of deep machine learning as
the k ey concept in mind. It has been identified that pericardiogram s and electrocardiogram s
from the word go have been used to diagnose cardiovascular illness for a long time .
convoluted neural networks is a cutting -edge technique for detecting and classifying
electrocardiogram signals picture that has been examined in a variety of ways, including 1 –
dimensional, 2 -dimensional, and a hybrid of the two. Similarly, Noman et al. established a
model for learning directly selected features from raw cardiac signals usin g 1 -dimensional
convoluted neural networks and 2 -dimensional convoluted neural networks using a 2 –
dimensional time -frequency saliency map. A model for automatic wearable
electrocardiogram categorization was suggested by Xi a et al . Huang et al. Converted five
different forms of cardiac signals into time -frequency spectrograms and then trained a 2d –
convoluted neural networks to classify different types of arrhythmias. By connecting the do ts
of 1D signals, Lu et al. turned 1D sign als into 2D picture s. Ji et al. similarly employed a one –
dimensional signal, transformed all data into two -dimensional.
9
Project method ology
3.1 Introduction.
Project methods can be identified as due procedures and strategies which helps in the
collection of material facts ( evidence) in order to bring forth hypothesis which was put
forward and uncover new information or provide viable information by creating a better
understanding of the topic. Project always helps in revealing appropriate information which is
prerequi sites, in uncovering ideas in the areas of concern. Additionally, tools and instruments
which are of value are used to analyze, the effectiveness of the proposed model in analyzing
the required information in its entirety (Zivkovic et al., 2022) . In essenc e, machine learning
approaches are created in a very prolific manner to ensure work has been executed and better
feature selection has been made.
Essentially, the availability of the data to be extracted and provide feature selection must
rotate around pr edisposing factors in features selection. These features include and are not
limited to the following aspect of cardiovascular diseases. The project paper definitely
focuses on various project domains either qualitative or quantitative project . This project
focuses on quantitative project .
3.2 Requirement analysis.
Requirement analysis is one of the most important aspects of developing any kind of system.
Requirement gathering requires a dignified approach such that the main area of concern will
be ar ticulated without many problems (Gstrein, 2022). Individual requirement analysis
requires one to formulate a very smart objective and requirement which are achievable.
The objectives set in place need to be achieved so that we can ascertain that our projec t is
achievable. The project gap which is part of the requirement is achieved through a thorough
10
analysis of some of the tasks whi ch have earlier been worked on. This means that we need to
carry out our analysis from the rich literature review available on line. To perform all these
requirement processes one required requirement gathering tools which include and are not
limited to the following;
3.2.1 Observation
This is one of the profound aspects which stands out in gathering our requirements in this
project . Feature for selection and challenges of overfittin g and underfitting. If we critically
observe some of the underlying conditions, we can be able to establish if the feature taken for
training and testing available matches our required criteria. Ess entially observation is a very
important method of data collection in this assignment. Various cardiovascular disease
symptoms are mapped and matc hed according to their clinical. In this .context, the developer
of the framework must collude with doctors so that he or she can have a better understanding
of clinical symptoms and ratify whether the model is working for the general good. This
includes the value of the predictions available.
3.2.2 Interviews
The interview is another aspect of data requirement gathering. This can take place as an oral
interview which involves one asking prompt questions and receiving feedback instantly while
or she records the feedback of the question posed in essence (Falb et al., 2019) . The second
approach of carrying out inte rviews involves administering structured and unstructured
questions so that respondents can have dignified responses and answer them in essence. In
the interview process, various clinical areas were visited so that specialists dealing with
cardiovascular d isease can be interrogated and provide a brief history about what are some of
the predisposing aspects of this project .
Other stakeholders, who were consulted included nurses, clinicians who are ever involved in
the process of taking care of and monitoring patients who used to deal with cardiovascular
disease. When these questions are handled with these professionals it is obvious that the
proper requirement will have been articulated.
3.2.3 User stories.
Ideally what normally happens is that, is a very greater factor in requirement gathering which
cannot be ignored at any cost aspects. Ideally, there is stakeholders are the ones who will give
brief history and information pertaining what is the real causes and symptoms of this kind of
cardiovascular dise ase (Sato et al 2021) . What do they think causes it and measures have been
taken in place in order to ensure everything is okay at all times. Therefore , when a broad
11
perspective of this thing is viewed and put into perspective a lot of requirements will hav e
been collected and modified.
3.2. 4 Using cases studies of similar convoluted neural networks framework and machine
learning.
To identify if our models are bound to work and produce an effective result, the use case of
similar models needs to be undertaken. This becomes also another part of the literature
review, which needs critical ly to be analyzed at all times (Rehman et al., 2020) . This involves
carrying out a thorough project on how another kind of disease predicted using the same
model worke d or not. In fact it is very magnificent and adducible to identify and ascertain
which the key component of the environment are . With this information collected to become
very good in ensuring each and every work which has been required in the analysis has been
effectively taken care of at all times. This is the method that has been adopted in the project
in the acquisition of various requirements and the model of the case of similar systems.
3.3 Data collection methods .
Questionnaire and oral interview questions were the methods that were used in the collection
of the data approaches.Questionaire have been attached at the appendix to illustrate questions
which were posed to the doctors and finally to the patient in this relation.Therefore,therefore
these question were sampled to give some insight so that we can enhance our reports
findings.
3.4 Hardware and software requirements.
Software and hardware requirement are as vital in system development as they will
interpolate what is required in order to enh ance and spur the development of the system
which needs to be developed from inception to completion. Infact system will have hardware
requirements that are not well configured the process will be slow in the way it is configured.
Better memory management and a huge one allow the system to be effectively configured
once and enhance faster execution of the system.
3.3.1 Software requirements.
Jupiter notebook: This is nothing other than an interactive web -based framework for python
that assists and helps in computing. This framework allows various individuals rangi ng from
scientist and engineers. This framework will be used to develop this project.
Database -This is the repository upon which numerous symptoms of cardiovascular disease
will be stored. Other att ributes to be stored will entail, patient particulars illness they are
12
likely to be suffering from. Oracle sql will be used as the database which be store various
patient records which will be used for training and prediction.
Alex Net : This refers to a ki nd of architecture that will be used by the convolutional neural
network which will help and enhance the detection of various cardiovascular diseases.
Figure 4: Alex net performance (Tan,2019)
3.3.2 Hardware requirements.
4GBRAM: This is a very important hardware component, it determines the speed of the
execution of your application. Small Random access memory makes the system run slow has
the process will face deadlock or unsynchronized decision in ensuring that, the appropriate
deci sion is being arrived at simply because of the problem of storing intermediate variables of
our executions in our programs.
The central processing unit ( CPU): The processor helps the operating system to schedule
threads and processes. The speed of our proc essor will determine which process runs next. In
this project , the speed of our process core i5. This is the most important to process data, so
since the software will be heavy, we need a fast processer to execute and process data.
500GB Hard disk: This is where all our data applications will lie. Data servers will require
this kind of environment in order to hold doctors’ data. This being a real -time project, it will
require working with a faster machine which reduces the level of space complexity will
cer tain algorithm is being executed.
Artificial intelligence hardware: These are hardware used to link the software and the
hardware together so they can do specific functions.
13
Input devices: These are a device that facilitates capturing of data into our syst em namely the
keyboard and mouse.
Output devices: These are the output we receive from our systems. They include and are not
limited to the following monitors and printers.
CHAPTER FOUR
Data Analysis and Findings.
4.1 Introduction to data analysis.
Data analysis is the process of carrying out experimental analysis to establish what needs to
be done to enhance, the validity of the hypothesis we suggested. Therefore, the data needs to
be critically analyzed and enhance its effectiveness (Ghorbanian, 20 20 ).We process this data
effectively. Data analysis is regarded as an important part of any kind of project . The data
collected become important now as it gives meaning to the data collected (Roberts et al.,
2019). The experimental data collected becomes ve ry effective and viable at all times. The
verifiability of the data is done through screening, analysis, and interpretation of the data.
This gives appropriate reasoning for the data collected. This helps to predict relationship
trends among the collected data.
4.2 Dataset
This refers to data available for analysis . In the classical sense without appropriate and well –
selected data the accuracy of the result can be compromised at all times and ensure its
eventuality at all times. The selection of features fo r this process requires to be extracted from
well -sanitized data (Michel, 2022). All processes of data should be done to ensure that the
image used in the experiment is very effective and it is free from noise. Essentially that is the
reason why the model w hich has been applied has incorporated the denoise step so that we
can ensure appropriate picture which has not been salted has effectively taken place at all
instances . This is the link to the dataset.
Link : https ://www.kaggle.com/datasets/shayanfazeli/heartbeat?select=mitbih_test.csv
4.3 The experimental setup.
This refers to the process of setting facilities that will be used in the experiment. They to be
precise with few levels of inaccuracies. This will ensure that our sensitivity level is high at all
14
times. In order to ensure the level of precision of our experimental will be high and will
require a high level of precision.
In order to perform this experimental project , we need to set the following environment on
our personal computers. We equipped our workstation with the following equipment in place.
(i) Intel -Quadcore i5 generation. This will ensure that all our simulations in the data
we are training take place in the right manner hence ensuring the availability of
processing resources at all times. This fine as the required result while being
executed widely in order to ensure all our result is on points .
(ii) 500GB hard disk which will be used to store some of the records of features that
needed to be extracted.
Fig ure 5: Predicting model (Sato,2021)
Convoluted neural networks models are used extensively in a variety of applications,
including self -driving cars, computer vision, and face recognition for object recognition
(Lapsiya et al., 2021) . This is a deep Convolutional Neural Network model that can handle a
wide range of inputs as well as 1D and 2D signals (Wu, 2018) . The output layer comes last,
followed by the classification layer, the totally connected layer, the Rectified Linear Units
layer, the convolutional layer, and the input layer, in that order.
Convoluted neural networks uses convoluted neural networks that has been taught directly to
perform manual feature extraction. Recognizing the significance of the outcomes has been
good (Mi n et al., 2020) . It has been programmed to carry out well -known tasks and can
15
expand on existing networks36. This is largely utilized in the topic of vision in computer
science, which has a variety of applications.
These layers are made up of several separ ate layers. Layers of activation functions are
combined in convoluted neural networks . It’s an import ant class in Neural Networks (Tupe ,
2020 ). This term is used to describe a wide range of computer vision problems. In 2 -7, there
are both nonlinear and mul tilinear layers. Its data and image extraction methods are becoming
increasingly popular and well -known (Nieto et al., 2019) . Every layer of this convoluted
neural networks has picked up on a key feature of the data. Convoluted neural network’ s
advanced lea rning algorithms have been trained to recognize handcrafted and traditional
filters and texture patterns, which have now been replaced by convoluted neural network’ s
sophisticated learning algorithms.
Figure 6: Mindray image
List of incorporated data in the data set parameters
Attributes Description Possible values
Age This is the exact age of the
person
Captures at triage
Whole numbers
sex Boy or Girl
Name Name of the client whose
16
ECG appears
PR interval Pulse rate interval 130ms
QRS interval Represent time for
ventricular depolarization
72 ms
Status Normal or abnormal ECG. 1-Refers to true
0-Refers to false
4.4 Performance metrics
All artificial intelligence -related concepts and algorithms require a high level of
maintainability and high lev el of accuracy as they execute their task. For instance, most the
classifier such as SVM, Decision tree, Random Forest , and Naive Bayes requires a high level
of accuracy and precision while executing the task they are being involved to execute (Shah
et al., 2020) . Therefore, it is wise not to leave this part of the concept without being
discussed. Performance metrics are a critical part of classifiers.
In machine learning there are classifiers there are some criteria that are considered while
evaluating the performance of the classifier including and not limited to the following.
(i) Precision: This refers to the measure of exactitudes of certain classifiers when
estimating the prediction level of certain data (Appavu, 2019).If the precision of
prediction of the classifier simply means we have reduced the number of false
positives which is accrued by the model.
Total precision= True positive
True positive+ false positive
This can be summarized as this : TP =TP/(TP+FP) ————– equation(i)
(ii) Recall: This is a kind of measure that is used to guarantee and estimate the
completeness of the classifier.
Recalling the equation :
Rec all can be summarized as:
Recall= True positive
True positive +False Negative
TP /( TP + FN) ——————————————– equation(ii)
17
(iii) F-score: This is referred to the association of accuracy and recalls. This is
computed as follows.
F-Scor e =Recall × Precision
Recall + Precision
F-score is computed heavily in a manner that illustrates that the classifier in question has
established the required prerequisites in advance. Ideally , when all these parameters turn out
to be true everything will work fine as our models will predict the required data at all times
which evaluates to be the best for us all (Chekroud et al.,2021) . In a more concise manner, the
classifier produces accurate values depending on the kind of input information they have
accrued and undertaken.
4.5 Data preparation
Data preparation will involve making sure that we have available electrocardiogram signal
images which are clear and evident for classification (Butt, 2021) .T his will involve accessing
the data present in your local system in handling the system effectively. This involves calling,
an external library that is effective and responsible for executing some of the prediction
aspects. As stated earlier and in advance , the kind of programming language which will be
applied here is the python programming language due to its rich libraries for machine
learning.
It is worthwhile to note that well -cleaned data is very important as it helps in improving the
security of the data. The steps represented below will help in performing appropriate steps
which helps us in ensuring that appropriate information is reaped at the time for model
predictions and so many other things.
Figure 6: The model in feature extraction and prediction (Pontone,2020) .
18
This is the original model that cardiologists in Seychelles still use today. During this time,
early discovery is critical. This suggests that further project is needed before proposing a
novel de sign and tactics for the early detection of this disease. A cardiologist at Dr. Jivan
Clinic Seychelles uses this machine model. Dr. Jivan Clinic Seychelles receives donations
from all around the world, so the machines meet EU requirements. This machine’s product
life cycle will be used for many years until Seychelles receives another contribution for a new
machine.
4.6 Dataset classification using convoluted neural networks .
Image data I gathered and stored safely so that analysis can be performed. The link below
represents t the dataset which have been configured heavily.
Link : https ://www.kaggle.com/datasets/shayanfazeli/heartbeat?select=mitbih_test.csv
Figure 7: showing the trained data (Bakirtzis, 2021) .
The output data will be used for training.
4.7 The training data regime
Figure 8: trained output
19
From the figure above the training of the model using convoluted neural networks has been
made effectively. Training ensures that all features which are beneficial are selected so that
accuracy can be enhanced. In this step cleaned data which free from null values is sanitized
and adhered to (Al Awar et al , 2018 ).In a more concise manner python has a mechanism of
performing training and translating data in a form that is acceptable at all times. The first data
is split into two categories namely data for training and the other one is used for testing.
We train the data effectively so as to achieve accurate results. After training the displayed out
is demonstrated and shown for effective prediction. Th e following is the output of our
convoluted neural networks model after training.
Counting the values to be involved in training. We have a total of 187 characteristics of
cardiovascular disease names which need to be trained. The data is split into 5 cat egories
which are ambient for data training and predictions.
4.8 convoluted neural networks prediction .
This is the output of our training model. We are predicting our attributes x_train,y_train using
the function model. Fit. This model propagates and predicts reasonable output in predicting
20
what the possibilities of certain cardiovascular diseases are . Holding factors constant, it is
evident that the model had a reduced rate of 0.000000000000379 97961.
CHAPTER FIVE
5.0 Discussion of the findings.
Validation of the experiment is undertaken at this point. In essence, it is the validation of the
hypothesis suggested at earlier stages of the project . The course of analysis is very important
and critical and has been addressed in the previous chapter. For some of the concerns that
will be raised here.
5.1 Accuracy of the model
The accuracy of the model in place has been necessitated by enhancing applying very vital
techniques such as machine learning in predicti ng some of the cardiovascular diseases which
might be ravaging and available ( Tengnah et al.,2019) . In a classical sense, the model in
predicting our values indicated that machine learning is very effective in predicting the
disease in question. Therefore, when precision and F1 -score have been identified the
accuracy of the model will have bee n established.
The classifier performance metric is identified from convoluted neural networks .
Performance metrics Identified figure
Precision completeness 0.8701
Recalling Version 0.8824
F-score value 0.8672
Accuracy determination values 0.8972
5.2 prediction explanation.
21
The figure above was a result of a prediction from our model. In the figure, it was evident we
predicted the accuracy of our models using different values we have already trained
(Khemasuwan et al. , 2020 ). With the heartbeat rhythmic graph, we can identify the
movement of the heart and predict the possible disease that has already been identified.
Therefore, with this kind of model in place, we will have reduced levels of cardiovascular
diseases (Siontis e t al., 2021) . Has it can be identified alternative hypothesis has been
validated as these cardiovascular diseases are not associated with an aging population. It
happens due to rhythmic change or weakness in a certain part of the heart.
5.3 convoluted neural networks prediction approach
An intriguing aspect is the ability to extract interesting feature images of filters from Multiple
Convolutional Kernels (Bragman et al., 2019) . In the convolutional layer, there are many
kernels of comparable size. In addition, the convolution layer employs filter groups. The
convolutional feature has been mapped in this case. This filter is used in this layer to replace
a filter that is no longer in use.
When compared to other tools, this is a winner. This ha s the capability of resolving a variety
of image classification challenges. RCB’s picture size is 256 by 256 pixels ( Tan , 2019 ). As a
result, all photos for testing and training must be that size. This size48 input is not available.
This image would then b e cropped and resized to get the desired image size. Furthermore, the
two layers are incompatible. ReLU models are models that are based on ReLU.
There are a dozen leads in a standard ELECTROCARDIOGRAM . Half of them are referred
to as “limb leads” since th ey are worn on patients’ legs and arms. Some are referred to as
22
precordial leads. They are put on the precordium or torso for this reason. “I, II, III, aVL,
aVR, and aVF” are the names of the previous leads.
.
5.4 Conclusion and future work.
Prediction of heart disease may alleviate a number of death which may be bound to take
place. Infact the model will always provide information about certain information which will
require a specialist to analyze the images and make a prediction. This even makes the
approach to cost -effective as the framework will act as recommender system (Barkadehi et
al., 2018) . Whoever will be using the model has to provide the system with some of the
systems and then our model which already has learned the data it will auto -provide
information that will be required to produce output that is accurate depending on the kind of
input provided.
Incorrect input simply implies that the prediction will not be accurate in all. In classical sense
what needs to be understood is th at no matter how accurate our models are, we need to
provide quality input so that we can ensure the correctness of the data has been achieved in
the right manner in all dimensions.
In evaluating the model its evident prediction of the cardiovascular disease has been made
faster as a result we made sure that our data set exhibited some cardiovascular test parameters
as well as habits of human beings (Bakirtzis,2021) . It is evident from the model accuracy
outsma rt others already established in the previous works which were carried out.
23
5.5 Future work
Future work gives us an intuition of what ought to happen in order to enhance the accuracy of
our work completely because everything cannot be achieved, within the set time or due to
financial constraints to set all parameters in place. Our future model will employ mechanisms
such as detecting another disease such as cancer and brain diseases. The mechanism will also
be integrated fully to ensure it works fine in ess ence.
24
References
Al Awar, N., Zhu, S., Biros, G. and Gligoric, M., 2021, June. A performance portability
framework for Python. In Proceedings of the ACM International Conference on
Supercomputing (pp. 467 -478).
Appavu alias Balamurugan, S. and Salomi, M., 2021. A predictive risk level classification of
diabetic patients using deep learning modified neural network. Journal of Ambient
Intelligence and Humanized Computing, 12(7), pp.7703 -7713.
Butt, F.S., Blunda, L.L., Wagner, M .F., Schäfer, J., Medina -Bulo, I. and Gómez -Ullate, D.,
2021. Fall detection from electrocardiogram ( electrocardiogram ) signals and classification by
deep transfer learning. Information, 12(2), p.63.
Barkadehi, M.H., Nilashi, M., Ibrahim, O., Fardi, A.Z. a nd Samad, S., 2018. Authentication
systems: A literature review and classification. Telematics and Informatics, 35(5), pp.1491 –
1511.
Bakirtzis, G., Genovese, F. and Fleming, C.H., 2021. Yoneda hacking: The algebra of
attacker actions. arXiv preprint arXiv: 2103.00044.
Bashir, D., Montanez, G.D., Sehra, S., Segura, P.S. and Lauw, J., 2020, November. An
information -theoretic perspective on overfitting and underfitting. In Australasian Joint
Conference on Artificial Intelligence (pp. 347 -358). Springer, Cham.
Bragman, F.J., Tanno, R., Ourselin, S., Alexander, D.C. and Cardoso, J., 2019. Stochastic
filter groups for multi -task Convoluted neural networks s: Learning specialist and generalist
convolution kernels. In Proceedings of the IEEE/CVF International Conferen ce on Computer
Vision (pp. 1385 -1394).
Chung, H. and Shin, K.S., 2020. Genetic algorithm -optimized multi -channel convolutional
neural network for stock market prediction. Neural Computing and Applications, 32(12),
pp.7897 -7914.
Chekroud, A.M., Bondar, J., Delgadillo, J., Doherty, G., Wasil, A., Fokkema, M., Cohen, Z.,
Belgrave, D., DeRubeis, R., Iniesta, R. and Dwyer, D., 2021. The promise of machine
learning in predicting treatment outcomes in psychiatry. World Psychiatry, 20(2), pp.154 –
170.
El -Hasnony, I. M., Elzeki, O.M., Alshehri, A. and Salem, H., 2022. Multi -label active
learning -based machine learning model for heart disease prediction. Sensors, 22(3), p.1184.
Falb, K., Laird, B., Ratnayake, R., Rodrigues, K. and Annan, J., 2019. The ethical contours o f
project in crisis settings: five practical considerations for academic institutional review
boards and project ers. Disasters, 43(4), pp.711 -726.
25
Gong, Q., Zhang, P., Wang, J., Ma, J., An, Y., Chen, Y., Zhang, B., Feng, X., Li, H., Chen,
X. and Cheng, Y. J., 2019. Morbidity and mortality after lifestyle intervention for people with
impaired glucose tolerance: 30 -year results of the Da Qing Diabetes Prevention Outcome
Study. The lancet Diabetes & endocrinology, 7(6), pp.452 -461.
Gstrein, O.J. and Beaulieu, A., 2022. How to protect privacy in a datafied society? A
presentation of multiple legal and conceptual approaches. Philosophy & Technology, 35(1),
pp.1 -38.
Ghorbanian, A., Kakooei, M., Amani, M., Mahdavi, S., Mohammadzadeh, A. and Hasanlou,
M., 2020. Imp roved land cover map of Iran using Sentinel imagery within Google Earth
Engine and a novel automatic workflow for land cover classification using migrated training
samples. ISPRS journal of photogrammetry and remote sensing, 167, pp.276 -288.
Han, Y., Xie, H., Liu, Y., Gao, P., Yang, X. and Shen, Z., 2019. Effect of metformin on all –
cause and cardiovascular mortality in patients with coronary artery diseases: a systematic
review and an updated meta -analysis. Cardiovascular Diabetology, 18(1), pp.1 -16.
Jamil, S. and Rahman, M., 2022. A Novel Deep -Learning -Based Framework for the
Classification of Cardiac Arrhythmia. Journal of Imaging, 8(3), p.70.
Khemasuwan, D., Sorensen, J.S. and Colt, H.G., 2020. Artificial intelligence in pulmonary
medicine: computer visio n, predictive model and COVID -19. European respiratory
review, 29(157).
Kłosowski, G., Rymarczyk, T., Wójcik, D., Skowron, S., Cieplak, T. and Adamkiewicz, P.,
2020. The use of time -frequency moments as inputs of lstm network for electrocardiogram
signal classification. Electronics, 9(9), p.1452.
Lapsiya, Y., Jain, D., Shah, P. and Kachare, A., 2021, August. Analysis Of Various Object
Detection Techniques for Self -Driving Cars. In 2021 Asian Conference on Innovation in
Technology (ASIANCON) (pp. 1 -6). IEEE.
Manogaran, G., Varatharajan, R. and Priyan, M.K., 2018. Hybrid recommendation system for
heart disease diagnosis based on multiple kernel learning with adaptive neuro -fuzzy inference
system. Multimedia tools and applications, 77(4), pp.4379 -4399.
Michel, F., O’Neill, T., Rynk, R., Bryant -Brown, M., Calvez, V., Li, J. and Paul, J., 2022.
Contained and in -vessel composting methods and methods summary. In The Composting
Handbook (pp. 271 -305). Academic Press.
Min, H., Wilson, D., Huang, Y., Liu, S., Crozier, S., Bradley, A.P. and Chandra, S.S., 2020,
April. Fully automatic computer -aided mass detection and segmentation via pseudo -color
mammograms and mask r -Convoluted neural networks . In 2020 IEEE 17th International
Symposium on Biomedical Imaging (IS BI) (pp. 1111 -1115). IEEE.
26
Nieto, Y., Gacía -Díaz, V., Montenegro, C., González, C.C. and Crespo, R.G., 2019. Usage of
machine learning for strategic decision making at higher educational institutions. IEEE
Access, 7, pp.75007 -75017.
Pontone, G., Guaricci, A.I., Palmer, S.C., Andreini, D., Verdecchia, M., Fusini, L.,
Lorenzoni, V., Guglielmo, M., Muscogiuri, G., Baggiano, A. and Rabbat, M.G., 2020.
Diagnostic performance of non -invasive imaging for stable coronary artery disease: A meta –
analysis. Internation al Journal of Cardiology , 300 , pp.276 -281.
Pourghasemi, H.R., Gayen, A., Lasaponara, R. and Tiefenbacher, J.P., 2020. Application of
learning vector quantization and different machine learning techniques to assessing forest fire
influence factors and spat ial modelling. Environmental project , 184, p.109321.
Roberts, K., Dowell, A. and Nie, J.B., 2019. Attempting rigour and replicability in thematic
analysis of qualitative project data; a case study of codebook development. BMC medical
project methodology, 19(1), pp.1 -8.
Rehman, A., Naz, S., Razzak, M.I., Akram, F. and Imran, M., 2020. A deep learning -based
framework for automatic brain tumors classification using transfer learning. Circuits,
Systems, and Signal Processing, 39(2), pp.757 -775.
Siontis, K.C., Noseworthy, P.A., Attia, Z.I. and Friedman, P.A., 2021. Artificial intelligence –
enhanced electrocardiography in cardiovascular disease management. Nature Reviews
Cardiology, 18(7), pp.465 -478.
Shah, K., Patel, H., Sanghvi, D. and Shah, M., 2020. A comparat ive analysis of logistic
regression, random forest and KNN models for the text classification. Augmented Human
Project , 5(1), pp.1 -16.
Sambasivam, G. and Opiyo, G.D., 2021. A predictive machine learning application in
agriculture: Cassava disease detectio n and classification with imbalanced dataset using
convolutional neural networks. Egyptian Informatics Journal, 22(1), pp.27 -34.
Sato, E. and Chen, J.C., 2021. Rise to the occasion: The trajectory of a novice Japanese
teacher’s first online teaching throug h action project . Language Teaching Project , 25 (2),
pp.306 -329.
Tan, M. and Le, Q.V., 2019. Mixconv: Mixed depthwise convolutional kernels. arXiv
preprint arXiv:1907.09595.
Tupe, P.R., Vibhute, P.M. and Sayyad, M.A., 2020, December. An Architecture Combining
Convolutional Neural Network ( CONVOLUTED NEURAL NETWORKS ) with Batch
Normalization for Apparel Image Classification. In 2020 IEEE International Symposium on
Sustainable Energy, Signal Processing and Cyber Security (iSSSC) (pp. 1 -6). IEEE.
27
Tengnah , M.A.J., Sooklall, R. and Nagowah, S.D., 2019. A predictive model for hypertension
diagnosis using machine learning techniques. In Telemedicine Technologies (pp. 139 -152).
Academic Press.
Williamson, K., Nimegeer, A. and Lean, M., 2020. Rising prevalence of BMI≥ 40 kg/m2: A
high‐demand epidemic needing better documentation. Obesity reviews, 21(4), p.e12986.
Wu, H. and Zhao, J., 2018. Deep convolutional neural network model based chemical process
fault diagnosis. Computers & chemical engineering, 115, pp.18 5-197.
Zivkovic, M., Stoean, C., Chhabra, A., Budimirovic, N., Petrovic, A. and Bacanin, N., 2022.
Novel Improved Salp Swarm Algorithm: An Application for Feature
Selection. Sensors, 22(5), p.1711
28
29
7.0 Appendices
Appendix 1: Gantt chart
Task 1-
we
ek
2-
we
ek
3-
we
ek
4-
we
ek
5-
we
ek
6-
we
ek
7-
we
ek
8-
we
ek
9-
we
ek
10 –
we
ek
11 –
we
ek
12 –
we
ek
13 –
we
ek
14 –
we
ek
15 –
we
ek
Chapter
one
Literature
review
Methodolo
gy
Modelling
Results
Appendix
Presentati
on
30
Appen dix 2: Project timeline
SR. List of Milestones Present status
1 Chapter one: Background study Complete
2 Literature Review Complete
3 Project methodology Complete
4 Data analysis Complete
5 Discussion Complete
6 Conclusion Complete
9 References and Appendices Complete
31
Appendix 3
Questionnaire
Questions for doctors
(i) What kind of patient with need in ELECTROCARDIOGRAM .
You should consider to have ELECTROCARDIOGRAM if you experience
symptoms such as high blood pressure or symptoms of heart disease.
(ii) What problems can ELECTROCARDIOGRAM detect
Arrhythmias: This is where the heart beat too slowly, too quickly and definitely i n
an irregular manner.
(iii) What do you look for in an ELECTROCARDIOGRAM
You check at EKG/ ELECTROCARDIOGRAM to check if the rate is regular or
irregular depending with the heartbeat.
(iv) When do you do an ELECTROCARDIOGRAM
When you have any conditions which affect the heart activities.
(v) How often do you find a problem
You should be checked regularly if you are a victim and not regularly if you are
not a victim.
(vi) Who leads an ELECTROCARDIOGRAM
Cardiologist leads this with close alongside general doctor practitioner.
Qu estions of a patient to a doctor
(vii) Why do we need an ELECTROCARDIOGRAM
To test the veracity of our hearts conditions at all the times. Heart disease are very
expensive and important and temperamental.
(viii) Is there any other diagnostic machine other than ELECTROC ARDIOGRAM to
detect my problem
Yes such as computerized tomography (CT)
(ix) How long do the test take
5-10 Minutes.
32
Appendix 4
Letter

QUALITY: 100% ORIGINAL PAPER – NO PLAGIARISM – CUSTOM PAPER
error: Content is protected !!