classification of leukemia image using genetic based k ... · in an image processing, the k-nearest...

5
Asian Journal of Computer Science and Technology ISSN: 2249-0701 Vol.7 No.2, 2018, pp.113-117 © The Research Publication, www.trp.org.in Classification of Leukemia Image Using Genetic Based K-Nearest Neighbor (G-KNN) M. Bennet Rajesh 1 and S. Sathiamoorthy 2* 1&2 Assistant Professor, Division of Computer and Information Science Annamalai University, Annamalai Nagar, Tamil Nadu, India E-Mail:[email protected] * Corresponding Author Abstract - In the medical diagnostic system, categorization of blood cell is more vital to analyze and detect the disease. The diseases that are associated with blood can be divided only after the categorization of blood cell. Leukemia is a blood cancer that starts with bone marrow. Therefore, it must be treated at the initial stage and makes to death if left unprocessed. This paper introduces a new Genetic based KNN pre-processing approach for removing the noise in Leukemia image without affecting the accuracy of an image. This paper integrates the Genetic algorithm and KNN for noise removal and pre-processing of Leukemia image datasets. Keywords: Leukemia, K-Nearest Neighbor, Genetic Algorithm (GA), Pre-Processing, Noise Removal, Median Filter approach 1. INTRODUCTION Leukemia is a kind of cancer [1] that marks the bone marrow causing the better fabrication of white blood cells which inflow the bloodstream which is more essential, infection-fighting part of your immune system, made in bone marrow. Hence, diagnose myelogenous leukemia has perfectly analyzed a physician will edict a comprehensive blood count bone mellow biopsy and cytogenetic tests. Leukemia is a malignancy (cancer) of blood cells. In leukemia, atypical blood cells have formed in the bone marrow. White Blood Cells (WBC) are the cells accountable for fighting infections. Leukaemia comprises the fabrication of the abnormal WBC. II. RELATED WORKS The following table 1 represents the literature survey on the Image Processing techniques used for the detection of the Leukemia. Niranjan Chatap, Sini Shibu in [2], analyzed that Classical watershed-based algorithm overcomes the problem of Cell extraction and segmentation from massive, noisy images. A. Arputha Regina in [3], suggested that K- Means Clustering, Hausdorff Dimension, Local Binary Pattern for detecting the leukemia using segmentation, Feature Extraction, and Classification. Tejashri G. Patil, V. B. Raskar in [4] used Contour Signature, Otsu‟s Segmentation for detecting the acute leukemia using feature extraction. C.Vidhya, P.Saravana Kumar, K. Keerthika, C.Nagalakshmi, B.Medona devi in [5] used K-Means Clustering, SVM Classifiers for input image is resized and CIELAB color conversion. Extraction of nuclei from the leukocytes. Mrs. Trupti A. Kulkarni-Joshi, Prof. Dilip S. Bhosale in [6] suggested Otsu‟s Thresholding Algorithm used for segmentation of Acute Lymphoblastic Leukemia. Himali P. Vaghela, Hardik Modi, Manoj Pandya, Manoj Pandya in [7] used K-Means Clustering, Histogram for shape-based feature finding is more accurate than watershed transform. Sonal G. Deore Prof. Neeta Nemade in [8] Adaptive Pre- Filtering and Segmentation Algorithms for extracting morphological indexes from those cells, and finally, it classifies the presence of leukemia. Subrajeet Mohapatra, Dipti Patra and Sanghamitra Satpathy in [9] used K-Means, KMedoid, Fuzzy C-Means (FCM), Gustafson Kessel (GK), Fuzzy Possibilistic C Means, Hausdorff Dimension, Contour Signature, SVM Classifier for WBC nucleus segmentation of stained blood smear images followed by relevant feature extraction for leukemia detection. Sos Agaian, Monica Madhukar, Anthony T. Chronopoulos in [10] utilized K-Means Clustering for the segmentation to extract the leukocytes' nuclei using color-based clustering. Emad A. Mohammed, Mostafa M.A.Mohammed, Christopher Naugler, Behrouz H. Far in [11] used Otsu‟s Method, Canny Edge Detector for Isolated pixels are removed, and a segmented nucleus mask is obtained for the detection of leukemia. Ms. Minal D. Joshi, Prof. Atul H. Karode, Prof. S.R.Suralkar in [12] suggested Histogram, Global Thresholding Otsu Method, KNN for feature extracted, classifier classifies the affected cells or healthy cells and with the accuracy of 93%. kNN classification is used. Khot S.T, Sneha Bhalekar, Divya Jaggi and Dolly Rani in [13] used Support Vector Machine for the features extracted from the images and applied to classifier to detect the leukemia cell. 113 AJCST Vol.7 No.2 July-September 2018

Upload: others

Post on 21-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Classification of Leukemia Image Using Genetic Based K ... · In an image processing, the K-Nearest Neighbor algorithm (K-NN) is a non-parametric method used for classification and

Asian Journal of Computer Science and Technology ISSN: 2249-0701 Vol.7 No.2, 2018, pp.113-117

© The Research Publication, www.trp.org.in

Classification of Leukemia Image Using Genetic Based K-Nearest

Neighbor (G-KNN)

M. Bennet Rajesh1 and S. Sathiamoorthy

2*

1&2Assistant Professor, Division of Computer and Information Science Annamalai University, Annamalai Nagar, Tamil Nadu, India

E-Mail:[email protected] *Corresponding Author

Abstract - In the medical diagnostic system, categorization of

blood cell is more vital to analyze and detect the disease. The

diseases that are associated with blood can be divided only

after the categorization of blood cell. Leukemia is a blood

cancer that starts with bone marrow. Therefore, it must be

treated at the initial stage and makes to death if left

unprocessed. This paper introduces a new Genetic based KNN

pre-processing approach for removing the noise in Leukemia

image without affecting the accuracy of an image. This paper

integrates the Genetic algorithm and KNN for noise removal

and pre-processing of Leukemia image datasets.

Keywords: Leukemia, K-Nearest Neighbor, Genetic Algorithm

(GA), Pre-Processing, Noise Removal, Median Filter approach

1. INTRODUCTION Leukemia is a kind of cancer [1] that marks the bone marrow causing the better fabrication of white blood cells which inflow the bloodstream which is more essential, infection-fighting part of your immune system, made in bone marrow. Hence, diagnose myelogenous leukemia has perfectly analyzed a physician will edict a comprehensive blood count bone mellow biopsy and cytogenetic tests. Leukemia is a malignancy (cancer) of blood cells. In leukemia, atypical blood cells have formed in the bone marrow. White Blood Cells (WBC) are the cells accountable for fighting infections. Leukaemia comprises the fabrication of the abnormal WBC.

II. RELATED WORKS The following table 1 represents the literature survey on the Image Processing techniques used for the detection of the Leukemia. Niranjan Chatap, Sini Shibu in [2], analyzed that Classical watershed-based algorithm overcomes the problem of Cell extraction and segmentation from massive, noisy images. A. Arputha Regina in [3], suggested that K- Means Clustering, Hausdorff Dimension, Local Binary Pattern for detecting the leukemia using segmentation, Feature Extraction, and Classification. Tejashri G. Patil, V. B. Raskar in [4] used Contour Signature, Otsu‟s Segmentation for detecting the acute leukemia using feature extraction.

C.Vidhya, P.Saravana Kumar, K. Keerthika, C.Nagalakshmi, B.Medona devi in [5] used K-Means Clustering, SVM Classifiers for input image is resized and CIELAB color conversion. Extraction of nuclei from the leukocytes. Mrs. Trupti A. Kulkarni-Joshi, Prof. Dilip S. Bhosale in [6] suggested Otsu‟s Thresholding Algorithm used for segmentation of Acute Lymphoblastic Leukemia. Himali P. Vaghela, Hardik Modi, Manoj Pandya, Manoj Pandya in [7] used K-Means Clustering, Histogram for shape-based feature finding is more accurate than watershed transform. Sonal G. Deore Prof. Neeta Nemade in [8] Adaptive Pre-Filtering and Segmentation Algorithms for extracting morphological indexes from those cells, and finally, it classifies the presence of leukemia. Subrajeet Mohapatra, Dipti Patra and Sanghamitra Satpathy in [9] used K-Means, KMedoid, Fuzzy C-Means (FCM), Gustafson Kessel (GK), Fuzzy Possibilistic C Means, Hausdorff Dimension, Contour Signature, SVM Classifier for WBC nucleus segmentation of stained blood smear images followed by relevant feature extraction for leukemia detection. Sos Agaian, Monica Madhukar, Anthony T. Chronopoulos in [10] utilized K-Means Clustering for the segmentation to extract the leukocytes' nuclei using color-based clustering. Emad A. Mohammed, Mostafa M.A.Mohammed, Christopher Naugler, Behrouz H. Far in [11] used Otsu‟s Method, Canny Edge Detector for Isolated pixels are removed, and a segmented nucleus mask is obtained for the detection of leukemia. Ms. Minal D. Joshi, Prof. Atul H. Karode, Prof. S.R.Suralkar in [12] suggested Histogram, Global Thresholding Otsu Method, KNN for feature extracted, classifier classifies the affected cells or healthy cells and with the accuracy of 93%. kNN classification is used. Khot S.T, Sneha Bhalekar, Divya Jaggi and Dolly Rani in [13] used Support Vector Machine for the features extracted from the images and applied to classifier to detect the leukemia cell.

113 AJCST Vol.7 No.2 July-September 2018

Page 2: Classification of Leukemia Image Using Genetic Based K ... · In an image processing, the K-Nearest Neighbor algorithm (K-NN) is a non-parametric method used for classification and

Subhan, Ms. Parminder Kaur in [14] suggested that K-NN, Hough Transform for the detection of abnormal cells that cause leukemia. Monika Mogra, Vivek Srivastava in [15] used Watershed Transform and Morphological Image Processing Technique to identify the affected white blood cells. Ms. Sneha Dhakne, Ms. Kumudini K. Borkute, Ms. Priyanka Ikhar in [16] utilized Watershed Algorithm and Clustering Algorithm for the determination of the cells.

III. PROPOSED GENETIC BASED KNN PRE-

PROCESSING APPROACH FOR LEUKEMIA

IMAGE DATASET

A. Filter based Noise Removal The best-known order-statistic filter in digital image processing is the median filter. It is a useful tool for reducing salt-and- pepper noise in an image. The median filter plays a key role in image processing and vision. In median filter, the pixel value of a point p is replaced by the median of pixel value of 8-neighborhood of a point „p‟. The operation of this filter can be expressed as:

( ) * ( ) ( )+ The median filter is popular because of its demonstrated ability to reduce random impulsive noise without blurring edges as much as a comparable linear low pass filter. However, it often fails to perform well as linear filters in providing sufficient smoothing of no impulsive noise components such as additive Gaussian noise. One of the main disadvantages of the basic median filter is that it is location-invariant in nature, and thus also tends to alter the pixels not disturbed by noise.

B. K-Nearest Neighbor Approach

In an image processing, the K-Nearest Neighbor algorithm (K-NN) is a non-parametric method used for classification and regression. In both cases, the input consists of the K closest training examples in the feature space. K-NN is a type of instance-based learning. In K-NN Classification, the output is a class membership. Classification is done by a majority vote of neighbours. The shortest distance between any two neighbours is always a straight line and the distance is known as Euclidean distance. The limitation of the K-NN algorithm is it‟s sensitive to the local configuration of the data. The process of transforming the input data to a set of features is known as Feature extraction. In Feature space, extraction is taken place on raw data before applying K-NN algorithm. The Figure 1 narrates the steps involved in a K-NN algorithm. C. Genetic Algorithm

In the computer science field of artificial intelligence, a genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used

to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover. GAs is inspired by Darwin‟s Theory about Evolution “Survival of Fittest”. GAs is adaptive heuristic search based on the evolutionary ideas of natural selection and genetics. GAs simulate the survival of the fittest among individuals over consecutive generation for solving a problem. Each generation consists of a population of character strings that are analogous to the chromosome that we see in our DNA. Each individual represents a point in a search space and a possible solution. The individuals in the population are then made to go through a process of evolution.

Fig. 1 KNN Classification Algorithm

The GA maintains a population of n chromosomes (solutions) with associated fitness values. Parents are selected to mate, on the basis of their fitness, producing offspring via a reproductive plan. Consequently highly fit solutions are given more opportunities to reproduce, so that offspring inherit characteristics from each parent. As parents mate and produce offspring, room must be made for the new arrivals since the population is kept at a static size.

114AJCST Vol.7 No.2 July-September 2018

M. Bennet Rajesh and S. Sathiamoorthy

Page 3: Classification of Leukemia Image Using Genetic Based K ... · In an image processing, the K-Nearest Neighbor algorithm (K-NN) is a non-parametric method used for classification and

Individuals in the population die and are replaced by the new solutions, eventually creating a new generation once all mating opportunities in the old population have been exhausted. In this way it is hoped that over successive generations better solutions will thrive while the least fit solutions die out New generations of solutions are produced containing, on average, better genes than a typical solution in a previous generation. Eventually, once the population has converged and is not producing offspring noticeably different from those in previous generations, the algorithm itself is said to have converged to a set of solutions to the problem at hand.

D. Proposed Genetic based KNN Pre-Processing

Framework for Leukemia Image Dataset This algorithm is proposed to select the best k-value with the minimum misclassification rate. The K-NN algorithm is one of the best and commonly used for clustering and classification. In this method, each sample value fit to the test dataset and it is classified according to the nearest k sample based on the training data. The class numbers values obtained from k sample values, the maximum number is determined from class samples. The distance measurement calculated by using Euclidean distance measure. The distance of nearest neighbors is calculated based on the test image using distance weighted formula as.

distance (x) = √∑ ( )

where x and y are two images, n is the number of features. Subsequently, the unidentified sample value is selected most relevant to the class from k nearest neighbor algorithm and it is used to find the real value from unidentified sample values. In this proposed method, k value is obtained by Genetic Algorithm. Pseudo code for Proposed Genetic based KNN pre-processing approach Step 1: Choose k number of samples from the training set to generate the initial population (p1). Step 2: Calculate the distance between the training samples in each chromosome and the testing samples, as fitness value. Step 3: Choose the chromosome with highest fitness value and store it as global maximum (Gmax). Step 3.1: For i = 1 to L do Step 3.2: Perform Reproduction Step 3.3: Apply the Crossover operator Step 3.4: Perform mutation and get the new population. (p2) Step 3.5: Calculate the local maximum (Lmax). Step 3.6: If Gmax < Lmax then Step 3.6.1: Gmax = Lmax; Step 3.6.2: p1 = p2; Step 3.6.3: Repeat

Step 4: Output – The chromosome which obtains Gmax and has the optimum k-neighbors.

Fig. 2 Proposed Genetic based KNN Pre-processing framework

IV. RESULT AND DISCUSSION

The performance of Genetic based KNN pre-processing method is calculated based on the classification accuracy, sensitivity, specificity, finding minimum distance and processing time. The classification accuracy, specificity and sensitivity calculated by using accuracy formula:

Here: True Positive (TP) = accurately classified positive values True Negative (TN) = accurately classified negative cases False Positive (FP) = inaccurately classified negative values False Negative (FN) = inaccurately classified negative values

TABLE 1 PERFORMANCE ANALYSIS OF THE PROPOSED GENETIC BASED

KNN WITH EXISTING METHODS

Techniques Classification Accuracy of

Finding Leukemia

MLPNN 90%

SVM 87% Proposed G-KNN pre-processing Framework 96.57%

115 AJCST Vol.7 No.2 July-September 2018

Classification of Leukemia Image Using Genetic Based K-Nearest Neighbor (G-KNN)

Page 4: Classification of Leukemia Image Using Genetic Based K ... · In an image processing, the K-Nearest Neighbor algorithm (K-NN) is a non-parametric method used for classification and

Fig. 3 Graphical Representation of the performance analysis of the proposed Genetic based K-Nearest Neighbor pre-processing framework with existing classification Multi-Layer Perceptron Neural Network (MLPNN) and Support Vector Machine (SVM)

TABLE II PERFORMANCE ANALYSIS OF THE PROPOSED GENETIC BASED K-

NEAREST NEIGHBOR PRE-PROCESSING FRAMEWORK AGAINST CLASSIFICATION ACCURACY, PROCESSING TIME AND MISCLASSIFICATION

RATE

Methodology Classification

accuracy

Processing

Time

Misclassification

Rate

MLPNN 90% 3 sec 3%

SVM 87% 3.5 sec 1.73% Proposed G-KNN pre-processing Framework

96.57% 2.9 sec 1.5%

TABLE III PERFORMANCE ANALYSIS OF THE PROPOSED GENETIC BASED K-NEAREST NEIGHBOR PRE-PROCESSING FRAMEWORK AGAINST ACCURACY,

SENSITIVITY, SPECIFICITY, PRECISION AND ERROR RATE

Classificati

on

Methods

Performance Analysis in %

Accura

cy

Sensitivi

ty

Specifici

ty

Precisi

on

Erro

r

MLPNN 90% 55 72 89 20

SVM 87% 82 82 89.9 10 Proposed G-KNN pre-processing Framework

96.57% 94 97 98 3.5

Fig. 4 Graphical Representation of the performance analysis of the proposed Genetic based K-Nearest Neighbor pre-processing framework with existing classification Multi-Layer Perceptron Neural Network (MLPNN) and Support Vector Machine (SVM) against Sensitivity

Fig. 5 Graphical Representation of the performance analysis of the proposed Genetic based K-Nearest Neighbor pre-processing framework with existing classification Multi-Layer Perceptron Neural Network (MLPNN) and Support Vector Machine (SVM) against Specificity

Fig. 6 Graphical Representation of the performance analysis of the proposed Genetic based K-Nearest Neighbor pre-processing framework with existing classification Multi-Layer Perceptron Neural Network (MLPNN) and Support Vector Machine (SVM) against Precision

Fig. 7 Graphical Representation of the performance analysis of the proposed Genetic based K-Nearest Neighbor pre-processing framework with existing classification Multi-Layer Perceptron Neural Network (MLPNN) and Support Vector Machine (SVM) against Error rate From the above figures 3-7, it is clear that the proposed Genetic based K-Nearest Neighbor Pre-processing framework gives better result than the existing methods like Support Vector Machine (SVM) and Multi-Layer Perceptron Neural Network (MLPNN). The proposed G-KNN pre-processing framework composed two major process, Image denoising by median filter approach and

116AJCST Vol.7 No.2 July-September 2018

M. Bennet Rajesh and S. Sathiamoorthy

Page 5: Classification of Leukemia Image Using Genetic Based K ... · In an image processing, the K-Nearest Neighbor algorithm (K-NN) is a non-parametric method used for classification and

enhancing the image by using G-KNN method in the pre-processing stage for the better classification accuracy of Leukemia image dataset.

V. CONCLUSION In this paper, the proposed G-KNN pre-processing framework composed two major processes, Image denoising by median filter approach and enhancing the image by using G-KNN method in the pre-processing stage of Leukemia image dataset. In this proposed G-KNN pre-processing method, K-NN is combined with Genetic Algorithm to enhance the classification accuracy of the lung cancer identification.

REFERENCES

[1] “Understanding Leukemia”, Leukemia and Lymphoma Society Fighting Blood Cancers.

[2] Chatap, Niranjan and Sini Shibu. "Analysis of blood samples for counting leukemia cells using Support vector machine and nearest neighbour." IOSR Journal of Computer Engineering (IOSR-JCE), Vol. 16, No. 5, pp. 79-87, 2014.

[3] Arputha Regina, “Detection of Leukemia with Blood Microscopic Images,” IJIRCCE, Vol.3, Special Issue 3, April 2015.

[4] Tejashri G. Patil and V. B. Raskar., “Blood microscopic image segmentation & acute leukemia detection,” IJERMT, Vol. 4, No. 9, Sep, 2015.

[5] C. Vidhya, P. Saravana Kumar, K. Keerthika, C. Nagalakshmi, and B.Medona devi., “Classification of acute lymphoblastic leukemia in blood microscopic images using SVM", ICETSH-2015.

[6] Trupti, M., A. Kulkarni-Joshi, and P. D. S. Bhosale. "A fast segmentation scheme for acute lymphoblastic leukemia detection." International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, Vol. 3, No. 2, pp. 7253-7258, 2014.

[7] Himali P. Vaghela, et al. "Leukemia detection using digital image processing techniques." Leukemia, Vol. 10, No. 1, pp. 43-51, 2015.

[8] Deore, Sonal G., and Neeta Nemade. "Image analysis framework for automatic extraction of the progress of an infection." International

Journal of Advanced Research in Computer Science and Software, 2013.

[9] Mohapatra, Subrajeet, Dipti Patra, and Sanghamitra Satpathy. "Unsupervised blood microscopic image segmentation and leukemia detection using color based clustering." International Journal of

Computer Information Systems and Industrial Management

Applications, Vol. 4, pp. 477-485, 2012. [10] Agaian, Sos, Monica Madhukar, and Anthony T. Chronopoulos.

"Automated screening system for acute myelogenous leukemia detection in blood microscopic images." IEEE Systems journal Vol. 8, No. 3, pp. 995-1004, 2014.

[11] Emad A.Mohammed, et al. "Chronic lymphocytic leukemia cell segmentation from microscopic blood images using watershed algorithm and optimal thresholding." Electrical and Computer

Engineering (CCECE), 2013 26th Annual IEEE Canadian

Conference on. IEEE, 2013. [12] Minal D. Joshi, Atul H. Karode and S. R. Suralkar. "White blood

cells segmentation and classification to detect acute leukemia." International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Vol. 2, No. 3, pp. 147-151, 2013.

[13] S. T. Khot, et al. "An innovative approach in myelogenous leukemia detection using attribute analysis." IJAEEE, pp. 2278-8948, 2013.

[14] M. S. Subhan and M. P. Kaur. "Significant analysis of leukemic cells extraction and detection using knn and hough transform algorithm." International Journal of Computer Science Trends and

Technology (IJCST), Vol. 3, No. 1, 2015. [15] M. Mogra and V. Srivastava, "A comprehensive review of analysis of

counting blood cells using different image processing algorithms." Int. J. Eng. Sci. Invention, Vol. 3, No. 6,pp. 29-31, 2014.

[16] Ms. Sneha Dhakne, Ms. Kumudini K. Borkute, Ms. Priyanka Ikhar., "Detection of a cancer cell in blood samples using an Effective algorithm", IJCESR, Vol. 2, No. 6, 2015.

117 AJCST Vol.7 No.2 July-September 2018

Classification of Leukemia Image Using Genetic Based K-Nearest Neighbor (G-KNN)