modular logic of authentication using dynamic...

Modular logic of authentication using dynamic keystroke pattern analysisTapalina Bhattasali, Piotr Panasiuk, Khalid Saeed, Nabendu Chaki, and Rituparna Chaki Citation: AIP Conference Proceedings 1738, 180012 (2016); doi: 10.1063/1.4951959 View online: http://dx.doi.org/10.1063/1.4951959 View Table of Contents: http://scitation.aip.org/content/aip/proceeding/aipcp/1738?ver=pdfcov Published by the AIP Publishing Articles you may be interested in The impact of database quality on keystroke dynamics authentication AIP Conf. Proc. 1738, 180020 (2016); 10.1063/1.4951967 The logical model for pattern representation AIP Conf. Proc. 1738, 120009 (2016); 10.1063/1.4951892 Pattern recognition using linguistic fuzzy logic predictors AIP Conf. Proc. 1738, 120006 (2016); 10.1063/1.4951889 Chaogates: Morphing logic gates that exploit dynamical patterns Chaos 20, 037107 (2010); 10.1063/1.3489889 Security Aspects of the Authentication Used in Quantum Cryptography AIP Conf. Proc. 889, 162 (2007); 10.1063/1.2713455

Reuse of AIP Publishing content is subject to the terms at: https://publishing.aip.org/authors/rights-and-permissions IP: 37.6.12.65 On: Wed, 15 Jun 2016 16:21:34

http://scitation.aip.org/content/aip/proceeding/aipcp?ver=pdfcov

http://scitation.aip.org/search?value1=Tapalina+Bhattasali&option1=author

http://scitation.aip.org/search?value1=Piotr+Panasiuk&option1=author

http://scitation.aip.org/search?value1=Khalid+Saeed&option1=author

http://scitation.aip.org/search?value1=Nabendu+Chaki&option1=author

http://scitation.aip.org/search?value1=Rituparna+Chaki&option1=author

http://scitation.aip.org/content/aip/proceeding/aipcp?ver=pdfcov

http://dx.doi.org/10.1063/1.4951959

http://scitation.aip.org/content/aip/proceeding/aipcp/1738?ver=pdfcov

http://scitation.aip.org/content/aip?ver=pdfcov

http://scitation.aip.org/content/aip/proceeding/aipcp/10.1063/1.4951967?ver=pdfcov



http://scitation.aip.org/content/aip/journal/chaos/20/3/10.1063/1.3489889?ver=pdfcov


Modular Logic of Authentication Using Dynamic Keystroke Pattern Analysis

Tapalina Bhattasali1,a), Piotr Panasiuk2 , Khalid Saeed2,3, Nabendu Chaki1,b) and Rituparna Chaki1,c)

1University of Calcutta, Kolkata, India

2Warsaw University of Technology, Warsaw, Poland

3Bialystok University of Technology, Bialystok, Poland

1,a) [email protected], 1,b)[email protected], 1,c)[email protected] [email protected]

3[corresponding author][email protected]

Abstract. Authenticating users in a continual manner has become extremely critical for a wide range of applications in the domain of pervasive computing and Internet of Things (IoT). In these days, it’s also an accepted fact that user authentication based on biometric features is often more efficient than the traditional means of password-based authentication. However, many of the existing biometric techniques like Iris or finger-print recognition are effective only when the person to be authenticated or verified is physically accessible. Thus such technologies are good for applications like Passport Control and fall short of the requirements for IoT applications like an integrated remote-healthcare where different types of users like Doctors, patients, hospitals, insurance companies, other care-givers and even authorized civic-body administrators are to be continually authenticated from remote locations. It is important to ensure that the desired services are accessed only by a legitimate user and no one else. In this paper, we address these issues by proposing a modular solution using key-stroke based biometrics.

Keywords: access control, authentication, biometrics, keystroke dynamics, computer security, modular authentication PACS: 89.20.Ff

INTRODUCTION

Trustworthiness of authentication can be increased by analyzing typing pattern of users [1]. It is a behavioral nature which can be captured by the way individual types on a keyboard [2]. Individual’s typing patterns are stored as template [3]. Timing vectors are mainly used to classify keystroke patterns as valid or invalid. In general, the considerable parameters for keystroke include dwell time (duration that the key is pressed), flight time (time between two successive keys are pressed), overlap time (duration of pressing more than one key at the same time), typing speed and typing error. Keystroke analysis may be implemented at a lower cost as compared to other biometrics. Typing pattern can be analyzed either by using fixed text [4] or free text [5]. Fixed text analysis is based on predefined content and is effective for static authentication. On the other hand, free text [6] can be used for dynamic monitoring that checks whether an impostor is attempting for false authentication.

We observe from the literature survey [7, 8] that larger the sample size better is the accuracy for free text authentication. However, thanks to the excellent tools for building graphical user interfaces (GUI), majority of applications these days hardly require entry of long free texts by end-users. It is also found [9, 10] that most of the keystroke authentication mechanisms consider only the parameters dwell time and flight time for user data generation. Global parameters such as typing sequences, count of errors during typing, habit of typing, stylometry, etc. need to be considered for better user pattern generation. Flexibility and adaptability to the dynamic keystroke analysis is another issue to be considered. Finally, accuracy of verification is indeed fundamentally crucial.

The main focus of this paper is to propose a modular, semi-continuous authentication mechanism using dynamic keystroke pattern analysis that can be applied to any web application [11]. The work aims to modularize functionalities in unambiguous way, to ensure flexibility, reduced space complexity and fast response besides higher accuracy. We proposed a modular logic for keystroke based authentication to guarantee valid access of sensitive data. Long free text typing pattern is analyzed along with fixed text password. In this novel authentication

International Conference of Numerical Analysis and Applied Mathematics 2015 (ICNAAM 2015)AIP Conf. Proc. 1738, 180012-1–180012-4; doi: 10.1063/1.4951959

Published by AIP Publishing. 978-0-7354-1392-4/$30.00

180012-1


mechanism, module 1 is used for data collection, module 2 is used for data storage and module 3 is used for data analysis. User gets trust score by a match module and the decision is taken by a decision maker based on trust score and other typing behavior of users. Performance analysis of the proposed mechanism shows its efficiency to classify between valid user and impostor [12] in a less complex way.

PROPOSED WORK SPECIFICATION

The proposed authentication mechanism uses behavioral biometric features [11] as an input and produces a conclusion based on pattern matching. The proposed keystroke analysis is designed based on modular logic. Among three modules, module 1 is used for data collection, module 2 is used for data storage and module 3 is used for data analysis. Figure 1 represents activities at each module of keystroke analysis mechanism.

FIGURE 1. Modules of Dynamic Keystroke Pattern Analysis

Module 1 is used for raw data collection, feature selection and feature extraction. Application collects temporal and global features of both fixed text password and free text. Collected data are - user-id, session time, key, key code, key time (time of key event), duration of fixed text typing, time per key press for fixed text, length of free text, free text typing duration, time per key pressed for free text, free text typing speed, count of frequently used patterns, rare patterns, alphabets, space, shift.

Module 2 is used for cluster formation from sample data (data processing), outlier detection and filtering, normalization, training of data and template generation and storage. This layer is used for grouping filtered data to reduce sample size and to make searching faster. There are two major differences of proposed cluster formation with k-means algorithm. The first is that the number of clusters does not need to be specified in advance in this approach. The second is that indexing is used to reduce time complexity of k-means [13] algorithm. Index database is used to search the location of a sample in a faster way. Clusterization of user’s data is based on temporal information of used and rare data. Each cluster is partitioned into sub-clusters according to number of typable characters within a string. Intra-cluster difference (difference between same categories) and Inter-cluster difference (difference between different categories) are calculated to differentiate between valid and invalid data samples. Intra is used to measure the compactness of the clusters. Inter is used to measure the separation of the clusters. Odd data set are filtered out at module 2. The distances of each data point from cluster centre (CC) are computed as:

d=∑ | | Data are normalized and changed into a form recognizable by Back Propagation Neural Network classifier.

The mean value of timing vector Ti(Uk) is computed asTi(Uk) = ∑ /

Normalized mean value, norm_mean_Ti(Uk)=

+

norm_mean_Ti(Uk ) within the range [minvalnew= 0.0 and maxvalnew=1.0] is

180012-2


norm_mean_Ti(Uk)[0.0,1.0] =

In order reduce complexity, mean values of timing vectors are considered in the range of [0.0, 1.0]. For training of normalized data, back propagation neural network model is used. Number of processing unit in

input layer (Pin) is set to number of selected features of timing vectors and number of processing unit in output layer (Pout) is set to 1. Number of processing unit in hidden layer (Phi) is set to (Pin+ Pout)/2.

Training configuration includes number of layers (input, hidden, output), activation function (sigmoid function), initial weight, termination condition.

Erroneous data at output layer are back propagated to the earlier ones, allowing incoming weights to be updated until all training data are used. If training of the model becomes successful, testing phase is more accurate. Testing phase is used to check whether submitted input pattern is matched with previously stored data of claimed user or not. Network model updates repeatedly until error is reduced to negligible amount.

Module 3 is used for analysing data, testing data during verification, generating trust score by match module and taking final decision. During testing of data, collected input patterns are transmitted to pre-learned network model. Network model may not generate exactly 0 or 1 for using sigmoid function. If allowable tolerance level is less than and equals to 0.1, it is treated as 0 or if it is greater than and equals to 0.9, it is equal to 1. If classification of results within 0 and 1 then, testing data of claimed user is treated as valid (1), otherwise claimed user is treated as invalid (0). Average deviation represents similarity in user’s behavior in a session to the valid user’s behavior. Lower number represents high confidence of similarity.

Decision logic finally takes the decision based on trust score and other enrolled temporal and global parameters such as user’s nature of typing to verify claimed identity. Other parameters are checked by decision maker by executing optimized query for the database related to user profile. If all conditions are satisfied, then positive decision (True) is generated to validate claimed identity, otherwise negative decision (False) is generated. User can try two times consecutively before being blocked by the system[14].

PERFORMANCE ANALYSIS

FIGURE 2. FAR (%) vs. FRR (%) against Threshold Level Performance of proposed work is analyzed in MATLAB R2012b. At present, analysis is done in controlled

environment. The dataset contains temporal data of 15 users, who typed five times at different sessions. False acceptance and false rejection rates are plotted according to the threshold value and compared with the

existing work [6] in a supervised controlled environment. It is analyzed that an EER for Ahmed et al’s [6] work without digraph approximation is high compared to the EER observed for the proposed mechanism. The proposed work shows FAR and FRR intersection point at minimum threshold level, which indicates more accuracy at a homogeneous environment.

180012-3


CONCLUSION

We infer that in the proposed work, user’s typing nature is captured dynamically besides considering temporal data of typing patterns. The results show that the accuracy level is enhanced in the proposed low-cost solution which also has lower complexity. Modular authentication mechanism has the potential to give high performance in distributed IoT environment when it is tested with a standard dataset.

ACKNOWLEDGMENTS

This work is partially supported by the Dean of Faculty of Mathematics and Information Sciences, Warsaw University of Technology.

REFERENCES

1. M. Karnan, M. Akila, N.Krishnaraj, “Biometric Personal Authentication Using Keystroke Dynamics: A Review”, in Elsevier Journal of Applied Soft Computing 11(2): 1515–1573(2011).

2. P. H. Pisani, A. C. Lorena, “A Systematic Review on Keystroke Dynamics”, in Journal of the Brazilian Computer Society 19(4): 573-587(2013).

3. S. P. Banerjee, D. L. Woodard, “Biometric Authentication and Identification Using Keystroke Dynamics: A Survey”, in Journal of Pattern Recognition Research 7(1): 116-139 (2012).

4. K.S. Balagani, V.V. Phoha, A. Ray, S.Phoha, “On the Discriminability of Keystroke Feature Vectors Used in Fixed Text Keystroke Authentication”, in Elsevier Pattern Recognition Letter32(7): 1070-1080(2011).

5. D. Gunetti, C. Picardi, “Keystroke Analysis of Free Text”, in ACM Transactions on Information and System Security 8(3):312-347(2005).

6. A. A. Ahmed, I. Traore, “Biometric Recognition Based on Free-Text Keystroke Dynamics”, in IEEE Transactions on Cybernetics 44(4): 458-472 (2014).

7. T. Shimshon , R. Moskovitch , L. Rokach, Y. Elovici, “Clustering Di-Graphs for Continuously Verifying Users According to their Typing Patterns”, in: Proceedings of IEEE Convention of Electrical and Electronics Engineers in Israel, 445-449 (2010).

8. K.S. Killourhy, R. A. Maxion, “Comparing Anomaly-Detection Algorithms for Keystroke Dynamics”, in: Proceedings of IEEE/IFIP International Conference Dependable Systems & Networks, 125–134 (2009).

9. L. C. F. Araujo, L. H. R. Jr.Sucupira, M.G. Lizarraga, L.L. Ling , J.B.T. Yabu-Uti, “User Authentication Through Typing Biometrics Features”, in IEEE Transaction on Signal Processing 53(2): 851-855(2005).

10. K.S. Killourhy, S. Kevin, R.A. Maxion, A. Roy, “Free vs. Transcribed Text for Keystroke-Dynamics Evaluations”, in: Proceedings of Workshop: Learning from Authoritative Security Experiment Results, 1-8(2012).

11. T. Bhattasali, K. Saeed, N. Chaki, R. Chaki, “Bio-Authentication For Layered Remote Health Monitor Framework”, in Journal of Medical Informatics and Technologies 23(2014): 131-140(2014).

12. K. Jain, A. Ross, S. Pankanti, “Biometrics: A Tool for Information Security”, in IEEE Transactions on Information Forensics and Security 1(2): 125-143(2001).

13. B. Kao, S. D. Lee, P.K.F. Lee , D. W. Cheung, W.S. Ho, “Clustering Uncertain Data using Voronoi Diagrams and R-Tree Index”, in IEEE Transactions on Knowledge and Data Engineering 22(9):1219 – 1233(2010).

14. T. Bhattasali, K. Saeed, “Two Factor Remote Authentication in Healthcare”, in: Proceedings of IEEE International Conference on Advances in Computing, Communications and Informatics, 380-381(2014).

180012-4


http://dx.doi.org/10.1007/s13173-013-0117-7

http://dx.doi.org/10.1007/s13173-013-0117-7

http://dx.doi.org/10.13176/11.427

http://dx.doi.org/10.1145/1085126.1085129

http://dx.doi.org/10.1145/1085126.1085129

http://dx.doi.org/10.1109/TCYB.2013.2257745

http://dx.doi.org/10.1109/TCYB.2013.2257745

http://dx.doi.org/10.1109/TSP.2004.839903

http://dx.doi.org/10.1109/TIFS.2006.873653

http://dx.doi.org/10.1109/TIFS.2006.873653

http://dx.doi.org/10.1109/TKDE.2010.82

http://dx.doi.org/10.1016/j.patrec.2011.02.014

modular logic of authentication using dynamic...

Documents