iot based sensor data analysis using machine learning … … · iot based sensor data analysis...
TRANSCRIPT
International Journal of Computer Engineering and Applications,
Volume XII, Special Issue, May 18, www.ijcea.com ISSN 2321-3469
Kritika Sharma and Deepali D. Londhe 1
IOT BASED SENSOR DATA ANALYSIS USING MACHINE
LEARNING
Kritika Sharma1, Deepali Londhe2 1Department of Information Technology 2Department of Information Technology
Pune Institute of computer technology, Pune, India
ABSTRACT:
The internet of things (IOT) technology is getting more and more attention nowadays due to the increasing number of devices per person. We are surrounded by these electronic gadgets all the time. Researchers and engineers are putting these devices to good use by using them to ensure human safety, health tracking, automation and what not. Being the most sought after field in terms of security, human safety is ensured by using human safety devices (HSD) which are pouring in like anything in the market nowadays. The essence of human safety device lies in the human activity recognition (HAR) which is being achieved using sensors. The sensors are responsible for detecting any changes in the temperature, pressure, heart beat or orientation of the human body and then performing sensor data analysis using machine learning algorithms on these readings to check whether the readings are usual or unusual. The proposed system talks about using sensors such as temperature sensor, pulse sensor, Pressure sensor, and accelerometers to measure these critical values which are directly affected in case of any unusual activity such as falling (detected by accelerometers), getting scared (pulse rate increases), or being grabbed (pressure increases) by someone and so on. The sensors continuously track the heart rate, pressure in one specific area, body activity etc., and generate values which are processed and then classified as usual or unusual activity with the help of Machine Learning algorithm. Sensor data analysis using machine learning offers great opportunities in the field of human safety.
Keywords: IOT, Human Safety, Human Activity Recognition, Sensors, Sensor Data Analysis, Machine Learning
IOT BASED SENSOR DATA ANALYSIS USING MACHINE LEARNING
Kritika Sharma and Deepali D. Londhe 2
[1] INTRODUCTION
Human safety has become an issue of utmost importance as human safety is being
compromised so often nowadays. The market today is overflowing with a huge variety of devices that
offer human safety. A lot of research is undergoing in improving these safety devices so that the
devices are more responsive, spontaneous, and well equipped. Human activity recognition (HAR) is
at the heart of the human safety as the activities we perform tell a lot about us. Human activity
recognition is the process of identifying the activities of a person with the help of high-tech devices
like sensors, camera, etc., that capture the human motion in order to identify the activity being
performed. HAR is being used extensively in areas such as virtual reality, remote health care, elderly
health care, security, in studies based on understanding human activities, etc., and all the sensors that
are available to us today have made this possible.
One of the techniques used for HAR is wearable sensor technology. Wearable sensor
technology has evolved a lot in the last decade and has affected every technology today. The devices
around us today have become so smart that they outsmart even humans in some fields. In today’s
world, this smartness to the sensor based devices is provided by a very well-practiced and renowned
technology called machine learning. Due to the huge popularity of machine learning in varied fields
such as human-computer interaction, computer vision, healthcare, virtual reality, video games, object
identification, etc., it has now become one of the most popular techniques amongst the researchers to
perform sensor data analysis. The ML algorithms such as support vector machine (SVM), K-means
clustering, random forest, hidden Markov model (HMM), etc., are being used extensively for the task
of sensor data analysis due to their effectiveness in mitigating the distortion and noise in the sensor
data due to the environmental conditions in the area where they are used.
We have undoubtedly improved on these safety devices a lot but there is still a long way to
go before we can rest peacefully and achieve the level of security we demand from these devices.
This paper proposes a human safety system which tracks human activities using the sensors and then
by performing the sensor data analysis on the received readings using machine learning algorithm
classifies the received values from the sensors as usual or unusual. Using machine learning technique
on the sensor data provide these devices insight about the user and makes the retraining of these
devices much easier and spontaneous. The rest of the paper is divided into four sections. Chapter two
is the literature survey on various types of algorithms used in sensor data analysis in human safety.
Chapter three talks about the proposed system whereas chapter four gives brief introduction to
various machine learning algorithms used for sensor data analysis and the last chapter presents the
conclusion.
[2] LITERATURE SURVEY
Michael et al. [1] explores a dense sensing approach which makes use of radio frequency
identification (RFID) sensor network technology to recognize human activities in a closed
environment. Their activity detection system is overall divided into three main parts namely the
wireless identification and sensing platform (WISP), RFID sensor network (RSN) and the inference
engine consisting of a hidden Markov model (HMM). WISP is responsible for transmitting unique
identifiers along with their most recent accelerometer readings which were collected by the RFID
readers placed on the ceiling of the apartment. This RSN acts as a data source for the inference engine
which have a Hidden Markov Model (HMM) which takes as input the ID of the various objects
moved around in the apartment and perform the sensor data analysis on the generated ID’s to identify
International Journal of Computer Engineering and Applications,
Volume XII, Special Issue, May 18, www.ijcea.com ISSN 2321-3469
Kritika Sharma and Deepali D. Londhe 3
activity of the person in the apartment. Their results show a precision rate of 90% and a recall rate of
91% as compared to a precision rate of 95% and a recall rate of 60% when using I bracelet instead of
the RSN.
Sohini et al [2] makes use of internet of things (IOT) and ML to ensure human safety. They
have proposed the design of a human safety band consisting of multiple sensors such as breath rate
sensor, heart rate sensor, glucometer, sweat sensor, and optical blood flow sensor. These sensors are
responsible for collecting the respective readings and sending them to the mobile application (app)
created by them. This mobile app is responsible to further taking suitable action depending upon the
readings of the sensor(s) which are being compared to the previously stored values in order to see if
the new readings have crossed a certain threshold or if they are normal. In case the readings are
normal the system continues to work normally; however, if the values are greater than the predefined
threshold, immediate actions are taken such as informing friends and family members, informing the
nearest police station or even the nearby hospitals. The emergency steps are taken on the basis of the
decision taken by a decision tree algorithm running at the back of the app. This decision tree
algorithm helps in taking decision depending upon the readings of various sensors say for example if
the breath rate is above the threshold and the heart rate is also above threshold and then the blood
flow rate also goes beyond the threshold then according to the decision tree the activity is considered
as abnormal and requires informing friends, police as well as the nearest hospital.
Daniel et al [3] uses the Microsoft Kinect one depth sensor for the identification of human
mobility impairments. Their framework involves a two-step process: First step is capturing the
movements of a healthy person (without mobility issues) using the Kinect one sensor and then,
Second step is to compare and evaluate the test participants motion sequence with the statistical
mobility model generated from the data of a healthy person (using Kinect). They also propose an
automatic method to enable a fairer, unbiased approach to label the motion capture data. They use
automatic K-selection and K-means clustering for feature reduction and selection. The first process of
K-selection reduces the number of clusters from the main data and an automated clustering technique
is then used to identify the optimum number of clusters. Their framework is able to successfully
identify the mobility concerns of a person with the mobility issues and this framework is also free
from any human intervention or bias in the process of decision making.
Lee et al [4] suggests a unique approach for minimizing the road accident problems by
analysing the negative emotional response of a person while driving. They consider the negative
emotional response to be one of the major indicators of the likelihood of an accident while driving a
vehicle. These responses are divided into three categories namely fatigue, stress and relaxation. They
consider Stress and relaxation as negative emotional responses. These emotional responses are
captured using three sensors i.e. Electromyography (EMG) sensor, Photoplethysmography (PPG)
sensor and an inertial sensor. These sensors are connected to a microcontroller unit equipped with
Bluetooth-enabled low energy module, which transmits the collected senor readings to a mobile
phone which then extract the information from the sensors and determine driver’s current emotion
using trained SVM classifier. The dataset used for SVM in this study is trained using eight out of nine
subjects and the last subject is left out for the purpose of testing the trained SVM. Their results show
accuracy of 99.52 %.
Tariq et al [5] test the performance of various ML classifiers offered by the ML tool WEKA
for the purpose of indoor human localization. They use capacitive sensors for indoor human
localization in a 3mx3m room to compare and analyse the performance of the various classifiers
available in WEKA. The capacitor sensors are placed on the four walls of the room in order to detect
the human motion. The sensors are used in load mode by connecting the sensor to one plate of the
IOT BASED SENSOR DATA ANALYSIS USING MACHINE LEARNING
Kritika Sharma and Deepali D. Londhe 4
capacitor while the other plate is made of the environment and the human body to detect the change
in the position of the person. After gathering the data, the data analysis is performed by using the
WEKA tool which have a collection of ML classifiers such as SVM, Random Forest, Logit Boost and
Bayes Net. The data gathered is intentionally left with a lot of noise in order to see how well does the
WEKA set of classifiers perform even on the data with a lot of noise. Their results show that Random
Forest algorithm outperforms all the other WEKA classifiers in the task of detecting the position of
the person in the room even by using the data with a lot of noise.
Anice et al [6] stresses upon the importance of safety of elderly people living alone at home
which are at a higher risk of severe damage, due to poor system for notifying caregivers and
providing care at healthcare centres. They have identified this issue as a fall detection problem and
have used a 3 axis accelerometer sensor on a Samsung Galaxy S3 phone to capture activities that are
classified into two categories: falls and daily living activities. Falls include activities such as lying,
front knee lying, sideward lying and back sitting chair. Daily living activities include standing,
walking, jogging, moving down stairs, sitting on chair, and stepping in/out of a car. After collecting
the sensor readings, Principal Component Analysis (PCA) is used to perform dimensionality
reduction on the data to keep few but meaningful features which can represent the original data
effectively. After this, they have used a hybrid neuro-fuzzy algorithm (MLF) which combines the
benefits of both fuzzy logic and neural network to achieve higher accuracy in fall detection. They
successfully detect the fall activities with a sensitivity value of 97.29% and specificity value of
98.70% on a public dataset.
Liu et al [7] focus on hand gesture recognition problems and record signals of eight kinds of
hand movements i.e. up, down, left, right, clockwise circle, counter clockwise circle, turn left, and
turn right with the help of a nine-axis motion sensor consisting of accelerometer, gyroscope and a
magnetometer. The data of nine axes is recorded but the data of only accelerometer and gyroscope is
kept for further processing as data received from magnetometer is irregular due to the noise. Further
they use PCA to perform dimension reduction on the data and after that feature extraction is carried
out using Linear Discriminant Analysis (LDA). Lastly SVM is applied for classifying the different
hand movements. Table 1 summarize the algorithms used in the above mentioned papers.
Table1. Summary of the algorithms used in literature survey
SR. No. Paper name Algorithm
1. Recognizing daily activities with RFID
based sensors [1]
Hidden Markov Model (HMM) is used to perform
sensor data analysis on the ID’s generated by the
objects moved around in the apartment.
2. Move Free: A ubiquitous system to provide
women safety [2]
Decision tree algorithm helps in taking decision
depending upon the readings of various sensors.
3. Automated analysis and quantification of
human mobility using a depth sensor [3]
Automatic K-selection and K-means clustering is
used for feature reduction and selection respectively.
4. Wearable mobile based emotional response-
monitoring system for drivers [4]
SVM classifier is used on the collected senor
readings on a mobile phone and it extracts the
information from the sensors and determines driver’s
current emotion.
5. Performance of machine learning classifiers
for indoor person localization with
capacitive
ML classifiers such as SVM, Random Forest, Logit
Boost and Bayes Net offered by the ML tool WEKA
are used for the purpose of indoor human localization
International Journal of Computer Engineering and Applications,
Volume XII, Special Issue, May 18, www.ijcea.com ISSN 2321-3469
Kritika Sharma and Deepali D. Londhe 5
Sensors [5]
6. Accurate fall detection using 3-axis
accelerometer sensor and MLF algorithm [6]
Principal Component Analysis (PCA) is used to
perform dimensionality reduction on the data and
then a hybrid neuro-fuzzy algorithm (MLF) is used
to achieve higher accuracy in fall detection.
7. Gesture recognition with wearable 9-axis
sensors [7]
PCA is used to perform dimension reduction on the
data and after that feature extraction is carried out
using Linear Discriminant Analysis (LDA). Lastly
SVM is applied for classifying the different hand
movements.
[3] ARCHITECTURE OF THE HUMAN SAFETY DEVICE
The general architecture of a HAR system for human safety devices is represented with the
help of a block diagram [8] in Figure 1. First block represents the sensors which are responsible for
tracking the human activity. Sensors such as accelerometer, temperature sensor, heart rate sensor,
etc., can be used depending upon the task of monitoring. The readings of the sensor are then passed to
a microcontroller. For simple tasks such as jogging, running or any other activity that do not involve
processing the collected sensor data, only microcontrollers are used. But for more complex tasks such
as remote health monitoring or devices used for security purpose, the sensor data analysis is
performed.
Figure: 1. General Architecture for a Human Safety Device [8]
The data collected from the sensors is send to the mobile phone which contains an application
which has a machine learning algorithm at the back to perform sensor data analysis. This sensor data
analysis is performed using machine learning algorithms such as SVM, K-means clustering, etc., for
classification of the activities as usual or unusual.
[4] ALGORITHMS FOR SENSOR DATA ANALYSIS
Sensor analytics refer to the process of processing the data collected by the sensors which are
either wired or wireless. The insights gained from performing analysis on sensor data are helpful in
many situations such as knowing about failed equipment in a power plant or monitoring the health of
a patient from a remote place. As sensors are on most of the time, the amount of data that they
generate can be overwhelming and too large to be considering for analysis. Sensor data analysis
provides a solution to this problem by providing a system to manage and process the large amount of
IOT BASED SENSOR DATA ANALYSIS USING MACHINE LEARNING
Kritika Sharma and Deepali D. Londhe 6
data collected by the sensors. Such a system has three parts: the sensors for monitoring, a data store
and an analytics engine. This analytics engine generally has machine learning algorithms that help in
processing the collected data. We will discuss about some of the machine learning algorithms used
for classification and dimensionality reduction here and main focus will be on support vector machine
(SVM) which is the one we will be using for sensor data analysis.
Principal component analysis (PCA) helps in reducing the dimensionality of data that have
many variables which are correlated to each other, either heavily or lightly, while preserving the
variations in the dataset to the maximum extent. The same is done by transforming the variables to a
new set which are called the principal components and are orthogonal which are ordered in such a
way that the preservation in the original variables decreases as we move downwards in order and in
this way the first principal component retains maximum variation that was present in the original
components. The principal components are the Eigenvectors of a covariance matrix, and hence they
are orthogonal. This algorithm is used in sensor data analysis to reduce the dimensionality of the data
in a case where a large amount of dataset is considered where it performs the job of preprocessing.
K-means clustering is another machine learning algorithm and is popular for cluster analysis
in data analysis task. The aim of this clustering algorithm is to go on dividing the main observation
into k clusters in which each observation belongs to a cluster which has the nearest mean and serves
as a prototype of that cluster. This algorithm is used to group together the same type of data and goes
on doing the same, forming the small clusters out of a big cluster, until the final values are clearly
distinctive from each other.
SVM is a supervised machine learning model used on data for classification and regression
analysis. Given a set of training examples, each marked as belonging to one or the other of two
categories, an SVM training algorithm builds a model that assigns new examples to one category or
the other, making it a non-probabilistic binary linear classifier.
In the linear SVM algorithm, assume that the training data is (𝑥𝑖, 𝑦𝑖) where i = 1...n points,
and 𝑦𝑖 = {+1, −1} indicating the class to which point 𝑥𝑖belongs. Now the main requirement for
which we use SVM is to find out the maximum margin hyper plane that divides the group of points xi
for which 𝑦𝑖 = −1 from the points in group for which 𝑦𝑖 = +1 which is defined so that the
distance between the hyper plane and the nearest point from either group is maximized.
Any hyper plane can be written as the set of points x satisfying
𝑤. 𝑥 − 𝑏 = 0 … (1)
Here, w = normal of the hyper plane
𝑏
|𝑤|= offset of the hyper plane from the origin along the normal ‘w’
If the data (training data) is linearly separable we can select 2 parallel hyper planes that separate 2
classes of data so that the distance between them is as large as possible. These two classes are
𝑤. 𝑥 − 𝑏 = +1 and𝑤. 𝑥 − 𝑏 = −1
...(2)
International Journal of Computer Engineering and Applications,
Volume XII, Special Issue, May 18, www.ijcea.com ISSN 2321-3469
Kritika Sharma and Deepali D. Londhe 7
and the distance between two planes is equal to2
|𝑤|and the value of ‘w’ must be minimized to
increase the distance between two hyper planes.
𝑦𝑖(𝑤. 𝑥 − 𝑏) ≥ 1 , 𝑓𝑜𝑟 𝑎𝑙𝑙 1 ≤ 𝑖 ≤ 𝑛 ...
(3)
So the above is the constraint that says that each data point must lie on the correct side of the margin.
“Minimize w subject to 𝑦𝑖(𝑥. 𝑤 − 𝑏) ≥ 1, 𝑓𝑜𝑟 𝑎𝑙𝑙 1 ≤ 𝑖 ≤ 𝑛” and the ‘w’ and ‘b’ that solve
this problem determine our classifier𝑥 ↔ 𝑠𝑔𝑛(𝑤. 𝑥 − 𝑏). The maximum margin hyper plane is
completely determined by those x points which lie nearest to it.
[5] CONCLUSION
HAR is the core of any safety device designed for humans as our activities tell a lot about our
well-being. The task of human safety involves three main steps: sensing the human activities,
processing the data collected and taking appropriate decisions based on the analysis. Human activity
recognition has become a single solution to many problems from different domains such as remote/e-
healthcare, human security, virtual reality, video games, road safety, etc., to name a few. Where IOT
has opened up a world of immense possibilities for many fields including human activity recognition,
ML has provided the long awaited intelligence to these devices. The use of sensors such as
accelerometer, heart rate sensor, pressure sensor, etc., are useful in capturing the essence of the
human activity, the readings, and then these are forwarded to the analytics engine with the help of the
microcontroller. This analytics engine has machine learning algorithms such as PCA for
dimensionality reduction, K-means clustering and SVM for classification and so on. Together, these
technologies have boosted the development of more light-weight, high-performance and comfortable
wearable devices which are suitable to take on even the biggest challenges in the field of human
safety and human activity recognition. The insight provided by using ML algorithms such as SVM
provides much better sensor data analysis improving overall accuracy of such systems. The hardware
part of the project involving the interfacing of sensors with the microcontroller is completed leaving
the sensor data analysis using machine learning to be completed in the remaining time.
IOT BASED SENSOR DATA ANALYSIS USING MACHINE LEARNING
Kritika Sharma and Deepali D. Londhe 8
REFERENCES
[1] Michael Buettner, Richa Prasad, Matthai Philipose, David Wetherall, “Recognizing daily
activities with RFID-based sensors”, in ACM 11th international conference on Ubiquitous
computing, Florida, USA, Oct. 2009, pp.51-60
[2] Sohini Roy, Abhijit Sharma, Uma Bhattacharya, “MoveFree: A ubiquitous system to provide
women safety”, in Third International Symposium on Women in Computing and Informatics,
Kochi, India, Aug. 2015, pp.545-552.
[3] Daniel Leightley, Jamie S. McPhee, Moi Hoon Yap, “Automated Analysis and Quantification of
Human Mobility Using a Depth Sensor”, IEEE Journal of Biomedical and Health Informatics,
Vol:21, no. 4, pp.939-948, Jun. 2017.
[4] Boon Giin Lee,Teak Wei Chong,Boon Leng Lee,Hee Joon Park,Yoon Nyun Kim,Beomjoon
Kim,“Wearable Mobile-Based Emotional Response-Monitoring System for Drivers”,IEEE
Transactions on Human-Machine Systems,vol:47, no.5, pp.636-649, Feb. 2017.
[5] Osama Bin Tariq, Mihai Teodor Lazarescu, Javed Iqbal, Luciano Lavagno, “Performance of
machine learning classifiers for indoor person localization with capacitive sensors”, IEEE
Access,vol:5,pp.12913-12926, Jul. 2017.
[6] Anice Jahanjoo, Marjan Naderan Tahan, Mohammad Javad Rashti, “Accurate fall detection
using 3-axis accelerometer sensor and MLF algorithm”, in IEEE 3rd International Conference on
Pattern Recognition and Image Analysis, Shahrekord, Iran, April. 2017, pp.90-95.
[7] Fang-Ting Liu,Yong-Ting Wang,Hsi-Pin Ma, “Gesture recognition with wearable 9-axis
sensors”, in IEEE International Conference on Communications,Paris, France, May 2017.
[8] Subhas C. Mukhopadhyay, “Wearable sensors for human activity monitoring: A review”, IEEE
Sensor Journal, Vol:15, no.3, March, pp.1321-1330, Dec. 2015.