iot based sensor data analysis using machine learning … … · iot based sensor data analysis...

International Journal of Computer Engineering and Applications,

Volume XII, Special Issue, May 18, www.ijcea.com ISSN 2321-3469

Kritika Sharma and Deepali D. Londhe 1

IOT BASED SENSOR DATA ANALYSIS USING MACHINE

LEARNING

Kritika Sharma1, Deepali Londhe2 1Department of Information Technology 2Department of Information Technology

Pune Institute of computer technology, Pune, India

ABSTRACT:

The internet of things (IOT) technology is getting more and more attention nowadays due to the increasing number of devices per person. We are surrounded by these electronic gadgets all the time. Researchers and engineers are putting these devices to good use by using them to ensure human safety, health tracking, automation and what not. Being the most sought after field in terms of security, human safety is ensured by using human safety devices (HSD) which are pouring in like anything in the market nowadays. The essence of human safety device lies in the human activity recognition (HAR) which is being achieved using sensors. The sensors are responsible for detecting any changes in the temperature, pressure, heart beat or orientation of the human body and then performing sensor data analysis using machine learning algorithms on these readings to check whether the readings are usual or unusual. The proposed system talks about using sensors such as temperature sensor, pulse sensor, Pressure sensor, and accelerometers to measure these critical values which are directly affected in case of any unusual activity such as falling (detected by accelerometers), getting scared (pulse rate increases), or being grabbed (pressure increases) by someone and so on. The sensors continuously track the heart rate, pressure in one specific area, body activity etc., and generate values which are processed and then classified as usual or unusual activity with the help of Machine Learning algorithm. Sensor data analysis using machine learning offers great opportunities in the field of human safety.

Keywords: IOT, Human Safety, Human Activity Recognition, Sensors, Sensor Data Analysis, Machine Learning

IOT BASED SENSOR DATA ANALYSIS USING MACHINE LEARNING


[1] INTRODUCTION

Human safety has become an issue of utmost importance as human safety is being

compromised so often nowadays. The market today is overflowing with a huge variety of devices that

offer human safety. A lot of research is undergoing in improving these safety devices so that the

devices are more responsive, spontaneous, and well equipped. Human activity recognition (HAR) is

at the heart of the human safety as the activities we perform tell a lot about us. Human activity

recognition is the process of identifying the activities of a person with the help of high-tech devices

like sensors, camera, etc., that capture the human motion in order to identify the activity being

performed. HAR is being used extensively in areas such as virtual reality, remote health care, elderly

health care, security, in studies based on understanding human activities, etc., and all the sensors that

are available to us today have made this possible.

One of the techniques used for HAR is wearable sensor technology. Wearable sensor

technology has evolved a lot in the last decade and has affected every technology today. The devices

around us today have become so smart that they outsmart even humans in some fields. In today’s

world, this smartness to the sensor based devices is provided by a very well-practiced and renowned

technology called machine learning. Due to the huge popularity of machine learning in varied fields

such as human-computer interaction, computer vision, healthcare, virtual reality, video games, object

identification, etc., it has now become one of the most popular techniques amongst the researchers to

perform sensor data analysis. The ML algorithms such as support vector machine (SVM), K-means

clustering, random forest, hidden Markov model (HMM), etc., are being used extensively for the task

of sensor data analysis due to their effectiveness in mitigating the distortion and noise in the sensor

data due to the environmental conditions in the area where they are used.

We have undoubtedly improved on these safety devices a lot but there is still a long way to

go before we can rest peacefully and achieve the level of security we demand from these devices.

This paper proposes a human safety system which tracks human activities using the sensors and then

by performing the sensor data analysis on the received readings using machine learning algorithm

classifies the received values from the sensors as usual or unusual. Using machine learning technique

on the sensor data provide these devices insight about the user and makes the retraining of these

devices much easier and spontaneous. The rest of the paper is divided into four sections. Chapter two

is the literature survey on various types of algorithms used in sensor data analysis in human safety.

Chapter three talks about the proposed system whereas chapter four gives brief introduction to

various machine learning algorithms used for sensor data analysis and the last chapter presents the

conclusion.

[2] LITERATURE SURVEY

Michael et al. [1] explores a dense sensing approach which makes use of radio frequency

identification (RFID) sensor network technology to recognize human activities in a closed

environment. Their activity detection system is overall divided into three main parts namely the

wireless identification and sensing platform (WISP), RFID sensor network (RSN) and the inference

engine consisting of a hidden Markov model (HMM). WISP is responsible for transmitting unique

identifiers along with their most recent accelerometer readings which were collected by the RFID

readers placed on the ceiling of the apartment. This RSN acts as a data source for the inference engine

which have a Hidden Markov Model (HMM) which takes as input the ID of the various objects

moved around in the apartment and perform the sensor data analysis on the generated ID’s to identify




activity of the person in the apartment. Their results show a precision rate of 90% and a recall rate of

91% as compared to a precision rate of 95% and a recall rate of 60% when using I bracelet instead of

the RSN.

Sohini et al [2] makes use of internet of things (IOT) and ML to ensure human safety. They

have proposed the design of a human safety band consisting of multiple sensors such as breath rate

sensor, heart rate sensor, glucometer, sweat sensor, and optical blood flow sensor. These sensors are

responsible for collecting the respective readings and sending them to the mobile application (app)

created by them. This mobile app is responsible to further taking suitable action depending upon the

readings of the sensor(s) which are being compared to the previously stored values in order to see if

the new readings have crossed a certain threshold or if they are normal. In case the readings are

normal the system continues to work normally; however, if the values are greater than the predefined

threshold, immediate actions are taken such as informing friends and family members, informing the

nearest police station or even the nearby hospitals. The emergency steps are taken on the basis of the

decision taken by a decision tree algorithm running at the back of the app. This decision tree

algorithm helps in taking decision depending upon the readings of various sensors say for example if

the breath rate is above the threshold and the heart rate is also above threshold and then the blood

flow rate also goes beyond the threshold then according to the decision tree the activity is considered

as abnormal and requires informing friends, police as well as the nearest hospital.

Daniel et al [3] uses the Microsoft Kinect one depth sensor for the identification of human

mobility impairments. Their framework involves a two-step process: First step is capturing the

movements of a healthy person (without mobility issues) using the Kinect one sensor and then,

Second step is to compare and evaluate the test participants motion sequence with the statistical

mobility model generated from the data of a healthy person (using Kinect). They also propose an

automatic method to enable a fairer, unbiased approach to label the motion capture data. They use

automatic K-selection and K-means clustering for feature reduction and selection. The first process of

K-selection reduces the number of clusters from the main data and an automated clustering technique

is then used to identify the optimum number of clusters. Their framework is able to successfully

identify the mobility concerns of a person with the mobility issues and this framework is also free

from any human intervention or bias in the process of decision making.

Lee et al [4] suggests a unique approach for minimizing the road accident problems by

analysing the negative emotional response of a person while driving. They consider the negative

emotional response to be one of the major indicators of the likelihood of an accident while driving a

vehicle. These responses are divided into three categories namely fatigue, stress and relaxation. They

consider Stress and relaxation as negative emotional responses. These emotional responses are

captured using three sensors i.e. Electromyography (EMG) sensor, Photoplethysmography (PPG)

sensor and an inertial sensor. These sensors are connected to a microcontroller unit equipped with

Bluetooth-enabled low energy module, which transmits the collected senor readings to a mobile

phone which then extract the information from the sensors and determine driver’s current emotion

using trained SVM classifier. The dataset used for SVM in this study is trained using eight out of nine

subjects and the last subject is left out for the purpose of testing the trained SVM. Their results show

accuracy of 99.52 %.

Tariq et al [5] test the performance of various ML classifiers offered by the ML tool WEKA

for the purpose of indoor human localization. They use capacitive sensors for indoor human

localization in a 3mx3m room to compare and analyse the performance of the various classifiers

available in WEKA. The capacitor sensors are placed on the four walls of the room in order to detect

the human motion. The sensors are used in load mode by connecting the sensor to one plate of the



capacitor while the other plate is made of the environment and the human body to detect the change

in the position of the person. After gathering the data, the data analysis is performed by using the

WEKA tool which have a collection of ML classifiers such as SVM, Random Forest, Logit Boost and

Bayes Net. The data gathered is intentionally left with a lot of noise in order to see how well does the

WEKA set of classifiers perform even on the data with a lot of noise. Their results show that Random

Forest algorithm outperforms all the other WEKA classifiers in the task of detecting the position of

the person in the room even by using the data with a lot of noise.

Anice et al [6] stresses upon the importance of safety of elderly people living alone at home

which are at a higher risk of severe damage, due to poor system for notifying caregivers and

providing care at healthcare centres. They have identified this issue as a fall detection problem and

have used a 3 axis accelerometer sensor on a Samsung Galaxy S3 phone to capture activities that are

classified into two categories: falls and daily living activities. Falls include activities such as lying,

front knee lying, sideward lying and back sitting chair. Daily living activities include standing,

walking, jogging, moving down stairs, sitting on chair, and stepping in/out of a car. After collecting

the sensor readings, Principal Component Analysis (PCA) is used to perform dimensionality

reduction on the data to keep few but meaningful features which can represent the original data

effectively. After this, they have used a hybrid neuro-fuzzy algorithm (MLF) which combines the

benefits of both fuzzy logic and neural network to achieve higher accuracy in fall detection. They

successfully detect the fall activities with a sensitivity value of 97.29% and specificity value of

98.70% on a public dataset.

Liu et al [7] focus on hand gesture recognition problems and record signals of eight kinds of

hand movements i.e. up, down, left, right, clockwise circle, counter clockwise circle, turn left, and

turn right with the help of a nine-axis motion sensor consisting of accelerometer, gyroscope and a

magnetometer. The data of nine axes is recorded but the data of only accelerometer and gyroscope is

kept for further processing as data received from magnetometer is irregular due to the noise. Further

they use PCA to perform dimension reduction on the data and after that feature extraction is carried

out using Linear Discriminant Analysis (LDA). Lastly SVM is applied for classifying the different

hand movements. Table 1 summarize the algorithms used in the above mentioned papers.

Table1. Summary of the algorithms used in literature survey

SR. No. Paper name Algorithm

1. Recognizing daily activities with RFID

based sensors [1]

Hidden Markov Model (HMM) is used to perform

sensor data analysis on the ID’s generated by the

objects moved around in the apartment.

2. Move Free: A ubiquitous system to provide

women safety [2]

Decision tree algorithm helps in taking decision

depending upon the readings of various sensors.

3. Automated analysis and quantification of

human mobility using a depth sensor [3]

Automatic K-selection and K-means clustering is

used for feature reduction and selection respectively.

4. Wearable mobile based emotional response-

monitoring system for drivers [4]

SVM classifier is used on the collected senor

readings on a mobile phone and it extracts the

information from the sensors and determines driver’s

current emotion.

5. Performance of machine learning classifiers

for indoor person localization with

capacitive

ML classifiers such as SVM, Random Forest, Logit

Boost and Bayes Net offered by the ML tool WEKA

are used for the purpose of indoor human localization




Sensors [5]

6. Accurate fall detection using 3-axis

accelerometer sensor and MLF algorithm [6]

Principal Component Analysis (PCA) is used to

perform dimensionality reduction on the data and

then a hybrid neuro-fuzzy algorithm (MLF) is used

to achieve higher accuracy in fall detection.

7. Gesture recognition with wearable 9-axis

sensors [7]

PCA is used to perform dimension reduction on the

data and after that feature extraction is carried out

using Linear Discriminant Analysis (LDA). Lastly

SVM is applied for classifying the different hand

movements.

[3] ARCHITECTURE OF THE HUMAN SAFETY DEVICE

The general architecture of a HAR system for human safety devices is represented with the

help of a block diagram [8] in Figure 1. First block represents the sensors which are responsible for

tracking the human activity. Sensors such as accelerometer, temperature sensor, heart rate sensor,

etc., can be used depending upon the task of monitoring. The readings of the sensor are then passed to

a microcontroller. For simple tasks such as jogging, running or any other activity that do not involve

processing the collected sensor data, only microcontrollers are used. But for more complex tasks such

as remote health monitoring or devices used for security purpose, the sensor data analysis is

performed.

Figure: 1. General Architecture for a Human Safety Device [8]

The data collected from the sensors is send to the mobile phone which contains an application

which has a machine learning algorithm at the back to perform sensor data analysis. This sensor data

analysis is performed using machine learning algorithms such as SVM, K-means clustering, etc., for

classification of the activities as usual or unusual.

[4] ALGORITHMS FOR SENSOR DATA ANALYSIS

Sensor analytics refer to the process of processing the data collected by the sensors which are

either wired or wireless. The insights gained from performing analysis on sensor data are helpful in

many situations such as knowing about failed equipment in a power plant or monitoring the health of

a patient from a remote place. As sensors are on most of the time, the amount of data that they

generate can be overwhelming and too large to be considering for analysis. Sensor data analysis

provides a solution to this problem by providing a system to manage and process the large amount of



data collected by the sensors. Such a system has three parts: the sensors for monitoring, a data store

and an analytics engine. This analytics engine generally has machine learning algorithms that help in

processing the collected data. We will discuss about some of the machine learning algorithms used

for classification and dimensionality reduction here and main focus will be on support vector machine

(SVM) which is the one we will be using for sensor data analysis.

Principal component analysis (PCA) helps in reducing the dimensionality of data that have

many variables which are correlated to each other, either heavily or lightly, while preserving the

variations in the dataset to the maximum extent. The same is done by transforming the variables to a

new set which are called the principal components and are orthogonal which are ordered in such a

way that the preservation in the original variables decreases as we move downwards in order and in

this way the first principal component retains maximum variation that was present in the original

components. The principal components are the Eigenvectors of a covariance matrix, and hence they

are orthogonal. This algorithm is used in sensor data analysis to reduce the dimensionality of the data

in a case where a large amount of dataset is considered where it performs the job of preprocessing.

K-means clustering is another machine learning algorithm and is popular for cluster analysis

in data analysis task. The aim of this clustering algorithm is to go on dividing the main observation

into k clusters in which each observation belongs to a cluster which has the nearest mean and serves

as a prototype of that cluster. This algorithm is used to group together the same type of data and goes

on doing the same, forming the small clusters out of a big cluster, until the final values are clearly

distinctive from each other.

SVM is a supervised machine learning model used on data for classification and regression

analysis. Given a set of training examples, each marked as belonging to one or the other of two

categories, an SVM training algorithm builds a model that assigns new examples to one category or

the other, making it a non-probabilistic binary linear classifier.

In the linear SVM algorithm, assume that the training data is (𝑥𝑖, 𝑦𝑖) where i = 1...n points,

and 𝑦𝑖 = {+1, −1} indicating the class to which point 𝑥𝑖belongs. Now the main requirement for

which we use SVM is to find out the maximum margin hyper plane that divides the group of points xi

for which 𝑦𝑖 = −1 from the points in group for which 𝑦𝑖 = +1 which is defined so that the

distance between the hyper plane and the nearest point from either group is maximized.

Any hyper plane can be written as the set of points x satisfying

𝑤. 𝑥 − 𝑏 = 0 … (1)

Here, w = normal of the hyper plane

𝑏

|𝑤|= offset of the hyper plane from the origin along the normal ‘w’

If the data (training data) is linearly separable we can select 2 parallel hyper planes that separate 2

classes of data so that the distance between them is as large as possible. These two classes are

𝑤. 𝑥 − 𝑏 = +1 and𝑤. 𝑥 − 𝑏 = −1

...(2)




and the distance between two planes is equal to2

|𝑤|and the value of ‘w’ must be minimized to

increase the distance between two hyper planes.

𝑦𝑖(𝑤. 𝑥 − 𝑏) ≥ 1 , 𝑓𝑜𝑟 𝑎𝑙𝑙 1 ≤ 𝑖 ≤ 𝑛 ...

(3)

So the above is the constraint that says that each data point must lie on the correct side of the margin.

“Minimize w subject to 𝑦𝑖(𝑥. 𝑤 − 𝑏) ≥ 1, 𝑓𝑜𝑟 𝑎𝑙𝑙 1 ≤ 𝑖 ≤ 𝑛” and the ‘w’ and ‘b’ that solve

this problem determine our classifier𝑥 ↔ 𝑠𝑔𝑛(𝑤. 𝑥 − 𝑏). The maximum margin hyper plane is

completely determined by those x points which lie nearest to it.

[5] CONCLUSION

HAR is the core of any safety device designed for humans as our activities tell a lot about our

well-being. The task of human safety involves three main steps: sensing the human activities,

processing the data collected and taking appropriate decisions based on the analysis. Human activity

recognition has become a single solution to many problems from different domains such as remote/e-

healthcare, human security, virtual reality, video games, road safety, etc., to name a few. Where IOT

has opened up a world of immense possibilities for many fields including human activity recognition,

ML has provided the long awaited intelligence to these devices. The use of sensors such as

accelerometer, heart rate sensor, pressure sensor, etc., are useful in capturing the essence of the

human activity, the readings, and then these are forwarded to the analytics engine with the help of the

microcontroller. This analytics engine has machine learning algorithms such as PCA for

dimensionality reduction, K-means clustering and SVM for classification and so on. Together, these

technologies have boosted the development of more light-weight, high-performance and comfortable

wearable devices which are suitable to take on even the biggest challenges in the field of human

safety and human activity recognition. The insight provided by using ML algorithms such as SVM

provides much better sensor data analysis improving overall accuracy of such systems. The hardware

part of the project involving the interfacing of sensors with the microcontroller is completed leaving

the sensor data analysis using machine learning to be completed in the remaining time.



REFERENCES

[1] Michael Buettner, Richa Prasad, Matthai Philipose, David Wetherall, “Recognizing daily

activities with RFID-based sensors”, in ACM 11th international conference on Ubiquitous

computing, Florida, USA, Oct. 2009, pp.51-60

[2] Sohini Roy, Abhijit Sharma, Uma Bhattacharya, “MoveFree: A ubiquitous system to provide

women safety”, in Third International Symposium on Women in Computing and Informatics,

Kochi, India, Aug. 2015, pp.545-552.

[3] Daniel Leightley, Jamie S. McPhee, Moi Hoon Yap, “Automated Analysis and Quantification of

Human Mobility Using a Depth Sensor”, IEEE Journal of Biomedical and Health Informatics,

Vol:21, no. 4, pp.939-948, Jun. 2017.

[4] Boon Giin Lee,Teak Wei Chong,Boon Leng Lee,Hee Joon Park,Yoon Nyun Kim,Beomjoon

Kim,“Wearable Mobile-Based Emotional Response-Monitoring System for Drivers”,IEEE

Transactions on Human-Machine Systems,vol:47, no.5, pp.636-649, Feb. 2017.

[5] Osama Bin Tariq, Mihai Teodor Lazarescu, Javed Iqbal, Luciano Lavagno, “Performance of

machine learning classifiers for indoor person localization with capacitive sensors”, IEEE

Access,vol:5,pp.12913-12926, Jul. 2017.

[6] Anice Jahanjoo, Marjan Naderan Tahan, Mohammad Javad Rashti, “Accurate fall detection

using 3-axis accelerometer sensor and MLF algorithm”, in IEEE 3rd International Conference on

Pattern Recognition and Image Analysis, Shahrekord, Iran, April. 2017, pp.90-95.

[7] Fang-Ting Liu,Yong-Ting Wang,Hsi-Pin Ma, “Gesture recognition with wearable 9-axis

sensors”, in IEEE International Conference on Communications,Paris, France, May 2017.

[8] Subhas C. Mukhopadhyay, “Wearable sensors for human activity monitoring: A review”, IEEE

Sensor Journal, Vol:15, no.3, March, pp.1321-1330, Dec. 2015.

iot based sensor data analysis using machine learning … … · iot based sensor data analysis...

Documents