inferringtiesinsocialiotusinglocation-basednetworksand ...3 jp]v3 ]4jjv3 brightkite and gowalla are...

16
Research Article Inferring Ties in Social IoT Using Location-Based Networks and Identification of Hidden Suspicious Ties Nauman Ali Khan , 1 Sihai Zhang , 1 Wuyang Zhou , 1 Ahmad Almogren , 2 Ikram Ud Din , 3 and Muhammad Asif 1 1 Key Laboratory of Wireless-Optical Communication, University of Science and Technology China, Hefei 230027, China 2 Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh 11633, Saudi Arabia 3 Department of Information Technology, e University of Haripur, Haripur 22620, Pakistan Correspondence should be addressed to Wuyang Zhou; [email protected] Received 21 October 2020; Revised 5 November 2020; Accepted 16 November 2020; Published 1 December 2020 Academic Editor: Habib Ullah Khan Copyright © 2020 Nauman Ali Khan et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Stochastic Internet of ings (IoT)-based communication behavior of the progressing world is tremendously impacting social networks. e growth of social networks helps to quantify the effect on the Social Internet of ings (SIoT). Multiple existences of two persons at several geographical locations in different time frames hint to predict the social connection. We investigate the extent to which social ties between people can be inferred by critically reviewing the social networks. Our study used Chinese telecommunication-based anonymized caller data records (CDRs) and two openly available location-based social network data sets, Brightkite and Gowalla. Our research identified social ties based on mobile communication data and further exploits communication reasons based on geographical location. is paper presents an inference framework that predicts the missing ties as suspicious social connections using pipe and filter architecture-based inference framework. It highlights the secret relationship of users, which does not exist in real data. e proposed framework consists of two major parts. Firstly, users’ cooccurrence based on the mutual location in a specific time frame is computed and inferred as social ties. Results are investigated based upon the cooccurrence count, the gap time threshold values, and mutual friend count values. Secondly, the detail about direct connections is collected and cross-related to the inferred results using Precision and Recall evaluation measures. In the later part of the research, we examine the false-positive results methodically by studying the human cooccurrence patterns to identify hidden relationships using a social activity. e outcomes indicate that the proposed approach achieves comprehensive results that further support the theory of suspicious ties. 1. Introduction A social network is a web of social ties among individuals. Social ties are the kind of one-to-one communication links among humans or Social Internet of ings [1, 2]. Formation of social ties depends upon many attributes, such as the location of living, personality, age, gender, workplace, ac- tivities, and many more [3, 4]. ese ties are built based on some needs or relationships. People use many mediums for communication such as calls, chatting on social networking websites, reading and writing comments to person, and reviewing and suggesting some purchasing mobile appli- cations [5]. Investigating human behavior and machine performance, how they react to and participate in social networks remained a center of attention for researchers [2, 6]. Social network analysis is the computer science field that quantifies, evaluates, and analyzes human behavior [7, 8]. A concept of social communication links was proposed by Granovetter [9]. According to their research finding, the communication link between two people is considered as the social tie. Also, each communication link’s strength was Hindawi Scientific Programming Volume 2020, Article ID 6667610, 16 pages https://doi.org/10.1155/2020/6667610

Upload: others

Post on 19-Dec-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

Research ArticleInferring Ties in Social IoT Using Location-Based Networks andIdentification of Hidden Suspicious Ties

Nauman Ali Khan 1 Sihai Zhang 1 Wuyang Zhou 1 Ahmad Almogren 2

Ikram Ud Din 3 and Muhammad Asif 1

1Key Laboratory of Wireless-Optical Communication University of Science and Technology China Hefei 230027 China2Department of Computer Science College of Computer and Information Sciences King Saud UniversityRiyadh 11633 Saudi Arabia3Department of Information Technology e University of Haripur Haripur 22620 Pakistan

Correspondence should be addressed to Wuyang Zhou wyzhouustceducn

Received 21 October 2020 Revised 5 November 2020 Accepted 16 November 2020 Published 1 December 2020

Academic Editor Habib Ullah Khan

Copyright copy 2020 Nauman Ali Khan et al +is is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited

Stochastic Internet of +ings (IoT)-based communication behavior of the progressing world is tremendously impacting socialnetworks +e growth of social networks helps to quantify the effect on the Social Internet of+ings (SIoT) Multiple existences oftwo persons at several geographical locations in different time frames hint to predict the social connection We investigate theextent to which social ties between people can be inferred by critically reviewing the social networks Our study used Chinesetelecommunication-based anonymized caller data records (CDRs) and two openly available location-based social network datasets Brightkite and Gowalla Our research identified social ties based on mobile communication data and further exploitscommunication reasons based on geographical location+is paper presents an inference framework that predicts the missing tiesas suspicious social connections using pipe and filter architecture-based inference framework It highlights the secret relationshipof users which does not exist in real data +e proposed framework consists of two major parts Firstly usersrsquo cooccurrence basedon the mutual location in a specific time frame is computed and inferred as social ties Results are investigated based upon thecooccurrence count the gap time threshold values and mutual friend count values Secondly the detail about direct connectionsis collected and cross-related to the inferred results using Precision and Recall evaluation measures In the later part of theresearch we examine the false-positive results methodically by studying the human cooccurrence patterns to identify hiddenrelationships using a social activity +e outcomes indicate that the proposed approach achieves comprehensive results thatfurther support the theory of suspicious ties

1 Introduction

A social network is a web of social ties among individualsSocial ties are the kind of one-to-one communication linksamong humans or Social Internet of+ings [1 2] Formationof social ties depends upon many attributes such as thelocation of living personality age gender workplace ac-tivities and many more [3 4] +ese ties are built based onsome needs or relationships People use many mediums forcommunication such as calls chatting on social networkingwebsites reading and writing comments to person and

reviewing and suggesting some purchasing mobile appli-cations [5] Investigating human behavior and machineperformance how they react to and participate in socialnetworks remained a center of attention for researchers[2 6] Social network analysis is the computer science fieldthat quantifies evaluates and analyzes human behavior[7 8]

A concept of social communication links was proposedby Granovetter [9] According to their research finding thecommunication link between two people is considered as thesocial tie Also each communication linkrsquos strength was

HindawiScientific ProgrammingVolume 2020 Article ID 6667610 16 pageshttpsdoiorg10115520206667610

further classified as strong or weak depending upon thefrequency of communication number of times emotionalattachment number of mutual ties relationship actions anda combination of these mentioned parameters [5] Equation(1) quantifies the strength of social ties which is denoted byweight such that higher weight tells stronger ties and viceversa [10] wAB represents the weight of social tie betweenNode A and Node B while CA and CB represent the degreeof Node A and Node B cAB is the number of mutual nodesbetween A and B Community structures were also foundone of the main reasons for social tie strength [11] it wasfound that people from the same communities have strongties as compared to different communities [5]

wAB cAB

CA minus 1 + CB minus 1 minus cAB (1)

Various models and techniques are developed to inferthe social network based on inadequate aspects [7 12] Oneof the specific categories belonging to such inference de-termines the cooccurrence based on time and locationDespite encountering many measures there remains a de-ficiency in acquiring precise and accurate inferences In ourresearch we consider several threshold parameters toquantify more precise inferences We also develop aframework that infers existing social ties and the hiddenrelationships in a social network

Initially in our research we present the inference ofsocial ties among people by correlating to their physicalpresence at several sites and their direct connections Wedefine a social connection if two individuals X and Y

cooccur in a s cell within t hr time frame such that X calls toperson R while connected to a b1 base station and in thesame time frame Y calls a person S from the same b1 basestation Furthermore we counted the number of cooccur-rence of X and Y Firstly we find social ties depending uponthe number of direct calls between two people To ensure thecorrect social connection we state a threshold such that thecount of direct calls is more than the threshold Secondly weevaluate the social relationship between two people bycounting the number of calls by X and Y in a specific timeframe Figure 1 states an example that explains the proce-dure of quantification Each hexagon represents an area of asingle base station X and Y are together 6 times in variousbase stations and there is a variation in the gap of calls Inthe first part of research we use the CDR data set providedby the telecommunication company and two openly avail-able location-based social network data sets ie Brightkiteand Gowalla [13] All data sets resemble to the stated ex-ample in Figure 1 We counted the number of concurrencesbased on multiple gap time frame thresholds and mutualfriends Furthermore we correlate results with the directcalls based on social connection using Precision and Recallevaluation measures

In the second phase of research we explore the false-positive results formed by the CDR-based social tie inferencemodel We state a missing tie as a suspicious tie between twopeople if they do not have any direct calls but are foundtogether numerous times Also they have a certain number

of mutual friends In the literature missing ties are definedas either nonresponse or absent ties [12] In an activity anactor does not give any information about a tie considered asnonresponse [14] while an absent tie means when an actordoes not give any indication about the tie detail A surveywas conducted to monitor the social behavior of the boysrsquoand girlsrsquo liking pattern +at was limited to binary datasuch that one represents a tie while zero represents no tieFigure 2 shows visual representation (block modeling) ofadjacency matrix made according to survey data [14] InFigure 2 green-filled slots represent the existence of tieregardless of its strength while white slots indicate eitherabsent ties or nonresponse ties In our research we exploreand classify a subset of missing ties as suspect ties Weconduct a social activity and simulation that generates a dataset the same as the CDR data supporting this conceptFurthermore we correlate the CDR-based social tie infer-ence modelrsquos false-positive results with the activity andsimulation results

+e contributions of this paper are as follows

(1) We developed an inference model and a classifierthat identifies location-based social ties +e infer-ence model is tested on the CDR-based social net-work Brightkite and Gowalla using Precision andRecall measures

(2) We identify a class of suspect ties by examining thesocial tie inference modelrsquos false-positive results

(3) We conducted an activity-based survey and a sim-ulation that demonstrates and evaluates the suspectsrsquoties

+e rest of this article is organized as follows Section 2describes the literature review Section 3 presents the de-scriptions of cooccurrence count normalization inferencealgorithm and social tie inference Brief concepts about thehidden relationship and suspicious links are described inSection 4 +e proposed framework and an algorithm toinfer suspicious relationships are given in Section 5 Section6 describes the data set description results and analysisFinally the conclusion of this article is presented in Section7

2 Related Work

+e physical world social network is represented as a graphwhere nodes are treated as people and edges are representedas the social tie between two people [15] In the literatureedge weight is represented as the strength of that particularsocial tie [10 16] A social network such as Twitter forms abidirectional graph eg a fan follows a celebrity but thecelebrity hardly ever follows back Usage of bidirectionalgraphs investigates influential networks and most inflec-tional people [17ndash19] Recommendation and targetedmarketing are some of the essential objectives of exploringsocial ties [20 21] +eme-based model adopts dynamicprogramming to explore critical factors for example playingand dating are the kind of themes [22] Social ties couplingand predicting the mobility of users were researched by

2 Scientific Programming

seeing the physical and network properties (geosocialproperties) [23 24] An effective prediction technique wasproposed to find the typical patterns of two users bycomparing the check-in details [24 25] Area significance ismeasured using a weight-assigningmethod by incorporatingtwo usersrsquo cooccurrence for a specific area A coffee shop is amore significant area as compared to an ordinary place [26]+e scoring mechanism helps in categorizing and labellingof social ties [10] Inference about the any social network isincomplete if associated features are neglected +e baselineof any social network is the single social connection betweentwo people In the state of the art social ties are generallycategorized as (1) strong ties (2) weak ties and (3) absent

ties [9 17] whereas the strength of tie depends upon (a)amount of time (b) emotional intensity (c) intimacy var-iables and (d) social distance [3 10] +e repeated presenceof two individuals in a specific geographical location within alimited time also infers a social connection [7] +e strengthof the social tie is directly proportional to the happening ofsuch high cooccurred events

IoT has emerged as one of the most powerful and im-pressive technological research domains [2] IoT presents anovel connectivity concept where machines can equallycollaborate with humans based on actuators and sensors[27 28] One research forecasts that smart devices such aselectronic medical kits and smart watches will reach up tothe worth of USD 160 billion by the end of 2026 [28] +ecommunication network between smart devices and humanforms Social Internet of +ings (SIoT) and further opens upnew research challenges for researchers Managing problemssuch as data scalability velocity and variety are few of theemerging issues in SIoT [29] Understanding social tiesamong human-to-human human-to-machines and ma-chines-to-machines helps to quantify the network perfor-mance issue [30]

Social ties are the backbone of any social network [31]Formation and deformation of social ties affect communitiesin a network [32] Besides social tie strength factors such aslocation emotion situation age gender religion person-ality and many more have a substantial impact on the socialconnection [10 33] Granovetter highlighted the strongconnection between weak social connections and findingjobs [34] In the literature sources of data commonly usedfor social analysis are call logs [35 36] emails and social-networking websites [5] In the literature extensive chal-lenges associated with the integration of visible and invisiblenetworks are highlighted [37] Investigating criminal socialnetworks using limited clues is one of the emerging researchareas of social network analysis (SNA) [38 39]

+X 440 pm Feb 11+X 221 pm Feb 18+Y 752 pm Feb 18

+X 944 pm Feb 28+Y 358 pm Feb 28

+X 732 pm Feb 12+Y 742 pm Feb 12

+X 630 pm Feb 19

+X 1117 pm Feb 7+Y 1002 pm Feb 7

+Y 122 pm Feb 11 +Y 813 pm Feb 11

+X 740 pm Feb 6+Y 122 pm Feb 6+X 307 pm Feb 9+Y 908 pm Feb 9

Figure 1 Illustration of social tie example for two individuals that have cooccurrence while connected to various base stations of differenttimes

Absent tie Nonresponse

B1

B2

B3

B4

B5

B6

G1

G2

G3

G4

G5

B1 B2 B3 B4 B5 B6 G1 G2 G3 G4 G5

Figure 2 Boys and girls liking social network (absent tie andnonresponse)

Scientific Programming 3

Statistically there are always some hidden or visibleassociated parameters among cooccurred events Socialnetwork analysis is performed to explore such intriguingknowledge In the physical world social network analysis isutilized in job searching [4] studying urban life psychologyinvestigation of guilt association [12] finding communities[40] spreading of news [41] and influential networks[18 42] In the recent era of information and technologiesmassive logs are generating for each person eg call recordsbank transactions online purchase records daily emailsCCTV cameras and much more mediums [7 43 44] Incontrast to the physical world suchmediums further concisethe accuracy of results by highlighting such associatedfeatures Despite numerous data sources there is no optimalprocedure to quantify stochastic human nature and socialnetwork evolution [45]

+e grouping method identifies hidden social groupswhich further explores the friend circles and focuses underhigh privacy settings [46] Another research explores thehidden social ties using respondent sampling [47] In theliterature hidden social ties refer to that population which ishard to access +e population that tries to hide from thesocial network is hidden in a network [47] In our researchsuspect ties mean actors in a social network that are presentand accessible but they try to hide their social connectionsOur second part of the research explores the suspicious tieswithin the existing network instead of a hidden populationin a social network

3 Data Set Characteristics andEvaluation Measure

31 Data Set Descriptions In our research we incorporatedthree large location-based data sets ie CDR Brightkiteand Gowalla [13] +e CDR large data set used in this studywas provided by one of the Chinese mobile telecommu-nication operator companies +e data set contains 202000subscribes along with user demographic informationCalling detailed records contain six months (June2014mdashDecember 2014) and calling detailed records con-tain these 202 000 subscribes which have 221 451 169records Each record of the data set is represented in thefollowing format

Caller ID Call Type Callee ID Time

Duration LAC ID CELL ID

Brightkite and Gowalla are openly available location-based social network data sets [13 48] Both data sets aregathered using the online social-networking websitesWebsites maintain user check-in data by fetching mobileGPS location data +ese services use to help people infinding the nearby users and to build social connectionBrightkite contains 58228 nodes and 214078 edges andGowalla contains 196591 nodes and 950327 edges Otherthan social network data both data sets also contain directsocial tie data

32 Abbreviations and Evaluation Measures Figure 3 statesthe example of the social network having a case of suspectactors and their hidden ties Actors with several mutualfriends but do not have direct connection may have a secretconnection +is information helps in identifying them as asuspect tie +e social network evolves and new connectionsexpand the scope of the social network One social networkis a combination of multiple social networks involvingdifferent individuals [31] A social network can be slicedbased on starting and ending time Social networks can alsobe divided into subsocial networks monthwise if it has beendeveloped over one year [49 50] Social network slicinghelps our research further to explore the missing ties be-tween friends of friend relationships

+e following list of abbreviations is used for thequantification processes of Precision and Recall which willalso be used in several parts of the paper

SK calling record+e calling records represent the actual number ofdirect calls that occur between two users +e value ofSK is counted to identify the social tie between twousersCK times of cooccurrenceTime cooccurrence represents the presence of two usersin the range of a common base station We counted CKwhen two users were connected to a common basestation and they called any other userG time minus frame gap value+e time-frame gap value represents the time intervalbetween two usersrsquo calls while connected to a specificbase station For example user X calls someone at 2 pmand user Y calls someone else at 4 pm in this case thegap between the calls is 2 hr To quantify the results and

Actor

Suspect actor

Mutual actor

TieHidden tie

Figure 3 Hidden social ties referring suspects

4 Scientific Programming

evaluate Precision and Recall values we experimentedon the following set of gap valuesG 30mins 1 hr 2 hr 6 hr 12 hr and 24 hrSNK threshold for direct calls+e threshold for direct calls represents the set ofthreshold values for assessing direct calls between twousers We evaluated Precision and Recall curves on thebases of the following set of threshold valuesSNK 2 5 10 and 15

CTK threshold for cooccurrence+e threshold for cooccurrence represents the set ofthreshold values for the two usersrsquo presence in a specificbase station We tested the performance on the CTKthreshold values ranges from 1 to 40MuF threshold formutual friend

+e threshold for mutual friend represents the set ofthreshold values for the two usersrsquo mutual friend We testedthe performance on the MuF threshold values ranges from 1to 100

Precision TP

TP + FP (2)

Recall TP

TP + FN (3)

where as

TP (SNKge SK andCTKgeCK)

FP (SNKlt SK andCTKgeCK)

FN (SNKlt SK andCTKltCK)

TN (SNKge SK andCTKltCK)

⎧⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎩

(4)

F1 score 2 timesPrecision times RecallPrecision + Recall

(5)

33 Cooccurrence Count Normalization MeasureCooccurrence count value CV tells the presence of two usersin the region of one base station An issue related to CVcounting is explained and resolved using an example for thetwo users X and Y shown in Figure 4 We counted CVwhentwo users were connected to a common base station andthey called any other user in a specific time frame +eexample is shown in Figure 4 states the call log details ofusers X and Y gathered in a time frame T

Let X x1 x2 x3 x4 x5 x61113864 1113865 and Y y1 y2 y31113864 1113865 then

CV α1 α2 α3 αi1113864 1113865 (6)

where αi TimeGap(xi Y) i 1 to 6 j 1 to 3

TimeGap xi Y( 1113857 min xi y1( 1113857 xi y2( 1113857 xi yj1113872 11138731113966 1113967

(7)

In Figure 4 x1 and x2 call times have the closest calltime to y1 call In this case a count value of CV can becalculated as 2 However such counting may lead to awrong inference It is the same as if one person calls onceand another person calls n-times within a specific timeframe equals n as the count value To resolve this issue wepropose a normalization equation that decreases the countvalue periodically We introduce Beta (β) value as a pe-riodic normalizing factor

Let X denotes a set of calls by user X and Y denotes a setof calls by user Yif (XgeY) thenX times Y elseY times X

According to the example stated in Figure 4 we assumedβ for set Y (y1β y2β y3β)

For first match value of β 1For the second match values of β β2Likewise for the nth match value of β βn

x1 y1( 1113857 times y1β( 1113857( 1113857 + x2 y1( 1113857 timesy1β2

1113888 11138891113888 11138891113890 1113891

+ x3 y2( 1113857 times y2β( 1113857( 1113857 + x4 y2( 1113857 timesy2β2

1113888 11138891113888 1113889 + x5 y2( 1113857 timesy2β3

1113888 11138891113888 11138891113890 1113891

+ x6 y3( 1113857 times y3β( 11138571113858 1113859

(8)

1113944mk

m11113944

nk

n1xn ym( 1113857 times

ymβn

1113888 11138891113888 1113889 (9)

In equation (9) mk refers to the total number of callsmade by user X to Y while nk refers to the total number ofcalls made by user Y to X equation (9) finds intermediatevalue for CV ie 433 instead of maximum 6 or mini-mum 3 values

4 Social Tie Inference

We initially investigated direct social ties formed by CDRdata sets and compared them to the indirect social tiesformed based on common location using Algorithm 1 By

Scientific Programming 5

direct ties we mean calling or direct connection For ex-ample person A calls person B refers to a direct tie betweenA and B Algorithm 1 takes GV SNK and CDR data sets(social network) as inputs Furthermore the algorithm hastwo parts initially it finds the direct ties between two in-dividuals depending upon the SNK threshold value Sec-ondly it counts the presence of two individuals based onseveral parameters +e Calculate Cooccurrence Count()function finds the number of cooccurrences using equation(9) explained in the previous section Infer Social Ties()function finds the social connections depending upon CTKDT and SNK and inferred them as the social ties

41 CDR-Based Social Tie Inference A social tie is inferredbetween two persons if they are found together at severalsites numerous times +e inference algorithm identifies twosets of results ie direct social ties and inferred social tiesFor the cross-validation of results we correlate the direct tieresults with the inferred ones Precision and Recall evalu-ation measures are used to examine the results We tested allrecords based on threshold values K is the direct calls M isthe times of cooccurrence and G is the time frame gap valueWhile N as direct call count shows the degree of friendship

more value of N indicates the friendship strength Figure 5shows the Precision graph which contains four setsFigures 5(a)ndash5(d) +e whole data set is examined based onK and the value of M and v

In Figure 5(a) the value of K is 15 which represents theusers with direct calls between each other equals to or greaterthan 15 +e value of M is the number of cooccurrence fortwo different users +e Precision values are comparativelysignificantly less for M in the range of 0 to 10 In contrastthe value of Precision increases exponentially for the value ofM in the range of 10 to 30 +e higher value of M indicateshigher cooccurrence of users A positive correlation can beobserved between the values of M and Precision It infersthat cooccurrence is a significant attribute that affectspositively in identifying social ties All graphs in Figure 5have six different lines each line represents the different timegap ranges It can also be seen that the values of gap value 30minutes are having more Precision while the rest lines of 1hour 2 hours 6 hours 12 hours and 24 hours are having lessPrecision +is also clues that the strength of ties has aspecific effect on Precision Users having strong socialconnections most of the time are found together in certainareas +is pattern is explicitly observed by looking cooc-currence value M (20 to 40) and gap time frame G 30

y1

y2

y3

User X User Y

x1

x2

x3

x4

x5

x6

Figure 4 User X and user Y cooccurrence count comparison case

RequireSN Social NetworkGVTime Gap +reshold ValueSNK 2 5 10 15

EnsureDTDirect Social TiesIST Infer Social Ties

(1) while SNne 0 do(2) DTDirect Ties (SNK)(3) end while(4) while SNne 0 do(5) CTKCalculate Cooccurrence Count (DT GV)(6) IST Infer Social Ties (CTK DT SNK)(7) end while

ALGORITHM 1 Social tie inference

6 Scientific Programming

minutes Another positive correlation is found between thedegree of friendship and physical presence at a specific place

To see the effect of friendship strength we evaluatedresults for the four different K values ie 2 5 10 and 15 Atypical pattern is found in all the graphs shown in Figure 5 Itshows that Precision is less for people whose mutualpresence is less at different sites Also people with strongsocial ties spent less than 1 hr time together at a specificlocation To understand the graphrsquos actual meaning wequantify and reconcile with the actual direct social ties It isobserved that a positive correlation in results infers that

people with strong social connections often visit placestogether

Figure 6(a) represents the Recall results We tested andevaluated Recall based on the same measures as Precisionie direct calls cooccurrence and gap time frameFigure 6(a) states that the value of Recall is at a maximumwhen we consider a less number of commonplaces Aninverse trend is observed between the values of Recall andM specifically for the M value ranging between 0 and 3+esame as Precision Recall is also evaluated based on sixdifferent time gap values

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 15

(a)

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 10

(b)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 5

(c)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 2

(d)

Figure 5 +e Precision formed between two callers due to social contact depending on the number of times they have connected throughthe same base station within a specific time frame Figure 5(a) shows the Precision for the six different time frame lines whereas M is thenumber of cooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

Scientific Programming 7

+is part of the research finds peoplersquos cooccurrencebased on the same base station connectivity in a specific timeframe and infers them as social ties Furthermore it crossrelates the inferred results with direct call results

42 Brightkite- and Gowalla-Based Social Tie InferenceBrightkite and Gowalla data sets contain direct social ties aswell as the check-in information of each user In our studywe investigated both data sets based on several dimensionsand found some of the very interesting facts During theanalysis we observed positive correlations between mutualfriend count values MuF and user check-in details InBrightkite MuF ranging 6 to 40 and in Gowalla MuF

ranging 25 to 90 shows the positive corelation with Preci-sion We also measure the effect of gap time value on thePrecision and Recall and evaluated results based on severalgap time values

Figures 7(a) and 7(c) of line graph show the relationbetween mutual friend count values MuF and Precisionwhile Figures 7(b) and 7(d) show the relation betweenmutual friend count values MuF and Recall In Figure 7results shows that people having a certain number ofmutual friends use to visit place together or with little gapof interval

+e social tie inference framework infers some of theabsent ties as social ties in the form of false-positive resultsTo further extract the actual meaning of such incorrect

0

01

02

03

04

05

06

07

08

09

1Re

call

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 15

(a)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 10

(b)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 5

(c)

0

01

02

03

04

05

06

07

08

09

1Re

call

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 2

(d)

Figure 6+e Recall formed between two callers due to social contact depending upon the number of times they have connected through thesame base station within a specific time frame Figure 6(a) shows the Recall for the six different time frame lines whereas M is the number ofcooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

8 Scientific Programming

inferences we conducted a social activity and simulation+e false-positive results of the first part of the research serve asthe foundation for the second part Activity under the first partdata set is conducted and the false-positive results are examinedby studying the human cooccurrence patterns described in thenext section Suspicious Ties+is stage of research gave us a clueto further exploit the category of missing ties

5 Suspicious Ties

An absent tie can be inferred as a suspect tie if it satisfies thefollowing properties

(1) M number of mutual friends(2) C number of cooccurrence on different sites

In the CDR data set each cell is treated as a single cellof the base station Our model adopts the followingfour features for evaluation

(1) Base station(2) User ID(3) Gap time threshold(4) Call time stamp

+e following mathematical model explains theproblem and its formulationLet BS denotes a set bsi1113864 1113865 of bsn points and bsi iscalled as the distinct base station cellLet DU denotes a set dui1113864 1113865 of dun points and dui iscalled as the distinct user

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 420001020304050607080910

MuF

Prec

ision

(a)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 42MuF

0001020304050607080910

Reca

ll

(b)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Prec

ision

1 10 20 30 40 50 60 70 80 90 100

(c)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Reca

ll

1 10 20 30 40 50 60 70 80 90 100

(d)

Figure 7 +e Precision and Recall formed between two users due to social contact depending on the number of times they have checked into certain location within a specific time frame Figure 7(a) shows the Precision for the six different time frame lines whereas MuF is thenumber of mutual friend count (a) Brightkite Precision (b) Brightkite Recall (c) Gowalla Precision and (d) Gowalla Recall

Scientific Programming 9

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 2: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

further classified as strong or weak depending upon thefrequency of communication number of times emotionalattachment number of mutual ties relationship actions anda combination of these mentioned parameters [5] Equation(1) quantifies the strength of social ties which is denoted byweight such that higher weight tells stronger ties and viceversa [10] wAB represents the weight of social tie betweenNode A and Node B while CA and CB represent the degreeof Node A and Node B cAB is the number of mutual nodesbetween A and B Community structures were also foundone of the main reasons for social tie strength [11] it wasfound that people from the same communities have strongties as compared to different communities [5]

wAB cAB

CA minus 1 + CB minus 1 minus cAB (1)

Various models and techniques are developed to inferthe social network based on inadequate aspects [7 12] Oneof the specific categories belonging to such inference de-termines the cooccurrence based on time and locationDespite encountering many measures there remains a de-ficiency in acquiring precise and accurate inferences In ourresearch we consider several threshold parameters toquantify more precise inferences We also develop aframework that infers existing social ties and the hiddenrelationships in a social network

Initially in our research we present the inference ofsocial ties among people by correlating to their physicalpresence at several sites and their direct connections Wedefine a social connection if two individuals X and Y

cooccur in a s cell within t hr time frame such that X calls toperson R while connected to a b1 base station and in thesame time frame Y calls a person S from the same b1 basestation Furthermore we counted the number of cooccur-rence of X and Y Firstly we find social ties depending uponthe number of direct calls between two people To ensure thecorrect social connection we state a threshold such that thecount of direct calls is more than the threshold Secondly weevaluate the social relationship between two people bycounting the number of calls by X and Y in a specific timeframe Figure 1 states an example that explains the proce-dure of quantification Each hexagon represents an area of asingle base station X and Y are together 6 times in variousbase stations and there is a variation in the gap of calls Inthe first part of research we use the CDR data set providedby the telecommunication company and two openly avail-able location-based social network data sets ie Brightkiteand Gowalla [13] All data sets resemble to the stated ex-ample in Figure 1 We counted the number of concurrencesbased on multiple gap time frame thresholds and mutualfriends Furthermore we correlate results with the directcalls based on social connection using Precision and Recallevaluation measures

In the second phase of research we explore the false-positive results formed by the CDR-based social tie inferencemodel We state a missing tie as a suspicious tie between twopeople if they do not have any direct calls but are foundtogether numerous times Also they have a certain number

of mutual friends In the literature missing ties are definedas either nonresponse or absent ties [12] In an activity anactor does not give any information about a tie considered asnonresponse [14] while an absent tie means when an actordoes not give any indication about the tie detail A surveywas conducted to monitor the social behavior of the boysrsquoand girlsrsquo liking pattern +at was limited to binary datasuch that one represents a tie while zero represents no tieFigure 2 shows visual representation (block modeling) ofadjacency matrix made according to survey data [14] InFigure 2 green-filled slots represent the existence of tieregardless of its strength while white slots indicate eitherabsent ties or nonresponse ties In our research we exploreand classify a subset of missing ties as suspect ties Weconduct a social activity and simulation that generates a dataset the same as the CDR data supporting this conceptFurthermore we correlate the CDR-based social tie infer-ence modelrsquos false-positive results with the activity andsimulation results

+e contributions of this paper are as follows

(1) We developed an inference model and a classifierthat identifies location-based social ties +e infer-ence model is tested on the CDR-based social net-work Brightkite and Gowalla using Precision andRecall measures

(2) We identify a class of suspect ties by examining thesocial tie inference modelrsquos false-positive results

(3) We conducted an activity-based survey and a sim-ulation that demonstrates and evaluates the suspectsrsquoties

+e rest of this article is organized as follows Section 2describes the literature review Section 3 presents the de-scriptions of cooccurrence count normalization inferencealgorithm and social tie inference Brief concepts about thehidden relationship and suspicious links are described inSection 4 +e proposed framework and an algorithm toinfer suspicious relationships are given in Section 5 Section6 describes the data set description results and analysisFinally the conclusion of this article is presented in Section7

2 Related Work

+e physical world social network is represented as a graphwhere nodes are treated as people and edges are representedas the social tie between two people [15] In the literatureedge weight is represented as the strength of that particularsocial tie [10 16] A social network such as Twitter forms abidirectional graph eg a fan follows a celebrity but thecelebrity hardly ever follows back Usage of bidirectionalgraphs investigates influential networks and most inflec-tional people [17ndash19] Recommendation and targetedmarketing are some of the essential objectives of exploringsocial ties [20 21] +eme-based model adopts dynamicprogramming to explore critical factors for example playingand dating are the kind of themes [22] Social ties couplingand predicting the mobility of users were researched by

2 Scientific Programming

seeing the physical and network properties (geosocialproperties) [23 24] An effective prediction technique wasproposed to find the typical patterns of two users bycomparing the check-in details [24 25] Area significance ismeasured using a weight-assigningmethod by incorporatingtwo usersrsquo cooccurrence for a specific area A coffee shop is amore significant area as compared to an ordinary place [26]+e scoring mechanism helps in categorizing and labellingof social ties [10] Inference about the any social network isincomplete if associated features are neglected +e baselineof any social network is the single social connection betweentwo people In the state of the art social ties are generallycategorized as (1) strong ties (2) weak ties and (3) absent

ties [9 17] whereas the strength of tie depends upon (a)amount of time (b) emotional intensity (c) intimacy var-iables and (d) social distance [3 10] +e repeated presenceof two individuals in a specific geographical location within alimited time also infers a social connection [7] +e strengthof the social tie is directly proportional to the happening ofsuch high cooccurred events

IoT has emerged as one of the most powerful and im-pressive technological research domains [2] IoT presents anovel connectivity concept where machines can equallycollaborate with humans based on actuators and sensors[27 28] One research forecasts that smart devices such aselectronic medical kits and smart watches will reach up tothe worth of USD 160 billion by the end of 2026 [28] +ecommunication network between smart devices and humanforms Social Internet of +ings (SIoT) and further opens upnew research challenges for researchers Managing problemssuch as data scalability velocity and variety are few of theemerging issues in SIoT [29] Understanding social tiesamong human-to-human human-to-machines and ma-chines-to-machines helps to quantify the network perfor-mance issue [30]

Social ties are the backbone of any social network [31]Formation and deformation of social ties affect communitiesin a network [32] Besides social tie strength factors such aslocation emotion situation age gender religion person-ality and many more have a substantial impact on the socialconnection [10 33] Granovetter highlighted the strongconnection between weak social connections and findingjobs [34] In the literature sources of data commonly usedfor social analysis are call logs [35 36] emails and social-networking websites [5] In the literature extensive chal-lenges associated with the integration of visible and invisiblenetworks are highlighted [37] Investigating criminal socialnetworks using limited clues is one of the emerging researchareas of social network analysis (SNA) [38 39]

+X 440 pm Feb 11+X 221 pm Feb 18+Y 752 pm Feb 18

+X 944 pm Feb 28+Y 358 pm Feb 28

+X 732 pm Feb 12+Y 742 pm Feb 12

+X 630 pm Feb 19

+X 1117 pm Feb 7+Y 1002 pm Feb 7

+Y 122 pm Feb 11 +Y 813 pm Feb 11

+X 740 pm Feb 6+Y 122 pm Feb 6+X 307 pm Feb 9+Y 908 pm Feb 9

Figure 1 Illustration of social tie example for two individuals that have cooccurrence while connected to various base stations of differenttimes

Absent tie Nonresponse

B1

B2

B3

B4

B5

B6

G1

G2

G3

G4

G5

B1 B2 B3 B4 B5 B6 G1 G2 G3 G4 G5

Figure 2 Boys and girls liking social network (absent tie andnonresponse)

Scientific Programming 3

Statistically there are always some hidden or visibleassociated parameters among cooccurred events Socialnetwork analysis is performed to explore such intriguingknowledge In the physical world social network analysis isutilized in job searching [4] studying urban life psychologyinvestigation of guilt association [12] finding communities[40] spreading of news [41] and influential networks[18 42] In the recent era of information and technologiesmassive logs are generating for each person eg call recordsbank transactions online purchase records daily emailsCCTV cameras and much more mediums [7 43 44] Incontrast to the physical world suchmediums further concisethe accuracy of results by highlighting such associatedfeatures Despite numerous data sources there is no optimalprocedure to quantify stochastic human nature and socialnetwork evolution [45]

+e grouping method identifies hidden social groupswhich further explores the friend circles and focuses underhigh privacy settings [46] Another research explores thehidden social ties using respondent sampling [47] In theliterature hidden social ties refer to that population which ishard to access +e population that tries to hide from thesocial network is hidden in a network [47] In our researchsuspect ties mean actors in a social network that are presentand accessible but they try to hide their social connectionsOur second part of the research explores the suspicious tieswithin the existing network instead of a hidden populationin a social network

3 Data Set Characteristics andEvaluation Measure

31 Data Set Descriptions In our research we incorporatedthree large location-based data sets ie CDR Brightkiteand Gowalla [13] +e CDR large data set used in this studywas provided by one of the Chinese mobile telecommu-nication operator companies +e data set contains 202000subscribes along with user demographic informationCalling detailed records contain six months (June2014mdashDecember 2014) and calling detailed records con-tain these 202 000 subscribes which have 221 451 169records Each record of the data set is represented in thefollowing format

Caller ID Call Type Callee ID Time

Duration LAC ID CELL ID

Brightkite and Gowalla are openly available location-based social network data sets [13 48] Both data sets aregathered using the online social-networking websitesWebsites maintain user check-in data by fetching mobileGPS location data +ese services use to help people infinding the nearby users and to build social connectionBrightkite contains 58228 nodes and 214078 edges andGowalla contains 196591 nodes and 950327 edges Otherthan social network data both data sets also contain directsocial tie data

32 Abbreviations and Evaluation Measures Figure 3 statesthe example of the social network having a case of suspectactors and their hidden ties Actors with several mutualfriends but do not have direct connection may have a secretconnection +is information helps in identifying them as asuspect tie +e social network evolves and new connectionsexpand the scope of the social network One social networkis a combination of multiple social networks involvingdifferent individuals [31] A social network can be slicedbased on starting and ending time Social networks can alsobe divided into subsocial networks monthwise if it has beendeveloped over one year [49 50] Social network slicinghelps our research further to explore the missing ties be-tween friends of friend relationships

+e following list of abbreviations is used for thequantification processes of Precision and Recall which willalso be used in several parts of the paper

SK calling record+e calling records represent the actual number ofdirect calls that occur between two users +e value ofSK is counted to identify the social tie between twousersCK times of cooccurrenceTime cooccurrence represents the presence of two usersin the range of a common base station We counted CKwhen two users were connected to a common basestation and they called any other userG time minus frame gap value+e time-frame gap value represents the time intervalbetween two usersrsquo calls while connected to a specificbase station For example user X calls someone at 2 pmand user Y calls someone else at 4 pm in this case thegap between the calls is 2 hr To quantify the results and

Actor

Suspect actor

Mutual actor

TieHidden tie

Figure 3 Hidden social ties referring suspects

4 Scientific Programming

evaluate Precision and Recall values we experimentedon the following set of gap valuesG 30mins 1 hr 2 hr 6 hr 12 hr and 24 hrSNK threshold for direct calls+e threshold for direct calls represents the set ofthreshold values for assessing direct calls between twousers We evaluated Precision and Recall curves on thebases of the following set of threshold valuesSNK 2 5 10 and 15

CTK threshold for cooccurrence+e threshold for cooccurrence represents the set ofthreshold values for the two usersrsquo presence in a specificbase station We tested the performance on the CTKthreshold values ranges from 1 to 40MuF threshold formutual friend

+e threshold for mutual friend represents the set ofthreshold values for the two usersrsquo mutual friend We testedthe performance on the MuF threshold values ranges from 1to 100

Precision TP

TP + FP (2)

Recall TP

TP + FN (3)

where as

TP (SNKge SK andCTKgeCK)

FP (SNKlt SK andCTKgeCK)

FN (SNKlt SK andCTKltCK)

TN (SNKge SK andCTKltCK)

⎧⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎩

(4)

F1 score 2 timesPrecision times RecallPrecision + Recall

(5)

33 Cooccurrence Count Normalization MeasureCooccurrence count value CV tells the presence of two usersin the region of one base station An issue related to CVcounting is explained and resolved using an example for thetwo users X and Y shown in Figure 4 We counted CVwhentwo users were connected to a common base station andthey called any other user in a specific time frame +eexample is shown in Figure 4 states the call log details ofusers X and Y gathered in a time frame T

Let X x1 x2 x3 x4 x5 x61113864 1113865 and Y y1 y2 y31113864 1113865 then

CV α1 α2 α3 αi1113864 1113865 (6)

where αi TimeGap(xi Y) i 1 to 6 j 1 to 3

TimeGap xi Y( 1113857 min xi y1( 1113857 xi y2( 1113857 xi yj1113872 11138731113966 1113967

(7)

In Figure 4 x1 and x2 call times have the closest calltime to y1 call In this case a count value of CV can becalculated as 2 However such counting may lead to awrong inference It is the same as if one person calls onceand another person calls n-times within a specific timeframe equals n as the count value To resolve this issue wepropose a normalization equation that decreases the countvalue periodically We introduce Beta (β) value as a pe-riodic normalizing factor

Let X denotes a set of calls by user X and Y denotes a setof calls by user Yif (XgeY) thenX times Y elseY times X

According to the example stated in Figure 4 we assumedβ for set Y (y1β y2β y3β)

For first match value of β 1For the second match values of β β2Likewise for the nth match value of β βn

x1 y1( 1113857 times y1β( 1113857( 1113857 + x2 y1( 1113857 timesy1β2

1113888 11138891113888 11138891113890 1113891

+ x3 y2( 1113857 times y2β( 1113857( 1113857 + x4 y2( 1113857 timesy2β2

1113888 11138891113888 1113889 + x5 y2( 1113857 timesy2β3

1113888 11138891113888 11138891113890 1113891

+ x6 y3( 1113857 times y3β( 11138571113858 1113859

(8)

1113944mk

m11113944

nk

n1xn ym( 1113857 times

ymβn

1113888 11138891113888 1113889 (9)

In equation (9) mk refers to the total number of callsmade by user X to Y while nk refers to the total number ofcalls made by user Y to X equation (9) finds intermediatevalue for CV ie 433 instead of maximum 6 or mini-mum 3 values

4 Social Tie Inference

We initially investigated direct social ties formed by CDRdata sets and compared them to the indirect social tiesformed based on common location using Algorithm 1 By

Scientific Programming 5

direct ties we mean calling or direct connection For ex-ample person A calls person B refers to a direct tie betweenA and B Algorithm 1 takes GV SNK and CDR data sets(social network) as inputs Furthermore the algorithm hastwo parts initially it finds the direct ties between two in-dividuals depending upon the SNK threshold value Sec-ondly it counts the presence of two individuals based onseveral parameters +e Calculate Cooccurrence Count()function finds the number of cooccurrences using equation(9) explained in the previous section Infer Social Ties()function finds the social connections depending upon CTKDT and SNK and inferred them as the social ties

41 CDR-Based Social Tie Inference A social tie is inferredbetween two persons if they are found together at severalsites numerous times +e inference algorithm identifies twosets of results ie direct social ties and inferred social tiesFor the cross-validation of results we correlate the direct tieresults with the inferred ones Precision and Recall evalu-ation measures are used to examine the results We tested allrecords based on threshold values K is the direct calls M isthe times of cooccurrence and G is the time frame gap valueWhile N as direct call count shows the degree of friendship

more value of N indicates the friendship strength Figure 5shows the Precision graph which contains four setsFigures 5(a)ndash5(d) +e whole data set is examined based onK and the value of M and v

In Figure 5(a) the value of K is 15 which represents theusers with direct calls between each other equals to or greaterthan 15 +e value of M is the number of cooccurrence fortwo different users +e Precision values are comparativelysignificantly less for M in the range of 0 to 10 In contrastthe value of Precision increases exponentially for the value ofM in the range of 10 to 30 +e higher value of M indicateshigher cooccurrence of users A positive correlation can beobserved between the values of M and Precision It infersthat cooccurrence is a significant attribute that affectspositively in identifying social ties All graphs in Figure 5have six different lines each line represents the different timegap ranges It can also be seen that the values of gap value 30minutes are having more Precision while the rest lines of 1hour 2 hours 6 hours 12 hours and 24 hours are having lessPrecision +is also clues that the strength of ties has aspecific effect on Precision Users having strong socialconnections most of the time are found together in certainareas +is pattern is explicitly observed by looking cooc-currence value M (20 to 40) and gap time frame G 30

y1

y2

y3

User X User Y

x1

x2

x3

x4

x5

x6

Figure 4 User X and user Y cooccurrence count comparison case

RequireSN Social NetworkGVTime Gap +reshold ValueSNK 2 5 10 15

EnsureDTDirect Social TiesIST Infer Social Ties

(1) while SNne 0 do(2) DTDirect Ties (SNK)(3) end while(4) while SNne 0 do(5) CTKCalculate Cooccurrence Count (DT GV)(6) IST Infer Social Ties (CTK DT SNK)(7) end while

ALGORITHM 1 Social tie inference

6 Scientific Programming

minutes Another positive correlation is found between thedegree of friendship and physical presence at a specific place

To see the effect of friendship strength we evaluatedresults for the four different K values ie 2 5 10 and 15 Atypical pattern is found in all the graphs shown in Figure 5 Itshows that Precision is less for people whose mutualpresence is less at different sites Also people with strongsocial ties spent less than 1 hr time together at a specificlocation To understand the graphrsquos actual meaning wequantify and reconcile with the actual direct social ties It isobserved that a positive correlation in results infers that

people with strong social connections often visit placestogether

Figure 6(a) represents the Recall results We tested andevaluated Recall based on the same measures as Precisionie direct calls cooccurrence and gap time frameFigure 6(a) states that the value of Recall is at a maximumwhen we consider a less number of commonplaces Aninverse trend is observed between the values of Recall andM specifically for the M value ranging between 0 and 3+esame as Precision Recall is also evaluated based on sixdifferent time gap values

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 15

(a)

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 10

(b)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 5

(c)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 2

(d)

Figure 5 +e Precision formed between two callers due to social contact depending on the number of times they have connected throughthe same base station within a specific time frame Figure 5(a) shows the Precision for the six different time frame lines whereas M is thenumber of cooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

Scientific Programming 7

+is part of the research finds peoplersquos cooccurrencebased on the same base station connectivity in a specific timeframe and infers them as social ties Furthermore it crossrelates the inferred results with direct call results

42 Brightkite- and Gowalla-Based Social Tie InferenceBrightkite and Gowalla data sets contain direct social ties aswell as the check-in information of each user In our studywe investigated both data sets based on several dimensionsand found some of the very interesting facts During theanalysis we observed positive correlations between mutualfriend count values MuF and user check-in details InBrightkite MuF ranging 6 to 40 and in Gowalla MuF

ranging 25 to 90 shows the positive corelation with Preci-sion We also measure the effect of gap time value on thePrecision and Recall and evaluated results based on severalgap time values

Figures 7(a) and 7(c) of line graph show the relationbetween mutual friend count values MuF and Precisionwhile Figures 7(b) and 7(d) show the relation betweenmutual friend count values MuF and Recall In Figure 7results shows that people having a certain number ofmutual friends use to visit place together or with little gapof interval

+e social tie inference framework infers some of theabsent ties as social ties in the form of false-positive resultsTo further extract the actual meaning of such incorrect

0

01

02

03

04

05

06

07

08

09

1Re

call

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 15

(a)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 10

(b)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 5

(c)

0

01

02

03

04

05

06

07

08

09

1Re

call

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 2

(d)

Figure 6+e Recall formed between two callers due to social contact depending upon the number of times they have connected through thesame base station within a specific time frame Figure 6(a) shows the Recall for the six different time frame lines whereas M is the number ofcooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

8 Scientific Programming

inferences we conducted a social activity and simulation+e false-positive results of the first part of the research serve asthe foundation for the second part Activity under the first partdata set is conducted and the false-positive results are examinedby studying the human cooccurrence patterns described in thenext section Suspicious Ties+is stage of research gave us a clueto further exploit the category of missing ties

5 Suspicious Ties

An absent tie can be inferred as a suspect tie if it satisfies thefollowing properties

(1) M number of mutual friends(2) C number of cooccurrence on different sites

In the CDR data set each cell is treated as a single cellof the base station Our model adopts the followingfour features for evaluation

(1) Base station(2) User ID(3) Gap time threshold(4) Call time stamp

+e following mathematical model explains theproblem and its formulationLet BS denotes a set bsi1113864 1113865 of bsn points and bsi iscalled as the distinct base station cellLet DU denotes a set dui1113864 1113865 of dun points and dui iscalled as the distinct user

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 420001020304050607080910

MuF

Prec

ision

(a)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 42MuF

0001020304050607080910

Reca

ll

(b)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Prec

ision

1 10 20 30 40 50 60 70 80 90 100

(c)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Reca

ll

1 10 20 30 40 50 60 70 80 90 100

(d)

Figure 7 +e Precision and Recall formed between two users due to social contact depending on the number of times they have checked into certain location within a specific time frame Figure 7(a) shows the Precision for the six different time frame lines whereas MuF is thenumber of mutual friend count (a) Brightkite Precision (b) Brightkite Recall (c) Gowalla Precision and (d) Gowalla Recall

Scientific Programming 9

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 3: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

seeing the physical and network properties (geosocialproperties) [23 24] An effective prediction technique wasproposed to find the typical patterns of two users bycomparing the check-in details [24 25] Area significance ismeasured using a weight-assigningmethod by incorporatingtwo usersrsquo cooccurrence for a specific area A coffee shop is amore significant area as compared to an ordinary place [26]+e scoring mechanism helps in categorizing and labellingof social ties [10] Inference about the any social network isincomplete if associated features are neglected +e baselineof any social network is the single social connection betweentwo people In the state of the art social ties are generallycategorized as (1) strong ties (2) weak ties and (3) absent

ties [9 17] whereas the strength of tie depends upon (a)amount of time (b) emotional intensity (c) intimacy var-iables and (d) social distance [3 10] +e repeated presenceof two individuals in a specific geographical location within alimited time also infers a social connection [7] +e strengthof the social tie is directly proportional to the happening ofsuch high cooccurred events

IoT has emerged as one of the most powerful and im-pressive technological research domains [2] IoT presents anovel connectivity concept where machines can equallycollaborate with humans based on actuators and sensors[27 28] One research forecasts that smart devices such aselectronic medical kits and smart watches will reach up tothe worth of USD 160 billion by the end of 2026 [28] +ecommunication network between smart devices and humanforms Social Internet of +ings (SIoT) and further opens upnew research challenges for researchers Managing problemssuch as data scalability velocity and variety are few of theemerging issues in SIoT [29] Understanding social tiesamong human-to-human human-to-machines and ma-chines-to-machines helps to quantify the network perfor-mance issue [30]

Social ties are the backbone of any social network [31]Formation and deformation of social ties affect communitiesin a network [32] Besides social tie strength factors such aslocation emotion situation age gender religion person-ality and many more have a substantial impact on the socialconnection [10 33] Granovetter highlighted the strongconnection between weak social connections and findingjobs [34] In the literature sources of data commonly usedfor social analysis are call logs [35 36] emails and social-networking websites [5] In the literature extensive chal-lenges associated with the integration of visible and invisiblenetworks are highlighted [37] Investigating criminal socialnetworks using limited clues is one of the emerging researchareas of social network analysis (SNA) [38 39]

+X 440 pm Feb 11+X 221 pm Feb 18+Y 752 pm Feb 18

+X 944 pm Feb 28+Y 358 pm Feb 28

+X 732 pm Feb 12+Y 742 pm Feb 12

+X 630 pm Feb 19

+X 1117 pm Feb 7+Y 1002 pm Feb 7

+Y 122 pm Feb 11 +Y 813 pm Feb 11

+X 740 pm Feb 6+Y 122 pm Feb 6+X 307 pm Feb 9+Y 908 pm Feb 9

Figure 1 Illustration of social tie example for two individuals that have cooccurrence while connected to various base stations of differenttimes

Absent tie Nonresponse

B1

B2

B3

B4

B5

B6

G1

G2

G3

G4

G5

B1 B2 B3 B4 B5 B6 G1 G2 G3 G4 G5

Figure 2 Boys and girls liking social network (absent tie andnonresponse)

Scientific Programming 3

Statistically there are always some hidden or visibleassociated parameters among cooccurred events Socialnetwork analysis is performed to explore such intriguingknowledge In the physical world social network analysis isutilized in job searching [4] studying urban life psychologyinvestigation of guilt association [12] finding communities[40] spreading of news [41] and influential networks[18 42] In the recent era of information and technologiesmassive logs are generating for each person eg call recordsbank transactions online purchase records daily emailsCCTV cameras and much more mediums [7 43 44] Incontrast to the physical world suchmediums further concisethe accuracy of results by highlighting such associatedfeatures Despite numerous data sources there is no optimalprocedure to quantify stochastic human nature and socialnetwork evolution [45]

+e grouping method identifies hidden social groupswhich further explores the friend circles and focuses underhigh privacy settings [46] Another research explores thehidden social ties using respondent sampling [47] In theliterature hidden social ties refer to that population which ishard to access +e population that tries to hide from thesocial network is hidden in a network [47] In our researchsuspect ties mean actors in a social network that are presentand accessible but they try to hide their social connectionsOur second part of the research explores the suspicious tieswithin the existing network instead of a hidden populationin a social network

3 Data Set Characteristics andEvaluation Measure

31 Data Set Descriptions In our research we incorporatedthree large location-based data sets ie CDR Brightkiteand Gowalla [13] +e CDR large data set used in this studywas provided by one of the Chinese mobile telecommu-nication operator companies +e data set contains 202000subscribes along with user demographic informationCalling detailed records contain six months (June2014mdashDecember 2014) and calling detailed records con-tain these 202 000 subscribes which have 221 451 169records Each record of the data set is represented in thefollowing format

Caller ID Call Type Callee ID Time

Duration LAC ID CELL ID

Brightkite and Gowalla are openly available location-based social network data sets [13 48] Both data sets aregathered using the online social-networking websitesWebsites maintain user check-in data by fetching mobileGPS location data +ese services use to help people infinding the nearby users and to build social connectionBrightkite contains 58228 nodes and 214078 edges andGowalla contains 196591 nodes and 950327 edges Otherthan social network data both data sets also contain directsocial tie data

32 Abbreviations and Evaluation Measures Figure 3 statesthe example of the social network having a case of suspectactors and their hidden ties Actors with several mutualfriends but do not have direct connection may have a secretconnection +is information helps in identifying them as asuspect tie +e social network evolves and new connectionsexpand the scope of the social network One social networkis a combination of multiple social networks involvingdifferent individuals [31] A social network can be slicedbased on starting and ending time Social networks can alsobe divided into subsocial networks monthwise if it has beendeveloped over one year [49 50] Social network slicinghelps our research further to explore the missing ties be-tween friends of friend relationships

+e following list of abbreviations is used for thequantification processes of Precision and Recall which willalso be used in several parts of the paper

SK calling record+e calling records represent the actual number ofdirect calls that occur between two users +e value ofSK is counted to identify the social tie between twousersCK times of cooccurrenceTime cooccurrence represents the presence of two usersin the range of a common base station We counted CKwhen two users were connected to a common basestation and they called any other userG time minus frame gap value+e time-frame gap value represents the time intervalbetween two usersrsquo calls while connected to a specificbase station For example user X calls someone at 2 pmand user Y calls someone else at 4 pm in this case thegap between the calls is 2 hr To quantify the results and

Actor

Suspect actor

Mutual actor

TieHidden tie

Figure 3 Hidden social ties referring suspects

4 Scientific Programming

evaluate Precision and Recall values we experimentedon the following set of gap valuesG 30mins 1 hr 2 hr 6 hr 12 hr and 24 hrSNK threshold for direct calls+e threshold for direct calls represents the set ofthreshold values for assessing direct calls between twousers We evaluated Precision and Recall curves on thebases of the following set of threshold valuesSNK 2 5 10 and 15

CTK threshold for cooccurrence+e threshold for cooccurrence represents the set ofthreshold values for the two usersrsquo presence in a specificbase station We tested the performance on the CTKthreshold values ranges from 1 to 40MuF threshold formutual friend

+e threshold for mutual friend represents the set ofthreshold values for the two usersrsquo mutual friend We testedthe performance on the MuF threshold values ranges from 1to 100

Precision TP

TP + FP (2)

Recall TP

TP + FN (3)

where as

TP (SNKge SK andCTKgeCK)

FP (SNKlt SK andCTKgeCK)

FN (SNKlt SK andCTKltCK)

TN (SNKge SK andCTKltCK)

⎧⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎩

(4)

F1 score 2 timesPrecision times RecallPrecision + Recall

(5)

33 Cooccurrence Count Normalization MeasureCooccurrence count value CV tells the presence of two usersin the region of one base station An issue related to CVcounting is explained and resolved using an example for thetwo users X and Y shown in Figure 4 We counted CVwhentwo users were connected to a common base station andthey called any other user in a specific time frame +eexample is shown in Figure 4 states the call log details ofusers X and Y gathered in a time frame T

Let X x1 x2 x3 x4 x5 x61113864 1113865 and Y y1 y2 y31113864 1113865 then

CV α1 α2 α3 αi1113864 1113865 (6)

where αi TimeGap(xi Y) i 1 to 6 j 1 to 3

TimeGap xi Y( 1113857 min xi y1( 1113857 xi y2( 1113857 xi yj1113872 11138731113966 1113967

(7)

In Figure 4 x1 and x2 call times have the closest calltime to y1 call In this case a count value of CV can becalculated as 2 However such counting may lead to awrong inference It is the same as if one person calls onceand another person calls n-times within a specific timeframe equals n as the count value To resolve this issue wepropose a normalization equation that decreases the countvalue periodically We introduce Beta (β) value as a pe-riodic normalizing factor

Let X denotes a set of calls by user X and Y denotes a setof calls by user Yif (XgeY) thenX times Y elseY times X

According to the example stated in Figure 4 we assumedβ for set Y (y1β y2β y3β)

For first match value of β 1For the second match values of β β2Likewise for the nth match value of β βn

x1 y1( 1113857 times y1β( 1113857( 1113857 + x2 y1( 1113857 timesy1β2

1113888 11138891113888 11138891113890 1113891

+ x3 y2( 1113857 times y2β( 1113857( 1113857 + x4 y2( 1113857 timesy2β2

1113888 11138891113888 1113889 + x5 y2( 1113857 timesy2β3

1113888 11138891113888 11138891113890 1113891

+ x6 y3( 1113857 times y3β( 11138571113858 1113859

(8)

1113944mk

m11113944

nk

n1xn ym( 1113857 times

ymβn

1113888 11138891113888 1113889 (9)

In equation (9) mk refers to the total number of callsmade by user X to Y while nk refers to the total number ofcalls made by user Y to X equation (9) finds intermediatevalue for CV ie 433 instead of maximum 6 or mini-mum 3 values

4 Social Tie Inference

We initially investigated direct social ties formed by CDRdata sets and compared them to the indirect social tiesformed based on common location using Algorithm 1 By

Scientific Programming 5

direct ties we mean calling or direct connection For ex-ample person A calls person B refers to a direct tie betweenA and B Algorithm 1 takes GV SNK and CDR data sets(social network) as inputs Furthermore the algorithm hastwo parts initially it finds the direct ties between two in-dividuals depending upon the SNK threshold value Sec-ondly it counts the presence of two individuals based onseveral parameters +e Calculate Cooccurrence Count()function finds the number of cooccurrences using equation(9) explained in the previous section Infer Social Ties()function finds the social connections depending upon CTKDT and SNK and inferred them as the social ties

41 CDR-Based Social Tie Inference A social tie is inferredbetween two persons if they are found together at severalsites numerous times +e inference algorithm identifies twosets of results ie direct social ties and inferred social tiesFor the cross-validation of results we correlate the direct tieresults with the inferred ones Precision and Recall evalu-ation measures are used to examine the results We tested allrecords based on threshold values K is the direct calls M isthe times of cooccurrence and G is the time frame gap valueWhile N as direct call count shows the degree of friendship

more value of N indicates the friendship strength Figure 5shows the Precision graph which contains four setsFigures 5(a)ndash5(d) +e whole data set is examined based onK and the value of M and v

In Figure 5(a) the value of K is 15 which represents theusers with direct calls between each other equals to or greaterthan 15 +e value of M is the number of cooccurrence fortwo different users +e Precision values are comparativelysignificantly less for M in the range of 0 to 10 In contrastthe value of Precision increases exponentially for the value ofM in the range of 10 to 30 +e higher value of M indicateshigher cooccurrence of users A positive correlation can beobserved between the values of M and Precision It infersthat cooccurrence is a significant attribute that affectspositively in identifying social ties All graphs in Figure 5have six different lines each line represents the different timegap ranges It can also be seen that the values of gap value 30minutes are having more Precision while the rest lines of 1hour 2 hours 6 hours 12 hours and 24 hours are having lessPrecision +is also clues that the strength of ties has aspecific effect on Precision Users having strong socialconnections most of the time are found together in certainareas +is pattern is explicitly observed by looking cooc-currence value M (20 to 40) and gap time frame G 30

y1

y2

y3

User X User Y

x1

x2

x3

x4

x5

x6

Figure 4 User X and user Y cooccurrence count comparison case

RequireSN Social NetworkGVTime Gap +reshold ValueSNK 2 5 10 15

EnsureDTDirect Social TiesIST Infer Social Ties

(1) while SNne 0 do(2) DTDirect Ties (SNK)(3) end while(4) while SNne 0 do(5) CTKCalculate Cooccurrence Count (DT GV)(6) IST Infer Social Ties (CTK DT SNK)(7) end while

ALGORITHM 1 Social tie inference

6 Scientific Programming

minutes Another positive correlation is found between thedegree of friendship and physical presence at a specific place

To see the effect of friendship strength we evaluatedresults for the four different K values ie 2 5 10 and 15 Atypical pattern is found in all the graphs shown in Figure 5 Itshows that Precision is less for people whose mutualpresence is less at different sites Also people with strongsocial ties spent less than 1 hr time together at a specificlocation To understand the graphrsquos actual meaning wequantify and reconcile with the actual direct social ties It isobserved that a positive correlation in results infers that

people with strong social connections often visit placestogether

Figure 6(a) represents the Recall results We tested andevaluated Recall based on the same measures as Precisionie direct calls cooccurrence and gap time frameFigure 6(a) states that the value of Recall is at a maximumwhen we consider a less number of commonplaces Aninverse trend is observed between the values of Recall andM specifically for the M value ranging between 0 and 3+esame as Precision Recall is also evaluated based on sixdifferent time gap values

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 15

(a)

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 10

(b)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 5

(c)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 2

(d)

Figure 5 +e Precision formed between two callers due to social contact depending on the number of times they have connected throughthe same base station within a specific time frame Figure 5(a) shows the Precision for the six different time frame lines whereas M is thenumber of cooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

Scientific Programming 7

+is part of the research finds peoplersquos cooccurrencebased on the same base station connectivity in a specific timeframe and infers them as social ties Furthermore it crossrelates the inferred results with direct call results

42 Brightkite- and Gowalla-Based Social Tie InferenceBrightkite and Gowalla data sets contain direct social ties aswell as the check-in information of each user In our studywe investigated both data sets based on several dimensionsand found some of the very interesting facts During theanalysis we observed positive correlations between mutualfriend count values MuF and user check-in details InBrightkite MuF ranging 6 to 40 and in Gowalla MuF

ranging 25 to 90 shows the positive corelation with Preci-sion We also measure the effect of gap time value on thePrecision and Recall and evaluated results based on severalgap time values

Figures 7(a) and 7(c) of line graph show the relationbetween mutual friend count values MuF and Precisionwhile Figures 7(b) and 7(d) show the relation betweenmutual friend count values MuF and Recall In Figure 7results shows that people having a certain number ofmutual friends use to visit place together or with little gapof interval

+e social tie inference framework infers some of theabsent ties as social ties in the form of false-positive resultsTo further extract the actual meaning of such incorrect

0

01

02

03

04

05

06

07

08

09

1Re

call

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 15

(a)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 10

(b)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 5

(c)

0

01

02

03

04

05

06

07

08

09

1Re

call

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 2

(d)

Figure 6+e Recall formed between two callers due to social contact depending upon the number of times they have connected through thesame base station within a specific time frame Figure 6(a) shows the Recall for the six different time frame lines whereas M is the number ofcooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

8 Scientific Programming

inferences we conducted a social activity and simulation+e false-positive results of the first part of the research serve asthe foundation for the second part Activity under the first partdata set is conducted and the false-positive results are examinedby studying the human cooccurrence patterns described in thenext section Suspicious Ties+is stage of research gave us a clueto further exploit the category of missing ties

5 Suspicious Ties

An absent tie can be inferred as a suspect tie if it satisfies thefollowing properties

(1) M number of mutual friends(2) C number of cooccurrence on different sites

In the CDR data set each cell is treated as a single cellof the base station Our model adopts the followingfour features for evaluation

(1) Base station(2) User ID(3) Gap time threshold(4) Call time stamp

+e following mathematical model explains theproblem and its formulationLet BS denotes a set bsi1113864 1113865 of bsn points and bsi iscalled as the distinct base station cellLet DU denotes a set dui1113864 1113865 of dun points and dui iscalled as the distinct user

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 420001020304050607080910

MuF

Prec

ision

(a)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 42MuF

0001020304050607080910

Reca

ll

(b)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Prec

ision

1 10 20 30 40 50 60 70 80 90 100

(c)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Reca

ll

1 10 20 30 40 50 60 70 80 90 100

(d)

Figure 7 +e Precision and Recall formed between two users due to social contact depending on the number of times they have checked into certain location within a specific time frame Figure 7(a) shows the Precision for the six different time frame lines whereas MuF is thenumber of mutual friend count (a) Brightkite Precision (b) Brightkite Recall (c) Gowalla Precision and (d) Gowalla Recall

Scientific Programming 9

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 4: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

Statistically there are always some hidden or visibleassociated parameters among cooccurred events Socialnetwork analysis is performed to explore such intriguingknowledge In the physical world social network analysis isutilized in job searching [4] studying urban life psychologyinvestigation of guilt association [12] finding communities[40] spreading of news [41] and influential networks[18 42] In the recent era of information and technologiesmassive logs are generating for each person eg call recordsbank transactions online purchase records daily emailsCCTV cameras and much more mediums [7 43 44] Incontrast to the physical world suchmediums further concisethe accuracy of results by highlighting such associatedfeatures Despite numerous data sources there is no optimalprocedure to quantify stochastic human nature and socialnetwork evolution [45]

+e grouping method identifies hidden social groupswhich further explores the friend circles and focuses underhigh privacy settings [46] Another research explores thehidden social ties using respondent sampling [47] In theliterature hidden social ties refer to that population which ishard to access +e population that tries to hide from thesocial network is hidden in a network [47] In our researchsuspect ties mean actors in a social network that are presentand accessible but they try to hide their social connectionsOur second part of the research explores the suspicious tieswithin the existing network instead of a hidden populationin a social network

3 Data Set Characteristics andEvaluation Measure

31 Data Set Descriptions In our research we incorporatedthree large location-based data sets ie CDR Brightkiteand Gowalla [13] +e CDR large data set used in this studywas provided by one of the Chinese mobile telecommu-nication operator companies +e data set contains 202000subscribes along with user demographic informationCalling detailed records contain six months (June2014mdashDecember 2014) and calling detailed records con-tain these 202 000 subscribes which have 221 451 169records Each record of the data set is represented in thefollowing format

Caller ID Call Type Callee ID Time

Duration LAC ID CELL ID

Brightkite and Gowalla are openly available location-based social network data sets [13 48] Both data sets aregathered using the online social-networking websitesWebsites maintain user check-in data by fetching mobileGPS location data +ese services use to help people infinding the nearby users and to build social connectionBrightkite contains 58228 nodes and 214078 edges andGowalla contains 196591 nodes and 950327 edges Otherthan social network data both data sets also contain directsocial tie data

32 Abbreviations and Evaluation Measures Figure 3 statesthe example of the social network having a case of suspectactors and their hidden ties Actors with several mutualfriends but do not have direct connection may have a secretconnection +is information helps in identifying them as asuspect tie +e social network evolves and new connectionsexpand the scope of the social network One social networkis a combination of multiple social networks involvingdifferent individuals [31] A social network can be slicedbased on starting and ending time Social networks can alsobe divided into subsocial networks monthwise if it has beendeveloped over one year [49 50] Social network slicinghelps our research further to explore the missing ties be-tween friends of friend relationships

+e following list of abbreviations is used for thequantification processes of Precision and Recall which willalso be used in several parts of the paper

SK calling record+e calling records represent the actual number ofdirect calls that occur between two users +e value ofSK is counted to identify the social tie between twousersCK times of cooccurrenceTime cooccurrence represents the presence of two usersin the range of a common base station We counted CKwhen two users were connected to a common basestation and they called any other userG time minus frame gap value+e time-frame gap value represents the time intervalbetween two usersrsquo calls while connected to a specificbase station For example user X calls someone at 2 pmand user Y calls someone else at 4 pm in this case thegap between the calls is 2 hr To quantify the results and

Actor

Suspect actor

Mutual actor

TieHidden tie

Figure 3 Hidden social ties referring suspects

4 Scientific Programming

evaluate Precision and Recall values we experimentedon the following set of gap valuesG 30mins 1 hr 2 hr 6 hr 12 hr and 24 hrSNK threshold for direct calls+e threshold for direct calls represents the set ofthreshold values for assessing direct calls between twousers We evaluated Precision and Recall curves on thebases of the following set of threshold valuesSNK 2 5 10 and 15

CTK threshold for cooccurrence+e threshold for cooccurrence represents the set ofthreshold values for the two usersrsquo presence in a specificbase station We tested the performance on the CTKthreshold values ranges from 1 to 40MuF threshold formutual friend

+e threshold for mutual friend represents the set ofthreshold values for the two usersrsquo mutual friend We testedthe performance on the MuF threshold values ranges from 1to 100

Precision TP

TP + FP (2)

Recall TP

TP + FN (3)

where as

TP (SNKge SK andCTKgeCK)

FP (SNKlt SK andCTKgeCK)

FN (SNKlt SK andCTKltCK)

TN (SNKge SK andCTKltCK)

⎧⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎩

(4)

F1 score 2 timesPrecision times RecallPrecision + Recall

(5)

33 Cooccurrence Count Normalization MeasureCooccurrence count value CV tells the presence of two usersin the region of one base station An issue related to CVcounting is explained and resolved using an example for thetwo users X and Y shown in Figure 4 We counted CVwhentwo users were connected to a common base station andthey called any other user in a specific time frame +eexample is shown in Figure 4 states the call log details ofusers X and Y gathered in a time frame T

Let X x1 x2 x3 x4 x5 x61113864 1113865 and Y y1 y2 y31113864 1113865 then

CV α1 α2 α3 αi1113864 1113865 (6)

where αi TimeGap(xi Y) i 1 to 6 j 1 to 3

TimeGap xi Y( 1113857 min xi y1( 1113857 xi y2( 1113857 xi yj1113872 11138731113966 1113967

(7)

In Figure 4 x1 and x2 call times have the closest calltime to y1 call In this case a count value of CV can becalculated as 2 However such counting may lead to awrong inference It is the same as if one person calls onceand another person calls n-times within a specific timeframe equals n as the count value To resolve this issue wepropose a normalization equation that decreases the countvalue periodically We introduce Beta (β) value as a pe-riodic normalizing factor

Let X denotes a set of calls by user X and Y denotes a setof calls by user Yif (XgeY) thenX times Y elseY times X

According to the example stated in Figure 4 we assumedβ for set Y (y1β y2β y3β)

For first match value of β 1For the second match values of β β2Likewise for the nth match value of β βn

x1 y1( 1113857 times y1β( 1113857( 1113857 + x2 y1( 1113857 timesy1β2

1113888 11138891113888 11138891113890 1113891

+ x3 y2( 1113857 times y2β( 1113857( 1113857 + x4 y2( 1113857 timesy2β2

1113888 11138891113888 1113889 + x5 y2( 1113857 timesy2β3

1113888 11138891113888 11138891113890 1113891

+ x6 y3( 1113857 times y3β( 11138571113858 1113859

(8)

1113944mk

m11113944

nk

n1xn ym( 1113857 times

ymβn

1113888 11138891113888 1113889 (9)

In equation (9) mk refers to the total number of callsmade by user X to Y while nk refers to the total number ofcalls made by user Y to X equation (9) finds intermediatevalue for CV ie 433 instead of maximum 6 or mini-mum 3 values

4 Social Tie Inference

We initially investigated direct social ties formed by CDRdata sets and compared them to the indirect social tiesformed based on common location using Algorithm 1 By

Scientific Programming 5

direct ties we mean calling or direct connection For ex-ample person A calls person B refers to a direct tie betweenA and B Algorithm 1 takes GV SNK and CDR data sets(social network) as inputs Furthermore the algorithm hastwo parts initially it finds the direct ties between two in-dividuals depending upon the SNK threshold value Sec-ondly it counts the presence of two individuals based onseveral parameters +e Calculate Cooccurrence Count()function finds the number of cooccurrences using equation(9) explained in the previous section Infer Social Ties()function finds the social connections depending upon CTKDT and SNK and inferred them as the social ties

41 CDR-Based Social Tie Inference A social tie is inferredbetween two persons if they are found together at severalsites numerous times +e inference algorithm identifies twosets of results ie direct social ties and inferred social tiesFor the cross-validation of results we correlate the direct tieresults with the inferred ones Precision and Recall evalu-ation measures are used to examine the results We tested allrecords based on threshold values K is the direct calls M isthe times of cooccurrence and G is the time frame gap valueWhile N as direct call count shows the degree of friendship

more value of N indicates the friendship strength Figure 5shows the Precision graph which contains four setsFigures 5(a)ndash5(d) +e whole data set is examined based onK and the value of M and v

In Figure 5(a) the value of K is 15 which represents theusers with direct calls between each other equals to or greaterthan 15 +e value of M is the number of cooccurrence fortwo different users +e Precision values are comparativelysignificantly less for M in the range of 0 to 10 In contrastthe value of Precision increases exponentially for the value ofM in the range of 10 to 30 +e higher value of M indicateshigher cooccurrence of users A positive correlation can beobserved between the values of M and Precision It infersthat cooccurrence is a significant attribute that affectspositively in identifying social ties All graphs in Figure 5have six different lines each line represents the different timegap ranges It can also be seen that the values of gap value 30minutes are having more Precision while the rest lines of 1hour 2 hours 6 hours 12 hours and 24 hours are having lessPrecision +is also clues that the strength of ties has aspecific effect on Precision Users having strong socialconnections most of the time are found together in certainareas +is pattern is explicitly observed by looking cooc-currence value M (20 to 40) and gap time frame G 30

y1

y2

y3

User X User Y

x1

x2

x3

x4

x5

x6

Figure 4 User X and user Y cooccurrence count comparison case

RequireSN Social NetworkGVTime Gap +reshold ValueSNK 2 5 10 15

EnsureDTDirect Social TiesIST Infer Social Ties

(1) while SNne 0 do(2) DTDirect Ties (SNK)(3) end while(4) while SNne 0 do(5) CTKCalculate Cooccurrence Count (DT GV)(6) IST Infer Social Ties (CTK DT SNK)(7) end while

ALGORITHM 1 Social tie inference

6 Scientific Programming

minutes Another positive correlation is found between thedegree of friendship and physical presence at a specific place

To see the effect of friendship strength we evaluatedresults for the four different K values ie 2 5 10 and 15 Atypical pattern is found in all the graphs shown in Figure 5 Itshows that Precision is less for people whose mutualpresence is less at different sites Also people with strongsocial ties spent less than 1 hr time together at a specificlocation To understand the graphrsquos actual meaning wequantify and reconcile with the actual direct social ties It isobserved that a positive correlation in results infers that

people with strong social connections often visit placestogether

Figure 6(a) represents the Recall results We tested andevaluated Recall based on the same measures as Precisionie direct calls cooccurrence and gap time frameFigure 6(a) states that the value of Recall is at a maximumwhen we consider a less number of commonplaces Aninverse trend is observed between the values of Recall andM specifically for the M value ranging between 0 and 3+esame as Precision Recall is also evaluated based on sixdifferent time gap values

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 15

(a)

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 10

(b)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 5

(c)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 2

(d)

Figure 5 +e Precision formed between two callers due to social contact depending on the number of times they have connected throughthe same base station within a specific time frame Figure 5(a) shows the Precision for the six different time frame lines whereas M is thenumber of cooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

Scientific Programming 7

+is part of the research finds peoplersquos cooccurrencebased on the same base station connectivity in a specific timeframe and infers them as social ties Furthermore it crossrelates the inferred results with direct call results

42 Brightkite- and Gowalla-Based Social Tie InferenceBrightkite and Gowalla data sets contain direct social ties aswell as the check-in information of each user In our studywe investigated both data sets based on several dimensionsand found some of the very interesting facts During theanalysis we observed positive correlations between mutualfriend count values MuF and user check-in details InBrightkite MuF ranging 6 to 40 and in Gowalla MuF

ranging 25 to 90 shows the positive corelation with Preci-sion We also measure the effect of gap time value on thePrecision and Recall and evaluated results based on severalgap time values

Figures 7(a) and 7(c) of line graph show the relationbetween mutual friend count values MuF and Precisionwhile Figures 7(b) and 7(d) show the relation betweenmutual friend count values MuF and Recall In Figure 7results shows that people having a certain number ofmutual friends use to visit place together or with little gapof interval

+e social tie inference framework infers some of theabsent ties as social ties in the form of false-positive resultsTo further extract the actual meaning of such incorrect

0

01

02

03

04

05

06

07

08

09

1Re

call

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 15

(a)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 10

(b)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 5

(c)

0

01

02

03

04

05

06

07

08

09

1Re

call

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 2

(d)

Figure 6+e Recall formed between two callers due to social contact depending upon the number of times they have connected through thesame base station within a specific time frame Figure 6(a) shows the Recall for the six different time frame lines whereas M is the number ofcooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

8 Scientific Programming

inferences we conducted a social activity and simulation+e false-positive results of the first part of the research serve asthe foundation for the second part Activity under the first partdata set is conducted and the false-positive results are examinedby studying the human cooccurrence patterns described in thenext section Suspicious Ties+is stage of research gave us a clueto further exploit the category of missing ties

5 Suspicious Ties

An absent tie can be inferred as a suspect tie if it satisfies thefollowing properties

(1) M number of mutual friends(2) C number of cooccurrence on different sites

In the CDR data set each cell is treated as a single cellof the base station Our model adopts the followingfour features for evaluation

(1) Base station(2) User ID(3) Gap time threshold(4) Call time stamp

+e following mathematical model explains theproblem and its formulationLet BS denotes a set bsi1113864 1113865 of bsn points and bsi iscalled as the distinct base station cellLet DU denotes a set dui1113864 1113865 of dun points and dui iscalled as the distinct user

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 420001020304050607080910

MuF

Prec

ision

(a)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 42MuF

0001020304050607080910

Reca

ll

(b)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Prec

ision

1 10 20 30 40 50 60 70 80 90 100

(c)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Reca

ll

1 10 20 30 40 50 60 70 80 90 100

(d)

Figure 7 +e Precision and Recall formed between two users due to social contact depending on the number of times they have checked into certain location within a specific time frame Figure 7(a) shows the Precision for the six different time frame lines whereas MuF is thenumber of mutual friend count (a) Brightkite Precision (b) Brightkite Recall (c) Gowalla Precision and (d) Gowalla Recall

Scientific Programming 9

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 5: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

evaluate Precision and Recall values we experimentedon the following set of gap valuesG 30mins 1 hr 2 hr 6 hr 12 hr and 24 hrSNK threshold for direct calls+e threshold for direct calls represents the set ofthreshold values for assessing direct calls between twousers We evaluated Precision and Recall curves on thebases of the following set of threshold valuesSNK 2 5 10 and 15

CTK threshold for cooccurrence+e threshold for cooccurrence represents the set ofthreshold values for the two usersrsquo presence in a specificbase station We tested the performance on the CTKthreshold values ranges from 1 to 40MuF threshold formutual friend

+e threshold for mutual friend represents the set ofthreshold values for the two usersrsquo mutual friend We testedthe performance on the MuF threshold values ranges from 1to 100

Precision TP

TP + FP (2)

Recall TP

TP + FN (3)

where as

TP (SNKge SK andCTKgeCK)

FP (SNKlt SK andCTKgeCK)

FN (SNKlt SK andCTKltCK)

TN (SNKge SK andCTKltCK)

⎧⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎩

(4)

F1 score 2 timesPrecision times RecallPrecision + Recall

(5)

33 Cooccurrence Count Normalization MeasureCooccurrence count value CV tells the presence of two usersin the region of one base station An issue related to CVcounting is explained and resolved using an example for thetwo users X and Y shown in Figure 4 We counted CVwhentwo users were connected to a common base station andthey called any other user in a specific time frame +eexample is shown in Figure 4 states the call log details ofusers X and Y gathered in a time frame T

Let X x1 x2 x3 x4 x5 x61113864 1113865 and Y y1 y2 y31113864 1113865 then

CV α1 α2 α3 αi1113864 1113865 (6)

where αi TimeGap(xi Y) i 1 to 6 j 1 to 3

TimeGap xi Y( 1113857 min xi y1( 1113857 xi y2( 1113857 xi yj1113872 11138731113966 1113967

(7)

In Figure 4 x1 and x2 call times have the closest calltime to y1 call In this case a count value of CV can becalculated as 2 However such counting may lead to awrong inference It is the same as if one person calls onceand another person calls n-times within a specific timeframe equals n as the count value To resolve this issue wepropose a normalization equation that decreases the countvalue periodically We introduce Beta (β) value as a pe-riodic normalizing factor

Let X denotes a set of calls by user X and Y denotes a setof calls by user Yif (XgeY) thenX times Y elseY times X

According to the example stated in Figure 4 we assumedβ for set Y (y1β y2β y3β)

For first match value of β 1For the second match values of β β2Likewise for the nth match value of β βn

x1 y1( 1113857 times y1β( 1113857( 1113857 + x2 y1( 1113857 timesy1β2

1113888 11138891113888 11138891113890 1113891

+ x3 y2( 1113857 times y2β( 1113857( 1113857 + x4 y2( 1113857 timesy2β2

1113888 11138891113888 1113889 + x5 y2( 1113857 timesy2β3

1113888 11138891113888 11138891113890 1113891

+ x6 y3( 1113857 times y3β( 11138571113858 1113859

(8)

1113944mk

m11113944

nk

n1xn ym( 1113857 times

ymβn

1113888 11138891113888 1113889 (9)

In equation (9) mk refers to the total number of callsmade by user X to Y while nk refers to the total number ofcalls made by user Y to X equation (9) finds intermediatevalue for CV ie 433 instead of maximum 6 or mini-mum 3 values

4 Social Tie Inference

We initially investigated direct social ties formed by CDRdata sets and compared them to the indirect social tiesformed based on common location using Algorithm 1 By

Scientific Programming 5

direct ties we mean calling or direct connection For ex-ample person A calls person B refers to a direct tie betweenA and B Algorithm 1 takes GV SNK and CDR data sets(social network) as inputs Furthermore the algorithm hastwo parts initially it finds the direct ties between two in-dividuals depending upon the SNK threshold value Sec-ondly it counts the presence of two individuals based onseveral parameters +e Calculate Cooccurrence Count()function finds the number of cooccurrences using equation(9) explained in the previous section Infer Social Ties()function finds the social connections depending upon CTKDT and SNK and inferred them as the social ties

41 CDR-Based Social Tie Inference A social tie is inferredbetween two persons if they are found together at severalsites numerous times +e inference algorithm identifies twosets of results ie direct social ties and inferred social tiesFor the cross-validation of results we correlate the direct tieresults with the inferred ones Precision and Recall evalu-ation measures are used to examine the results We tested allrecords based on threshold values K is the direct calls M isthe times of cooccurrence and G is the time frame gap valueWhile N as direct call count shows the degree of friendship

more value of N indicates the friendship strength Figure 5shows the Precision graph which contains four setsFigures 5(a)ndash5(d) +e whole data set is examined based onK and the value of M and v

In Figure 5(a) the value of K is 15 which represents theusers with direct calls between each other equals to or greaterthan 15 +e value of M is the number of cooccurrence fortwo different users +e Precision values are comparativelysignificantly less for M in the range of 0 to 10 In contrastthe value of Precision increases exponentially for the value ofM in the range of 10 to 30 +e higher value of M indicateshigher cooccurrence of users A positive correlation can beobserved between the values of M and Precision It infersthat cooccurrence is a significant attribute that affectspositively in identifying social ties All graphs in Figure 5have six different lines each line represents the different timegap ranges It can also be seen that the values of gap value 30minutes are having more Precision while the rest lines of 1hour 2 hours 6 hours 12 hours and 24 hours are having lessPrecision +is also clues that the strength of ties has aspecific effect on Precision Users having strong socialconnections most of the time are found together in certainareas +is pattern is explicitly observed by looking cooc-currence value M (20 to 40) and gap time frame G 30

y1

y2

y3

User X User Y

x1

x2

x3

x4

x5

x6

Figure 4 User X and user Y cooccurrence count comparison case

RequireSN Social NetworkGVTime Gap +reshold ValueSNK 2 5 10 15

EnsureDTDirect Social TiesIST Infer Social Ties

(1) while SNne 0 do(2) DTDirect Ties (SNK)(3) end while(4) while SNne 0 do(5) CTKCalculate Cooccurrence Count (DT GV)(6) IST Infer Social Ties (CTK DT SNK)(7) end while

ALGORITHM 1 Social tie inference

6 Scientific Programming

minutes Another positive correlation is found between thedegree of friendship and physical presence at a specific place

To see the effect of friendship strength we evaluatedresults for the four different K values ie 2 5 10 and 15 Atypical pattern is found in all the graphs shown in Figure 5 Itshows that Precision is less for people whose mutualpresence is less at different sites Also people with strongsocial ties spent less than 1 hr time together at a specificlocation To understand the graphrsquos actual meaning wequantify and reconcile with the actual direct social ties It isobserved that a positive correlation in results infers that

people with strong social connections often visit placestogether

Figure 6(a) represents the Recall results We tested andevaluated Recall based on the same measures as Precisionie direct calls cooccurrence and gap time frameFigure 6(a) states that the value of Recall is at a maximumwhen we consider a less number of commonplaces Aninverse trend is observed between the values of Recall andM specifically for the M value ranging between 0 and 3+esame as Precision Recall is also evaluated based on sixdifferent time gap values

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 15

(a)

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 10

(b)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 5

(c)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 2

(d)

Figure 5 +e Precision formed between two callers due to social contact depending on the number of times they have connected throughthe same base station within a specific time frame Figure 5(a) shows the Precision for the six different time frame lines whereas M is thenumber of cooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

Scientific Programming 7

+is part of the research finds peoplersquos cooccurrencebased on the same base station connectivity in a specific timeframe and infers them as social ties Furthermore it crossrelates the inferred results with direct call results

42 Brightkite- and Gowalla-Based Social Tie InferenceBrightkite and Gowalla data sets contain direct social ties aswell as the check-in information of each user In our studywe investigated both data sets based on several dimensionsand found some of the very interesting facts During theanalysis we observed positive correlations between mutualfriend count values MuF and user check-in details InBrightkite MuF ranging 6 to 40 and in Gowalla MuF

ranging 25 to 90 shows the positive corelation with Preci-sion We also measure the effect of gap time value on thePrecision and Recall and evaluated results based on severalgap time values

Figures 7(a) and 7(c) of line graph show the relationbetween mutual friend count values MuF and Precisionwhile Figures 7(b) and 7(d) show the relation betweenmutual friend count values MuF and Recall In Figure 7results shows that people having a certain number ofmutual friends use to visit place together or with little gapof interval

+e social tie inference framework infers some of theabsent ties as social ties in the form of false-positive resultsTo further extract the actual meaning of such incorrect

0

01

02

03

04

05

06

07

08

09

1Re

call

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 15

(a)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 10

(b)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 5

(c)

0

01

02

03

04

05

06

07

08

09

1Re

call

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 2

(d)

Figure 6+e Recall formed between two callers due to social contact depending upon the number of times they have connected through thesame base station within a specific time frame Figure 6(a) shows the Recall for the six different time frame lines whereas M is the number ofcooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

8 Scientific Programming

inferences we conducted a social activity and simulation+e false-positive results of the first part of the research serve asthe foundation for the second part Activity under the first partdata set is conducted and the false-positive results are examinedby studying the human cooccurrence patterns described in thenext section Suspicious Ties+is stage of research gave us a clueto further exploit the category of missing ties

5 Suspicious Ties

An absent tie can be inferred as a suspect tie if it satisfies thefollowing properties

(1) M number of mutual friends(2) C number of cooccurrence on different sites

In the CDR data set each cell is treated as a single cellof the base station Our model adopts the followingfour features for evaluation

(1) Base station(2) User ID(3) Gap time threshold(4) Call time stamp

+e following mathematical model explains theproblem and its formulationLet BS denotes a set bsi1113864 1113865 of bsn points and bsi iscalled as the distinct base station cellLet DU denotes a set dui1113864 1113865 of dun points and dui iscalled as the distinct user

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 420001020304050607080910

MuF

Prec

ision

(a)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 42MuF

0001020304050607080910

Reca

ll

(b)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Prec

ision

1 10 20 30 40 50 60 70 80 90 100

(c)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Reca

ll

1 10 20 30 40 50 60 70 80 90 100

(d)

Figure 7 +e Precision and Recall formed between two users due to social contact depending on the number of times they have checked into certain location within a specific time frame Figure 7(a) shows the Precision for the six different time frame lines whereas MuF is thenumber of mutual friend count (a) Brightkite Precision (b) Brightkite Recall (c) Gowalla Precision and (d) Gowalla Recall

Scientific Programming 9

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 6: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

direct ties we mean calling or direct connection For ex-ample person A calls person B refers to a direct tie betweenA and B Algorithm 1 takes GV SNK and CDR data sets(social network) as inputs Furthermore the algorithm hastwo parts initially it finds the direct ties between two in-dividuals depending upon the SNK threshold value Sec-ondly it counts the presence of two individuals based onseveral parameters +e Calculate Cooccurrence Count()function finds the number of cooccurrences using equation(9) explained in the previous section Infer Social Ties()function finds the social connections depending upon CTKDT and SNK and inferred them as the social ties

41 CDR-Based Social Tie Inference A social tie is inferredbetween two persons if they are found together at severalsites numerous times +e inference algorithm identifies twosets of results ie direct social ties and inferred social tiesFor the cross-validation of results we correlate the direct tieresults with the inferred ones Precision and Recall evalu-ation measures are used to examine the results We tested allrecords based on threshold values K is the direct calls M isthe times of cooccurrence and G is the time frame gap valueWhile N as direct call count shows the degree of friendship

more value of N indicates the friendship strength Figure 5shows the Precision graph which contains four setsFigures 5(a)ndash5(d) +e whole data set is examined based onK and the value of M and v

In Figure 5(a) the value of K is 15 which represents theusers with direct calls between each other equals to or greaterthan 15 +e value of M is the number of cooccurrence fortwo different users +e Precision values are comparativelysignificantly less for M in the range of 0 to 10 In contrastthe value of Precision increases exponentially for the value ofM in the range of 10 to 30 +e higher value of M indicateshigher cooccurrence of users A positive correlation can beobserved between the values of M and Precision It infersthat cooccurrence is a significant attribute that affectspositively in identifying social ties All graphs in Figure 5have six different lines each line represents the different timegap ranges It can also be seen that the values of gap value 30minutes are having more Precision while the rest lines of 1hour 2 hours 6 hours 12 hours and 24 hours are having lessPrecision +is also clues that the strength of ties has aspecific effect on Precision Users having strong socialconnections most of the time are found together in certainareas +is pattern is explicitly observed by looking cooc-currence value M (20 to 40) and gap time frame G 30

y1

y2

y3

User X User Y

x1

x2

x3

x4

x5

x6

Figure 4 User X and user Y cooccurrence count comparison case

RequireSN Social NetworkGVTime Gap +reshold ValueSNK 2 5 10 15

EnsureDTDirect Social TiesIST Infer Social Ties

(1) while SNne 0 do(2) DTDirect Ties (SNK)(3) end while(4) while SNne 0 do(5) CTKCalculate Cooccurrence Count (DT GV)(6) IST Infer Social Ties (CTK DT SNK)(7) end while

ALGORITHM 1 Social tie inference

6 Scientific Programming

minutes Another positive correlation is found between thedegree of friendship and physical presence at a specific place

To see the effect of friendship strength we evaluatedresults for the four different K values ie 2 5 10 and 15 Atypical pattern is found in all the graphs shown in Figure 5 Itshows that Precision is less for people whose mutualpresence is less at different sites Also people with strongsocial ties spent less than 1 hr time together at a specificlocation To understand the graphrsquos actual meaning wequantify and reconcile with the actual direct social ties It isobserved that a positive correlation in results infers that

people with strong social connections often visit placestogether

Figure 6(a) represents the Recall results We tested andevaluated Recall based on the same measures as Precisionie direct calls cooccurrence and gap time frameFigure 6(a) states that the value of Recall is at a maximumwhen we consider a less number of commonplaces Aninverse trend is observed between the values of Recall andM specifically for the M value ranging between 0 and 3+esame as Precision Recall is also evaluated based on sixdifferent time gap values

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 15

(a)

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 10

(b)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 5

(c)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 2

(d)

Figure 5 +e Precision formed between two callers due to social contact depending on the number of times they have connected throughthe same base station within a specific time frame Figure 5(a) shows the Precision for the six different time frame lines whereas M is thenumber of cooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

Scientific Programming 7

+is part of the research finds peoplersquos cooccurrencebased on the same base station connectivity in a specific timeframe and infers them as social ties Furthermore it crossrelates the inferred results with direct call results

42 Brightkite- and Gowalla-Based Social Tie InferenceBrightkite and Gowalla data sets contain direct social ties aswell as the check-in information of each user In our studywe investigated both data sets based on several dimensionsand found some of the very interesting facts During theanalysis we observed positive correlations between mutualfriend count values MuF and user check-in details InBrightkite MuF ranging 6 to 40 and in Gowalla MuF

ranging 25 to 90 shows the positive corelation with Preci-sion We also measure the effect of gap time value on thePrecision and Recall and evaluated results based on severalgap time values

Figures 7(a) and 7(c) of line graph show the relationbetween mutual friend count values MuF and Precisionwhile Figures 7(b) and 7(d) show the relation betweenmutual friend count values MuF and Recall In Figure 7results shows that people having a certain number ofmutual friends use to visit place together or with little gapof interval

+e social tie inference framework infers some of theabsent ties as social ties in the form of false-positive resultsTo further extract the actual meaning of such incorrect

0

01

02

03

04

05

06

07

08

09

1Re

call

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 15

(a)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 10

(b)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 5

(c)

0

01

02

03

04

05

06

07

08

09

1Re

call

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 2

(d)

Figure 6+e Recall formed between two callers due to social contact depending upon the number of times they have connected through thesame base station within a specific time frame Figure 6(a) shows the Recall for the six different time frame lines whereas M is the number ofcooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

8 Scientific Programming

inferences we conducted a social activity and simulation+e false-positive results of the first part of the research serve asthe foundation for the second part Activity under the first partdata set is conducted and the false-positive results are examinedby studying the human cooccurrence patterns described in thenext section Suspicious Ties+is stage of research gave us a clueto further exploit the category of missing ties

5 Suspicious Ties

An absent tie can be inferred as a suspect tie if it satisfies thefollowing properties

(1) M number of mutual friends(2) C number of cooccurrence on different sites

In the CDR data set each cell is treated as a single cellof the base station Our model adopts the followingfour features for evaluation

(1) Base station(2) User ID(3) Gap time threshold(4) Call time stamp

+e following mathematical model explains theproblem and its formulationLet BS denotes a set bsi1113864 1113865 of bsn points and bsi iscalled as the distinct base station cellLet DU denotes a set dui1113864 1113865 of dun points and dui iscalled as the distinct user

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 420001020304050607080910

MuF

Prec

ision

(a)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 42MuF

0001020304050607080910

Reca

ll

(b)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Prec

ision

1 10 20 30 40 50 60 70 80 90 100

(c)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Reca

ll

1 10 20 30 40 50 60 70 80 90 100

(d)

Figure 7 +e Precision and Recall formed between two users due to social contact depending on the number of times they have checked into certain location within a specific time frame Figure 7(a) shows the Precision for the six different time frame lines whereas MuF is thenumber of mutual friend count (a) Brightkite Precision (b) Brightkite Recall (c) Gowalla Precision and (d) Gowalla Recall

Scientific Programming 9

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 7: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

minutes Another positive correlation is found between thedegree of friendship and physical presence at a specific place

To see the effect of friendship strength we evaluatedresults for the four different K values ie 2 5 10 and 15 Atypical pattern is found in all the graphs shown in Figure 5 Itshows that Precision is less for people whose mutualpresence is less at different sites Also people with strongsocial ties spent less than 1 hr time together at a specificlocation To understand the graphrsquos actual meaning wequantify and reconcile with the actual direct social ties It isobserved that a positive correlation in results infers that

people with strong social connections often visit placestogether

Figure 6(a) represents the Recall results We tested andevaluated Recall based on the same measures as Precisionie direct calls cooccurrence and gap time frameFigure 6(a) states that the value of Recall is at a maximumwhen we consider a less number of commonplaces Aninverse trend is observed between the values of Recall andM specifically for the M value ranging between 0 and 3+esame as Precision Recall is also evaluated based on sixdifferent time gap values

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 15

(a)

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

M

Prec

ision

K = 10

(b)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 5

(c)

0 5 10 15 20 25 30 35 400

01

02

03

04

05

06

07

08

09

1

T = 30minT = 1hT = 2h

T = 6hT = 12hT = 24h

M

Prec

ision

K = 2

(d)

Figure 5 +e Precision formed between two callers due to social contact depending on the number of times they have connected throughthe same base station within a specific time frame Figure 5(a) shows the Precision for the six different time frame lines whereas M is thenumber of cooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

Scientific Programming 7

+is part of the research finds peoplersquos cooccurrencebased on the same base station connectivity in a specific timeframe and infers them as social ties Furthermore it crossrelates the inferred results with direct call results

42 Brightkite- and Gowalla-Based Social Tie InferenceBrightkite and Gowalla data sets contain direct social ties aswell as the check-in information of each user In our studywe investigated both data sets based on several dimensionsand found some of the very interesting facts During theanalysis we observed positive correlations between mutualfriend count values MuF and user check-in details InBrightkite MuF ranging 6 to 40 and in Gowalla MuF

ranging 25 to 90 shows the positive corelation with Preci-sion We also measure the effect of gap time value on thePrecision and Recall and evaluated results based on severalgap time values

Figures 7(a) and 7(c) of line graph show the relationbetween mutual friend count values MuF and Precisionwhile Figures 7(b) and 7(d) show the relation betweenmutual friend count values MuF and Recall In Figure 7results shows that people having a certain number ofmutual friends use to visit place together or with little gapof interval

+e social tie inference framework infers some of theabsent ties as social ties in the form of false-positive resultsTo further extract the actual meaning of such incorrect

0

01

02

03

04

05

06

07

08

09

1Re

call

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 15

(a)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 10

(b)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 5

(c)

0

01

02

03

04

05

06

07

08

09

1Re

call

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 2

(d)

Figure 6+e Recall formed between two callers due to social contact depending upon the number of times they have connected through thesame base station within a specific time frame Figure 6(a) shows the Recall for the six different time frame lines whereas M is the number ofcooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

8 Scientific Programming

inferences we conducted a social activity and simulation+e false-positive results of the first part of the research serve asthe foundation for the second part Activity under the first partdata set is conducted and the false-positive results are examinedby studying the human cooccurrence patterns described in thenext section Suspicious Ties+is stage of research gave us a clueto further exploit the category of missing ties

5 Suspicious Ties

An absent tie can be inferred as a suspect tie if it satisfies thefollowing properties

(1) M number of mutual friends(2) C number of cooccurrence on different sites

In the CDR data set each cell is treated as a single cellof the base station Our model adopts the followingfour features for evaluation

(1) Base station(2) User ID(3) Gap time threshold(4) Call time stamp

+e following mathematical model explains theproblem and its formulationLet BS denotes a set bsi1113864 1113865 of bsn points and bsi iscalled as the distinct base station cellLet DU denotes a set dui1113864 1113865 of dun points and dui iscalled as the distinct user

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 420001020304050607080910

MuF

Prec

ision

(a)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 42MuF

0001020304050607080910

Reca

ll

(b)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Prec

ision

1 10 20 30 40 50 60 70 80 90 100

(c)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Reca

ll

1 10 20 30 40 50 60 70 80 90 100

(d)

Figure 7 +e Precision and Recall formed between two users due to social contact depending on the number of times they have checked into certain location within a specific time frame Figure 7(a) shows the Precision for the six different time frame lines whereas MuF is thenumber of mutual friend count (a) Brightkite Precision (b) Brightkite Recall (c) Gowalla Precision and (d) Gowalla Recall

Scientific Programming 9

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 8: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

+is part of the research finds peoplersquos cooccurrencebased on the same base station connectivity in a specific timeframe and infers them as social ties Furthermore it crossrelates the inferred results with direct call results

42 Brightkite- and Gowalla-Based Social Tie InferenceBrightkite and Gowalla data sets contain direct social ties aswell as the check-in information of each user In our studywe investigated both data sets based on several dimensionsand found some of the very interesting facts During theanalysis we observed positive correlations between mutualfriend count values MuF and user check-in details InBrightkite MuF ranging 6 to 40 and in Gowalla MuF

ranging 25 to 90 shows the positive corelation with Preci-sion We also measure the effect of gap time value on thePrecision and Recall and evaluated results based on severalgap time values

Figures 7(a) and 7(c) of line graph show the relationbetween mutual friend count values MuF and Precisionwhile Figures 7(b) and 7(d) show the relation betweenmutual friend count values MuF and Recall In Figure 7results shows that people having a certain number ofmutual friends use to visit place together or with little gapof interval

+e social tie inference framework infers some of theabsent ties as social ties in the form of false-positive resultsTo further extract the actual meaning of such incorrect

0

01

02

03

04

05

06

07

08

09

1Re

call

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 15

(a)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

0 5 10 15 20 25 30 35 40M

K = 10

(b)

0

01

02

03

04

05

06

07

08

09

1

Reca

ll

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 5

(c)

0

01

02

03

04

05

06

07

08

09

1Re

call

0 5 10 15 20 25 30 35 40

T = 30min

T = 2hT = 1h

T = 6hT = 12hT = 24h

M

K = 2

(d)

Figure 6+e Recall formed between two callers due to social contact depending upon the number of times they have connected through thesame base station within a specific time frame Figure 6(a) shows the Recall for the six different time frame lines whereas M is the number ofcooccurrences and K is the number of direct calls (a) K 15 (b) K 10 (c) K 5 and (d) K 2

8 Scientific Programming

inferences we conducted a social activity and simulation+e false-positive results of the first part of the research serve asthe foundation for the second part Activity under the first partdata set is conducted and the false-positive results are examinedby studying the human cooccurrence patterns described in thenext section Suspicious Ties+is stage of research gave us a clueto further exploit the category of missing ties

5 Suspicious Ties

An absent tie can be inferred as a suspect tie if it satisfies thefollowing properties

(1) M number of mutual friends(2) C number of cooccurrence on different sites

In the CDR data set each cell is treated as a single cellof the base station Our model adopts the followingfour features for evaluation

(1) Base station(2) User ID(3) Gap time threshold(4) Call time stamp

+e following mathematical model explains theproblem and its formulationLet BS denotes a set bsi1113864 1113865 of bsn points and bsi iscalled as the distinct base station cellLet DU denotes a set dui1113864 1113865 of dun points and dui iscalled as the distinct user

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 420001020304050607080910

MuF

Prec

ision

(a)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 42MuF

0001020304050607080910

Reca

ll

(b)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Prec

ision

1 10 20 30 40 50 60 70 80 90 100

(c)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Reca

ll

1 10 20 30 40 50 60 70 80 90 100

(d)

Figure 7 +e Precision and Recall formed between two users due to social contact depending on the number of times they have checked into certain location within a specific time frame Figure 7(a) shows the Precision for the six different time frame lines whereas MuF is thenumber of mutual friend count (a) Brightkite Precision (b) Brightkite Recall (c) Gowalla Precision and (d) Gowalla Recall

Scientific Programming 9

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 9: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

inferences we conducted a social activity and simulation+e false-positive results of the first part of the research serve asthe foundation for the second part Activity under the first partdata set is conducted and the false-positive results are examinedby studying the human cooccurrence patterns described in thenext section Suspicious Ties+is stage of research gave us a clueto further exploit the category of missing ties

5 Suspicious Ties

An absent tie can be inferred as a suspect tie if it satisfies thefollowing properties

(1) M number of mutual friends(2) C number of cooccurrence on different sites

In the CDR data set each cell is treated as a single cellof the base station Our model adopts the followingfour features for evaluation

(1) Base station(2) User ID(3) Gap time threshold(4) Call time stamp

+e following mathematical model explains theproblem and its formulationLet BS denotes a set bsi1113864 1113865 of bsn points and bsi iscalled as the distinct base station cellLet DU denotes a set dui1113864 1113865 of dun points and dui iscalled as the distinct user

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 420001020304050607080910

MuF

Prec

ision

(a)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

2 6 10 14 18 22 26 30 34 38 42MuF

0001020304050607080910

Reca

ll

(b)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Prec

ision

1 10 20 30 40 50 60 70 80 90 100

(c)

T = 15minT = 30minT = 1h

T = 2hT = 6hT = 12h

MuF

0001020304050607080910

Reca

ll

1 10 20 30 40 50 60 70 80 90 100

(d)

Figure 7 +e Precision and Recall formed between two users due to social contact depending on the number of times they have checked into certain location within a specific time frame Figure 7(a) shows the Precision for the six different time frame lines whereas MuF is thenumber of mutual friend count (a) Brightkite Precision (b) Brightkite Recall (c) Gowalla Precision and (d) Gowalla Recall

Scientific Programming 9

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 10: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

Let TH denotes a set thi1113864 1113865 of thn points and thi iscalled as the threshold value for timeframeLet C denotes a set ci1113864 1113865 of cn points and ci is called asthe call information

ci (id ct) whereid as call id

ct as outgoing call time1113896 (10)

Let R denotes a set ri1113864 1113865 of rn points then

ri dui ci(id)( 1113857 times dui ci(id)( 11138571113858 1113859anddui ne dui (11)

Let RT denotes a set rti1113864 1113865 of rtn points then

rti dui ci(ct)( 1113857 minus dui cti(ct)( 11138571113858 1113859anddui ne dui (12)

Let S denotes a set si1113864 1113865 of sn points then

si ri isin bsi( 1113857and rti le thi( 11138571113858 1113859 (13)

wheresi is a set of elements that identifies distinct callersbased on the same base station connectivity and a definitenumber of calls in a specific time frame

6 Suspect Inference Framework

We studied the pattern of exceptional cases belong to thefalse-positive set and described a subset of the false-positive set as suspect ties Physical activity was designedand conducted to investigate the formation of suspectsocial ties Activity consisting of 50 people and a data setwas generated within almost 4-5 hours A basketballcourt was utilized for the activity Nine circles weredrawn physically on the basketball court assumed as thebase station cell Out of 50 people nine were directed toact as a base station +e boundary of each circle wasconsidered as a range of the base station cell Rests of the41 persons were directed to perform the following twosteps

61 Selection Step Initially each person from 41 people wasasked to choose two sets of friends One set as obviousfriends and the second set of hidden friends such that thesize of the hidden friend set should be at most 15 of theobvious friend set size eg if one person has five people inapparent friend set he can have no more than one hiddenfriend After the selection of both sets by each individualinformation was shared with one of our representatives

62 Operation Step In this phase each person was directedto follow the following rules

(1) You should not call your hidden friend

(2) You should call all of your obvious friends at leastonce

(3) You should conduct a maximum number of calls toyour closest obvious friend and second maximum toa second level obvious friend and likewise to the leastfriend

(4) You must try to meet your hidden friend as much aspossible physically

+e method of calling is like if person A wants to callperson B from base station B1 the person has to go to thebase station B1 and register a call with a person acting andstanding in base station B1 Respective base station personwill write and make an entry with five parameters ie CallerName Callee Name Time From Base Station Name and ToBase Station Name +e data set was gathered in the fol-lowing format An example is given below

CallerName

CalleeName Time Stamp

FromBase

Station

ToBase

Station

User 1 User 12 02032019101508 BS 7 BS 2

For the understanding of the variations and patterns thesame activity was also designed using simulation +e wholesimulation followed the same conditions and another dataset was generated using a random function Based on theactivity a framework is designed to separate a class ofsuspect ties Proposed inference framework work is designedand implemented using pipe and filter architecture shownin Figure 8 Algorithm 2 shows the implementation ofsuspicious tie inference framework explained in Figure 8+e framework takes the social network matrix countthreshold value gap time value and mutual friend countvalue as inputs and filters the result Initially CalculateLevels() function finds the five level depth information foreach distinct user Let us say if A calls B B calls C C calls Dand D calls E it implies that A 0 B 1 C 2 D 3 andE 4 represent five levels +is step ensures that all the levelshave distinct users Secondly Find Suspects() functionselects only those sets of users from level 1 and level 3 that donot have any direct calls and M number of mutual friendsFurthermore Calculate Subsocial Network() functiongenerates subgraphs using level 1 and level 3 detailsdepending upon the gap time value Results are filtered onthese bases of time gap value eg two users called someother user while connected to the same base station withinthe given time frame explained in the previous section Afterthat Infer Hidden Ties() function uses the proposed nor-malization method to find the number of the count definedin equation (9) and then all results are filtered according tothe cooccurrence count CV and the mutual friend thresholdvalue M Based on mentioned parameters and thresholdsAlgorithm 2 significantly identifies the subclass of missingties as suspicious ties

+e results of the activity and simulation are computedand evaluated using Precision and Recall measures

10 Scientific Programming

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 11: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

RequireSN Social NetworkGVTime Gap+reshold ValueCVCooccurrence Count +reshold ValueMMutual Friend Count

EnsureST Inferred Social Suspect Ties

(1) while SNne 0 do(2) Levels [5 n]Calculate Levels()(3) Suspects Find Suspects (Level 1 Level 3)(4) SGCalculate Subsocial Network (Suspects GV)(5) ST Infer Suspect Ties (SG CV M) using equation (9)(6) end while

ALGORITHM 2 Suspicious hidden social tie inference

Table 1 Evaluation of simulation and social activity

Gap timeMge 2 Mge 5

Simulation Social activity Simulation Social activityP R F1 P R F1 P R F1 P R F1

CVge 5

GVlt 10 012 068 020 012 100 021 013 062 021 012 079 021GVlt 20 010 042 016 015 079 025 015 031 020 013 042 020GVlt 30 021 022 021 071 059 064 025 018 021 023 028 025GV ge 30 034 023 027 100 012 021 030 010 015 032 019 024

CVge 10

GVlt 10 010 033 015 012 048 019 009 030 014 011 061 019GVlt 20 011 028 016 027 050 021 009 018 012 010 039 016GVlt 30 013 018 015 029 030 029 009 010 009 009 020 012GV ge 30 013 008 010 026 005 008 010 004 006 013 007 009

CVge 20

GVlt 10 004 018 007 008 023 012 005 029 009 011 039 017GVlt 20 007 008 007 005 024 008 004 011 006 010 028 015GVlt 30 004 006 005 009 009 009 003 005 004 009 019 012GV ge 30 002 001 001 001 001 001 001 001 001 001 005 002

lowastP Precision R Recall andF1 F1 Score

Social network

Count number of co-occurtence

Suspect social tiesMutual friend threshold

value

Count threshold value

Calculate sub social networks via

time gap threshold value

Calculate 0 1 2 3 4 level ties

Highlight missing ties between

level 1 and level 3

((xn ym) times (ymβn))mm=1

nn=1=

Figure 8 Suspect social tie inference framework

Scientific Programming 11

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 12: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

Evaluation results of simulation and social activity con-ducted are shown in Table 1 Precision Recall and F1 Scoremeasures are used to evaluate the framework that is furthercalculated based on the cooccurrence count CV mutualfriend count M and gap time GV parameter setting valuesF1 Score is calculated using equation (5) Definitions of therelated parameters are given as follows

TP Which were hidden friends and system infer ashidden friends

TN Which were not hidden friends and system inferas not hidden friends

FP Which were not hidden friends and system inferas hidden friends

FN Which were hidden friends and system infer asnot hidden friends

63 Results and Discussion We tested and evaluated allrecords based on the cooccurrence count value CV mutualfriend M and gap time value GV as the gap timeFigures 9(a)ndash9(d) show the evaluation of results for theactivity based on three different values of the cooccurrencecount value CV ie CV 5 10 and 15 and two different

values of the mutual friend M ie Mge 2 and ge5 LikewiseFigures 10(a)ndash10(d) show the evaluation of results for thesimulation data set along with mentioned CV and M valuesResults were generated on four values of gap time ie ge 30lt 30 lt 20 and lt 10

In Figures 9(a) and 9(b) Recall is maximum and Pre-cision is less where the value of GVge 30 CVge 5 and Mge 2+e system obtains a maximum number of relevant hiddenties along with false-positive results By GVge 30 CVge 5 andMge 2 it means that the gap between the two calls is 30minsor more while the count of cooccurred events is keptminimum five and mutual friend count as two or less +esystemrsquos performance drops when gap time is reduced tolt 30 lt 20 and lt 10 Even though there is a drop in true-positive values a significant drop in the false-positive valuescan be seen Results become concise with the variation inboth values of GV and M +e limited data set collectedusing activity highlights the occurrences of hidden tiesWhole activity and simulation were designed to get similarfields of data as CDR so that the proposed framework iscompatible with the CDR data set

Results shown in Figures 9 and 10 exhibit the existence ofhidden relationships +e parameter such as the gap timevalue GV helps to identify the time frame selection such that

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1Pr

ecisi

onM ge 2

(a)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

(b)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Prec

ision

M ge 5

(c)

CV ge 5CV ge 10CV ge 15

Gap time lt 30Gap time ge 30 Gap time lt 20 Gap time lt 100

010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 9 Evaluation of social activity using Precision and Recall

12 Scientific Programming

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 13: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

what gap value is suitable to get optimal performanceLikewise Figures 10(a) to 10(d) represent the Precision andRecall curves for the simulation data set Results are gatheredby setting the same values for the threshold parameters CVGV and M Figures 10(a) and 10(d) give the least values forPrecision and Recall which tells that if threshold values aretightened so does the count relevant retrieve results get lowOut of all the results Precision and Recall values for bothsimulation and activity data set showmaximum if thresholdvalues GVge 30 CVge 5 and Mge 2 are set We found someexciting dissimilarities in the results between simulation andactivity data set results during the complete evaluationprocess +e simulation results do not give higher Recall andPrecision value compared to activity data results We con-cluded that the data set of simulation is generated usingrandom function while the activity had the human hidingpatterns +ese results also help to infer human psychologyin building hidden ties +e simulation data set generatorworks based on the same constraints mentioned as rules ofthe activity However the key difference between simulationand activity is the selection of friends hidden friend and the

pattern of calling While conducting the simulation trivialvariation in results was observed as the random selection hasrandom patterns According to our findings simulation andactivity both exhibit patterns of hidden ties However ac-tivity results are more pronounced and significantly iden-tifying the hidden relationships

It is essential to highlight some critical questions anddependent variables that help to find hidden social tiesbetween two people for example why two people hide theirsocial ties Is this deliberate or unintentional action What ifthey are deliberately hiding their social tie for a purpose Insuch a scenario extracting a social tie for two people is a kindof intense problem It is a kind of investigation processwhich explores a clue to draw some relationship betweentwo people Investigation designates if two people areposting a picture on social media to be more cautious aboutnot identifying their social ties While if they are doing someprivate activity they will be less careful for example postinga picture on Facebook or Flickr compared to calling to aperson from a specific location Although extracting hiddensocial ties include various privacy issues We designed

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Prec

ision

M ge 2

CV ge 5CV ge 10CV ge 15

(a)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 100

010203040506070809

1

Reca

ll

M ge 2

CV ge 5CV ge 10CV ge 15

(b)

Prec

ision

0010203040506070809

1M ge 5

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

(c)

Gap time lt 30 Gap time ge 30Gap time lt 20Gap time lt 10

CV ge 5CV ge 10CV ge 15

0010203040506070809

1

Reca

ll

M ge 5

(d)

Figure 10 Evaluation of simulation using Precision and Recall

Scientific Programming 13

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 14: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

activity and simulation that generated the data set by thecaller data record (CDR) data set We thoroughly investi-gated the patterns of connectivity and established aframework to infer social ties +is research has opened up anew direction further to explore connectivity in the SocialInternet of +ings (SIoT) specifically machine-to-machinedirect communication and machine-to-human or human-to-machine hidden relationships In many cases severalmachines work together but they rarely have a directconnection

7 Conclusion and Future Work

In this research we have examined the developments ofsocial connection patterns based on physical gathering Inthe first part of our research we explored the correlationbetween direct communication and two individualsrsquo phys-ical presence To check the systemrsquos performance andevaluation we examined all results by utilizing Precision andRecall evaluation measures We also present a periodicnormalization equation for the cooccurrence count In thesecond phase we propose the suspect tie inference frame-work False-positive results of the first part of the researchare the ground to the second part of the study +e proposedframework adopts pipe and filter architecture where thethreshold values control each filter +e frameworkrsquos fun-damental objective is to take the data set such as CDR (callerdata record) and infer suspect social ties depending uponthe specified threshold values Analyzing the results criti-cally we propose a theory that identifies suspect social tiesBesides this for comparison and evaluation we conductedreal-time human-based activity and simulation Keeping inmind the structure of the actual CDR data set the wholeactivity was designed and evaluated In contrast to existingwork our research focus is on hidden ties instead of thehidden actors In the future we are aiming to explore thehomophilic nature of suspect ties within the Social Internetof +ings (SIoT)

Data Availability

+e data used can be found at httpsnapstanfordedudataindexhtmllocnet

Conflicts of Interest

+e authors declare that they have no conflicts of interestregarding the publication of this work

Acknowledgments

+is work was supported by King Saud University SaudiArabia through research supporting project number RSP-2020184 Nauman Ali Khan acknowledges the support ofthe Chinese Government and Chinese Scholarship Council(CSC) for his PhD studies at the University of Science andTechnology China +is research work was partially sup-ported by Key Program of the National Natural ScienceFoundation of China (grant number 61631018)

References

[1] L Weng M Karsai N Perra F Menczer and A FlamminildquoAttention on weak ties in social and communication net-worksrdquo in Complex Spreading Phenomena in Social Systemspp 213ndash228 Springer Berlin Germany 2018

[2] A M Ortiz D Hussein S Park S N Han and N Crespildquo+e cluster between internet of things and social networksreview and research challengesrdquo IEEE Internet of ingsJournal vol 1 no 3 pp 206ndash215 2014

[3] P Luarn and Y-P Chiu ldquoKey variables to predict tie strengthon social network sitesrdquo Internet Research vol 25 no 2pp 218ndash238 2015

[4] D Goel and K Lang ldquoSocial ties and the job search of recentimmigrantsrdquo ILR Review vol 72 no 2 pp 355ndash381 2019

[5] H Kizgin A Jamal N Rana Y Dwivedi and VWeerakkodyldquo+e impact of social networking sites on socialization andpolitical engagement role of acculturationrdquo TechnologicalForecasting and Social Change vol 145 pp 503ndash512 2019

[6] P Suebvises ldquoSocial capital citizen participation in publicadministration and public sector performance in +ailandrdquoWorld Development vol 109 pp 236ndash248 2018

[7] Z Liu Y Qiao S TaoW Lin and J Yang ldquoAnalyzing humanmobility and social relationships from cellular network datardquoin Proceedings of the 2017 13th International Conference onNetwork and Service Management (CNSM) pp 1ndash6 TokyoJapan November 2017

[8] L Matosas-Lopez and A Romero-Ania ldquo+e efficiency ofsocial network services management in organizations an in-depth analysis applying machine learning algorithms andmultiple linear regressionsrdquo Applied Sciences vol 10 no 15p 5167 2020

[9] M S Granovetter ldquo+e strength of weak tiesrdquo in SocialNetworks pp 347ndash367 Elsevier Amsterdam Netherlands1977

[10] T Nguyen L Zhang and A Culotta ldquoEstimating tie strengthin follower networks to measure brand perceptionsrdquo inProceedings of the 2019 IEEEACM International Conferenceon Advances in Social Networks Analysis and Mining (ASO-NAM) pp 779ndash786 Vancouver Canada August 2019

[11] H Mattie K Engoslash-Monsen R Ling and J-P OnnelaldquoUnderstanding tie strength in social networks using a localbow tie frameworkrdquo Scientific Reports vol 8 no 1 p 93492018

[12] D J Crandall L Backstrom D Cosley S SuriD Huttenlocher and J Kleinberg ldquoInferring social ties fromgeographic coincidencesrdquo Proceedings of the NationalAcademy of Sciences vol 107 no 52 pp 22436ndash24441 2010

[13] E Cho S A Myers and J Leskovec ldquoFriendship and mo-bility user movement in location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining pp 1082ndash1090 San Diego CA USA August 2011

[14] A Znidarsic P Doreian and A Ferligoj ldquoAbsent ties in socialnetworks their treatments and blockmodeling outcomesrdquoAdvances in Methodology amp StatisticsMetodoloski Zvezkivol 9 no 2 2012

[15] S A Myers A Sharma P Gupta and J Lin ldquoInformationnetwork or social network the structure of the twitter followgraphrdquo in Proceedings of the 23rd International Conference onWorld Wide Web pp 493ndash498 Seoul Republic of KoreaApril 2014

14 Scientific Programming

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 15: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

[16] X Zuo J Blackburn N Kourtellis J Skvoretz andA Iamnitchi ldquo+e power of indirect social tiesrdquo 2014 httparxivorgabs14014234

[17] M A Brandatildeo and M M Moro ldquoTie strength in co-au-thorship social networks analyses metrics and a new com-putational modelrdquo in Proceedings of the Anais Estendidos doXXV Simposio Brasileiro de Sistemas Multimıdia e Webpp 17ndash20 Rio de Janeiro Brazil 2019

[18] H A Khanday A H Ganai and R Hashmy ldquoA comparativeanalysis of identifying influential users in online social net-worksrdquo in Proceedings of the 2018 International Conference onSoft-Computing and Network Security (ICSNS) pp 1ndash6Coimbatore India February 2018

[19] R Garcia-Guzman Y A Andrade-Ambriz M-A Ibarra-Manzano S Ledesma J C Gomez and D-L Almanza-Ojeda ldquoTrend-based categories recommendations and age-gender prediction for pinterest and twitter usersrdquo AppliedSciences vol 10 no 17 p 5957 2020

[20] A Talukder M G R Alam N H Tran D Niyato andC S Hong ldquoKnapsack-based reverse influence maximizationfor target marketing in social networksrdquo IEEE Access vol 7pp 44182ndash44198 2019

[21] N Ampazis T Emmanouilidis and F Sakketou ldquoA matrixfactorization algorithm for efficient recommendations insocial rating networks using constrained optimizationrdquoMachine Learning and Knowledge Extraction vol 1 no 3p 928 2019

[22] N Zhou X Zhang and S Wang ldquo+eme-aware socialstrength inference from spatiotemporal datardquo in Proceedingsof the International Conference on Web-Age InformationManagement pp 498ndash509 Macau China June 2014

[23] P A Grabowicz J J Ramasco B Gonccedilalves andV M Eguıluz ldquoEntangling mobility and interactions in socialmediardquo PLoS One vol 9 no 3 Article ID e92196 2014

[24] D Yang B Qu J Yang and P Cudre-Mauroux ldquoRevisitinguser mobility and social relationships in LBSNs a hypergraphembedding approachrdquo in Proceedings of the World Wide WebConference pp 2147ndash2157 San Francisco NY USA May2019

[25] R-H Li J Liu J X Yu H Chen and H Kitagawa ldquoCo-occurrence prediction in a large location-based social net-workrdquo Frontiers of Computer Science vol 7 no 2pp 185ndash194 2013

[26] V Jayadevan K Bharadwaj A Kumar and P KhandelwalldquoDiscovering local social groups using mobility datardquo Inter-national Journal of Computer Applications vol 120 no 212015

[27] A J Jara Y Bocchi and D Genoud ldquoSocial internet of thingsthe potential of the internet of things for defining humanbehavioursrdquo in Proceedings of the 2014 International Con-ference on Intelligent Networking and Collaborative Systemspp 581ndash585 Salerno Italy September 2014

[28] W A D M Jayathilaka K Qi Y Qin et al ldquoSignificance ofnanomaterials in wearables a review on wearable actuatorsand sensorsrdquo Advanced Materials vol 31 no 7 Article ID1805921 2019

[29] J Ren H Guo C Xu and Y Zhang ldquoServing at the edge ascalable iot architecture based on transparent computingrdquoIEEE Network vol 31 no 5 pp 96ndash105 2017

[30] I Seeber E Bittner R O Briggs et al ldquoMachines as team-mates a research agenda on ai in team collaborationrdquo In-formation amp Management vol 57 no 2 Article ID 1031742020

[31] T Pi L Cao P Lv Z Ye and H Wang ldquoInferring implicitsocial ties in mobile social networksrdquo in Proceedings of the2018 IEEE Wireless Communications and Networking Con-ference (WCNC) pp 1ndash6 Barcelona Spain April 2018

[32] N Tahir A Hassan M Asif and S Ahmad ldquoMCD mutuallyconnected community detection using clustering coefficientapproach in social networksrdquo in Proceedings of the 2019 2ndInternational Conference on Communication Computing andDigital Systems (C-CODE) pp 160ndash165 Islamabad PakistanMarch 2019

[33] V A Lewis C A MacGregor and R D Putnam ldquoReligionnetworks and neighborliness the impact of religious socialnetworks on civic engagementrdquo Social Science Researchvol 42 no 2 pp 331ndash346 2013

[34] M S Granovetter ldquo+e strength of weak ties a networktheory revisitedrdquo Sociologicaleory vol 1 pp 201ndash233 1983

[35] J Gui Z Zheng X Zhao and Z Qin ldquoStatistical propertiesand temporal properties of calling behaviorrdquo in Proceedings ofthe 2019 IEEE 5th International Conference on Computer andCommunications (ICCC) pp 1940ndash1944 Chengdu ChinaDecember 2019

[36] I H Sarker ldquoA machine learning based robust predictionmodel for real-life mobile phone datardquo Internet of ingsvol 5 pp 180ndash193 2019

[37] M S Bernstein E Bakshy M Burke and B KarrerldquoQuantifying the invisible audience in social networksrdquo inProceedings of the SIGCHI Conference on Human Factors inComputing Systems pp 21ndash30 Paris France April 2013

[38] R Montasari R Hill V Carpenter and F Montaseri ldquoDigitalforensic investigation of social media acquisition and analysisof digital evidencerdquo International Journal of Strategic Engi-neering vol 2 no 1 pp 52ndash60 2019

[39] M Yip N Shadbolt and C Webber ldquoStructural analysis ofonline criminal social networksrdquo in Proceedings of the 2012IEEE International Conference on Intelligence and SecurityInformatics pp 60ndash65 Arlington VA USA June 2012

[40] Z Zhao C Li X Zhang F Chiclana and E H Viedma ldquoAnincremental method to detect communities in dynamicevolving social networksrdquo Knowledge-Based Systems vol 163pp 404ndash415 2019

[41] S Vosoughi D Roy and S Aral ldquo+e spread of true and falsenews onlinerdquo Science vol 359 no 6380 pp 1146ndash1151 2018

[42] Y Fang J Gao Z Liu and C Huang ldquoDetecting cyber threatevent from twitter using IDCNN and BILSTMrdquo AppliedSciences vol 10 no 17 p 5922 2020

[43] P Lu L Zhang X Liu J Yao and Z Zhu ldquoHighly efficientdata migration and backup for big data applications in elasticoptical inter-data-center networksrdquo IEEE Network vol 29no 5 pp 36ndash42 2015

[44] O Sharif M Hoque A Kayes R Nowrozy and I SarkerldquoDetecting suspicious texts using machine learning tech-niquesrdquo Applied Sciences vol 10 no 18 p 6527 2020

[45] K C Roy M Cebrian and S Hasan ldquoQuantifying humanmobility resilience to extreme events using geo-located socialmedia datardquo EPJ Data Science vol 8 no 1 p 18 2019

[46] A Squicciarini S Karumanchi D Lin and N DesistoldquoIdentifying hidden social circles for advanced privacy con-figurationrdquo Computers amp Security vol 41 pp 40ndash51 2014

[47] L Chen F W Crawford and A Karbasi ldquoSeeing the unseennetwork Inferring hidden social ties from respondent-drivensamplingrdquo in Proceedings of the irtieth AAAI Conference onArtificial Intelligence pp 1174ndash1180 Phoenix AZ USAFebruary 2016

Scientific Programming 15

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming

Page 16: InferringTiesinSocialIoTUsingLocation-BasedNetworksand ...3 JP]V3 ]4JJV3 Brightkite and Gowalla are openly available location-basedsocialnetworkdatasets[13,48].Bothdatasetsare gathered

[48] J Leskovec and A Krevl ldquoSNAP datasets stanford largenetwork dataset collectionrdquo 2014 httpsnapstanfordedudata

[49] X Zheng Y Wang and M A Orgun ldquoContextual sub-network extraction in contextual social networksrdquo in Pro-ceedings of the 2015 IEEE TrustcomBigDataSEISPApp 119ndash126 Helsinki Finland August 2015

[50] A Madani and M Marjan ldquoMining social networks to dis-cover ego sub-networksrdquo in Proceedings of the 2016 3rd MECInternational Conference on Big Data and Smart City(ICBDSC) pp 1ndash5 Muscat Oman March 2016

16 Scientific Programming