mobile video chat: issues and challenges

IEEE Communications Magazine • June 2013144 0163-6804/13/$25.00 © 2013 IEEE

INTRODUCTION

Video chat, also referred to as video telephonyor video calls, is the new data hog in current cel-lular networks. A simple phone call pales incomparison to a face-to-face video chat. Recent-ly, group interactive video chat applications havealso emerged. In group video chat, multipleusers can see and chat with each other on a sin-gle screen at the same time. GigaOM Pro [1]projects video chat consumers via mobile phoneswill make noticeable gains by 2015 (Fig. 1).Undoubtedly, video chat is now playing a biggerpresence in our daily lives and changing theparadigm of interpersonal communications [2].

Many video chat applications for mobilephones have been rolled out in the market. Dif-ferent applications have different features anduser experience. Fring, Tango, Skype, Vtok,FaceTime, ooVoo, and Google Talk are just fewexamples of those applications.

Video chat requires high quality of service(QoS) to meet user needs and expectations interms of packet loss rate, end-to-end delay, jit-ter, and so on. Many standards bodies and indus-try organizations have come up with various

classifications of services and associated QoSparameters; for example, video chat over thirdgeneration (3G), Universal Mobile Telecommu-nications System (UMTS) falls under interactiveclass with best effort service. Due to its interac-tive nature, video chat has very strict timing con-straints and is intolerant of packet loss.International Telecommunication Union (ITU)Recommendation G.114 recommends a maxi-mum delay of 150 ms for video chat in order toensure lip synchronization. For “good” to “excel-lent” performance, current uplink and downlinkdata rate for video chat is in the range ª 130–330kb/s, which is subject to change with more pow-erful devices in the hands of end users.

The growth of video chat becomes a sourceof extra revenue for mobile operators. But as thetrend will not stop after two or three years, withapproximately 140 million video chat consumers(Fig. 1), the future growth of video chat traffic isa concern for network service providers. Theoperators are concerned about how video chattraffic may affect their networks. In addition,facilitating such growth requires advancement inapplication-level coding techniques, network-layer video delivery, quality assessment strate-gies, security, and so on, and the interplay ofthese issues.

In this article, we present issues and chal-lenges in design of successful mobile video chatapplications. Figure 2 gives a system overview ofthe main issues in mobile video chat. Subjectiveand objective user experience measurements,privacy and secrecy concerns, current network(communication channel) characteristics, end-device capabilities, and interoperability issues allplay a crucial role in providing the best userexperience to the end user. These issues will beregistered by a controller, which can decide theright parameter choice for all components, whileresolving any conflicts.

Our study is spread out in the next three sec-tions. We give a summary of overall mobilevideo chat and relevant network architecture.We discuss the limitations of end-to-end cellularnetworks and possible solutions that can improvevideo chat performance. We discuss the chal-lenges in providing additional features to makevideo chat easy to use. We then conclude ourarticle.

ABSTRACT

The next generation of cellular communica-tion needs to endure a higher demand forenhanced real-time multimedia support to endusers. Among interactive video applications,use of mobile video chat is on the rise in boththe enterprise and consumer worlds. Mobilevideo chats are resource-constrained and het-erogenous with varying display sizes, processingpowers, network conditions, and battery levels,and pose real-time delivery constraints unliketraditional video streaming applications. Withthe demand for anywhere any time connectivityfor video chat, it becomes quintessential todevelop techniques that adapt to the underlyingnetwork, optimize the device architectures, andprovide the best possible video chat experienceto end users. In this article, we identify mainlimitations and challenges in mobile video chat.We further discuss possible solutions and pro-pose research directions to make video chat agood experience.

TOPICS IN CONSUMER COMMUNICATIONSAND NETWORKING

Shraboni Jana, Amit Pande, An (Jack) Chan, and Prasant Mohapatra, University of California, Davis

Mobile Video Chat:Issues and Challenges

JANA LAYOUT_Layout 1 5/30/13 4:37 PM 44

IEEE Communications Magazine • June 2013 145

MOBILE VIDEO CHAT

In this section we discuss various components —mobile phone transmitter and receiver, videochat software, access segment, and core network— involved in mobile video chat over cellularnetworks and affecting the performance of videochat. The access uplink-downlink segment is awireless radio link. The core network is wired.

We broadly divide the network into threeparts: mobile devices, application provider, andmobile carriers. We discuss them in detail in thenext few subsections.

MOBILE DEVICESThe video content generators and receivers aremobile phone devices (end devices, Fig. 2),which are gradually being replaced by smartphones. We use the term smart phone(s) ormobile phone(s) to address both the transmitterand receiver devices in this article. The improve-ment in processing power, speed, screen resolu-tion, RAM, and storage space boosts richfunctionalities on mobile phones, making high-quality video chat possible in these devices.

Processing power and processing speed play akey role in the performance of any application ina device. Processors like NVIDIA TEGRA 4have already reached speeds ª 1.9 GHz with 72GPU cores supporting ultra-HD video process-ing at low power consumption. Current smartphones are equipped with high-resolution low-latency front-facing cameras to capture end-uservideo while the end user is viewing the smart-phone screen with pixel densities of 1080p. Dis-play resolution has increased to 1080 ¥ 1920pixels, 5.0 in (~ 441 ppi pixel density in theSamsung GS4) in mobile phone devices, reduc-ing the impact of resolution loss. There are nowmegabytes of RAM (ª 512 Mbytes) availableand also gigabytes (ª 32 Gbytes) of space avail-able in smart phones.

With the current advancements in mobilephones, high-quality video chat applications canbe developed and deployed on mobile devices,which was not possible before.

APPLICATION PROVIDERWith the advancement in mobile phone hard-ware, obviously the onus is on video chat appli-cation providers to provide good quality ofexperience to end users. Fring, Tango, Qik, andGoogle Talk are some of the video chat applica-tions contending to be the frontrunner in mobilephones.

A video chat application needs to coordinatewith the mobile phone hardware and networkfor efficient content creation satisfying all partiesincluding end users and network operators. Anyvideo chat application needs to provide a rangeof functionalities — capturing, processing, andtransmission of video data. In Fig. 2, althoughwe show interaction of coding, decoding, andcamera for capturing with the video chat con-troller, we elaborate each of these functionalitiesbelow.

Video Capture — The operating systems ofmobile phones provide application programminginterfaces (APIs) for video chat software to

coordinate camera functionality to capture thecontent and transmit the content data through awireless channel (Fig. 3). The software decidesthe frame rate at which the camera will captureunique consecutive images and then save themto memory. An appropriate frame rate is veryimportant to maintain a balance between band-width, delay, computational power, and userexpectations. When users are communicating viasign language, for extreme cases, at least 21frames/s is recommended in order to supportfinger spelling. In contrast, various studies sug-gest that audio and video are perceived as syn-chronized at minimum 5 frames/s [3]; therefore,that is the frame rate requirement for videochat. But with increasing user expectations, the 5frames/s frame rate will not suffice.

With high-end smart phones making their

Figure 1. Exponential increase in mobile video chat customers (source :GigaOM Pro [1]).

Mobile video call/chat consumers20100

20

0

Con

sum

ers

in m

illio

ns

40

60

80

100

120

140

160

2011 2012 013 2014 2015

Figure 2. Block diagram of a mobile video chat engine. The dotted lines indi-cate participation of other user(s).

Network

End deviceCapabilities

Camera

Coding

DisplayEnd-userinput

Objectivequalitymetrics

Subjectiveassessment

Privacysettings

Decoding

Video chat controller

Security and authentication

JANA LAYOUT_Layout 1 5/30/13 4:37 PM Page 145

IEEE Communications Magazine • June 2013146

way into the hands of end users and increasednetwork bandwidth due to 4G, future video chatis expected to provide higher bit rates to endusers.

Video Compression — To reduce the band-width requirements, video applications transmitcompressed video. This encoding is commonlyreferred as source coding or simply video com-pression, and is performed by the codec in anapplication. The higher the compression ratio,the more computational power required but thelower the bandwidth requirement. Current videocoding standards, H.264 and VP8 both claim toprovide efficient compression and low bit rates.In the study conducted by the authors in [4],H.264 shows lower bandwidth usage and bettervideo quality than VP8. High-efficiency videocoding (sometimes referred to as H.265) is on itsway to standardization and gives up to 40–50percent improvement in bit rates.

Although the current state-of-the-art H.264codec has become very mature with efficientcompression and low latency, the codec needs tobe further customized for video chat. The param-eters of a codec — frame rate, group of picture,size, quantization parameter, and resolution —need to be suited to the end system capabilities.For example, two mobile devices with resolu-tions x ¥ y (camera) and x1 ¥ y1 (display) in agroup video chat may decide on a commonshared resolution x2 ¥ y2 based on camera reso-lution of mobile devices rather than using theH.264 spatial scalability feature, which mayinvolve high latency.

Layered video coding approaches such asH.264 scalable video coding (SVC) lead toincreased computational cost and increasedoverall bit rate, which is detrimental to the per-formance of one-to-one video communications.However, it may play a significant role in group

video chat, where multiple users have varyingnetwork conditions.

Video Transmission — The application useswireless driver APIs to transmit the data.

MOBILE CARRIERSIn today’s highly competitive environment, usershave the option of choosing from a plethora ofcarriers. Therefore, it is not enough to simplymake services available to users. Mobile carriersmust deliver those services in such a way thatusers fully enjoy a rich experience at a reason-able price. In this section we discuss the keytechnologies used by mobile carriers for videochat. Mobile carriers are responsible for end-to-end seamless connectivity of video chat applica-tion. Video chat applications demand strict QoS.As mentioned before, it requires packet delay £150 ms and bandwidth requirement > 100 kb/s.The frame error rate needs to be £ 1 percent [5].Therefore, continuous feedback is requiredbetween the network and the video chat con-troller (Fig. 2) for balancing between networkconditions and end users’ video chat expecta-tions. We now focus our discussion on networkarchitecture relevant to video chat.

3G uses both a circuit-switched mobile corenetwork and a packet network, while 4G is apacket network. The 3G-324M protocol is usedto support real-time multimedia services overwireless circuit-switched 3G networks. The cir-cuit-switched network provides a 64 kb/s circuit-switched path, but published measurements [6]with live 3G networks gives uplink speeds ofonly 50 kb/s. As the current minimum upstream-downstream bandwidth required by a video chatapplication is 128 kb/s, a circuit-switched mobilecore network is unsuitable for video chat.

Figure 4 gives high-level network architectureof an end-to-end video chat. The dominant stan-dard for transmitting video telephony in packet-switched networks is RTP/UDP/IP. IP is aconnectionless network communications proto-col. It may provide greater bandwidth, but thebandwidth is not guaranteed. Currently two ver-sions of IP, version 4 (IPv4) and version 6 (IPv6),are in use. IPv4 is best effort service, and IPv6 isdesigned to ensure QoS. To enhance real-timemultimedia mobile services for IP, agnostic of theaccess network, IP multimedia subsystem (IMS)provides an architectural framework for flexiblemultimedia management. IMS ensures QoS,negotiating mobile device requirements usingSession Initiation Protocol (SIP) during sessionsetup or modification. Once end-to-end QoS isestablished, the end-user terminals use Real-Time Transport Protocol (RTP) to packetizevideo chat data and send it using UDP over IP.

To transport 3G/4G services through IP net-works and provide end-to-end QoS provisioningfor an IP packet, the Internet Engineering TaskForce (IETF) defines two models — integratedservices (IntServ) and differentiated services(DiffServ). The IntServ model uses ResourceReservation Protocol (RSVP) to signal andreserve the desired QoS for each flow in the IPnetwork. Under IntServ, video chat has a verystrict guaranteed service providing firm boundson end-to-end delay and ensuring bandwidth for

Figure 3. A simplified view of smart phone hardware-software architecture andapplication provider functions.

API

Operatingsystem

Hardware

Cameradriver

Wirelessdriver

Videocapturemodule

Transmit/receive data

moduleVideo chatapplication



the traffic. Although it is theoretically possible toprovide such QoS for each flow in the network,practically it is very hard as every device alongthe path of a packet needs to be fully aware ofRSVP and capable of delivering the requiredQoS. The DiffServ model is relatively simple andcoarse as it groups the network flows based ondifferent classes, also called classes of service(CoSs), and applies distinct QoS parameters foreach class. The type of service octet in IPv4stores the 6-bit DiffServ code point in the IPheader to identify the CoS, whereas in IPv6 thetraffic class octet is used. Under the DiffServmodel, delay-sensitive video chat traffic fallsunder the conversational CoS.

The upcoming 4G communication network,Long Term Evolution-Advanced (LTE-A),promises a data rate of 500 Mb/s uplink and 1Gbps downlink at peak. With such increased bitrates in networks, the quality of delivered videois also expected to increase. Full high-definitionvideos require a bandwidth of nearly 2 Mb/s,which may become a requirement for video chatin coming years.

LIMITATIONSWe evaluate the network QoS (Fig. 5) in anexperiment carried out with video chat applica-tion Vtok (based on Google Talk APIs) betweentwo end users, one connected to 3G and anotherconnected to a WiFi network. The end devicesused are Samsung Galaxy II smart phones. Weuse Tcpdump to capture packets sent andreceived at two stations performing video chat.In a random packet trace, we observe that whilethe video chat is within the limits (average send-ing rate, Fig. 5), the packet loss is sometimesaround 50 percent, which is way above therequired network QoS, thus reaffirming ourbelief that a robust video chat is far from real-ization.

Current wireless mobile video applications’performance is limited due to the constrainednetwork resources, and fading and interferencecaused by the wireless medium. In this section,we discuss limitations of any real-time videoapplication on mobile phones. We identify tech-niques that might aid in addressing these limita-tions for mobile video chat.

BANDWIDTHVideo chat requires real-time communication. Ifthe application overutilizes the link, it causesunfairness to other traffic, and if it underutilizesthe link, it may cause low quality of video chat.In addition to this, the uplink-downlink connec-tion in 3G is asymmetric. When congestion hap-pens, the video chat on the uplink will sufferfirst, thus determining the quality for the wholevideo chat. Hence, lower uplink bandwidth needsto be widely available for live video chat.

Multicasting can be one option to reduce thebandwidth requirement when more than oneparticipating users are present in the video chat.Multicasting allows a single stream to be served,which is replicated throughout the network,thereby reducing the bandwidth required forboth uplink and downlink. Multimedia broad-cast/multicast service (MBMS) was introduced

by UMTS to provide high-speed multimediamulticasting and broadcasting services. e-MBMS,the evolved version of the legacy MBMS, is con-sidered an important architecture of LTE-A inthis regard. Although multicasting has not beena reality in 3G so far, it has the potential tominimize the bandwidth requirement for groupvideo chat.

The bandwidth consumed by video chat canbe further reduced with a mechanism to notifyend users about incoming video chat calls. Videochat applications keep sending packets even dur-ing idle periods. A voice call, in such a case, hasan edge over video chat in not requiring endusers to always be logged into the data networkof the application. With automatic login during

Figure 5. Evaluation of video chat in 3G/WiFi network.

Observation time (s)548

20

0

Pack

ets/

s

40

60

80

562 564 566 568

Observation time (s)0

50

0Ave

rage

sen

ding

rat

e (k

b/s)

100

150

200

500 600400300200100

Packets sentPackets lost

550 552 554 556 558 560

Figure 4. High-level network architecture in an end-to-end video chat.

IP network

UTRAN

IPaccess

network

Wifi, Bluetoothetc.

IPv6IPv4

RTP/UDP/IP

RTP/UD

P/IP

RTP/UDP/IP

Corenetwork

PS

IP multimedia subsystem

Video chat application server

SIP

SIP

SIP



an incoming call, the end user can receive thevideo call without being logged into the videochat application. As per Skype users and tech-nology news, when idle, Skype consumes (1.3Gbytes/month1) considerable bandwidth com-pared to the current capped data plans availablefor cellular end users.

In other approaches, to address asymmetricuplink-downlink connection, video chat applica-tions can use available WiFi hotspots. AlthoughWiFi brings a realistic and comfortable solutionfor bandwidth-constrained 3G to provide suchhigh-end applications, maintaining QoS duringsuch vertical handover is an issue.

FACE TO EYE DELAYMobile video chat poses unique challenges dueto its requirement for small face (transmitter) toeye (receiver) delay. ITU G.114 recommends amaximum delay of 150 ms for video chat in orderto ensure lip synchronization. Due to video chathaving strict delay constraints, the retransmissionmechanism of TCP may not be used. TCPretransmission increases delay in the system,making it difficult to adhere to the delay con-straints of video chat QoS. With UDP, on theother hand, the network is unable to recoverfrom packet losses. Delay and packet loss in thenetwork cause jitters, frame freeze, and videostalls. We discuss the impact of packet loss inlater subsections.

Other causes of delays are horizontal andvertical handovers. Video chat can freeze or stallthe video during handover when the delay is notwithin the limits. As the mobile moves, its asso-ciated access point changes as per the currentlocation. The delay in such horizontal handoveris inevitable for 3G. As mentioned before, cur-rently, to solve the bandwidth constraints, mobilecarriers prefer video chat applications to useavailable WiFi hotspots. Maintaining delay con-straints during such vertical handover is also akey limitation [7].

IEEE 802.21 aims for seamless handoveracross heterogenous networks, reducing discon-nection times and packet loss during handoverof real-time communications. LTE-coordinatedmultipoint (CoMP) has been proposed by theThird Generation Partnership Project (3GPP)standards community [8] to take care of seam-less connectivity for edge users and handovers in4G networks. In LTE-CoMP the edge users arealso supported by multiple neighboring base sta-tions, therefore giving more diversity to the datareceived. When edge users cross a cell duringchat, with LTE-CoMP they will still be support-ed by adjacent base stations and hence not expe-rience video freeze due to handover delay.

INTERFERENCE, FADING, AND CONGESTIONInterference and fading in the wireless mediumand congestion in the network cause packet loss-es. Packet loss may cause visible distortions invideo chat. The two main visible distortions areblocking and blurring. We find with a randomtrace (Fig. 5) that packet loss is a major issue invideo chat. Although bandwidth and delay arewithin the required QoS limits, limitationsimposed by them may cause packet loss.

Packet loss is inevitable in the wireless medi-

um due to its very nature. Simultaneous trans-missions from or to different mobile phones inthe same cell may result in interference. As themobile moves to the edge of the cell, its sustain-able data rate may change due to interferencefrom adjacent cells. Moreover, the impact ofpacket loss is further increased as there is errorpropagation among successive frames of thecodec. For real-time applications, it would bemore appropriate for the application to dynami-cally change the parameters of the codec to min-imize the impact of packet loss.

For delay-tolerant applications, the networkoperators can employ better loss-resistant net-work protocols. But as the packets are streamedover UDP for video chat, the onus mainly fallsover video chat application providers to imple-ment packet loss recovery techniques to preservenetwork stability.

To combat packet loss, it is very importantfor the application to identify the reason fordata loss. Data losses can be due to video codinglosses, random losses, or congestion losses in thenetwork. Random losses are caused by the wire-less access link: signal fading, interference, andchannel quality losses. End-to-end packet loss ina cellular network and the Internet can becaused by congestion loss due to buffer overflowin the network. Decreasing transmission rate willnot help in the case of non-congestion-relatedlosses. To identify the root cause of the dataloss, Spike [9], an end-to-end loss differentiationalgorithm, can be used.

The data recovery techniques (Fig. 6a) in theliterature can be divided into feedback-basedand non-feedback-based. Non-feedback-basedtechniques are implemented at the encoder ordecoder. The decoder-based approaches, such asinterpolation, filtering, and spatial and temporalsmoothing, are not very efficient. An encoder-based technique, while efficient, adds redundantbits to the video chat data and therefore increas-es bandwidth requirement. Forward error cor-rection (FEC) and joint source and channelcoding are some of the encoder-based error con-trol approaches. Such techniques require signifi-cant changes to a video codec and have highcomputational complexity. Of all the encoder-based approaches, though, FEC is the most pop-ular solution; however, studies using FEC in avideo chat application (Skype) show increase inbandwidth overhead by 25 to 50 percent [10].Hence, in a congested environment, FEC willlikely increase the loss rate.

In feedback-based error control techniques,information sent by the decoder is used by theencoder to adjust the coding parameters orretransmit lost packets to achieve better datarecovery. In full retransmission, all of the data issent again, unlike partial retransmission. Bothapproaches increase delay due to increasedround-trip time. However, partial retransmissiontakes up less bandwidth than full retransmission.Some of the examples of partial retransmissionare Reference Picture Selection, Intra Update,and media rate control protocols such as TFRCand DCCP.

Limitations imposed by packet loss for videochat are a challenge that needs to be addressedto give end users better quality for bandwidth-

1 http://tech.blorge.com/Structure:%20/2009/02/24/skype-stealsband-width-even-when-you-are-not-using-it/, last accessedMarch 27, 2013.

Video chat applica-

tions can use avail-

able WiFi hotspots.

Although WiFi brings

a realistic and com-

fortable solution for

bandwidth-con-

strained 3G to pro-

vide such high-end

applications, main-

taining QoS during

such vertical hand -

over is an issue.



constrained networks and delay-intolerant videochat applications. Generally, a good video chatapplication, when incurring heavy data-loss,should drop the video stream in order to pre-serve the audio stream as much as possible. InFig. 6b we propose a three-stage packet lossrecovery technique for mobile video chat adher-ing to its delay constraints. The plus sign (+)indicates an increase in the network parameter.With bandwidth availability in the network, par-tial retransmission gives the best result. Depend-ing on the loss type, the system can adoptencoder- or decoder-based approaches in thepresence of losses. Encoder-based approachesare more appropriate for random losses, buthave negative impact in the presence of networkcongestion. The stages are numbered in order ofpreference. Finding the thresholds for switchingbetween these three stages and satisfying end-user video chat experience is an importantresearch area for video chat.

CHALLENGESA video chat controller needs to address addi-tional challenges apart from network and appli-cation limitations, discussed in the previoussections. It should measure up to end-user expec-tation in terms of quality, security, and enddevice capabilities (Fig. 2). In this section, wediscuss these challenges in detail.

MEASURE AND MAINTAIN USER QOETraditionally, quality of service (QoS) metricssuch as packet loss, delay, and jitter have beenused to measure the application performance onan end device. However, with increased diversityin application requirements of multimedia ser-vices and different content types, users’ qualityof experience (QoE) instead of networks’ QoS isbecoming an increasingly popular term in themobile video sector. QoE metrics tend to mea-sure user perception of delivered multimediaservices instead of counting on network serviceparameters. Many tools have evolved for evalu-ating video quality delivered to the end userusing subjective or objective methods.

Subjective quality assessment refers to algo-rithms that measure the “user-perceived” qualityof received video. Typically, users give a numeri-cal score (1–5) on the perceived video quality.Methodologies such as oneclick or crowdsourc-ing can be employed to carry out video chat sub-jective assessment in a more economical way.Oneclick [11] uses a single dedicated key click toconvey dissatisfaction of end users with the videoapplication in an online manner. Crowdsourcingapplications like Amazon Turk2 take feedbackfrom the mobile end users in exchange for mon-etary benefit. These end users are crowdsourcedinstead of selecting a pool of observers in a con-trolled environment.

On the other hand, objective methods referto mathematical models that approximate resultsof subjective quality assessment (Fig. 2), but arebased on criteria and metrics that can be mea-sured objectively and automatically using com-putational techniques. Full-reference objectivemetrics such as peak signal-to-noise ratio(PSNR) and SSIM are not suitable for real-time

interactive content because they require a copyof the original video. However, metrics such asblocking and blurring, which are no-referencemetrics [12], can be used to quantify the networkimpairment in the video for video chat.

Ideally, a mobile video chat must have robustmapping of subjective scores to objective mea-surements. While the chat engine can automati-cally assess the video quality using objectivemetrics, subjective framework (such as one-clickor its variant) can be used to fine-tune its perfor-mance. This feedback can be used by serviceproviders to manage network and other serviceparameters of an end-to-end system.

BATTERY CONSUMPTIONBattery consumption has become a major issuefor running high-end applications in smartphones. Mobile phones, being small and support-ing such a wide range of applications, increasethe pressure on battery life for such devices. Fur-thermore, battery technology is not improving atthe same pace as the processing power and speedof mobile devices. With nearly 1500 mAh batterypower in current smart phones, if a typical videoapplication consumes roughly 300 mW of powerper hour when enabled, neglecting all otherpower consumption on the phone, the battery isexpected to last 4–5 h. Effort has been made byTango in this regard. Tango uses Google pushnotifications to wake up the mobile phone onreceiving a call. Although this technique avoidsdraining the mobile phone battery, users do notget the provision to sign out if they do not intendto receive calls.

The best way to solve this issue is to designbattery-status-aware and end-user-expectation-aware video chat applications. One such effortfor image compression is presented in Poly-DWT architecture [13], which can morph its

Figure 6. a) Summary of packet-loss recovery techniques; b) three-stage packetloss recovery for video chat (+ indicates increase).

1. Partialretransmission

2. Encoder-based

3. Decoder-based

(a)

(b)

Data loss recovery techniques

Feedback-based

Full retransmission

Non-feedback-based

Partial retransmission Encoder-based Decoder-based

+Random losses

+Random lo

sses

+Congestion losses

+BW availability+BW av

ailab

ility

+Congestion losses

2 https://www.mturk.com/mturk/welcome, lastaccessed March 27, 2013.



hardware requirements and image reconstruc-tion quality at runtime, leading to considerablesavings in power.

INTEROPERABILITYThe future video chat needs to be as ubiquitousas a voice call. The content receiver can be a dif-ferent mobile phone device with differentrequirements under different operators and run-ning different video chat applications.

Two end users can strike a voice call irrespec-tive of the system and network differences. Thepacket switched telephone network (PSTN) hasbeen the enabling technology in this regard.Headway has been achieved in standardizingmultivendor interoperability and operator inter-connect using IMS-based services. SIP, an openstandard, is used for control plane signaling inIMS. Devices from any vendor supporting SIPvideo chat will be able to interoperate with SIP-based devices from any other vendor. As men-tioned earlier, user plane traffic in IMS isRTP/UDP/IP-based.

Currently, end users using different videochat applications cannot communicate with eachother. This is because video chat applicationsuse different video codecs and technologies forsystem negotiation, which have not been stan-dardized yet. To give similar flexibility to endusers as for voice call, video chat applicationproviders need to work toward standardization.

SECURITY AND PRIVACYSecurity and privacy concerns are inherent inany real-time interactive communication.

Encryption techniques are well studied andcan be employed to provide the required level ofsecurity for video chat users. Joint compressionand encryption schemes have been recentlydeveloped, which lead to significant savings incomputational power while achieving good levelsof unintelligibility of the bitstream [14]. Howev-er, privacy concerns still need to be addressedfor mobile video chat. Using diary and interviewtechniques, [15] points out that privacy duringvideo chat is a key concern of end users. Theconcern stems from the fact that the user at theother end may make obscene gestures, or theend user, when using video chat in a highlycrowded place, wants to keep it private.

Blurring techniques have been used to makea given scene appropriate for users at the otherend to view. Although video blurring techniquescan work appropriately as well as protect usersprivacy in both contexts, they are not appropri-ate for the new generation of video chat services[16]. This is because the main function of thenew generation of video chat services is to bringa user face to face, via smart phones, with anoth-er person from another corner of the world inhis/her background. If the services provide userswith all blurring faces or user background, theusers who use these services will gradually losetheir interest in talking to others.

Other attack models and countermeasures forvideo chat over smart phones need in-depthanalysis. A basic level of privacy can be providedby the application provider by maintaining pub-lic and private profiles of end users [16]. Withregard to end-user preference, the application

can display a corresponding profile at the otherend. Also, it is the application provider’s respon-sibility to secure end-user video chat accounts.

CONCLUSIONWith the increased proliferation of smartphonesand the advancement in communication tech-nologies, the demand for video chat hasincreased tremendously. To catch up with thistrend and exploit the economic incentive associ-ated with it, both application providers andmobile carriers have significant roles to play interms of improving the quality of video chat.

Although there have been some approachessuggested in the literature, applicability of thesesolutions over cellular networks for high-qualityend-to-end video chat has not yet been studied indetail so far. In this article, we lay down the limi-tations and summarize the challenges faced byvideo chat. We also discuss the possible solutionsand their incompleteness. We believe that afteraddressing these challenges, the mobile videochat can be developed into its full momentum.

REFERENCES[1] S. Higginbotham, “Can You See Me Now? The Future

of Video Chat,” http://gigaom.com/2010/06/07/can-you-see-me-now-the-future-of-video-chat/

[2] M. G. Ames et al., “Making Love in the Network Closet:the Benefits and Work of Family Videochat,” Proc. 2010ACM Conf. Computer Supported Cooperative Work,2010, pp. 145–54.

[3] J. Scholl et al., “Designing a Large-Scale Video ChatApplication,” Proc. 13th Annual ACM Int’l. Conf. Multi-media, 2005, pp. 71–80.

[4] P. Seeling et al., “Video Network Traffic and QualityComparison of VP8 and H.264 SVC,” Proc. 3rd Wksp.Mobile Video Delivery, 2010.

[5] 3GPP, “ETSI TS 122 105 V10.0.0; Digital Cellular Telecom-munications System (Phase 2+); LTE; Services and ServiceCapabilities (Release 10),” Tech. Rep., May 2011.

[6] J. Nurminen et al., “Sharing the Experience With MobileVideo: A Student Community Trial,” 6th IEEE ConsumerCommun. and Networking Conf., 2009, pp. 1–5.

[7] S. Sharma, N. Zhu, and T. cker Chiueh, “Low-LatencyMobile IP Handoff for Infrastructure-Mode WirelessLANs,” IEEE JSAC, 2004, pp. 643–52.

[8] 3GPP TR 36.819 v11.0.0, “Coordinated Multi-Point Opera-tion for LTE,” 3GPP TSG RAN WG1,” tech. rep., 2011.

[9] O. Boyaci, A. Forte, and H. Schulzrinne, “Performanceof Video-Chat Applications Under Congestion,” IEEEInt’l. Symp. Multimedia, 2009, pp. 213–18.

[10] J. Wang and D. Katabi, “Chitchat: Making Video ChatRobust to Packet Loss,” tech. rep., July 2010.

[11] K.-T. Chen, C.-C. Tu, and W.-C. Xiao, “Oneclick: AFramework for Measuring Network Quality of Experi-ence,” IEEE INFOCOM ’09, 2009, pp. 702–10.

[12] Z. Wang and A. C. Bovik, Modern Image Quality Assess-ment, Synthesis Lectures on Image, Video, and MultimediaProcessing Series, Morgan & Claypool, 2006.

[13] A. Pande and J. Zambreno, “Poly-DWT: PolymorphicWavelet Hardware Support for Dynamic Image Com-pression,” ACM Trans. Embed. Comput. Sys., vol. 11,no. 1, Apr. 2012, pp. 6:1–6:26.

[14] A. Pande, P. Mohapatra, and J. Zambreno, “Using ChaoticMaps for Encrypting Image and Video Content,” IEEE Int’l.Symp. Multimedia, Dec. 2011, pp. 171–78.

[15] K. O’Hara, A. Black, and M. Lipson, “Everyday Practiceswith Mobile Video Telephony,” Proc. SIGCHI Conf. HumanFactors in Computing Systems, 2006, pp. 871–80.

[16] X. Xing et al., “Safevchat: Detecting Obscene Content andMisbehaving Users in Online Video Chat Services,” Proc.20th Int’l. Conf. World Wide Web, ACM, 2011, pp. 685–94.

BIOGRAPHIESSHRABONI JANA ([email protected]) is currently a Ph.D. stu-dent in the Department of Electrical and Computer Engi-neering at the University of California, Davis. She received

With the increased

proliferation of

smartphones and the

advancement in

communication tech-

nologies, the

demand for video

chat has increased

tremendously. To

catch up with this

trend and exploit the

economical incentive

associated with it,

both application pro-

viders and mobile

carriers have signifi-

cant roles to play.



her M.S. from the University of California, Irvine, and herB.E. from the National Institute of Technology, Durgapur,India. Her research interests are in wireless networks, real-time multimedia and mobile systems.

AMIT PANDE ([email protected]) completed his Ph.D. fromIowa State University in 2010 where he was awardedResearch Excellence Award for his dissertation, Algorithmsand Architectures for Secure Embedded Multimedia Sys-tems. Prior to this he completed his Bachelor’s in electron-ics and communications engineering from the IndianInstitute of Technology Roorkee in 2007 with the InstituteSilver Medal. His research interests are in multimedia sys-tems, wireless networks, embedded systems, security, pri-vacy, forensics and trust, wireless health, and watersustainability.

AN (JACK) CHAN ([email protected]) received B.Eng andM.Phil degrees in information engineering from the Chi-nese University of Hong Kong in 2005 and 2007, respec-tively. He got his Ph.D. in computer science at theUniversity of California, Davis, in 2012. He is now working

as a scientist at Broadcom Corporation. His research inter-ests include video and voice transmission and quality ofexperience (QoE) over wireless networks, advanced IEEE802.11-like multi-access protocols, and data path optimiza-tion in next generation cellular networks.

PRASANT MOHAPATRA [F] ([email protected]) is cur-rently the Tim Bucher Family Endowed Chair Professor andthe Chairman of the Department of Computer Science atthe University of California, Davis. He was/is on the Editori-al Boards of IEEE Transactions on Computers, IEEE Transac-tions on Mobile Computing, IEEE Transaction on Paralleland Distributed Systems, ACM WINET, and Ad Hoc Net-works. He has been on the program/organizational com-mittees of several international conferences. He receivedhis doctoral degree from Penn State University in 1993,and received an Outstanding Engineering Alumni Award in2008. He also received an Outstanding Research FacultyAward from the College of Engineering at the University ofCalifornia, Davis. He is a Fellow of the AAAS. His researchinterests are in the areas of wireless networks, mobile com-munications, sensor networks, Internet protocols, and QoS.


mobile video chat: issues and challenges

Documents