mobile video literacy: negotiating the use of a new visual technology
TRANSCRIPT
ORIGINAL ARTICLE
Mobile video literacy: negotiating the use of a new visualtechnology
Alexandra Weilenmann • Roger Saljo •
Arvid Engstrom
Received: 23 November 2011 / Accepted: 28 September 2012 / Published online: 26 September 2013
� Springer-Verlag London 2013
Abstract In this article, we examine the practice of
learning to produce video using a new visual technology.
Drawing upon a design intervention at a science centre,
where a group of teenagers tried a new prototype tech-
nology for live mobile video editing, we show how the
participants struggle with both the content and the form of
producing videos, i.e., what to display and how to do it in a
comprehensible manner. We investigate the ways in which
video literacy practices are negotiated as ongoing accom-
plishments and explore the communicative and material
resources relied upon by participants as they create videos.
Our results show that the technology is instrumental in this
achievement and that as participants begin to master the
prototype, they start to focus more on the narrative aspects
of communicating the storyline of a science centre exhibit.
The participants are explicitly concerned with such issues
as how to create a comprehensible storyline for an assumed
audience, what camera angles to use, how to cut and other
aspects of the production of a video. We consider these
observed activities to be candidate steps in an emerging
mobile video literacy trajectory that involves developing a
capacity to document and argue by means of this specific
medium.
Keywords Mobile video � Camera phones � Video
literacy � Media literacy � Video analysis � Science
centres
1 Introduction
Media play a vital role in the development, storage and
communication of human experiences and knowledge.
Through history, media have taken different forms—all the
way from Upper Paleolithic cave paintings via clay tablets,
parchment rolls, printed books, to present-day smartphones
and digital tablets, to mention but a few examples of
powerful media with important social impact, each in their
own time. The process of documentation through such
inventions implies externalizing experiences and present-
ing them in visuographic form for others to share [8]. This
creates what Donald refers to as an ‘‘external memory
field’’, where information can be preserved over time and
be accessible for later use. Through the capacity to develop
technologies for preserving information and human expe-
riences, the social memory of a society may expand
without limits. Externalization of human experiences
through documentary practices, in turn, relies on the use of
inscriptions of various kinds: images, scripts, number
systems, graphs and so on. Such inscriptions are often
complex and have to be taught through explicit instruction.
Human development, thus, is co-determined by such
intellectual technologies, which, when appropriated, pro-
vide us with the means of representing and analyzing the
world, and, of course, communicating what we have found
to others.
Traditionally, the term literacy has been used to refer to
the skills that have to do with reading, writing and the use
of texts. However, recently, many scholars have argued
A. Weilenmann (&) � R. Saljo
University of Gothenburg, Gothenburg, Sweden
e-mail: [email protected]
R. Saljo
e-mail: [email protected]
A. Engstrom
Mobile Life Centre, Stockholm University,
Stockholm, Sweden
e-mail: [email protected]
123
Pers Ubiquit Comput (2014) 18:737–752
DOI 10.1007/s00779-013-0703-x
that the developments taking place through digital media
make it necessary to reconsider the nature of literacy and
literacy practices in contemporary society, where texts no
longer are as dominant as they have been in the past. Kress
[24] argues that the screen has become an important arena
for representing the world and that the screen affords dif-
ferent kinds of interpretive practices than texts. For
instance, texts are read in a linear fashion, while screens are
not read in this manner. Jenkins [19] uses the term trans-
media literacy to refer to the interactions between media
that characterize many activities, where narratives and
stories emerge ‘‘across multiple media platforms with each
new text making a distinctive and valuable contribution to
the whole’’. Jenkins [20] further discusses how to foster
new media literacies and highlights participatory skills
such as appropriation and performance, besides being
competent in searching, reading and judging media con-
tent. Lemke [27] suggests that transmedia literacy practices
are now essential elements of how people engage in
meaning making in many situations.
The concept of media literacy, whatever specification
one prefers, is an interesting point of departure for ana-
lyzing how people appropriate technologies for purposes of
meaning making and participation in media practices.
Media technologies—paper and pencil, typewriters, cam-
eras, video, computer software of various kinds—make it
possible to represent and communicate about the world in
increasingly complex manners. At the same time, mastery
of such tools presupposes that one makes experiences of
how they may be used to design messages. In the context of
classical literacy, for instance, one has to learn the alphabet
and grammatical and other principles of how to construe
intelligible messages. As one becomes more advanced, the
learning trajectory has to include insights into genres and
audience expectations; a scholarly piece of writing has to
be designed on the basis of principles that are different
from those that apply to a fantasy story or a piece of news.
To be put in an authoring position—i.e., to act as a designer
of media messages—generally requires more knowledge
than to be a consumer. It is easier to read a novel than to
write one, and it is easier to watch a cartoon than to make
one. As Gilmor argues, ‘‘being literate in today’s world
means more than just smarter consumption, however
actively you do that. Being literate is also about creating,
contributing and collaborating’’ [15, p. 60].
The aim of this article is to explore such learning tra-
jectories that concern what we refer to as mobile video
literacy. Mobile phones are ubiquitous, and they are
becoming increasingly powerful when it comes to docu-
menting and even editing videos. Given the availability of
such resources, it may be expected that they will be used
quite widely in many settings, for instance, in the context
of education and during leisure activities. Recording and
editing, in our opinion, may be seen as examples of media
literacy practices that have to do with how to design
messages suitable for different purposes. The skills that are
relevant tor such activities have to be learned, and such
learning implies both mastery of the technologies and
increasing ones familiarity with principles of how to rep-
resent events in a manner congenial with relevant genres of
communication.
In the present article, we focus on one such new media
practice. Our primary interests concern how new users
develop skills when engaging with a new video technology.
Based on a design intervention at a science centre, where a
group of teenagers worked with a new prototype for live
video editing on mobile phones, we discuss the challenges
involved when learning to use this new technology. Being
novices to both the technology and the setting they are in,
the participants struggle with designing a comprehensible
documentation and rendition of the activities in the science
centere.
1.1 Camera phones and video work
Following the proliferation of camera phones and digital
cameras, video and photography are now readily available
and easy to share in many situations. This has lead to new
practices of documentation and sharing of experiences as
they happen. ‘Camera phones enable an expanded field for
chronicling and displaying self and viewpoint to others in a
new kind of everyday visual storytelling’ [33, p. 17]. While
the body of work on the use of digital cameras and
smartphones is expanding in the research literature, we still
know relatively little about the actual work that goes into
creating and sharing visual experiences with these devices,
and about how such skills develop. It has been noted that
camera phones ‘have yet to be given the attention they
deserve by researchers (…). For example, if one compares
research done on, or involving, camera phone videos with
research done with/on Flickr, YouTube and/or Facebook, it
is striking how little has been done on the former, which is
an enormously important global phenomenon’ [7, p. 96].
Similarly, it has been argued that there is a lack of
empirical studies of the editing of visual images in specific
settings [14].
The present study is concerned with amateurs being
introduced to a new technology. While video editing skills
amongst professionals have been investigated in detail (e.g.
[6, 12, 25, 31, 32]), there has been less attention paid to the
appropriation of these skills amongst novice users [11].
With the development of new video recording and editing
technologies, the possibilities for amateurs to do editing
work are increasing. Basic video recording and editing
software is now included in most mobile phones, making
video tools accessible for non-professional users. These
738 Pers Ubiquit Comput (2014) 18:737–752
123
phones also allow for easy online sharing, allowing easy
dissemination of videos.
Kirk et al. [23] show how video production is an
explicitly social process in all aspects of its use, and
how its end use is a key driver in video production.
For ‘lightweight’ devices, such as camera phones (as
opposed to ‘heavyweight’ video recorders), video is
typically captured spontaneously, shared in the
moment and primarily meaningful in the context of
that shared experience. Editing after the event is
regarded as cumbersome and happens very rarely,
which opens up a design space for more live media
editing and sharing tools. Lehmuskallio et al. [26]
confirm the differences between the ‘lightweight’,
unplanned activity of shooting with a camera phone,
on the one hand, and, on the other hand, the use of
more ‘heavyweight’ camcorder and editing practices.
They further suggest that mobile video practice is
more closely related to snapshot photography than to
traditional videography and filmmaking. David [7]
reports similar observations of everyday video prac-
tices and their consequences, noticing the spontane-
ous character of video production:
Repositories of digital pocket videos often tell stories
that feel like old spaghetti western films. Most of the
home camera phone videos lack dramatic action.
While the spectator just wants to see what will hap-
pen next, the takes are long and the desert dry.
Camera phones are an appurtenance of everyday life,
which we rarely storyboard. The images so produced,
therefore, tend to be spontaneous, at least in their
content. (ibid., 95–96)
The properties and affordances of live streaming video
have been explored in diverse contexts such as groups of
friends [34], visual production in nightclubs [10] and
emergency response work [4]. Following the emergence of
early online services for live broadcasting from mobile
phones, the content on these services has been studied [9,
12]. These studies point to some early trends and user
practices, but at the same time, they reveal that we are
dealing with an immature medium, as it were, where the
literacy practices relevant are not yet stable. All but the
most advanced users struggle with capturing and presenting
engaging content in a timely manner while filming live.
Editing and collaborative production have been sug-
gested as means to overcome this and to communicate the
skills relevant for producing more engaging visual stories
suited to the needs of various practices and settings. Iacucci
et al. [18] provide an early investigation of how camera
phones are used to enhance a shared spectator experience.
They emphasize how spectators co-experience events in
groups and how mobile imaging can be a participatory
practice enhancing the spectator experience on-site, rather
than merely documenting it. Other work with similar aims
has drawn on socially produced video to produce an
enhanced post hoc record of a live music event [22, 38].
Jokela [21] presents an early prototype and design princi-
ples for editing on a mobile device. The collaboration
around the mixing mobile video feeds has been explored in
prototype designs and live productions, revealing new af-
fordances as well as design challenges when transferring
professional production methods and tools to non-profes-
sional users [3, 10, 11].
As pointed out above, live video and editing facilities
add to the possibilities of collaboration around a mutually
shared final product [11]. The live feature allows sharing in
the moment, but at the same time, the access to a live video
feed leads to an added possibility of sharing events outside
of the local context. Designing experiences for a distance
audience, as we will show, involve considering the
awareness of the gaze of others:
Camera phones makes (sic) ubiquitous visual access
to others possible. In other words, the gaze of others
is always present as a potentiality, leading to a
heightened sense of visual awareness and a growing
centrality of images in the ongoing social exchanges
of everyday life. [33]
This awareness of a presumed audience has a number of
consequences for the camerawork. Engstrom et al. [11] argue
that ‘Video technologies support a type of collaborative gaze
in which camera users act as proxy viewers on behalf of
someone else: the eventual viewer of broadcast content’.
They call this ‘mediated looking’ and define three different
forms or positions from which to look: (a) looking editorially
(selecting what shot to choose), (b) looking together as a
team of camerapersons and (c) looking on behalf of others.
The latter category involves not just looking as would a
presumed audience but includes activities when the cam-
eraperson is looking on behalf of the editor. Licoppe and
Morel [29] specifically bring out the affordances of the
mobility of the device; how everything within the video
stream is available for scrutiny and therefore can be poten-
tially relevant to the ongoing conversation.
In many of the previous studies of the work that goes into
editing video or selecting shots to display for an audience,
live or otherwise, the focus is on professional settings, with
a professional camera crew. In this study, as in that of [11]
above, we focus on novice users of this technology in an
informal learning environment. As argued in the introduc-
tion, new forms of literacy will involve more advanced
forms of participation and production of media; thus, we
need to move on to study contexts beyond everyday usages
of these media technologies. One such context, that this
Pers Ubiquit Comput (2014) 18:737–752 739
123
paper is concerned with, is informal learning environments,
more specifically science centres. As we will show, such a
setting allows for an engagement with the content of the
exhibits using the technology, in order to create a compre-
hensible narrative from the visit. The participants are not
only new to the technology; they are also new to the setting
they are recording. This makes their task quite different
from that of a professional team, and it also makes it dif-
ferent from everyday, amateur usage of this medium.
Our point of departure is to analyse ‘video as a practical
accomplishment’ [32]. Following this, we take ‘an ana-
lytical interest in the way in which coherent images are
assembled, as well as the way in which interactional
order—as it is witnessable, accountable, and intelligible for
members (not only researchers)—is a social accomplish-
ment made possible through technological resources within
the social practices of video recording and video editing’
(ibid., 69–70). In doing so, we focus not so much on the
finished product, the film that the participants produce, but
rather on the work behind the scenes, the video production
practices. It has been noted that the common approach to
study media productions is to base the analysis on ‘the
finished products, rather than the processes through which
these products were assembled. Consequently, the design
rationales of the original composers of the objects are not
directly available to researchers, but have to be inferred’
(Greiffenhagen in press). In our analysis, we intend to
make some of these practices explicit. In addition, we
argue that this set-up is particularly useful in rendering
available for analysis the development of skills needed to
use this new video technology in productive manners.
2 Design intervention: the use of mobile video
at a science centre
As part of a larger study on the use of mobile technologies
in science centres and museums, we experimented with
mobile video as a means of giving visitors new ways of
documenting, experiencing and sharing their visits. Below,
we briefly explain the study set-up and the technology.
2.1 The study
The Mobile Vision Mixer, developed by Mobile Life Centre
as a research prototype, provides a real-time multimedia
environment through which mobile users can collaborate to
make and broadcast live video using only their mobile
phones. In this multimedia environment, several mobile
cameras record, while a director monitors the different
video feeds on a mobile phone screen and selects which one
to show ‘on air’. The final, edited output video can then be
viewed directly online or accessed on a later occasion.
A group of seven teenagers, aged 13–17, were invited to
Universeum, a large science centre in the city of Gothen-
burg, Sweden, to participate in an evaluation of the Mobile
Vision Mixer. In the data used here, participants were
asked to focus on one specific exhibit in an exhibition
called CrimeLab. This exhibit focuses on the mechanisms
behind face recognition technology. Inside the exhibit, the
person stands in front of a face scanner, which scans the
face and then informs him/her when the scan is complete.
The person can then move to another room, where a reader
is located and the system will recognize the face. Upon
recognition, the participant gets access to the room—the
face is thus used as a key. In this way, the activity called
for a certain type of sequentiality, involving several steps at
two different locations within the exhibition area.
Participants were asked to make a short video, using the
Mobile Vision Mixer, to explain the exhibit to their
classmates who had not visited the exhibition. No other
instructions on how to go about making the video were
given, except briefly explaining how to manage the pro-
totype. Participants’ output videos could be viewed online
live.
This study has dual aims: one is to evaluate this proto-
type and to receive feedback about its design as part of a
larger study on mobile live media [10–12] and the other is
to place this particular technology in an informal educa-
tional setting to allow for a discussion of its potential
benefits as a documentary practice to be used for instruc-
tional purposes. The latter aim is the focus of this paper. It
should be noted that the prototype during the evaluation
presented users with challenges that are common in field
testing of prototypes; there were several issues with reli-
ability and stability of the system. For instance, it turned
out that the system did not work entirely satisfactory
indoors because of slow connection to the servers that were
placed elsewhere. This meant that during the evaluation,
the system had to be restarted several times, and the par-
ticipants had to wait. The participants used this time to
discuss strategies for making the videos; these discussions
were recorded and are part of the analysis. The most rel-
evant technological problem to mention is that of a delay in
the video being displayed on the screen of the editor. There
was a delay before the content currently recorded by the
cameraman became visible on the screen of the editor. This
proved to be a challenge for the editor when making the
selection between screens.
Also, in field evaluations, it is common that the partic-
ipants are positive towards the technology being evaluated,
simply to please the researchers [5]. However, for our
current purposes, it is not necessary to consider the extent
to which the participants ‘liked or disliked’ the Mobile
Vision Mixer as such. We are simply focusing on the
practical accomplishments of the participants as they were
740 Pers Ubiquit Comput (2014) 18:737–752
123
struggling with using this, imperfect and sometimes cum-
bersome, new video technology.
The evaluation at the science centre was made during
one day. However, a large material forms a background for
the particular study presented here. First, an extensive
period of fieldwork, including observations at this and
other exhibit in the science centre, carried out by the first
author and two students, precedes this data collection.
Second, regarding the technology, complementary evalua-
tions of the same prototype were made during one full day
at a large skate park. Skating, being more of a leisure
activity, allowed for an interesting comparative case.
Third, the prototype has a number of predecessors, which
have been evaluated and reported on elsewhere [10, 11]. In
all, this background material allows for a rich under-
standing of the different ways in which novice users dealt
with the technology in this particular setting (Figs. 1, 2, 3).
The group members were allowed to negotiate what
roles to take. They worked in pairs—two persons for each
camera phone and two persons as editor, handling one
camera. They agreed that the boy (the younger brother of
one of the girls) should be the assistant; a term they coined
for the person interacting with the exhibits while the others
documented it. The reason for the set-up of having two
people collaborating around each phone was to get access
to their negotiations.
2.2 Methodological approach
In order to get a closer look at what constitutes ‘mobile
video literacy’ and its trajectory, we chose a micro-oriented
approach to see how video literacy practices emerge and
are negotiated as ongoing accomplishments. For these
purposes, we rely upon the form of video analysis which
has its roots in ethnomethodology [13] and Conversation
Analysis [35]. Being well-known perspectives in much
previous work of video in professional practices, these
approaches allow us to focus on video as a situated practice
and accomplishment [32]. Rather than focusing on video
literacy as a larger concept and a competence which
develops in a population over time when getting more used
to video technologies, we focus on video literacy as a
‘members’ concern’, as an interactional achievement in the
activity they are currently engaged in. The participants are
explicitly concerned with such issues as how to create a
comprehensible storyline for an assumed audience, what
camera angles to use, how to cut and other aspects of the
production of a video. What this means methodologically is
that these activities are observable in the ongoing practical
achievement of making these videos, and it is also possible
to scrutinise how these skills develop on a micro-level. By
focusing on one instance of a group exploring a new
device, we can follow in detail how they develop, in a
sequence of repeated shootings, competence in terms of
using the technology to present content.
In some sense, then, we are doing video analysis of
video analysis, in our focus on members’ work with this
visual technology:
This fuels an interest in a praxeological analysis of
ordinary and professional video practices, and of
videos as locally organized accomplishments. More-
over, looking at video as practice reveals the skilled
glance on social interaction which is embodied in
looking through the camera: video-makers’ local
orientation to the organizational features of interac-
tion is exhibited in the very way in which they shoot,
arrange and edit the video [32, p. 68].
Fig. 1 The editor (to the right) is looking at the display of her phone
while making the selection between the four different input videos. In
the background, we see one of the camerapersons (left) and the person
doing the test, the assistant
Fig. 2 The mixer version with the four different live video streams
that the editor can choose from
Pers Ubiquit Comput (2014) 18:737–752 741
123
Our video analysis renders visible the participants’
analysis, their displayed understanding, of the exhibit.
In the process of organizing the video work, and
negotiating it continuously as shootings, retakes and so
on, the participants display their understanding of the
content of the exhibit and what is relevant to present to
others. In this article, we are not focusing on how the
participants discover features of exhibits (cf. [39]),
since they have already explored the basic functionality
of the face recognition exhibit before making the
shootings. Rather, we focus on how they display their
understanding of the exhibit in the videos they make of
the exhibit.
We rely on two sets of videos: those recorded by us
as researchers, and those recorded by the participants in
the activity. As pointed out by Mondada [31], this is an
important distinction since ‘video is not a transparent
document but an embodied accomplishment, integrating
the recording and the analysis of the recorded event’
(ibid., 60). In the videos recorded by us as researchers,
we made a selection of what was relevant to focus on in
each particular situation, and in this selection, our
understanding and analysis of the event were displayed.
It is, however, the participants’ understanding, displayed
in their work of making selections of what to shoot, and
the assessments of previous shootings as well as nego-
tiations about upcoming shootings, that is in focus in
this article.
3 Unpacking mobile video literacy
Drawing upon previous work, we have tried to argue that
new forms of literacy skills will emerge as new media
technologies are introduced and taken up by users. In the
following, we will examine in more detail the ways in
which such video literacy practices are negotiated as
ongoing accomplishments. By focusing on one particular
example from a group of visitors to a science centre
exploring a new system, we can follow in detail how the
participants deal with the challenges they encounter and
become more competent in terms of using the technology
to present the content of the exhibits in the science centre.
The idea behind this design intervention is that participants
must, in order to make a comprehensible video, to some
extent, understand both the technology and the content they
are presenting, and this is visible in how they negotiate the
use of this new mobile technology (cf. [40]). These nego-
tiations were verbal and gestural, and relied upon their
placements within the exhibit as well as material resources.
Working as a camera team, the task of the camerapersons is
to provide useable and complementary footage to the edi-
tor, whose role is to assemble the individual shots into a
compelling sequence. Broth [6] describes the reflexive
work of the camerapersons and editor as proposal-accep-
tance, where the default is for each cameraperson to pro-
pose quality shots, communicated non-verbally through
their visual output.
Fig. 3 The facial recognition
exhibit at Universeum
742 Pers Ubiquit Comput (2014) 18:737–752
123
Through a number of excerpts from the data, we will
show how the participants are (a) negotiating the storyline,
(b) negotiating the performance of the actions to be
recorded and (c) negotiating the camerawork so as to
capture the actions in a relevant and comprehensible way.
We present three excerpts from a longer piece of interac-
tion where participants (a) first plan how they work to
produce the recording, (b) then, capture a video sequence,
(c) evaluate it, and, finally, (d) repeat steps A through B a
second time.
3.1 Negotiating the storyline
In this example from the design intervention, the person
performing a test of the face recognition system, here
called the assistant, (whose face is visible in Fig. 1) is
talking to the cameraperson responsible for documenting
his activity. The following conversation occurs before the
test, when they are discussing how to coordinate the work
between the camerapersons. Here, we can see how the
participants struggle with the issue of how the story of the
exhibit should be told (the form) at the same time as they
are deciding what to show (the content).
Previously, they have agreed that the assistant1 should
do what they call ‘the experiment’; i.e., use the face rec-
ognition system, first to scan his face and then move on to
use it to open the door to the ‘hotel room’. However, the
assistant is not clear about what to do once he has entered
the room—‘What the hell should I do when I get in’, he
asks (lines 100–101). The cameraperson’s answer ‘then
there will be someone filming you’ displays a focus on the
form, the presentation, but not an awareness of the fact that
there has to be ‘something’ to film, an element of a story.
The assistant therefore reinforces his point, asking what he
should do ‘in there’ for something more to happen in the
storyline. The cameraperson, however, does not provide
such a next event. Instead, she says that he ‘chill’ because
‘then it’s done then you are in’ (lines 108–109). When the
assistant is ‘in’ the room, they have reached the end of the
sequence, the storyline that they are working with and
nothing more needs to be recorded.
The cameraperson’s final remark that ‘it is the experi-
ment that we are going to show, not that you are walking
into a room’, displays an orientation to the story they are
creating, and to the final product of their work. The result
should focus on the experiment, as she calls it, which is the
procedure of testing the face recognition exhibit. However,
in displaying the exhibit, the act of ‘walking into the room’
has to be made part of the storyline. In fact, it is crucial as it
is the final part of the procedure and thus the end of, or
‘punchline’ of the storyline.
However, as it turns out later, after they have per-
formed and recorded a first sequence, the assistant
actually asks a reasonable question when wondering
what to do upon entering the room. They encounter a
problem in the creation of narrative around this exhibit
because the actual entering of the room goes too quickly
and is perceived as uneventful. Performing ‘getting into
the room’ has to be done more slowly so as to render it
a filmable activity. This will be discussed in the fol-
lowing section.
Excerpt 1 The assistant (to the left) outside the hotel room, being filmed by Cameraperson 4. Notice Cameraperson 3 behind the glass wall,
glancing towards the others, ready to film inside the room as the assistant enters. A Assistant, C4 Cameraperson 4
1 This is a term the participants themselves used, to describe the
person who was interacting with the exhibits while being
documented.
Pers Ubiquit Comput (2014) 18:737–752 743
123
3.2 Negotiating how to perform the actions to be
recorded
The following conversation takes place after the recording
of the first trial at the face recognition exhibit. The par-
ticipants briefly assess the work that was just done and
agree that the actions were performed too quickly to be
acceptable. They decide to have another go and discuss
strategies for improving their work the second time.
Capturing the full sequence of the facial recognition
involves moving over the exhibition space, from the small
room where the first system is placed, to the entrance of the
‘hotel room’, where the face is used as a key to open the
door. The participants are consequently encountering the
problem of capturing the mobility of the assistant, as he
moves between these two locations. Capturing his move-
ments provides some challenges. Right after the first
sequence has been shot, one of the editors says to one of
the camerapersons that ‘it almost feels like you didn’t
really follow’ (lines 201–203). She thereby shares verbally
an evaluation of what was made available on the mixer
screen. The assistant comes out of the room, and the
cameraperson is told to come out as well, and they all
gather around to discuss how the shooting went.
Excerpt 2 Cameraman 4 makes a quick vertical movement with her phone, mimicking the short period of time she had her phone up to record the
assistant’s actions. E1 Editor 1, E2 Editor 2, C2 Cameraperson 2, C4 Cameraperson 4, CX unidentifiable cameraperson, X unknown speaker
744 Pers Ubiquit Comput (2014) 18:737–752
123
Cameraperson 4 says that ‘it went very fast’ and does a
quick vertical movement with her phone, as if again per-
forming the video recording. In the quickness of this
exaggerated gesture, she displays the short period of time
that she needed to do her part of recording the assistant’s
movements. Cameraperson 2 agrees and also uses her body
as a resource to visualize the procedures she had to capture
the video. She does a restrained form of running-on-the-
spot, holding the phone up as if recording. She suggests
that ‘he should walk a bit slower next time’ (lines
215–216).
The discussion then moves on to negotiate how the
performance should be made when doing another
sequence. All participants go back to their positions again.
The editor discusses with one of the camerapersons how to
make the recordings. This involves showing ‘all the steps
that happen’ as Editor 1 explains to the camerapersons.
Before counting down to start again, it is emphasized again
that the event should unfold slowly.
To sum this example up, we have seen how the partic-
ipants, after having filmed a first sequence, are negotiating
the pacing of the performance, i.e., how quickly the actions
should be carried out in order to be comprehensible. As a
result of the assistant performing the walking too quickly,
the cameraperson’s work has to be done quickly as well,
something that they discuss as a problem. So the negotia-
tions of the actions to be recorded are tightly interwoven
with the bodily performance of the camerapersons, having
to keep up with the actions and capturing them, what could
analogously be called pacing of camera movements. A
trained camera operator would typically rehearse the
camera movements simultaneously as the performer
rehearses their movements. But to the novices seen here,
this is an emergent feature that is not evident before
engaging in the camerawork. Although arguably not a part
of their everyday understanding of video recording, these
skills are attainable and begin to develop in just a few
repeated attempts, as the participants engage with the
technology in this setting. The awareness of one’s own
movements as a cameraperson, and the ability to plan and
negotiate them in parallel with the covered action, is part of
becoming video literate in the broader and more involved
sense we argue for here.
3.3 Negotiating the camerawork: taking the audience
perspective
As was discussed in the background section, ‘camera
phones make ubiquitous visual access to others possible’
[33, p. 17], and this implies that ‘the gaze of others is
always present as a potentiality’ (ibid.). When capturing a
video, there is an awareness of the potential of sharing it. In
the scenario at the science centre, the participants were told
to imagine an audience of their classmates, who were not
present but who would look at this video from remote, and
be able to make sense of the exhibitions from the video
presentations. The participants’ orientation to such an
imagined audience is visible in their negotiations of the
video work. In this section, we focus on this negotiation
prior to broadcast, and how it is sequentially organized to
provide the editor with a useable multicamera set-up. The
detailed work of the ongoing camerawork and live editing
is not covered here (cf. [6, 11]).
How, then, is this orientation to an audience visible in
the material? In order to understand the following exam-
ple, there is a need to briefly explain some details about
Swedish pronouns. Here, the participants are changing
between using du (singular second-person you—tu in
French), ni (plural second-person you—vous) and man
(generic you—on). This means that when the camera-
person says what is translated into English as ‘You have
to make sure that you see the screen’ it is not ambiguous
to the participants, as it is in English, what these two
instances of you are referring to. This is of analytical
relevance, since the selection of pronoun displays an
orientation on behalf of the cameraperson to a presumed
audience: her taking the perspective of an outside ‘you’
watching the final product.
The sequence begins with general instructions on how
the recordings should be made, e.g., ‘you (singular)
should film so that one sees clearly what he is doing
then’. Later on in the sequence, we see how she moves
on to giving more direct feedback on the actual cam-
erawork, telling the cameraperson to adjust the view and
asking her to film from another angle. The pictures in
the excerpt are taken from the camera video made by
Cameraperson 1.
Pers Ubiquit Comput (2014) 18:737–752 745
123
Excerpt 3 E Editor 1, C1 Cameraperson 1, X unknown speaker, FRS voice from face recognition system
746 Pers Ubiquit Comput (2014) 18:737–752
123
In this conversation, their orientation to a presumed
audience is visible in the repeated use of formulations like
third person ‘man’ (‘one’) in Swedish, e.g., ‘otherwise one
doesn’t get the whole idea’. A salient example of how the
main editor displays an orientation to the audience, and her
looking on behalf of the audience, is found on lines
209–213. Here, the editor goes from saying ‘you have to
s—(singular you), then interrupts herself and reformulates
it as ‘one has to see’. In fact, they are both relevant for-
mulations, as the cameraperson in order to make it avail-
able for an audience also has to see it herself. Also, this
‘one’ is not just the audience but includes the editor doing
the looking on behalf of the remote audience (cf. [11]).
In this excerpt, in contrast to the previous ones, we focus
on the video streams provided as responses to the editor’s
verbal instructions about the recordings when preparing to
begin the shooting. In a series of photos from the video, we
see how the cameraperson presents a set of interpretations
of the editor’s instructions. In each shot here, she tries out a
new angle or zoom level that the editor can then reject or
accept. As Ivarsson ([17], 178) points out ‘Similar to how a
filmmaker can guide the visual attention of a viewer, by
zooming in on an object, the zoom can become a conver-
sational move in a situation like this’. By simply looking at
the pictures in this excerpt, we see that the focus moves
from capturing the person interacting with the exhibit (the
assistant) to a close-up of the screen next to the face rec-
ognition scanner.
So there are clearly some tensions in the views on how
to best capture the face recognition system to make it
comprehensible to an audience. The cameraperson begins
by filming the person from the front; visible behind him is a
sign with information about the exhibit. This is responded
to by the editor as ‘one has to see what’s going on’. This is
the same, relatively vague, formulation that she used pre-
viously. The camerapersons now pan over the other exhibit
sign behind the assistant (not visible in the pictures above),
suggesting that this sign should be in the shot as well. The
editor disagrees and now formulates herself more precisely
‘you (singular) have to film what they do on the screen’.
Emphasizing ‘screen’, she displays that this is the inter-
pretation of how to capture ‘what’s going on’ as part of a
storyline rather than merely showing the sign, as the
cameraperson suggested.
The next candidate for framing is then proposed [6] by
the cameraperson (Picture 2), recording the assistant’s back
and parts of the screen. The editor is not happy with this
solution either, emphasizing ‘no’, and continues to say that
‘you (singular) have to see what’s on the screen and that’
(lines 235–236). Picture 3 is then provided by the camera-
person as a candidate stream, with the screen being visible
alongside the face recognition scanner. The editor continues
her phrase, explaining why the screen needs to be shown:
‘or else one won’t get the whole idea’. Here, she clearly
displays an orientation to a storyline and a presumed
audience, who needs to have certain visual material pre-
sented to them in order to ‘get the idea’ of this particular
exhibition. There is then a short sequence when the editor
seems to loose her patience with the giggling cameraperson,
and tells her to focus. The final candidate shot (Picture 4)
displays a close-up of the screen, and the editor seems to
approve as they agree to finally begin the shooting.
To sum up, in this example, we have seen that there is a
specific sequencing of the turn-taking involved in arrang-
ing the video streams. The cameraperson proposes different
candidates for shots (Pictures 1–4 in the excerpt), receives
feedback from the editor and then rearranges the camera
until they agree on a framing. Then the recording can
begin. In these stepwise instructions, the cameraperson is
verbally formulating how to present visual material, and in
doing so is orienting to the storyline and the audience.
In this example, the negotiation concerns the planning of
one of the cameraperson’s camerawork. When this is in
place, the editor assesses that the camera team is set-up to
perform the broadcast. The final set-up, displaying four
distinct, complementary shots covering the action being
proposed to the editor, is shown in Fig. 4 (right). By con-
trast, the left image (Fig. 4) shows the set-up before the
negotiation, giving the editor less optimal footage to use in
editing the story. This is one example where we can see
their skills as a collaborative team evolving during the
course of the trial.
4 Discussion
We have explored issues of how content and form are
interrelated when doing video work, illustrating some
features of the process of learning to engage in this practice
of producing a video. We consider these observations as
candidate steps for an emerging ‘mobile video literacy’
trajectory, i.e., an emerging capacity to document and
argue by means of a specific medium. The notion of mobile
video literacy is developed further below. We then discuss
the communicative and material resources relied upon by
the participants in the video work. We then move on to
discuss how the visual elements are instrumental in the
creation of the storyline, and how parallel temporalities and
the mobility of people and the technology provide chal-
lenges but also resources for the participants when pro-
ducing a documentation of the exhibit.
4.1 Mobile video literacy
In this paper, we take the concept of media literacy as a
point of departure for analysing how people appropriate
Pers Ubiquit Comput (2014) 18:737–752 747
123
technologies for purposes of meaning making and partici-
pation in media practices. As media technologies, like the
Mobile Vision Mixer studied here, are becoming more
accessible, it becomes apparent that something more is
needed than mere access to the tools; they also have to be
skilfully put to use for certain purposes in certain contexts.
Also, these technologies are often reaching out to audi-
ences beyond the current situation, allowing for online
participation with people in other contexts. In this way, a
mobile video literacy involves an attention to creating a
narrative and a storyline that is comprehensible beyond the
current context. This was evident in our study when the
participants oriented to their imagined audience, producing
videos that displayed an understanding of what was a
’filmable’ event and what was not. Producing something is
one part of the participatory process involving media
technologies like this one. Gillmor [15] argues that the term
media literacy should move beyond connotation of smart
consumption, and incorporate participation and collabora-
tion practices. In this study, we have seen how the producers
are ongoingly displaying awareness to the consumers, for
instance, when discussing angles that will be understand-
able to an audience. In doing so, it is reasonable to assume
that they draw on their everyday experiences of viewing
visual media (what would be considered media literacy in
the traditional sense), but combine these experiences with
working out how to practically produce such viewable
content, in the situation at hand. The combined activity, in
the live production situation, is a process of developing
media literacy in Gillmor’s more participatory sense.
The element of liveness as a part of the media format
enables engagement and communication around the
content, between local and remote participants. This media
format allows for a process that is less linear than in more
traditional uses of video, where a video is shot in one
context, edited and shown to an audience at a later point in
time. Here, and because of the live editing and broadcast-
ing, the production phase differs radically. This allows for
two forms of collaboration around the context being doc-
umented: the local collaboration around the production,
which we have focused on in this paper, and collaboration
with an audience watching the product as it is being
broadcast. Thus, the product (i.e., the live broadcast) is a
starting point, not an end result. In a related study from
another museum, we saw how live uploading of images
from the museum was responded with comments through
social media channels, which then effectively changed the
material the participants uploaded [41]. Whereas in this
study, we did not include the consumers of the material
being produced, the scenario would easily lend itself to this
type of interaction around the produced content.
The example shows how a cameraperson interprets the
editor’s broad direction to let their viewers ‘see what is
going on’, and tries out camera movements that could
guide the viewer, camera moves of a more conversational
[17] nature. Developing such skills through practice are
arguably another part of a new video literacy, as they
acknowledge the product—the live video broadcast—as a
starting point and a topic of conversation rather than a
finished product to be consumed passively.
In the final product, the production work is (or should
be) invisible. Today, a lot of people are used to consuming
the products of video editing, but they are not necessarily
familiar with the process it takes to get there. Novices
Fig. 4 a In this picture, we see the four streams provided by the
camerapersons to the editor before negotiating the set-up. b The four
streams provided by the camerapersons after the discussion in Excerpt
3, right before beginning to shoot the sequence. The two top pictures
have now been readjusted
748 Pers Ubiquit Comput (2014) 18:737–752
123
encounter challenges involving how to coordinate the
recordings without being heard or seen in the final product,
how to choose between different angles, when to zoom and
how to design a presentation for a specific purpose. They
encounter situations that are crucial for producing live
video, e.g., mobility and temporality issues (see below).
Skills of these kinds have obvious similarities with those
that have to be developed in the context of literacy; one has
to learn skills of analysis and composition but also a range
of skills that relate to specific technologies. We saw how
our participants realized certain problems only when they
encountered them as part of the practical accomplishments
of doing the video work. Thus, mobile video literacy
emerges and develops when engaging with these media
tools.
Although the Mobile Vision Mixer is a prototype in
development, the types of systems that we are discussing
here could be one way to support an integration of media
literacy practices in educational settings. As argued by
Lewis et al. [28], ‘‘[w]hile educators have harnessed the
web to develop formal e-learning platforms, many are
struggling to unleash the power of social media to support
learning. In part, this is due to perceived difficulties in
integrating its emergent fluid forms and meanings into
highly structured learning environments’’ [28, p. 4]. In this
way, the media literacy we are describing here poses
challenges to the educational system.
We have contrasted broader approaches to media liter-
acy with an update of the term accounting for this current
shift in digital media towards involvement, sharing and
production practices [15]. We argue that as the notion of
literacy shifts towards participation and the ability to pro-
duce media content, rather than just consuming it, and as
the tools for production become more powerful and
diverse, the skills needed to participate will be increasingly
medium specific. We aim to contribute to the understand-
ing of one such medium-specific literacy and how it may
evolve. In the following sections, we outline some of the
elements of what constitutes a mobile video literacy, as one
part of the broader media literacy described in the litera-
ture. We draw on this new approach to media literacy and
our present material on use of tools for live video pro-
duction, and describe three dimensions—visual storytell-
ing, parallel temporalities and mobility—that are key in
attaining a mobile video literacy.
4.2 Creating a storyline using visual resources
We have seen how issues arise around what to document
and how to do it, when trying to produce a relevant sto-
ryline of an exhibit. The participants worked to place
themselves in the storyline of the exhibit, both in terms of
performance of actions in a timely and relevant manner,
and in terms of camerawork to display the content in a
coherent way. The Mobile Vision Mixer is a visual tech-
nology, so clearly the participants mainly rely upon visual
resources when creating their storyline. There is a chal-
lenge involved here in the sense that the visual production
has to be formulated verbally when negotiating how and
what to film. We have seen how such verbal formulations
as capturing ‘what’s going on’ have lead to some confu-
sion, which is solved by monitoring and revising the visual
output stream from the camera. In other words, the par-
ticipants resort to the verbal level in order to negotiate and
master obstacles.
Interestingly enough, the participants do not introduce
any verbal commentaries to explain what is shown in the
video. However, there is a discussion and frequent
reminders about not talking during the recording since this
will be heard on the video. So the end result is a silent
movie, where all that is heard is the face recognition sys-
tem’s output (and some noise from nearby exhibits and
other visitors).
Thus, the storyline is produced using visual resources
only. The storyline could have been created with a voice-
over, a verbal narrative. However, the participants did not
choose to add that in the material we present here. Neither
before nor after any of the retakes of the face recognition
sequence, there is any discussion about whether to include
commentaries or not. However, in another example from
the study, not presented in this article, the participants
agree that they need to verbalize parts of the exhibit during
the recording. This happens when filming an exhibition
which is in itself more text-based, a lie-detector. Before the
initiation of the recording, the cameraperson tells the
assistant that they need to read out the questions the lie-
detector system provides, and that the answers have to be
spoken out loud as well. During the actual filming, she then
reads the questions and has to remind the assistant
answering the questions, to speak out loud not just respond
in silence by pressing the button. The button pressing will
not be visible on the screen, so here she shows an orien-
tation to an audience and a need to verbalize the otherwise
invisible, textual elements of the exhibit. However, in none
of the shootings with this group did the participants create a
separate narrative voice-over to explain the content of the
film.
There could be a number of reasons why this does not
happen. First, the Mobile Vision Mixer is a visual tech-
nology. The most salient characteristic of a mobile video
tool is clearly the visual elements, the moving camera
footage. Since this technology was new to the participants,
it called for complex visual work in order to handle the four
live video streams and make it into one. Second, creating a
storyline involving a voice-over would create yet another
work task, thus adding complexity. It would involve the
Pers Ubiquit Comput (2014) 18:737–752 749
123
negotiation of a content of the verbal narrative, some sort
of script, which would have to fit into the other actions
taking place. Also, it would call for an extra role to be
introduced, a narrator. Third, it would introduce another
temporality to the video work: to time the talk with the
actions taking place. The camera movements and the
actor’s (the assistant’s) movements would have to be
matched with the delivery of the commentaries and the
content of the commentaries, thus increasing the com-
plexity. Presumably, this is a set-up that would need more
practice and experience. It would also involve timing the
verbal commentary with other actions.
These are a few aspects of the creation of a visual
narrative, using a visual technology. With this particular
technology, visual elements are particularly important but
as noted, are negotiated and are handled verbally in the
production process. Unpacking the storyline and making it
appear clear in the edited video is one important element
for the participants to learn.
4.3 Parallel temporalities
In the particular video technology used in this study, live
transmission, as opposed to post-editing, is the main fea-
ture. The whole storyline has to be completed before
assessment and, if considered necessary, before any retake.
Two parallel temporalities that have to be coordinated are
at work: that of the performed actions, constituting the
storyline, and that of the camerawork, the recording of the
actions. The participants have to consider the temporality
of the performance of the actions at the same time as they
consider their own movements to capture the events. Also,
the recording is a collaborative achievement, which adds to
the complexity, and calls for coordination between the
different camerapersons. This is the role of the directing.
The directing cannot be done during the actual filming; it
has to be done before the shooting. So before the recording
begins, the director gives the instructions of how the
camera team should handle the timing and sequentiality of
the events and the camerawork.
When they start to unpack the situation, after having
done a couple of rounds of filming, the participants dis-
cover features of the complexity of these parallel tempo-
ralities. In this way, dealing with parallel temporalities is a
developing skill that we have seen emerge throughout this
data. It is not until they have tried it that they get a sense of
the potential problems in timing the performance with the
camerawork. For instance, in the first shooting, the cam-
eraperson did not manage to keep up with the performance
of the assistant walking across the room. This leads to a
discussion of making a retake, where the actor was
instructed to walk slower in order to be captured in a
suitable manner. These challenges become apparent only
when using the technology, after the embodied experience
of having moved across the room in order to follow the
assistant’s movements. This links to another point we wish
to make in the discussion, around how the participants rely
upon different forms of mobility, and how mobility con-
stitutes a part of the visual literacy skills that they need to
grasp.
4.4 Mobility
Because of the mobility of all parts of this system, i.e., the
capturing as well as the mixing camera phones, the par-
ticipants can easily move around to document different
parts of the exhibit. The placement and mobility of the
camerapersons are crucial aspects of the documentation
process. As the participants become more familiar with the
technology, they realize that they need to distribute them-
selves across the exhibition area in order to cover different
aspects of the event. The storyline of the face recognition
documentation builds on the local mobility [2] of the
assistant, moving from the initial face scanning to the
unlocking of the ‘‘hotel room’’. The assistant, who is doing
the procedures of the exhibit, has to move in order to
perform the different steps of the exhibition activity they
are filming, and consequently, the camerapersons have to
be in different locations to document these steps. When
placing themselves within the storyline of the exhibit, they
are also placing themselves within the space of the exhi-
bition area where relevant parts of the storyline takes place.
In this way, when recording the entering into the ’hotel
room’, one cameraperson is placed inside the room and one
outside in order to capture the assistant’s movements
through the space of the storyline. This is something that
evolved after a couple of trials. In the beginning, every-
body stood next to each other, recording the same object,
but after some time they started to discuss how to distribute
themselves to capture the activity from different locations.
This illustrates how the participants learn about some of the
considerations that have to be taken into account when
making a video.
Related to mobility is the timing of movements.
Movements throughout the storyline have to be done in a
certain pace. The participants progressively fine-tune their
movements of their bodies through space, as well as of the
camera. This is an embodied experience, where awareness
of their bodily mobilities is raised when the participants
have tried a first round of filming. They then realize that the
pacing of the performance has to be changed, and the
camerapersons need to keep up with that pacing in order to
be at the right location to document it, at the right time. In
relating their experiences of the first round, one of the
camerapersons is also using her body as a resource to show
the challenge in this documentation process.
750 Pers Ubiquit Comput (2014) 18:737–752
123
Another related aspect of the mobility of the device is
the micro-mobility [30] of the camera phones, where we
saw how fine adjustments of the camera angles were
negotiated amongst the participants. The micro-mobility of
the video recording devices allows a moment-by-moment
assessment of what is visible on the screen, meaning that
certain aspects of the local environment are made available
for scrutiny [29]. In this case, the participants do not get
feedback from the viewers, who are not part of this study
scenario, but they do receive feedback from each other.
Primarily, it is the camerapersons who receive feedback on
various aspects of the camerawork from the editors. This
feedback can be done as the editor is monitoring the
changes the cameraperson is doing in real time, looking at,
e.g., the new angle as it is being produced. The editor can
then ask the cameraperson to adjust this angle, in similar
ways that the participants in a Skype conversation were
found to ask the other to adjust the camera angle to get a
better view (ibid.).
5 Conclusion
Our observations illustrate some steps in the development
of literacy skills that relate to a new technology for docu-
menting, organizing and communicating information and
accounts of events. What we have shown is how students as
a group encounter problems that have to do with accom-
modating to the technology (recording and editing) and
designing a storyline (i.e., providing an interesting rendi-
tion of events documented), and with how to coordinate
these dimensions. In this learning trajectory, we see how
they struggle with critical elements and develop insights
and criteria for how the product of their work should
appear. The resistance offered served as obstacles that
generated negotiations, which, in turn, topicalized issues
that had to be dealt with including, for instance, how to
make a segment of video intelligible and interesting to an
audience. The discussions about difficulties make the
problem definitions and solutions learnable for the partic-
ipants. Participating in such practices, and in the co-
occurring analyses, increases the likelihood that skills will
develop and that learners will be able to make use of more
complex affordances of the technology in question. We
draw on this new approach to media literacy and our
present material on use of tools for live video production,
and describe three dimensions—visual storytelling, parallel
temporalities and mobility—that are key in attaining a
mobile video literacy.
As a learning task, the appropriation of these tool-
mediated literacy practices relies on collective work and
sharing of experiences in situ. Most likely, it is also
important to play different roles in such work in order to
experience the production process from different perspec-
tives and responsibilities. In comparison with more tradi-
tional learning tasks, an interesting feature of this task is
that it has a distinctive performative quality [36]. Students
are held accountable for producing a relevant rendition of
an event rather than for giving back something that is
already known, and they have to struggle with both form
and content. This is a transformation of learning tasks in
many settings that follows the increasing uses of digital
technologies where information is documented and readily
available. The ability to produce informative renditions and
productive accounts, where mobile video literacy is one
such case in point, is what is the expected outcome of such
learning. As Kress [24] points out, learning becomes
‘design’ rather than reproduction.
Appendix: Transcription notations
Based on Jefferson’s transcript notation, as related in [1].
Well Emphasis is indicated by underlining
e:hhh: Colon indicates prolonged segment
(0.3) A pause, timed in tenths of a second
(.) Pause shorter than one-tenth of a second
Overlap [] Simultaneous (overlapping) speech
- Interrupted speech
hhh Outbreath
.hh Inbreath
[what\ Spoken faster
�yes� ‘Degree’ signs enclose quieter speech
YES Capitals are spoken louder than surrounding
talk
wha- Interrupted, cut-off speech
References
1. Atkinson JM, Heritage J (1985) Structures of social action:
studies in conversation analysis. Cambridge University Press,
Cambridge
2. Bellotti V, Bly S (1996) Walking away from the desktop com-
puter: distributed collaboration and mobility in a product design
team. In: Ackerman MS (ed) Proceedings of CSCW ‘96. ACM,
New York, pp 209–218
3. Bentley FR, Groble M (2009) TuVista: meeting the multimedia
needs of mobile sports fans. In: Proceedings of MM’09, Beijing,
China, 19–24 Oct, pp 471–480
4. Bergstrand F, Landgren J (2011) Visual reporting in time-critical
work: exploring video use in emergency response. In: Proceed-
ings of MobileHCI, pp 415–424
5. Brown B, Reeves S, Sherwood S (2011) Into the wild: challenges
and opportunities for field trial methods. In: Proceedings of CHI.
ACM Press
6. Broth M (2009) Seeing through screens, hearing through speak-
ers: managing distant studio space in television control room
interaction. J Pragmat 41(10):1998–2016
Pers Ubiquit Comput (2014) 18:737–752 751
123
7. David G (2010) Camera phone images, videos and live stream-
ing: a contemporary visual trend. Vis Stud 25(1):89–98
8. Donald M (1991) Origins of the modern mind: three stages in the
evolution of culture and cognition. Harvard University Press,
Cambridge
9. Dougherty A (2011) Live-streaming mobile video: production as
civic engagement. In: Proceedings of MobileHCI, pp 425–434
10. Engstrom A, Esbjornsson M, Juhlin O (2008) Mobile collabora-
tive live video mixing. In: Proceedings of MobileHCI. ACM
Press, pp 157–166
11. Engstrom A, Perry M, Juhlin O (2012) Amateur vision and rec-
reational orientation: creating live video together. In: Proceedings
of CSCW
12. Engstrom A, Juhlin O, Perry M, Broth M (2010) Temporal
hybridity: mixing live video footage with instant replay in real
time. In: Proceedings of CHI’10 ACM Press, pp 1495–1504
13. Garfinkel H (1967) Studies in ethnomethodology. Englewood
Cliffs, NJ
14. Gilje Ø (2011) Working in tandem with editing tools: iterative
meaning-making in filmmaking practices. Vis Commun 10(1):
45–62
15. Gillmor D (2010) Mediactive. Dan Gillmor (Creative Commons)
16. Greiffenhagen C (forthc.) Visual grammar in practice: negotiat-
ing the arrangement of speech bubbles in storyboards. Semiotica
17. Ivarsson J (2010) Developing the construction sight: architectural
education and technological change. Vis Commun 9:171
18. Iacucci G, Oulasvirta A, Salovaara A, Sarvas R (2005) Sup-
porting the shared experience of spectators through mobile group
media. In: Proceedings of group. ACM Press, pp 207–216
19. Jenkins H (2006) Convergence culture: where old and new media
collide. New York University Press, New York
20. Jenkins H (2009) Confronting the challenges of participatory
culture: media education for the 21st century. MIT Press,
Cambridge
21. Jokela T, Lehikoinen JT, Korhonen H (2008) Mobile multimedia
presentation editor: enabling creation of audio-visual stories on
mobile devices. In: Proceedings of the SIGCHI conference on
human factors in computing systems (CHI’08). ACM, New York,
NY, pp 63–72
22. Kennedy L, Naaman M (2009) Less talk more rock: automated
organisation of community-contributed collections of concert
videos. In: Proceedings of WWW 2009, pp 311–320
23. Kirk D, Sellen A, Harper R, Wood K (2007) Understanding
videowork. In: Proceedings of ACM CHI, pp 61–70
24. Kress G (2003) Literacy in the New Media Age. Routledge, New
York
25. Laurier E, Brown B (2011) The reservations of the editor: the
routine work of showing and knowing the film in the edit suite.
J Soc Semiot 21(2):239–257
26. Lehmuskallio A, Sarvas R (2008) Snapshot video: everyday
photographers taking short video-clips. In: Proceedings of
NordiCHI ‘08, pp 257–265
27. Lemke JL (1998) Metamedia literacy: transforming meanings and
media. In: Reinking D, McKenna MC, Labbo LD, Kieffer RD
(eds) Handbook of literacy and technology: transformations in a
post-typographic world. Erlbaum, Mahwah, pp 283–301
28. Lewis S, Pea R, Rosen J (2010) Collaboration with mobile media:
shifting from ‘participation’ to ‘co-creation’. In: Proceedings of
WMUTE, IEEE, pp 112–116
29. Licoppe C, Morel J (2009) The collaborative work of producing
meaningful shots in mobile video telephony. In: Proceedings of
MobileHCI’09, ACM Press, pp 254–263
30. Luff P, Heath C (1998) Mobility in collaboration. In: Proceedings
of CSCW ‘98. ACM Press, pp 305–314
31. Mondada L (2003) Working with video: how surgeons produce
video records of their actions. Vis Stud 18(1):58–73
32. Mondada L (2009) Video recording practices and the reflexive
constitution of the interactional order: some systematic uses of
the split-screen technique. Hum Stud 32(1):67–99
33. Okabe D (2004) Emergent social practices, situations and rela-
tions through everyday camera phone use. In: Paper presented at
mobile communication and social change, international confer-
ence on mobile communication in Seoul, Korea, 18–19 Oct 2004
34. Reponen E (2008) Live @ Dublin: mobile phone live video group
communication experiment. In: Proceedings of EUROITV ‘08
35. Sacks H, Schegloff EA (1992) Lectures on conversation. In:
Jefferson G (ed), vol 1. Blackwell, Oxford, p 2
36. Saljo R (2010) Digital tools and challenges to institutional tra-
ditions of learning: technologies, social memory and the perfor-
mative nature of learning. J Comput Assist Learn 26(2):43–64
37. Toussi R, Zoric G, Engstrom A, Juhlin O (in submission) Mobile
vision mixer: a system for collaborative live mobile video pro-
duction, submitted manuscript, Mobile Life Centre
38. Vihavainen S, Mate S et al (2011) We want more: human–
computer collaboration in mobile social video remixing of music
concerts. In Proceedings of CHI 2011, ACM Press, pp 287–294
39. vom Lehn D, Heath C (2006) Discovering exhibits: video-based
studies of interaction in museums and science centres. In:
Knoblauch H, Schnettler B, Raab J, Soeffner H (eds) Video
analysis: methodology and methods: qualitative audiovisual data
analysis in sociology. Peter Lang Pub Inc., New York, NY,
pp 101–113
40. Weilenmann A (2001) Negotiating use: making sense of mobile
technology. Pers Ubiquit Comput 5:137–145
41. Weilenmann A, Hillman T, Jungselius B (2013) Instagram at the
museum: communicating the museum experience through social
photo sharing. In: Proceedings of the conference on human fac-
tors in computing systems, ACM Press
752 Pers Ubiquit Comput (2014) 18:737–752
123