Spatial Cinematography: How Spatial Camera Systems, Non-Interactive Immersion
and Production Workflows Impact VR Filmmaking

By

Asad Aftab
Bachelor of Industrial Design, School of Art, Design and Architecture, 2023

Supervisor: Dr. Garnet Hertz

A CRITICAL AND PROCESS DOCUMENTATION THESIS PAPER SUBMITTED IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF DESIGN

EMILY CARR UNIVERSITY OF ART + DESIGN
2025

© Asad Aftab, 2025

Table of Contents
Acknowledgments ........................................................................................................ 3
Abstract ....................................................................................................................... 4
Keywords ..................................................................................................................... 4
Glossary ....................................................................................................................... 5
Introduction ................................................................................................................. 7
Document Overview ............................................................................................... 7
Scope and Limitations ............................................................................................ 7
Chapter 1: Spatial Cinematography ................................................................................ 9
Chapter 2: The Evolution of VR Cinema ......................................................................... 11
2.1 360-Degree Video and its Evolution...................................................................... 11
2.2 The Transition to 180-Degree Video ...................................................................... 14
2.3 Stereoscopic 3D and Cine-VR .............................................................................. 16
2.4 The "Spatial Video" Revolution............................................................................. 17
Chapter 3: Pre-Production ........................................................................................... 18
3.1 Immersive vs. Interactive Content ........................................................................ 18
3.2 Structuring Non-Linear Narratives ....................................................................... 20
3.3 Pre-Visualization ............................................................................................... 22
3.4 Guiding Audience Attention ................................................................................. 23
Chapter 4: Production ................................................................................................. 31
4.1 Field of View and Perspective .............................................................................. 31
4.2 Choosing a Camera System ................................................................................ 36
Chapter 5: Post-Production ......................................................................................... 41
5.1 Image Stabilization and Upscaling ....................................................................... 41
5.2 Editing and Compositing ..................................................................................... 46
Research Output: The Spatial Cinematography Framework Poster .................................. 49
Conclusion and Future Directions ................................................................................ 53
Bibliography: .............................................................................................................. 55
2

Acknowledgments
This thesis is dedicated to people who have been instrumental in my life for the past two
years.
To my friends JoJo, Jo, Qianxuan, and Rebecca, whose constant support allowed me to go
beyond my own perceived capabilities and made me realize there is a life beyond work.
To Alan, Garnet, Peter and Sean, who provided me with an environment that allowed me to
grow without fear of failure.
To Aisha, Hamza, Khansa, Maira, and Rija for always believing in me.
And lastly, while my family does not necessarily understand what I do, they have never
opposed it.
This research was primarily conducted on the unceded xʷməθkʷəy̓əm (Musqueam),
Sḵwx̱wú7mesh Úxwumixw (Squamish), and səl̓ilw̓ətaʔɬ (Tsleil-Waututh) territories, more
commonly known as so-called "Metro Vancouver."

3

Abstract
Immersive filmmaking in virtual reality is a rapidly evolving method of storytelling that
combines traditional cinematic approaches with immersive and interactive technologies.
This thesis researches the creative and technological challenges of producing VR films,
particularly Spatial Camera Systems, Non-Interactive Immersion, and production Workflow.
By dissecting these features through a role-specific lens, this thesis guides VR filmmakers,
writers, designers, and technical teams, inspiring them to explore VR's innovative potential.
All of this forms the basis for the term "Spatial Cinematography."
This study emphasizes VR's creative capabilities, such as creating multi-branching
storylines, assessing the level of interactivity, 3DOF vs. 6DOF immersive cinematic
experiences, and using new media techniques, such as spatial sound, to invoke emotional
responses from viewers. It also discusses the significance of directing audience attention in
an immersive space while maintaining viewer autonomy and agency, a fundamental
difficulty specific to immersive media.
On the technical side, the study examines the selection of VR-specific camera systems,
image stabilization, and upscaling techniques ensuring smooth performance across various
HMD platforms. It also offers best practices for production planning and efficient production
workflows, drawing on practical insights from research and case studies.
Actionable solutions for expanding the field of VR filmmaking can be accomplished by
evaluating current limits – such as hardware constraints and accessibility issues – and
current trends like AI-driven workflows for immersive experiences.
This thesis aims to inspire filmmakers to embrace VR's potential and foster innovation and
storytelling techniques that profoundly redefine audience engagement in this new,
immersive narrative form.

Keywords
Keywords: Spatial cinematography, Virtual reality, Filmmaking, Immersive cinema, AI,
Immersive Video, Spatial Video, Cinematic Virtual Reality.

4

Glossary
Head-mounted display (HMD): A head-mounted display (HMD) is a device you wear on
your head, featuring a small optical display in front of one eye for monocular HMDs or
both eyes for binocular HMDs. Virtual reality (VR) headsets are a specific kind of HMD
that tracks the user's 3D position and orientation, creating an immersive virtual
environment.
Stereoscopic 3D (S3D): This method crafts a depth illusion by showing a pair of images
to the viewer's eyes, leading the brain to interpret them as one cohesive 3D image. The
images are composed of a left image and a right image to get the two “views”.
Monoscopic: It refers to a single image or video that is viewed with one eye. In this
specific context, it refers to a type of virtual reality (VR) that uses a single image for both
eyes.
Immersive Video: This type of video content is designed to make viewers feel like they
are inside the video. The idea is to give viewers a lifelike perspective, usually in a 180degree field of view.
Spatial Video: It is a S3D video format that allows users to move around in the video and
interact with objects in it.
Pre-Visualization: Commonly referred to as Previs, this term denotes the visualization of
scenes or sequences from a film before shooting begins. Pre-visualization typically
includes techniques like storyboarding, which utilize either hand-drawn or digitally
created sketches / 3D scenes to outline or conceptualize film scenes.
Pixel Depth: Also referred to as color depth or bit depth, it indicates the number of bits
allocated for a pixel's color representation. This measurement dictates the variety of
colors that can be shown on a screen. The higher the pixel depth, the higher the color
quality.
3DOF: Three Degrees of Freedom (3DoF) is a virtual reality concept that defines user
interaction within a virtual environment. In 3DoF, users remain stationary, allowing them
to look left, right, up, and down and pivot left and right, but they are unable to navigate
through the virtual space.
6DOF: In Six Degrees of Freedom (6DoF), users maintain the three types of movement
enabled by 3DoF while gaining additional movement options. They can move forward
and backward, up and down, and left and right. Users can also observe and interact with
objects in the environment as if they were real.
NeRF: A neural radiance field (NeRF) utilizes deep learning to generate a threedimensional representation of a scene from two-dimensional images. Put simply, it is a
machine learning technique that employs AI to produce 3D models and meshes. It is a
more recent advancement in 3D scanning and photogrammetry.

5

Lens masks: A lens mask is a video mask created around VR footage to smoothen out
the edges and make it blend better into the virtual environment.
Horizon correction: This is the process of 'straightening' the footage to be level with the
horizon line due to discrepancies while recording.
Stereoscopic stitching: This technique combines multiple images/videos to create a
stereoscopic panorama. It is utilized in virtual reality (VR) and 3D video.
Stereo depth budget: A stereoscopic depth budget is the range of depth that a viewer
can comfortably perceive in a 3D image. It is the amount of space on the z-axis that a
viewer can comfortably see. The geometry of the stereoscopic pipeline from the scene
to the viewer's eyes determines the depth budget.
Image stabilization: This technique introduces a compensating movement to keep the
image static on the camera's sensor. This movement may be physical, i.e., In-body image
stabilization (IBIS), or calculated via sensors, i.e., electronic image stabilization (EIS). IS
crops the footage to compensate for the movement it is trying to remove.
Gyroscopic stabilization: It is a type of EIS that uses gyroscopic data captured either by
the camera or lens to stabilize the footage at hand.
Upscaling: It involves enhancing the resolution of an image or video, which can be
accomplished through an algorithm or may be powered by AI.
NLE: Non-linear editing (NLE) is a process that allows editors to modify a video or audio
project without following a linear timeline. This means you can work on any clip in any
sequence, regardless of whether it belongs at the project's beginning, middle, or end.
Full SBS: Full Side-by-Side (SBS) means the left and right views of a 3D video are
transmitted at full resolution. This results in better quality but bigger file sizes. A 1080p
full-SBS 3D file would have the following resolution - (1920+1920) 3840 x 1080P
Half SBS: Half Side-by-Side (SBS) means the left and right views of a 3D video are
subsampled at half resolution, and you get a backward-compatible full frame. A 1080p
half-SBS 3D file would have the following resolution - (960+960) 1920 x 1080P
Screen door effect: The screen door effect, or pixelation, is a visual artifact in which the
spaces between pixels on a display become apparent. This effect is reminiscent of a
mesh or flyscreen and is particularly evident in low-resolution displays and VR headsets.
SOOC: Straight out of Camera – This means that whatever image/video file we are
looking at has had no post-processing or corrections applied to it and is being viewed
exactly as the camera shot it.

6

Introduction
Virtual Reality filmmaking, also referred to as Cinematic VR (Mateer, 2017) or Cine-VR for
short, has evolved significantly, transitioning from early 360-degree video experiments to
today's new "spatial video" revolution in digital content and hardware. This thesis examines
how spatial camera systems, non-interactive immersion, and production workflows shape
Cinematic VR content. By understanding these three base variables, we can develop a
framework for telling a compelling story in an immersive setting.

Document Overview
This document is split into five main chapters, starting with Spatial Cinematography in
Chapter 1, which provides an overview and definition of Spatial Cinematography, what CineVR is as a medium, and the role of a Spatial Cinematographer. Next, Chapter 2 explores the
Evolution of VR Cinema, explaining a historical market overview of the medium. Chapter 3
continues with pre-production, highlighting tools like pre-visualization methods, spatial
audio cues, and visual design techniques to strategically guide viewer attention. Production
is the theme of Chapter 4, which explains field of view, guiding viewer attention, choosing a
spatial camera system and balancing the technical and creative aspects of a Cine-VR
project. The final Chapter 5 finishes with an overview of post-production techniques. After
this, the document shows how these principles play out in practice through an easy-toconsume and distribute poster. Moreover, it looks at the prospects of the field.

Scope and Limitations
This project intentionally limits its scope to focus on the technical aspects of VR production
for spatial cinematography. It does not address the types of content that can be created
within the VR format; rather, it offers practical production workflows to assist creators with
the medium’s formal challenges. It strives to be a practical technical manual. Broader ethical,
social, and environmental concerns surrounding VR technologies are also outside the scope
of this research. While important issues exist—including the biases embedded in media, as
McLuhan described in his assertion that "the medium is the message" (McLuhan, 1964), as
well as corporate critiques of companies such as Meta (with Mark Zuckerberg)—they are not
explored here. Environmental impacts, including resource use, carbon emissions, and ewaste associated with VR systems, are similarly acknowledged but not analyzed in detail.
These exclusions are not meant to diminish the significance of these concerns, but to

7

maintain a clear focus on addressing the lack of accessible technical knowledge in VR
production. This document is therefore structured intentionally as a practical technical
manual.

8

Chapter 1: Spatial Cinematography
I refer to this approach towards VR Filmmaking as "Spatial Cinematography," and it extends
beyond technological considerations, influencing the very nature of storytelling and
audience engagement. Traditional cinematography techniques must be adapted and
modified for immersive experiences (Zhang & Weber, 2023), requiring us to rethink narrative
construction, cinematographic framing, and viewer interaction. This research builds on
existing ideas in VR Film while proposing new methods for maximizing engagement through
spatial cinematography techniques.
An important distinction is that Cine-VR is a medium like Film and Digital Cinema and can be
defined as being similar to 360-degree video, but its key difference from 360° is using
cinematic production techniques such as lighting design, sound design, scenic design,
and blocking techniques (the latter two in the case of dramatic work) (Williams et al., 2021).
Within this new medium of Cine-VR comes this newly proposed field called Spatial
Cinematography and this also gives rise to a new role in the production of the Spatial
Cinematographer, which is a combination of a traditional cinematographer/Director of
Photography, a Stereographer and an XR designer/artist.

9

Figure 1: Multi-sided role of the Spatial Cinematographer.
Creating this new multi-faceted role is important as VR cinema is cementing itself as an
increasingly rapidly expanding medium for immersive storytelling. This requires streamlined
roles, responsibilities, and workflows to reduce the barrier of entry for new creatives who
wish to use this medium to tell their stories more effectively.

10

Chapter 2: The Evolution of VR Cinema
2.1 360-Degree Video and its Evolution
The birth of 360-degree video introduced audiences to an entirely new way of viewing
content. However, the format presented challenges, such as limited storytelling control and
technical difficulties in camera placement and stitching.
In the early 2000s, the novelty of 360-degree filmmaking drew early adopters, but its
constraints soon became evident. Initially, multiple cameras had to be used in custom-made
enclosures to film a 360-degree video. The most common example of this was the usage of
GoPro action cameras in an enclosure like the one in Figure 2 below.

Figure 2: Custom GoPro enclosure made to hold 6 GoPro Hero 3+ to film 360 footage from 2012.
The most significant disadvantage of this GoPro array approach was the lack of image
stabilization and manual stitching together in posts. The results ranged from decent to
downright abysmal, but it was a start. As the years went by, and 360 cameras (and VR
headsets) got better, we started to see more polished solutions from companies like

11

Insta360 and Ricoh in the form of Dual-fisheye lens cameras in 2013, which allowed filming
in 360° to be much simpler. The introduction of auto-stitching software helped to speed up
editing and make 360° videos more accessible to the average consumer.

Figure 3: Insta 360 ONE (2017).1
At the high end, we also saw companies like Kandao and Insta360 introduce "pro-level"
cameras in 2017 and 2021. These cameras can shoot at high resolutions of 8K and beyond
and have many features to streamline 360-degree filming and live streaming.

1

Read more about the Insta360 ONE: https://www.cnet.com/reviews/insta360-one-preview/

12

Figure 4: Left Insta 360 Pro2 from 2017, right: Kandao Obsidian Pro3 from 2021.
However, 360° videos since the early 2000s have some major flaws that keep them from
reaching their true potential, which primarily boils down to the fact that the camera captures
everything around them. Lighting scenes using grip and lighting gear becomes extremely
difficult and, in many cases, impossible (Chulhyun, 2016). Coupled with the difficulties of
framing and the increasingly difficult challenge of keeping the audience engaged in
whatever narrative was being strung, it made life difficult in the early days of VR cinema. It
resulted in visually stunning pieces but fell flat in other areas. Eventually, 360° videos were
relegated to a small niche as they began to be viewed as something that required extreme
effort to produce and often resulted in very limited audience engagement.
This prompted more interactive experiences to be realized using tools like Unity and Unreal
Engine, and they like to take centre stage in VR content. Although these developments
reduced passive viewing and enhanced immersion, they failed to fulfill the promise of
'Cinematic VR.' Instead, they provided gamified experiences that required constant user
engagement and physical interaction, complicated by the discomfort of bulky headsets.

2
3

Read more about the Insta 360 Pro https://www.insta360.com/product/insta360-pro
Read more about the Kandao Obsidian Pro https://www.kandaovr.com/Obsidian-Pro

13

These challenges with the 360°-video format between 2012 and 2020 prompted VR
filmmakers to seek more engaging formats. As a result, several content creators began
experimenting with 180-degree videos around 2019.

2.2 The Transition to 180-Degree Video
As interest in 360° started to wane, the industry started to look for other formats to create
Cinematic VR content. In around 2018, Google released the VR 180° format (Ackermann,
2018) as a more usable alternative to 360°.
The Lenovo Mirage was the first consumer-facing VR 180° camera to hit the market in 2018
(MIXED, 2023), soon to be followed by offerings such as the Insta360 EVO and Vuze XR,
which still remain popular among enthusiasts.

Figure 5: Left: Vuze XR from 20194, Right: Insta 360 EVO from 20195.
Unlike 360° cameras, 180° cameras quickly emerged after around 2019 as a more practical
alternative for Cinematic VR despite its initial struggles to gain traction due to various
technological constraints. However, improvements in hardware and software, largely thanks
to the introduction of Canon's EOS VR system, have made 180-degree Video a viable option
for immersive storytelling, balancing a controlled perspective with a heightened sense of
depth and immersion,

4

Read more about the Vuze XR https://www.videomaker.com/reviews/cameras/humaneyes-vuzexr-hands-on-review-a-flexible-camera-for-the-cutting-edge/
5
Read more about the Insta 360 Evo https://www.insta360.com/product/insta360-evo

14

Figure 6: Canon R5 paired with the RF 5.2 mm F2.6 Dual Fisheye Lens, released in 20216.

6

Read more about the EOS VR system at
https://canon.ca/en/product?name=RF5.2mm_F2.8_L_Dual_Fisheye&category=/en/products/Lense
s/VR-Lenses

15

2.3 Stereoscopic 3D and Cine-VR
Stereoscopic 3D (S3D) adds a layer of depth perception, making environments feel more
tangible and "real." S3D and Cinematic VR go almost hand in hand to the point that
Monoscopic cine-VR tends to look 'flat.'

Figure 7: GoPro Dual Hero System7 – A popular consumer-level S3D platform released in 2015
(Asch, 2022).
Stereoscopic 3D (S3D) became a popular format in the early 2010s, championed by
filmmakers like James Cameron and his groundbreaking 'Avatar.'(CITE) However, S3D is
historically a very old medium, being especially popular in the cinema space since the
1950’s (Arden, 2012). Unfortunately, difficulties in filming in 3d and the high cost of entry for
consumers eventually prevented the format from widespread adoption. (Peddie, 2024)
However, as VR headsets improved in screen resolution and pixel depth (Lynn et al., 2020),
they almost resurrected 3D as a format outside of theatre screens. You now do not need a

7

Read more about the dual hero system here https://stereoscopy.blog/2022/01/16/gopro-dualhero-system-housing-an-extraordinary-3d-camera-rig/

16

3D TV or an expensive projector and 3D glasses setup. You just need a semi-decent VR
headset for a more than acceptable 3D viewing experience.
However, filming for VR in S3D also presents its own unique challenges, such as maintaining
interocular distance (Knorr et al., 2024) especially when using DIY filming solutions and
avoiding eye strain due to improper stereo calibration (Terzić and Hansard, 2016). However,
the effort is worth the hassle as it helps increase the viewer's immersion, and the layer of
depth perception brings it closer to real life (Morana, 2024).

2.4 The "Spatial Video" Revolution
In December 2023, Apple introduced 'Spatial video' on the iPhone 15 Pro, which they termed
"A groundbreaking new capability that helps users capture life's precious moments." (Apple
Newsroom, 2023). This new format of immersive video capture was brought forward on the
heels of the launch of the Apple Vision Pro. Marketing jargon aside, "Spatial video" is just
Stereo 3D video, and "Apple Immersive Video" is simply a 3D VR 180° video (Swanson, 2024).
With the release of the Apple Vision Pro, we are seeing a big shift in the Cine-VR industry.
One of the most significant issues plaguing standalone VR headsets is the lack of powerful
hardware to run high-resolution screen panels and playback high-resolution footage. The
AVP solves this by effectively being a MacBook Pro strapped to your face. With the creation
of the new MV-HEVC codec for ‘spatial video,’(Blackman and Harley, 2024) We are finally
seeing a clear push towards high-level Cine-VR content. With Meta also supporting this new
video container on their headsets, it would not be remiss to say that the stage is set for the
next generation of VR filmmaking.
The push for high-fidelity experiences due to the 'screen door effect' (Cho et al., 2017) has
driven technological innovation. We finally have both the hardware and software to make
said experiences, but compared to traditional filmmaking, Cine-VR still lacks established
frameworks and 'best practices' owing to its relatively nascent stage. This necessitates the
formalization of 'spatial cinematography' into its own creative field within CVR.

17

Chapter 3: Pre-Production
This section looks at some of the more theoretical and creative decisions you will have to
make when making the deep dive into the world of Cine VR. It unpacks key strategies for
building compelling storytelling within virtual reality—an area often caught between
traditional cinematic conventions and interactive experiences. This chapter highlights
concrete tools like effective pre-visualization methods, spatial audio cues, and visual design
techniques to strategically guide viewer attention. Ultimately, it outlines how filmmakers can
leverage these principles practically, pushing Cinematic VR beyond gimmicks into
meaningful, impactful narratives.

3.1 Immersive vs. Interactive Content
VR content/experiences can primarily be grouped into two categories based on the amount
of movement freedom the user has:
•

3DOF: Head tracking around 3 rotational axes.

•

6DOF: Head tracking + translational motion tracking.

Figure 8: An illustrative example of 3DOF vs 6DOF (Barnard, 2023)
In simpler terms, in a 3DOF experience, the user is an observer, while in a 6DOF experience,
the user is a participant (IEEE, 2022).

18

Many VR experiences we see now are primarily 6DOF, and perception has been built around
the medium's requirement that the user be able to touch and interact with objects in the
scene and weave their own interactive storyline. However, 6DOF experiences are closer to
games than to cinema, and that is a barrier to entry for both creation and consumption, as
people may not possess the necessary technical skills to make a whole 6DOF experience in
Unity or Unreal Engine and similarly, if not executed correctly 6dof experiences could
potentially lead to more immersion breaking scenarios than 3DOF (Chatterjee et al., 2024).
It is almost universally agreed upon in the VR community that 3DOF is a "lower level" of
immersion than 6DOF (IEEE, 2022) However, in the case of Cine-VR, creating a live-action
6DOF experience is incredibly difficult/near impossible due to the sheer amount of
resources required to film all the possible paths the user may take (which are practically
infinite). So, for Cine-VR specifically, 3DOF is the sub-medium of choice, and ‘non-interactive
immersion’ is required within that container.
Non-interactive immersion allows for greater directorial control while maintaining audience
engagement. As we have a 3DOF-esque experience, we eliminate the possibility of the user
being able to interact with the environment actively. Previous studies have found that,
generally, viewer attention is focused on the centre area of the frame (L. Tong et al., 2021).
This makes the shift to VR 180° even more sense as it allows the content to be framed in a
more user-consumption-friendly way while maintaining a high immersion level. If
screenwriters and directors from traditional films have experience with formats such as Vista
Vision or IMAX, they will find writing a story for immersive to be similar to those two formats
as they are primarily "centre-punched”. (IMAX Corporation, 1999)
Therefore, by carefully designing non-interactive experiences, filmmakers can maintain a
higher level of agency over the narrative than usual while still offering audiences the
sensation of presence. In other words, this research suggests that 3DOF is a considerably
superior choice over 6DOF as an immersive video narrative format. This goes against the
assumption that “more is better” for degrees of freedom: for CineVR, three is better than six.

19

3.2 Structuring Non-Linear Narratives
Non-linear storytelling in VR challenges traditional narrative structures by introducing
audience interactivity and spatial exploration. Unlike conventional filmmaking, where scenes
unfold in a predetermined sequence, VR narratives must account for user agency while
maintaining coherence. This balance requires innovative storytelling techniques such as
environmental cues, guided focus, and adaptive pacing.
As VR blurs the lines between linear and non-linear storytelling, immersive experiences like
CARNE y ARENA (2017) demonstrate how spatial narratives can engage viewers while
preserving artistic intent. To navigate this complexity, VR filmmakers rely on advanced previsualization tools and spatial storyboarding to direct attention and enhance narrative
immersion.
Due to its inherently non-linear nature, virtual reality (VR) presents unique challenges and
opportunities for storytelling. Unlike traditional filmmaking, where the narrative unfolds in a
fixed sequence, VR and immersive media allow for greater audience interactivity, creating a
complex storytelling environment where guiding viewer attention becomes critical.
(Schleser et al., 2024)
While VR is often categorized as a non-linear medium due to its interactive capabilities,
immersive and spatial videos occupy a middle ground between linear and non-linear
storytelling. In these formats, directors and screenwriters still control the overarching
narrative. However, due to the expansive field of view in VR—particularly in 360-degree
video—the audience has the freedom to explore the environment, which can lead to divided
attention (Choi and Nam, 2022). In 360-degree experiences, the viewer's attention is divided
between environmental exploration and the primary narrative elements. This creates a
significant challenge in ensuring that key narrative moments are not overlooked.
A stellar example of non-linear narratives in VR is the Oscar special award-winning VR
installation (AMPAS, 2017) “CARNE y ARENA (Virtually present, Physically invisible)" by
director Alejandro G. Iñárritu.

20

Figure 9: Title poster for CARNE y ARENA (PHI, 2017)
CARNE y ARENA explores the human condition of immigrants and refugees. This is a
twenty-minute individual experience focused on a virtual reality segment shared by three
guests in distinct rooms. Each guest is able to experience the story at the same time from
their own perspective.
To effectively structure such an experience, the spatial cinematographer must employ
strategic pre-production planning and visualization techniques to mitigate the risk of
audience distraction and ensure that critical narrative elements are perceived. In traditional
filmmaking, storyboarding is used to plan shots and sequences. In VR storytelling,
storyboarding through the use of pre-vis techniques like using Unreal Engine 5 becomes
even more essential, as it must account for the audience's ability to look in multiple
directions (Gipson et al., 2018). Therefore, storyboarding in VR aids production logistics and
serves as a tool for guiding user focus.
VR storytelling demands a meticulous approach to narrative structuring, balancing the
freedom of audience exploration with the need to convey a cohesive story. Through careful
pre-production planning, effective use of audiovisual cues (Vosmeer and Schouten, 2017),
and strategic set design, creators can successfully navigate the challenges of non-linear
storytelling in immersive media. By leveraging these techniques, VR filmmakers can create

21

engaging and meaningful narratives that resonate with their audiences while maintaining
coherence within an expansive, interactive space.

3.3 Pre-Visualization
Previsualization, or pre-vis for short, plays an increasingly important role in modern
filmmaking (Ardal et al., 2019). However, it remains less prevalent in smaller productions than
traditional storyboarding techniques. In CVR, pre-vis is particularly critical because it enables
filmmakers to simulate the virtual environment and test audience reactions before full-scale
production begins (Park, 2018).
By utilizing game engines such as Unreal Engine, filmmakers can create interactive pre-vis
models that allow them to assess viewer focus (Muender et al., 2018). For instance, user
testing can reveal whether audiences are paying attention to key narrative elements or
becoming distracted by peripheral details in the scene. These insights inform staging,
lighting, and production design adjustments to enhance storytelling clarity.

Figure 10: A pre-vis example in UE5.
Given the high production costs associated with CVR, allocating sufficient time and
resources to both previsualization and user testing within the pre-production pipeline is
essential. Unlike traditional films, CVR projects demand a more extensive development

22

process due to the medium's complexity, the integration of interactive elements, and the
unique distribution challenges.
Newer techniques of 3D scanning like NeRF's combined with VR (Xu et al., 2023) have
opened up many possibilities for recreating real-life locations for the purposes of Previs.
Thanks to apps like Luma AI, NeRFs have become much more accessible. They allow one to
simply use their phone to scan the room and either cloud process or locally process it on
their own PC with very convincing and usable results.

Figure 11: An example of a still screenshot from a NeRF scan; view the scan at this link.

3.4 Guiding Audience Attention
In VR Cinema, audience focus is more difficult to direct than in traditional cinema.
Techniques such as spatial sound cues, lighting, and environmental design are crucial in
maintaining immersion without explicit direction. Ensuring that viewers naturally follow the
intended flow of the story is essential to sustaining engagement and coherence (Knorr et al.,
2018).
In the production of Cinematic virtual reality (CVR), several key elements contribute to the
overall immersive experience, including sound design, lighting, environmental design (i.e.,
set design), and production design (Choi and Nam, 2022). Unlike traditional filmmaking, CVR
does not provide explicit directional cues such as arrows, subtitles, or guided prompts that
are often present in interactive experiences. One of the most significant challenges in noninteractive immersive experiences, as opposed to interactive ones, is the absence of non-

23

playable characters (NPCs) to direct the user's attention. This lack of guidance means we do
not know where the viewer will look exactly (He and Liu, 2024) and it necessitates a different
approach to storytelling, requiring the director, screenwriter, and production team to
anticipate user behavior and design the experience accordingly.
Several techniques can be used to direct viewer attention effectively in VR (Carpio et al.,
2023):
•

Audio Cues: Sound design plays a crucial role in VR storytelling. (Truong, 2015).
Directional audio can draw attention to specific scene areas, subtly guiding the
viewer's gaze. As part of the PXR conference XRtist linkup, I developed a VRChat
world titled "Cabin Fever” alongside a team of students from various universities. I
was the main Unity World builder on this project, and as a mechanism to guide the
user, various audio cues were used to help them find the required keys to progress
through the experience.

Figure 12: Screengrab from "Cabin Fever" – The experience can be viewed in VRChat here.

24

Figure 13: Unity project screen grab from "Cabin Fever".
•

Gestural Cues: Character movements and gestures can serve as natural focal points,
encouraging the audience to follow the narrative action. For example. In this untitled
VR 180° piece, the audience is encouraged to look around the 180°-environment
based on the actions of various people in the scene and also with camera
movements.

Figure 14: Screengrab from “Untitled” VR 180° piece. View it at this link here.

25

•

Environmental Design: Set design and production elements can be structured to
lead the viewer's eye toward key storytelling elements subtly. For example, objects of
interest can be highlighted using lighting contrasts or motion. In my 360 VR piece
titled “Dreamstate – Ascension,” the movement of the various spacecraft across the
screen is used as a way to direct the user's eye across most of the space and to
encourage them to look around the 360 environments.

Figure 15: Screengrab from “Dreamstate - Ascension”. View it at this link here.

26

•

VFX Enhancements: Visual effects can strategically emphasize important elements
within the scene, ensuring that narrative significance is reinforced even in a highly
interactive space (Park, 2018). Again referencing "Cabin Fever,” in this experience, the
keys had a subtle glow texture, and the path to them had hints of small glowing rocks
as wayfinding points. The main guiding "fairy" was modelled as a striking orb of light
to instantly pull the user's attention to it as soon as they stepped into the scene.

Figure 16: Guiding fairy from "Cabin Fever".

27

Figure 17: One of 3 keys from "Cabin Fever".

Figure 18: Guiding Rocks Unity Project View from "Cabin Fever".

28

Figure 19: Guiding Rocks from "Cabin Fever" – The experience can be viewed in VRChat here.
Beyond directing attention, set and production design play a significant role in enhancing
the immersive narrative experience. Given the wide frame of VR environments, every
element within the scene must contribute meaningfully to the story. Whether filming in an
outdoor location, like a forest, or an elaborate interior space, like a mansion, the careful set
decoration can incorporate visual cues that reinforce the storyline. Additionally, postproduction techniques can integrate subtle narrative elements into the environment,
enriching the immersive experience and making the world feel more alive.
A crucial aspect of this whole process is user testing. In interactive experiences, interactivity
is tested to ensure functionality and engagement. Similarly, in CVR, testing is essential to
evaluate how viewers engage with the visual and auditory elements. Traditionally, screen
tests in filmmaking occur during the later stages of production, often after a rough or final
cut is available. However, in CVR, the testing process must begin much earlier, particularly
during pre-vis.

29

Another key distinction between CVR and interactive experiences, such as video games, is
the level of viewer agency (L. Tong et al., 2021). In interactive media, designers can restrict
user movement or progression until specific actions are performed. In contrast, Cinematic
VR often lacks such constraints, especially in exhibition settings like film festivals. While
users may have limited control, such as being unable to pause the experience if controllers
are removed, their gaze and attention remain unrestricted. Consequently, if crucial narrative
elements are overlooked, they may be permanently missed, posing a challenge for
filmmakers who seek to craft narrative-driven VR experiences (Szita and Gander, 2021). As
narrative-focused CVR projects become more common, filmmakers must develop
strategies to direct audience attention within the immersive environment effectively.
This challenge also underscores the advantage of VR180° over full 360-degree VR. In a
VR180° experience, the viewer's field of vision is limited to 180°, reducing the likelihood of
distractions and enabling the production team to focus on directing attention within a
defined frame (L. Tong et al., 2021). By contrast, 360-degree VR allows viewers to look in any
direction, increasing the risk of missing key story elements. As a result, VR180° provides a
more controlled environment that aligns more closely with traditional cinematic storytelling
techniques.
In conclusion, Cinematic VR development requires a strategic approach to production,
incorporating Previs, user testing, and careful attention to audience engagement. As the field
continues to evolve, these considerations will be instrumental in shaping the future of
immersive storytelling.

30

Chapter 4: Production
Creating content for spatial cinema demands a thorough grasp of both the technical and
creative aspects that are distinct to VR filmmaking. In contrast to traditional cinema, where
the framing and perspective are closely managed, spatial cinematography must navigate
the delicate interplay between immersion and narrative clarity. One of the crucial choices a
spatial cinematographer faces is the selection of the field of view (FOV), which significantly
impacts how viewers interpret and interact with the narrative.
This chapter delves into the essential elements of spatial cinema production, starting with
how the field of view (FOV) and perspective influence storytelling. It continues by discussing
the significance of selecting the appropriate camera system, spanning from budget-friendly
options to advanced professional setups, each providing unique benefits depending on
resolution, workflow, and production requirements. By grasping these key components, VR
filmmakers can make informed choices that elevate both technical quality and immersive
storytelling.

4.1 Field of View and Perspective
Selecting between 180-degree and 360-degree FOV is perhaps a spatial cinematographer's
most crucial decision to best convey the story the director and screenwriters have created.
As previously discussed, the FOV influences storytelling and audience experience. So, the
spatial cinematographer must work with the pre-production team during the pre-vis process
to select the FOV that best conveys the director's vision and is practical in production.
While a wider field of view can enhance presence, environment building, and details, it may
introduce complexity in directing viewer focus as a unique quality of Cinematic VR is that the
viewer is generally placed into a first-person “POV" style perspective as opposed to seeing
things through the lens of a camera as that tends to be more effective (Cannavò et al., 2024).

31

Due to the complexity and the number of factors that can influence audience attention in
VR, the spatial cinematographer must reduce those factors and plan for principal
photography accordingly. Choosing the FOV is an important part of that, and while we have
touched on 360° and 180° previously as popular formats, there are other possible formats
that the spatial cinematographer can potentially use, such as:
•

200° (Closest to 180° but allows for more head movement)

•

220° (Allows for slightly more head movement)

•

240° (A FOV closer to 360° but still not as wide)

•

63° (slightly larger than the FOV of our eyes)

All these formats deviate from the intended/designed use cases for the available VR
camera systems. But it is always important to remember that viewer attention is often
focused on the centre of the frame (He and Liu, 2024).
Returning to it, however, as we have previously seen, narrower FOVs are a lot easier to direct
audience attention around, and this is why CVR has transitioned primarily to being focused
on VR180°. Having a narrower FOV than 360° allows the production quality to increase as the
shots can be lit properly using more traditional cinema techniques; better production design
is possible because a certain area is not being filmed due to the narrower FOV.

Figure 20: Top-down illustration of FOVs for VR. The viewer is in the centre.

32

Now, let us look at some illustrative examples from my practice to understand further how
much of the frame is visible to the viewer in the HMD. The first example below is from the
opening sequence of the first piece of my “Dreamstate” trilogy, of which the second part has
already been mentioned in previous sections.
Below is an equirectangular screenshot in figure 20 from “Dreamstate” that represents the
whole 360° space available to the viewer to look around.

Figure 21: Opening sequence of “Dreamstate”.
Figure 21 shows the various FOVs we discussed previously in this section, illustrated in a 16:9
aspect ratio, to give the reader an idea of how much of the frame will be available for the
viewer to look around in.

33

Figure 22: Opening sequence of “Dreamstate" showing FOV cropping.
Let us look at another example. Below in Figure 23 is a still photograph shot with Canon's
5.2mm Dual Fisheye lens, which captures an approximate 190-degree FOV. The previous
example was a 2D CG VR 360° experience, and the one below is a 3D VR 180° still image.

Figure 23: ~190-degree FOV 3D still image without lens mask (SOOC).

34

Now, we will look at the image with a lens mask applied and how much of the frame gets
cropped as we move to a narrower FOV. For the sake of simplicity, only the left eye will be
illustrated.

Figure 24: ~190-degree FOV 3D still image with lens mask via EOS VR utility & FOV crop
illustrated.
In Figure 24 above, we see an approximation of how much space the viewer might have if we
crop into a narrower FOV or shoot at that FOV in the first place.
We also see that the frame is automatically cropped to 180 degrees by applying the lens
mask. By shooting at a slightly larger FOV of 190 degrees and applying a lens mask later, we
ensure we always have a full 180-degree frame. By increasing the size of the mask, we can
crop into the footage to make narrower FOVS; the same goes when working with 360°.
In conclusion, FOV is a powerful tool for the Spatial Cinematographer. The ability to
manipulate the frame can direct the viewers' eyes to a point of interest, or it may create
some emphasis, sense of constriction, or sense of expansion based on how the lens mask is
being manipulated. It is a crucial tool that any spatial cinematographer must be aware of
while planning during pre-production, and all these ideas can be prototyped using the
aforementioned pre-vis techniques.

35

4.2 Choosing a Camera System
As CVR and spatial cinematography continue to evolve, a range of camera systems has
become available, each catering to different levels of production quality, budget constraints,
and usability requirements. This section categorizes VR camera systems into three primary
tiers: consumer-level, prosumer-level, and professional-level setups. Each category offers
distinct advantages and limitations, which the spatial cinematographer must carefully
consider when selecting a system.
Consumer-Level VR Cameras
Consumer-level VR cameras are designed for entry-level users and casual content creators.
These cameras typically offer ease of use, compact form factors, and affordability, making
them suitable for hobbyists or those experimenting with VR content creation.
Notable consumer-grade VR cameras include:
•

Insta360 EVO: Previously a popular choice, newer models have now surpassed this
camera due to its outdated resolution and firmware limitations.

•

Kandao Qoocam 8K, EGO and Ultra 3: The Qoocam 8K is a lightweight, portable 8K
360-degree camera. The EGO is a 3D camera with a FOV of around 66 degrees, and
the Qoocam Ultra 3 is like the 8K but with more features. While Kandao's offerings
provide impressive image quality for their price point, they suffer from software
stability issues.8

•

Insta360 X-Series (X4, X3, etc.): Positioned as 360-degree “action” cameras, these
models offer solid performance compared to Kandao's offerings at a similar price
point.9

•

CALF x VISINSE: This is positioned as a more premium consumer camera. While
promising, its firmware has presented operational challenges.10

8

See Kandao online product page at https://www.kandaovr.com/consumer
See Insta360 product page here:
https://store.insta360.com/consumer?i_source=website&i_medium=menu_button&i_campaign=cons
umer
10
See the CALF Product page here: https://calfglobal.com/products/calf-visinse-3d-vr180-camera
9

36

While these cameras provide accessibility, many lack professional-grade features such as
LOG or RAW recording, advanced media storage options, robust power solutions, and
reliable overheating management. These shortcomings make them less suitable for
professional productions requiring extensive post-production workflows and higher image
fidelity.
Prosumer-Level VR Cameras
The prosumer category bridges the gap between consumer-grade and high-end
professional setups, offering improved image quality, better control over recording settings,
and compatibility with professional editing software.
A significant advancement in this category is the Canon EOS VR System, which provides a
range of VR lens options:
•

Canon RF 5.2mm f/2.8 Dual Fisheye Lens: Designed for full-frame sensors,
capturing a 190-degree field of view.11

•

Canon RF-S 3.9mm F3.5 STM Dual Fisheye Lens: Optimized for APS-C sensors,
offering a 144-degree field of view.12

•

Canon RF-S 7.8mm F4 STM DUAL Lens: A more recent addition, covering
approximately 66 degrees, making it suitable for close-up and macro applications.13

11

https://www.canon.ca/en/product?name=RF5.2mm_F2.8_L_Dual_Fisheye&
category=/en/products/Lenses/VR-Lenses
12
https://www.canon.ca/en/product?name=RFS3.9mm_F3.5_STM_DUAL_FISHEYE&category=/en/products/Lenses/VR-Lenses
13
https://www.canon.ca/en/product?name=RFS7.8mm_F4_STM_DUAL&category=/en/products/Lenses/VR-Lenses

37

Figure 25: Image via Canon Canada.14
The full-frame lenses are primarily used with Canon's R5C, R5 Mark II, C80, and C400
cameras, and the APS-C lenses are currently only compatible with the Canon R7 (Fensome
and Kendrick, 2024). The Canon R5C and R5 MKII record up to DCI 8k at 60fps, the C80 up to
DCI 6k at 30fps, and the C400 up to DCI 6k at 60fps, all in CRAW format. Meanwhile, the R7
records up to 4k Fine and up to 30fps.
CVR's ability to look "good” inside a headset is at a resolution of 8k, which translates to 4k
per eye, and this is generally considered the baseline for CVR content. Lower resolutions
lack detail and do not look as good inside a headset as more artifacts, and a general overall
lower visual fidelity makes it just that harder for the viewer to get immersed.
When paired with the Canon EOS VR Utility15 and its Premiere Pro plugin, filmmakers benefit
from a streamlined post-production workflow. The EOS VR utility handles all the stitching. It
allows for one-click stabilization, Lens mask creation, and horizon correction, reducing the
need for third-party software such as Mistika VR.

14

See Canon’s EOS VR lens lineup here https://www.canon.ca/en/products/Lenses/VR-Lenses
Read more about the EOS VR utility more here https://app.ssw.imagingsaas.canon/app/en/vru.html
15

38

Figure 26: EOS VR Utility featuring some footage I shot in Venice during January 2025.
Professional-Level VR Cameras
Professional VR camera systems offer the highest quality, flexibility, and post-production
control for high-budget productions and advanced immersive filmmaking. However, they
also require post-processing software such as Mistika VR or Davinci Resolve with the
KartaVR Plugin pack to perform tasks such as stitching, stereo correction, etc.
The most notable recent entry in this category is the Blackmagic URSA Cine Immersive,
which features:
o

~17K overall resolution (8K per eye), the highest available in commercial VR
filmmaking.

16

o

Superior dynamic range and BRAW recording capabilities.

o

Robust sensor performance, making it ideal for, large-scale productions.

o

Native Integrated workflow with Davinci Resolve Studio.16

For more info, see https://www.blackmagicdesign.com/ca/media/release/20240611-02.

39

Other notable high-end immersive cameras include:
•

RED V-Raptor X paired with Canon's dual fisheye lenses is the go-to for Cinematic VR
production.17
o

•

4k Per eye, 8K 120 FPS recording capabilities, and REDRAW Codec.

Apple's in-development in-house VR camera system, details of which remain
largely undisclosed due to confidentiality.18
o

We know it is in a similar spec range to the Ursa Cine Immersive at 8k per eye.

Selecting the appropriate VR camera system depends on the specific needs of a project,
including budget, resolution requirements, workflow compatibility, and production scale.
Consumer cameras provide accessible entry points but lack professional-grade features.
Prosumer systems, particularly Canon's EOS VR lineup, offer a compelling balance between
quality and affordability. Meanwhile, professional VR cameras like the Blackmagic URSA Cine
Immersive and RED V-Raptor X push the boundaries of immersive filmmaking, albeit at
significantly higher costs. As VR technology advances, the landscape of available camera
options will evolve, offering new opportunities for filmmakers to innovate within immersive
space.

17

See https://www.youtube.com/watch?v=L7iP4zRD3cY for more information.
Read more here https://appleinsider.com/articles/24/10/11/apples-secretive-3d-cinema-cameraresurfaces-for-submerged
18

40

Chapter 5: Post-Production
5.1 Image Stabilization and Upscaling
Image stabilization plays a critical role in CVR due to the unique challenges associated with
camera movement. Unlike traditional filmmaking, where the camera's movement does not
directly impact the viewer's sense of presence, the spatial cinematographer must carefully
synchronize between visual motion and the viewer's physical orientation. Any discrepancy
between these can lead to discomfort, including headaches and motion sickness (Li et al.,
2021).
Even things such as the camera's position relative to the viewer's height can affect how the
user experiences the scene (Rothe et al., 2019). The spatial cinematographer is heavily
responsible for achieving smooth and stable imagery, paramount in VR filmmaking.
Spatial cinematographers will rely on meticulous shot planning and stabilization techniques
to mitigate motion-related issues. The most common approach is to use locked-off shots,
positioning the camera on a tripod or monopod to minimize unintended motion. Additionally,
gimbals and Steadicam systems are widely employed to achieve fluid movement. Handheld
shooting is strongly discouraged, as it introduces micro-jitters that can be challenging to
remove in post-production, often exacerbating motion sickness in VR headsets.

41

There are three primary techniques for image stabilization in VR filmmaking:
1.

Physical Camera Stabilization: This involves using hardware-based stabilizers, such
as gimbals from manufacturers like DJI and Zhiyun. Professional systems like
Steadicam, Flycam, or ARRI Trinity offer advanced stabilization capabilities for higherbudget productions.

Figure 27: Myself with my Canon R7 w/ RF-S 3.9mm F3.5 Dual Fisheye Lens on my Zhiyun Crane
3 Lab Gimbal.

42

2. Post-Production AI Stabilization: AI-driven software solutions provide an alternative
for filmmakers who may not have access to high-end stabilization hardware. Topaz AI
is widely regarded as the leading image stabilization and upscaling software.

Figure 28: Topaz Video AI 6 image stabilization – Self-shot footage from Venice in January 2025.
3. Gyroscopic Stabilization: Some VR camera systems, such as Canon's EOS VR
system, incorporate gyroscopic data into the footage. When processed through the
EOS VR Utility, this data enables precise image stabilization, which is considered an
effective low-cost method for VR applications.

43

Figure 29: EOS VR Utility with image stabilization analysis in progress – Stability is prioritized for
this specific clip – Self-shot footage from Venice in January 2025.
Despite AI-based stabilization's advantages, it presents certain challenges, particularly in
stereoscopic VR. AI algorithms may introduce artifacts that result in inconsistencies between
the left and right eye views, leading to an uncomfortable viewing experience in the headset.
The Role of Upscaling in VR Filmmaking
Upscaling has become a common practice in VR filmmaking due to the demanding
technical requirements of high-resolution content. Shooting in native 8K 60fps RAW is often
impractical due to the need for high-performance camera systems, extensive storage
capacity, and rapid data transfer rates. As a result, many filmmakers opt to shoot at lower
resolutions, such as 4K, 5.7K, or 6K and then upscale the footage using AI-enhanced
software like Topaz AI to achieve 8K (4K per eye) or even 16K resolution (8K per eye) for
premium headsets such as the Apple Vision Pro. Generally, 8K (4K per eye) is considered the
accepted minimum industry standard for VR.

44

Figure 30: Topaz Video AI 6 – Source footage is 3840x2160 (4k) at 30 fps i.e 2k per eye and is
being upscaled to 15360 x 8640 (~16k) at 60 fps i.e 8k per eye – Self-shot footage
from Venice in January 2025.
While upscaling can yield impressive results, its effectiveness depends heavily on the
quality of the original footage. Poorly shot footage cannot be "saved" in post-production
through upscaling, reinforcing the need for skilled spatial cinematography. Optimal lighting
conditions are essential to ensure that stabilization or upscaling can be applied effectively.
Additionally, spatial cinematographers must account for the cropping effect inherent in
image stabilization. Since stabilization algorithms compensate for motion by adjusting the
frame, some peripheral details may be lost. This is not the case with AI-based image
stabilization, but that tends to introduce artifacts. This is particularly significant in VR, where
cropping can disrupt the viewer's sense of scale. Unlike traditional filmmaking, where a
zoomed-in shot may be perceived as an intentional framing choice, excessive cropping in
VR can distort the viewer's perspective. For instance, a minor miscalculation in stabilization
may cause human subjects to appear disproportionately large, creating an unsettling
experience where the viewer perceives themselves as significantly smaller than the onscreen characters. This can break immersion and negatively impact audience engagement.
Both image stabilization and upscaling are integral to achieving high-quality Cinematic VR
experiences. While hardware stabilizers, gyroscopic correction, and AI-driven postproduction techniques offer viable solutions, each approach presents its own limitations.

45

Likewise, upscaling can circumvent the challenges of high-resolution native capture, but it is
not a substitute for well-executed cinematography. Ultimately, careful planning in
stabilization and image resolution is necessary to maintain immersion and ensure an optimal
viewing experience in VR.

5.2 Editing and Compositing
CVR presents unique challenges distinct from traditional filmmaking, particularly in editing.
Unlike conventional cinema, Spatial Cinematography frequently involves multiple camera
lenses or video sources that must be seamlessly stitched together into a unified stream.
Once a labor-intensive manual task, this process has been significantly streamlined by
camera systems and software solutions advancements.
Stitching and Initial Processing
Modern VR cameras, such as Insta360 and Kandao cameras, now offer in-camera stitching
or proprietary software solutions that simplify the process. For instance, footage captured
with these systems can be processed by dragging the files into the respective software and
executing an automated stitching command with a few steps (Busch, 2024).
Similarly, Canon's EOS VR system utilizes a unique lens configuration that produces a sideby-side spherical image. This footage can then be converted into an equirectangular
projection using the EOS VR Utility (Fensome and Kendrick, 2024), ensuring compatibility
with non-linear editing (NLE) systems like Adobe Premiere Pro and DaVinci Resolve.
Stereo Correction and Compositing
Stereo correction is a critical step in VR post-production (J. Tong et al., 2022). While built-in
correction tools, such as Canon's parallax and horizon correction features, can manage most
adjustments, specialized software like Mistika VR provides more granular control over stereo
compositing. Mistika VR allows editors to fine-tune depth alignment, ensuring a comfortable
viewing experience (Terzić and Hansard, 2016). Once these adjustments are made, the
footage is imported back into the NLE for further editing and sequencing.

46

Figure 31: Example stitch of Canon EOS VR dual fish eye footage in Mistika Boutique – The
spherical SBS footage can be seen in the timeline, and the stereo-composed equirectangular
convert in the bottom right.
VR Editing in NLEs
Editing VR footage requires specialized tools for accurate previewing. Premiere Pro, for
example, offers an in-headset live preview feature that enables editors to review footage in
real-time, reducing the need for repeated exports and revisions. While editing on a flat
screen remains an option, real-time VR previewing is recommended as it saves time and
ensures precise spatial alignment. In DaVinci Resolve, a similar workflow can be achieved
using the open-source KartaVR plugin (Hazelden, 2024), which utilizes Fusion to unwrap
footage for headset viewing. Additionally, with the upcoming Ursa Cine Immersive,
Blackmagic will enhance DaVinci Resolve Studio, transforming it into a fully native
immersive video editing platform. However, the extent of its compatibility with third-party
camera systems remains uncertain.
Metadata Injection and Export
Once the editing process is complete, the final step involves exporting and formatting the
VR video with appropriate metadata. FFmpeg, a long-standing and widely used software
tool, facilitates metadata injection to define critical attributes such as resolution, screen size,
and stereo format (full SBS or half SBS). Specific metadata configurations are required for

47

different platforms. For instance, Apple Spatial Video (Swanson, 2024) necessitates specific
metadata settings, while YouTube requires tags that identify the video format (e.g., VR180° or
VR360) to ensure proper playback.
Considerations for VR Editing
One of the fundamental differences between traditional and VR editing is the necessity for
the editor to understand stereo depth budget and alignment. While spatial
cinematographers aim for optimal in-camera alignment, subtle stereo discrepancies can still
arise. Even minor misalignments can cause viewer discomfort, (J. Tong et al., 2022) making it
essential for editors to make precise stereo corrections during post-production.
Conclusion
Editing and compositing in Cinematic VR requires a combination of advanced software tools
and a deep understanding of spatial cinematography principles. While technological
advancements have simplified many aspects of the workflow, meticulous stereo correction,
real-time VR previewing, and proper metadata handling remain critical for producing highquality immersive experiences. As VR filmmaking evolves, integrating specialized editing
tools within NLEs will further streamline the post-production process, enhancing Cinematic
VR content's overall efficiency and quality.

48

Research Output: The Spatial Cinematography
Framework Poster
To make these research findings more legible to a broader audience, I decided to take the
key recommendations of my research and summarize them into a single graphical image.
The advantage of this is that it is legible to a broader audience and more straightforward to
disseminate than the format of my thesis document. The intention is to spread my research
insights online to improve the field of spatial filmmaking today.
The poster has three sections: 1. Pre-production, 2. Production, and 3. Post-production.
Pre-Production:
•

Clarify Narrative Approach (3DOF vs. 6DOF):
Clearly define whether the VR experience is immersive (non-interactive, 3DOF) or
interactive (6DOF) to effectively guide production and storytelling decisions.

•

Conduct Thorough Pre-Visualization:
Use advanced pre-visualization methods (e.g., Unreal Engine, NeRF scanning) to
accurately simulate scenes, identify potential viewer attention issues, and streamline
the shooting process.

•

Select Optimal Field of View (FOV):
Determine the most appropriate FOV (180°, 360°, or custom) based on narrative
requirements, directing attention, practicality in lighting, and intended viewer
immersion.

•

Choose the Right Camera System:
Carefully evaluate available camera systems (consumer, prosumer, professional)
considering production scale, budget, resolution, and post-production workflow
compatibility.

49

Production:
•

Apply Cinematic VR Framing Techniques:
Strategically frame shots to guide viewer attention using spatial design, subtle
gestural cues, and careful camera placement to maintain narrative clarity and
immersion.

•

Ensure Stereo Comfort and Depth Budget:
Meticulously calibrate stereoscopic 3D setups, maintain proper interocular distances,
and consistently verify comfortable viewing to prevent viewer fatigue or discomfort.

•

Prioritize Stable Camera Movements:
During shooting, use physical stabilization methods (tripods, monopods, gimbals) to
minimize unwanted camera movement and viewer discomfort.

Post-Production:
•

Perform Precise Image Stabilization:
Apply effective stabilization techniques (gyroscopic data, AI-driven software like
Topaz Video AI) to ensure smooth and comfortable viewing experiences.

•

Optimize Resolution through Upscaling:
When shooting at lower resolutions, utilize AI-enhanced upscaling techniques to
achieve target VR resolutions (8K minimum) for maximum visual fidelity and
immersion.

•

Implement Rigorous VR-specific Editing Practices:
Utilized specialized editing workflows in software like Premiere Pro or DaVinci
Resolve, ensuring careful stereo correction, compositing accuracy, and thorough
real-time VR headset reviews to finalize the immersive experience.

50

The Spatial Cinematography Production Checklist:

Figure 32: Spatial Cinematography Production Checklist.

51

The goals of this project reflect what experts suggest about making research findings
legible to a broader audience. As Tufte notes, the page layout of this information matters:
“Graphical excellence consists of complex ideas communicated with clarity, precision, and
efficiency” (Tufte, 2001). The goal here is to condense my findings into an easy-to-transport
image that will be shared on Instagram, the web, and other digital platforms.

52

Conclusion and Future Directions
Spatial Cinematography represents more than a technical shift in filmmaking — it signals a
deeper transformation in how stories are told, experienced, and remembered. This thesis
has worked to map this new field by examining how spatial camera systems, non-interactive
immersion, and production workflows reshape the language of Cinematic VR. By developing
a practical framework, this research offers filmmakers a way to move beyond
experimentation and begin crafting immersive works with greater precision, clarity, and
emotional resonance.
At the heart of this project is a simple but profound idea: VR storytelling is not just about new
tools — it demands new ways of thinking. As Marshall McLuhan famously argued, "the
medium is the message" (McLuhan, 1964); the form of a medium shapes not just the content
it carries but how it is understood. Spatial Cinematography embraces this reality, recognizing
that VR is not simply another platform for cinema but a fundamentally different way of
structuring narrative, space, and presence.
As hardware advances and technologies like AI-driven production and real-time rendering
evolve, the challenge will not be merely to adapt, but to rethink the nature of narrative space
itself. Spatial Cinematography provides a good foundation for this shift, helping to bridge the
technical and creative divide that has held back the medium’s full potential.
The accompanying framework and poster aim to make this knowledge accessible to a wider
community of creators, opening doors for more diverse, sophisticated, and emotionally
powerful VR films. As my upcoming Cine-VR work, The Sound of One Eye Closing19, moves
toward its premiere at the Venice Biennale20, it stands as proof that immersive cinema is not
a passing trend — it is a medium still inventing its own grammar.
The future of VR storytelling will belong to those who see immersion not as a gimmick, but
as a canvas. Spatial Cinematography invites filmmakers to claim that canvas — to

19

BCC press release: https://www.labiennale.org/it/news/biennale-college-cinema-aperto-ilbando-italiano-novit%C3%A0-e-aggiornamenti
20
Read more about this year’s BCC-I projects here
https://collegecinema.labiennale.org/en/prog_immersive_24/

53

experiment boldly, to question assumptions, and to create worlds that audiences can truly
inhabit.

54

Bibliography:
Ackermann, E. (2018, June 15). Introducing VR180 Creator, simplifying the video editing process.
Google. https://blog.google/products/google-ar-vr/introducing-vr180-creatorsimple-video-editing/
AMPAS. (2017, October 27). THE ACADEMY’S BOARD OF GOVERNORS AWARDS AN OSCAR®
TO ALEJANDRO G. IÑÁRRITU’S “CARNE Y ARENA” VIRTUAL REALITY INSTALLATION |
Oscars.org | Academy of Motion Picture Arts and Sciences.
https://www.oscars.org/news/academys-board-governors-awards-oscarralejandro-g-inarritus-carne-y-arena-virtual-reality
Apple Newsroom. (2023). Apple introduces spatial video capture on iPhone 15 Pro. Apple
Newsroom (Canada). https://www.apple.com/ca/newsroom/2023/12/appleintroduces-spatial-video-capture-on-iphone-15-pro/
Ardal, D., Alexandersson, S., Lempert, M., & Abelho Pereira, A. T. (2019). A Collaborative
Previsualization Tool for Filmmaking in Virtual Reality. Proceedings of the 16th ACM
SIGGRAPH European Conference on Visual Media Production, 1–10.
https://doi.org/10.1145/3359998.3369404
Arden, S. (2012). Adventures in Stereo: Stereoscopic Cinema in the Age of a Digital Observer.
https://doi.org/10.35010/ecuad:2719
Asch, T. (2022, January 16). GoPro Dual HERO System Housing – An Extraordinary 3D Camera
Rig – The Stereoscopy Blog. https://stereoscopy.blog/2022/01/16/gopro-dual-herosystem-housing-an-extraordinary-3d-camera-rig/
Barnard, D. (2023, June 27). Degrees of Freedom (DoF): 3-DoF vs 6-DoF for VR Headset
Selection – VirtualSpeech. https://virtualspeech.com/blog/degrees-of-freedom-vr
Baoill, Andrew Ó. “Jenkins, H. (2006). Convergence Culture: Where Old and New Media
Collide. New York: New York University Press. 336 Pp. $29.95 (Hardbound).” Social

55

Science Computer Review, December 3, 2007.
https://doi.org/10.1177/0894439307306088.
Blackman, T., & Harley, D. (2024). Interpreting Apple’s visions: Examining the spatiality of the
Apple Vision Pro. Platforms & Society, 1, 29768624241283913.
https://doi.org/10.1177/29768624241283913
Busch, A. (2024, March 20). How to quickly stitch a 360 video with Insta360 Studio. Mantis Sub
Underwater Housings for Insta360 Pro / RS. https://www.mantissub.com/academy/how-to-quickly-stitch-360-video-with-studio
Cannavò, A., Castiello, A., Pratticò, F. G., Mazali, T., & Lamberti, F. (2024). Immersive movies:
The effect of point of view on narrative engagement. AI & SOCIETY, 39(4), 1811–1825.
https://doi.org/10.1007/s00146-022-01622-9
Carpio, R., Birt, J., & Baumann, O. (2023). Using case study analysis to develop heuristics
to guide new filmmaking techniques in embodied virtual reality films. Creative
Industries Journal, 1–22. https://doi.org/10.1080/17510694.2023.2171336
Chatterjee, J., Spruyt, L., Pirson, N., & Vega, M. T. (2024). Effects of 6DoF Motion
on Cybersickness in Interactive Virtual Reality. In L. T. De Paolis, P. Arpaia, & M. Sacco
(Eds.), Extended Reality (pp. 21–37). Springer Nature Switzerland.
https://doi.org/10.1007/978-3-031-71713-0_2
Cho, J., Kim, Y., Jung, S. H., Shin, H., & Kim, T. (2017). 78-4: Screen Door Effect Mitigation and Its
Quantitative Evaluation in VR Display. SID Symposium Digest of Technical Papers, 48(1),
1154–1156. https://doi.org/10.1002/sdtp.11847
Choi, H., & Nam, S. (2022). A Study on Attention Attracting Elements of 360-Degree Videos
Based on VR Eye-Tracking System. Multimodal Technologies and Interaction, 6(7),
Article 7. https://doi.org/10.3390/mti6070054

56

Chulhyun, K. (2016, September 30). A Comparative Study for Virtual Reality 360° Contents
Shooting Equipments Based on Real World -Journal of Broadcast Engineering | Korea
Science. https://koreascience.kr/article/JAKO201631267707579.page
De La Peña, N. (Director). (2012). Hunger in Los Angeles.
Dixon, Steve. (2007). “Virtual Reality: The Search for Immersion.,”.
https://www.immersence.com/publications/2007/2007-SDixon.html.
Fensome, M., & Kendrick, M. (2024, November). EOS VIRTUAL REALITY SYSTEM White Paper.
Canon Europe. https://www.canon-europe.com/virtual-reality/
Gipson, J., Brown, L., Robbins, E., Gomez, J., Anderson, M., Velasquez, J., Ruiz, J., & Cooper, D.
(2018). VR story production on Disney animation’s “cycles.” ACM SIGGRAPH 2018 Talks,
1–2. https://doi.org/10.1145/3214745.3214818
Godard, J.-L. (Director). (2014). Goodbye to Language (Adieu au langage).
Hazelden, A. (2024, November 14). Kartaverse | KartaVR. Kartaverse.
https://github.com/kartaverse/Kartaverse-Docs/
He, X., & Liu, Z. (2024). (PDF) A Novel Way of Estimating a User’s Focus of Attention in a Virtual
Environment. ResearchGate. https://doi.org/10.1007/978-3-319-91581-4_6
IEEE. (2022). The Differences between 3DoF and 6DoF, and Why—IEEE Digital Reality.
https://digitalreality.ieee.org/publications/degrees-of-freedom
IMAX Corporation. (1999). The 15/70 Filmmaker’s Manual: IMAX Corporation Website | PDF | 3
D Film | Camera. Scribd. https://www.scribd.com/document/635399609/Untitled
Iñárritu, A. G. (Director). (2017). Carne y Arena.
Knorr, S., Kunter, M., Sikora, T., & Ide, K. (2024). (PDF) The Avoidance of Visual Discomfort and
Basic Rules for Producing “Good 3D” Pictures. ResearchGate.
https://doi.org/10.5594/j18236

57

Knorr, S., Ozcinar, C., Fearghail, C. O., & Smolic, A. (2018). Director’s cut: A combined dataset
for visual attention analysis in cinematic VR content. Proceedings of the 15th ACM
SIGGRAPH European Conference on Visual Media Production, 1–10.
https://doi.org/10.1145/3278471.3278472
Li, J., Reda, A., & Butz, A. (2021). Queasy Rider: How Head Movements Influence Motion
Sickness in Passenger Use of Head-Mounted Displays. 13th International Conference
on Automotive User Interfaces and Interactive Vehicular Applications, 28–38.
https://doi.org/10.1145/3409118.3475137
Lynn, M. H., Luo, G., Tomasi, M., Pundlik, S., & E. Houston, K. (2020). Measuring Virtual Reality
Headset Resolution and Field of View: Implications for Vision Care Applications.
Optometry and Vision Science, 97(8), 573.
https://doi.org/10.1097/OPX.0000000000001541
Mateer, J. (2017). Directing for Cinematic Virtual Reality: How the traditional film director’s
craft applies to immersive environments and notions of presence. Journal of Media
Practice, 18(1), 14–25. https://doi.org/10.1080/14682753.2017.1305838
Manovich, Lev. (2001). “The Language of New Media.” MIT Press,
https://mitpress.mit.edu/9780262632553/the-language-of-new-media/.
Milk, C. (Director). (2015). Clouds over Sidra. United Nations.
MIXED. (2023, November 18). The immersive turnaround: How VR180 3D cameras are
conquering the (smartphone) market. MIXED Reality News. https://mixednews.com/en/immersive-turnaround-vr180-3d-cameras-smartphone-market/
Morana, G. (2024). Impact of Imaging and Distance Perception in VR Immersive Visual
Experience. https://doi.org/10.18745/th.27468

58

Muender, T., Fröhlich, T., & Malaka, R. (2018). Empowering Creative People: Virtual Reality for
Previsualization. Extended Abstracts of the 2018 CHI Conference on Human Factors in
Computing Systems, 1–6. https://doi.org/10.1145/3170427.3188612
Murray, Janet H. (1997) “Hamlet on the Holodeck: The Future of Narrative in Cyberspace.” MIT
Press, https://mitpress.mit.edu/9780262631877/hamlet-on-the-holodeck/.
Nielsen, Frank. (2005). “Surround Video: A Multihead Camera Approach.” The Visual Computer
1, no. 1 : 92–103. https://doi.org/10.1007/s00371-004-0273-z.
Park, S.-H. (2018). The Study on the Role of 3D Animated Pre-visualization in VFX
FilmProduction. Cartoon and Animation Studies, 293–319.
https://doi.org/10.7230/KOSCAS.2018.51.293
Peddie, J. (2024, October 10). Remember Stereo 3D on the PC? Have You Ever Wondered
What Happened to It? ACM SIGGRAPH Blog.
https://blog.siggraph.org/2024/10/stereo-3d-pc-history-decline.html/
PHI. (2017). CARNE y ARENA (Virtually present, Physically invisible) | On Tour. PHI.
https://phi.ca/en/carne-y-arena/
Rahman, A. R. (Director). (2022). Le Musk.
Riggs, Stephanie. (2019) “The End of Storytelling: The Future of Narrative in the Storyplex.”
End of Storytelling. https://www.endofstorytelling.com.
Rothe, S., Kegeles, B., & Hussmann, H. (2019). Camera Heights in Cinematic Virtual Reality:
How Viewers Perceive Mismatches Between Camera and Eye Height. Proceedings of
the 2019 ACM International Conference on Interactive Experiences for TV and Online
Video, 25–34. https://doi.org/10.1145/3317697.3323362
Schleser, M., Remedios, D. J., Berrett, J., & Mathew, D. J. (2024). Imaginative storytelling –
novel immersive production practices and processes for mobile cinematic,

59

interactive 360-degree and real-time VR. Media Practice & Education, 1–19.
https://doi.org/10.1080/25741136.2024.2426076
Swanson, M. (2024, March 7). Spatial Video – Mike Swanson’s Blog.
https://blog.mikeswanson.com/spatial-video/
Szita, K., & Gander, P. (2021). The Effects of Cinematic Virtual Reality on Viewing Experience
and the Recollection of Narrative Elements. PRESENCE: Virtual and Augmented
Reality.
https://www.academia.edu/107267704/The_Effects_of_Cinematic_Virtual_Reality_o
n_Viewing_Experience_and_the_Recollection_of_Narrative_Elements
Terzić, K., & Hansard, M. (2016). Methods for reducing visual discomfort in stereoscopic 3D: A
review. Signal Processing: Image Communication, 47, 402–416.
https://doi.org/10.1016/j.image.2016.08.002
Tong, J., Wilcox, L. M., & Allison, R. S. (2022). The impacts of lens and stereo camera
separation on perceived slant in Virtual Reality head-mounted displays. IEEE
Transactions on Visualization and Computer Graphics, 28(11), 3759–3766. IEEE
Transactions on Visualization and Computer Graphics.
https://doi.org/10.1109/TVCG.2022.3203098
Tong, L., Lindeman, R. W., & Regenbrecht, H. (2021). Viewer’s Role and Viewer Interaction in
Cinematic Virtual Reality. Computers, 10(5), Article 5.
https://doi.org/10.3390/computers10050066
Truong, A. (2015, December 28). Virtual reality filmmakers are discovering new things about the
power of audio. Quartz. https://qz.com/579185/virtual-reality-filmmakers-arediscovering-new-things-about-the-power-of-audio
Tufte, E. (2001). The Visual Display of Quantitative Information. Edward Tufte.
https://www.edwardtufte.com/book/the-visual-display-of-quantitative-information/

60

Vosmeer, M., & Schouten, B. (2017). Project Orpheus A Research Study into 360° Cinematic
VR. Proceedings of the 2017 ACM International Conference on Interactive Experiences for
TV and Online Video, 85–90. https://doi.org/10.1145/3077548.3077559
Wallworth, L. (Director). (2016). Collisions.
Williams, E. R., Love, C., Love, M., & Durado, A. (2021). Cine-VR: A new medium. In Virtual
Reality Cinema. Routledge.
Xu, L., Agrawal, V., Laney, W., Garcia, T., Bansal, A., Kim, C., Rota Bulò, S., Porzi, L.,
Kontschieder, P., Božič, A., Lin, D., Zollhöfer, M., & Richardt, C. (2023). VR-NeRF: HighFidelity Virtualized Walkable Spaces. SIGGRAPH Asia 2023 Conference Papers, 1–12.
https://doi.org/10.1145/3610548.3618139
Zhang, Y., & Weber, I. (2023). Adapting, modifying and applying cinematography and editing
concepts and techniques to cinematic virtual reality film production. Media
International Australia, 186(1), 115–135. https://doi.org/10.1177/1329878X211018476

61