Creating a Replicable Methodology for Categorizing Spatial Editing Techniques

ABSTRACT

Studies have shown that cinematic filming techniques affect a viewer’s spatial understanding [1]. Off screen, misaligned spatial reference frames can lead to potential processing demands [2]. The goal of this study was to create a replicable method of categorizing spatial cues in movies. The method can later be used to investigate whether misaligned spatial cues impact viewer processing demands. The concept of misaligned spatial cues is referred to as “conflict.” Movie-specific cues were established to further an investigation of conflict: screen-side cues, gaze-direction cues, and cues arising from an intrinsic understanding of characters’ locations in a scene. Conflict was compared across two movies, The Social Network (TSN) and The Truman Show (TTS). It was predicted that a replicable methodology could be made to categorize spatial cues and resulting conflict. Further, that conflict values would be higher in TTS due to its notably confusing filming. After drafting an SOP, two separate raters used the protocol to analyze a section of shots from TTS. The rater agreements were close to 1, demonstrating that the coding method was replicable [3]. After creating an equation for conflict and assessing it in both movies, it was found that TTS had significantly more high-conflict shots (p < 0.01) than TSN. This research developed an area of study currently lacking, that of the brain. Future work can investigate what and how different parts of the brain respond to different levels of conflict, which would track a course to understanding more about humans, overall.

INTRODUCTION.

The overarching goal of this research is to understand how humans perceive space in the two-dimensional world of movies. This research establishes the concept of “conflict” to measure the degree to which contextual space in a scene makes sense to a viewer. Measures of conflict require scenes with human interaction; shot-reverse-shots (SRSs) exclusively were used to calculate conflict. SRSs are primarily conversational scenes wherein the camera cuts between two or more target characters. This is a preliminary study setting up the idea of conflict. Future studies may use this method to investigate how varying levels of conflict affect a viewer’s perception of space and brain activity.

Prior research has established that watching a movie excites brain activity. Previous research discovered that different filming sets can aid, or be a hindrance to, spatial understanding [1]. A scene shot with a single camera that follows characters around (constrained-view set) presents space with more clarity than does a multi-view set shot with multiple cameras. Research has also found that violations of the 180-degree rule causes viewers to have a subconscious distaste for the movie [4]. The 180-degree rules asks that once a plane of filming has been established, it does not change within a scene [5] (Figure 1). Research conducted by a thesis student in the Levin lab showed that left to right (L-R) SRSs cause lateralization in viewers’ brains. Right to left (R-L) SRSs do not [6]. He categorized pairs of shots as R-L or L-R, labeling the first frame after the cut in The Truman Show.

**Figure 1.** This image represents the 180-degree rule, which states that once a plane of filming is established, it should not change within a scene. The dotted line demarcates the plane that the filming angle should not pass.

In the real world, people perceive space egocentrically. From a young age, one is aware of space and their place in it [7]. Humans understand physical surroundings and their position in relation to them: building a spatial relational system between and from reference objects [2]. Reference objects are things one knows the location of, such as walls, ceiling, and floor. This understanding creates the definition of spatial reference frames. Spatial reference frames can center around different planes. When thinking in an egocentric frame of reference, one describes an object’s locations in relation to oneself [4]. For example, the girl in Figure 2 is standing on the floor, next to a couch, in front of a TV. Her understanding of space is based on the things around her. This is an egocentric understanding of space: understanding space in relation to you.

**Figure 2.** This girl stands on the floor, a sofa is next to her, a ceiling is above her. She can use an egocentric, or self-oriented, frame of reference to understand where she and other things are in space. Photo courtesy of Canva, 2013.

Movies require an intrinsic frame of reference, which comes from relating objects to objects to make sense of space. For example, to understand where a city is on a map, one must relate state positions and borders to each other.

Research has found that misaligned spatial reference frames in the real world make it harder for people to understand space [2]. This can also be applied to movies. Humans understand space on the screen intrinsically and in the world around them egocentrically. In the study, participants viewed a room with target objects placed around it. After exiting the room, they were asked to visualize the location of one object from the perspective of another object in the room. The research found that when the perspective from which they initially viewed the room was parallel to their new imagined view, participants were much better at locating the other object than when they were asked to imagine the room from a non-parallel view. In this case, when their views were parallel, their egocentric and intrinsic understanding of the space were in alignment, and they understood the space better.

The concept of conflict stems from a misalignment between egocentric and intrinsic understanding. A person watching TV is aware of her off-screen location and space egocentrically. But to make sense of the on-screen space, she must intrinsically relate the positions of people and objects to each other. These egocentric and intrinsic spatial reference frames exist and are in use at the same time. When watching a movie, for example, a person may know that character 1 stands on the left of character 2, information one has gathered intrinsically. What if the location of character 1 on the TV screen does not match the location of character 1 in relation to character 2 in the next sequence of shots? That would mean that character 1 stands on the right of the TV screen, which clashes with the understanding that they are to the left of character 2, thus creating conflict (Figure 3). Character 1 standing on the left of character 2, should therefore, be looking to their right (Figure 3). This research builds upon this concept, creating three spatial cues: screen side, gaze direction, and location in the scene. Screen side denotes where a character is on the TV screen and comes from an egocentric understanding. Gaze direction denotes which direction gaze is directed to look at the other target. Gaze direction was measured egocentrically. Location in the scene is contextual and intrinsic. One knows from cues that two characters stand directly across from each other, even if one stands screen side left and the other screen side right. These three cues, along with the presence of a non-target character in a shot, work together to build the idea of “conflict” as presented in this study.

**Figure 3.** This represents a TV screen, where the character in the purple shirt is in shot one, and the woman in the pink shirt is in shot two. It is clear to viewers that character 1 (in other words, purple shirt) is on the left of character 2 (pink shirt). However, character 1 stands on the right of the screen, creating conflict. A further measure of conflict is eye-gaze. Character 1 is looking right at character 2, who is looking left. This aligns with what viewers see on screen, meaning there is no conflict.

Conflict is a relatively new concept and has never been applied to movies before. In this research, the main objective was to see if conflict was a concept that could be analyzed categorically. The goal was to create a method to calculate a “conflict value.” We then compared values in two movies: The Truman Show and The Social Network. These two movies are human-centric, meaning that the information believed helpful for calculating conflict is present, such as gaze direction and screen side. The Social Network has also received numerous awards for its editing clarity such as the Academy Award for Best Film Editing and British Academy of Film and Television Arts for Best Editing. The Truman Show, on the other hand, was intentionally confusing [8] and did not receive, any awards regarding its editing. Therefore, comparing the conflict across the two movies could prove insightful. Higher conflict in The Truman Show could mean this research proved correct and applicable to the real world.

MATERIALS AND METHODS.

Stimuli: The Truman Show and The Social Network in DaVinci Resolve.

Past research, approached investigating cinematic techniques through two types of SRSs, Left to Right (L-R), and Right to Left (R-L). This research used more categories for SRSs: left to middle (L-M), middle to left (M-L), right to middle (R-M), middle to right (M-R), right to right (R-R), left to left (L-L), and middle to middle (M-M). Conflict is an idea that has, in past research, been applied to real-world spatial processing. Gaze direction and screen side were imperative, but other information was also coded such as if the scene was indoor or outdoor, the amount to which the other targets body was present in a scene (Figure 4)(other-visible), and if there were non-target characters visible in the background. In the end, only gaze direction, screen side, and other-visible were used to calculate conflict.

**Figure 4.** This image, a shot from The Truman Show, represents the category other-visible. In the corner there is a hand, which acts as a spatial cue to viewers.

Coding Screen Side.

Targets were coded separately by shot for the side of the screen they were on. Using DaVinci Resolve the screen was split by 5 lines. Each line represented a category for screen side (Figure 5). Line (1) far left, (2) center left, (3) center, (4) center right, (5) far right. If the body was not centered on any line, the space between lines was used to categorize screen side. Space from the left of the screen to line 2 being far left, the space from line 1 to line 3 being centered left, and so on.

**Figure 5.** Depicts the methodology used to code screen side. Truman was centered on line 3, so this shot was categorized as screen side middle.

Coding Gaze Direction.

Gaze direction was categorized as (1) clearly left, (2) subtly left, (3) center, (4) subtly right, or (5) clearly right. The direction a character’s eyes were pointed on the screen, even if the person they were looking at was on the right, was used to find gaze direction (Figure 6).

**Figure 6.** In this scene from The Truman Show, Truman’s eyes (on the left) are directed to the right of the screen, and the nurses’ eyes (on the right) are directed to the left.

Coding Other-Visible.

It was noted if the back of the head, shoulders, etc of a target were visible in a shot where they were not the main target (Figure 4). Other-visible was coded on a 0 to 2 scale. 0 means the other target is not visible at all, 1 means slightly visible, and 2 means they are mostly/fully visible.

Calculating Conflict.

A perfect scene, with zero conflict, would have two characters gaze directions as opposite as possible, and their screen side as opposite as possible. A character standing on the left looking right at a character standing on the right looking left back at them would create the least conflict possible (Figure 7). The greater the value of other-visible, the less conflict, as other-visible is a very clear cue to where the targets are in relation to each other.

**Figure 7.** This depicts a scene with zero conflict, as screen side and gaze direction of the two characters are as opposite as possible. Credits: Canva

Therefore, the equation for conflict is: Conflict = |(Gaze direction B – Gaze direction A) + (Screen side B – Screen side A)| – (Other visible B)

When used, the conflict code resulted in a scale of -1 to 5 where -1 is the lowest possible conflict, and 5 is the highest possible conflict. To find the differences in conflict across the two movies, a two-tailed t-test was conducted within Jasp.

Inter-rater Reliability.

To check that the protocol was reliable, two raters used the protocol for 50 random frames. After the first rater used the protocol, Cohen’s Kappa statistical test was run on Jasp, and changes were made to the protocol to decrease disagreements with the second rater. The difficulty arose with coding screen side when a character’s face or body was squarely in between two lines. The protocol was changed to ask that when such nuances arise, replace the five lines with a singular one down the middle and determine if a character is on the line, to the left of the line, or to the right of the line. The protocol was also edited and polished to be more direct. The second rater then used the enhanced protocol and the Cohen’s Kappa test was rerun, comparing the second rater score to both the original code (rater 1), and rater 2.

RESULTS.

Inter-rater Reliability.

The agreement score between rater 1 (the original) and rater 2 had the lowest agreement score. When comparing rater 2 to 3 and rater 3 to 1, the agreement scores were above 0.8 (Figure 8). Past research has shown that agreement scores above 0.8 mean that the results are in near perfect agreement (3).

Shot-reverse-shot frequencies.

It was found that L-R and R-L shots are the most common type in both movies. Other, more peculiar shots, like M-R (middle to right, coded by screen side), were rare (Figure 9).

**Figure 9.** Comparing Shot Reverse Shot’s Across Two Movies. The frequencies of each type of SRS was compared between both movies. The y-axis is the percent of the type out of the total SRSs.

Relating Shot-type to Conflict value.

For this assessment, low conflict was defined as values less than 3, and high conflict as greater than 3. This gives us a baseline assessment of which levels of conflict these shot types are associated with. L-R and R-L are the most common SRSs in both movies and are also related to lower amounts of conflict (Figure 10).

**Figure 10.** Comparing Frequencies of High and Low Conflict with L-R and R-L Shots. This graph shows the relationship between L-R/R-L SRSs and high/low conflict. The y-axis represents the total number of shot-types with a certain conflict value out of all L-R or R-L shots. There are substantially more R-L and L-R low conflict shots.

Comparing Conflict Across The Truman Show and The Social Network

The mean conflict value of both The Social Network (mean = 2.199) and The Truman Show (mean = 2.536) was found. The difference between these values is considered statistically significant (t(df) = 0.23, p<0.001)(Figure 11).

**Figure 11.** Comparing the Distributions of Conflict Across Two Movies. This graph compares the levels of conflict across the two movies. In both movies, conflict values center around the 1, 2, 3 range. There are some discrepancies though, as The Truman Show centers more around 2, 3, and 4, whereas The Social Network centers more around 1, 2 and 3.

CONCLUSION.

This research developed a coding scheme can be employed to identify and categorize various features in movies, applied to two different movies, and compared results. Two movies, different in both purpose and filming technique, have vastly different shot-reverse-shot categorizations. When using the definition of conflict and applying that to both movies, The Truman Show has a significantly higher average conflict value (potential to increase processing load).

As the coding scheme was modified to be clearer, inter-rater reliability got closer to 1, indicating near perfect agreement. The coding scheme, therefore, can be applied to other areas and studies.

Across both The Social Network and The Truman Show, shot-reverse-shots were primarily L-R and R-L. This makes sense as L-R and R-L are the most conducive to a low conflict scene and visually are more appealing. For example, a M-L shot, where one-character stands in the middle, and another stands to the left, is susceptible to breaking more rules and is less appealing. TTS, therefore, has less L-R and R-L, and more conflict, proving that L-R and R-L shots do correspond to conflict levels, and create less of it. Although both have received numerous awards and plaudits, The Social Network has received much more to do with the editing and film technique employed, whereas The Truman Show has received more to do with its theme and purpose. These reviews and critics make the work of this research more tangible, proving that the data is responding to the same thing’s movie watchers do.

In the future, this research can be employed in many different areas, such as psychology, cognitive science or movie production. It can serve as a foundation for research that will delve deeper into why certain parts of the brain respond to stimuli, such as high or low conflict, and why very specific filming techniques “work” and others do not. There are many different styles and forms of cinematic filming and directing. This research brings to light these differences through substantiated data. Future research could explore more ways movies can be analyzed to further understand cinema concretely.

In conclusion, our study showed that movies can be analyzed through cinematic techniques and other cues to understand the portrayal of space. Although work needs to be done to further advance this research, it has built a strong foundation for future research.

ACKNOWLEDGMENTS

Thank you to the Levin Lab, for supplying me the computers and desk space I needed to complete this research. Especially to Dr. Levin and Kathryn Sam, who guided me through this project and truly looked out for me. Finally, to the SSMV, for making this research possible, and a special thanks to Dr. Deweese, my advisor, for helping me through all the ups and downs.

REFERENCES

[1] D. T. Levin, Spatial Representations of the Sets of Familiar and Unfamiliar Televisions Programs. Media Psychology, 13, 54-76 (2010).

[2] A. L. Shelton, T. P. McNamara, Systems of Spatial Reference in Human Memory. Cognitive Psychology, 43, 274-310 (2001).

[3] M. L. McHugh, Interrater reliability: the kappa statistic. Biochemia Medica,. 3, 276-282 (2012).

[4] A. C. Şimşek, T. Aydin, E. A. Demirgüneş, P. S. Şafak, Space in Movies: Continuity and Perceptual Load Guide Spatial Judgements. Art & Perception, 13, 1-36 (2025)

[5] G. V. Kachkovski, D. Vasilyev, M. Kuk, A. Kingstone, and C. N. H. Street, Exploring the Effects of Violating the 180-Degree Rule on Film Viewing Preferences. Communication Research, 46, 948-964 (2019).

[6] Z. Xing, Exploring the Effects of Editing Techniques through Event-related Potential, Ph.D. Dissertation thesis, Graduate School of Vanderbilt University, Nashville, TN (2025).

[7] G. Butterworth and N. Jarrett, What minds have in common is space: Spatial mechanisms serving joint visual attention in infancy. British Journal of Developmental Psychology, 9, pp. 55–72 (1991).

[8] R. Eric, Inside the Cinematography of The Truman Show. The American Society of Cinematographers, AC June, (1998).

Posted by buchanle on Friday, May 15, 2026 in May 2026.

Tags: Conflict, Lateralization, Shot-Reverse-Shots, Spatial Cues, Spatial Reference Frames