Understanding Stereoscopy and the Cardboard Effect

Understanding Stereoscopy and the Cardboard Effect

Examination of influence quantities on the pictorial design of stereoscopic productions – An analysis of the impact of various parameter on the depth effect on the basis of/based on stereophotography.

In collaboration with my project partner Stephan Sobiesinsky.
The whole, very elaborate Bachelor thesis (in German) can be purchased via Stephan or I. Please contact either of us for further information.

Photography is not limited to reproducing lines and surfaces. It has found its complement in the stereoscope, which gives to a design the most irresistible appearance of relief and roundness, insomuch, that nature is no longer content to reproduce her superficially, she gives, in addition, the complete idea of projections and contours; she is not merely a painter but a sculptor.

(M. A. Belloc, 1858)

Stereo productions aim to make photographed (or recorded) objects appear three-dimensional to the viewer and create a sense of depth. Impressive effects can be accomplished nowadays, and it seems that modern stereoscopy has reached a level which imitates the human vision almost perfectly.

The first effect of looking at a good photograph through the stereoscope is a surprise such as no painting ever produced. The mind feels its way into the very depths of the picture.

(Oliver Wendell Holmes, 1864)

Human Perception | Depth Cues and 3D Vision

Criteria which provide information about the depth separation of objects are split into monocular and binocular depth cues; thus into perception of depth with only one or both eyes.

Monocular Depth Cues

Monocular Depth Cues – examples (click on images for enlargement)

Various depth cues giving evidence about the position of objects in 3D space work monocularly. Information about the layout of a room can be extracted even though the image appears to be ‘flat’. Those cues can be divided into different categories:

  • - Masking of Objects
  • - Size of Objects
  • - Perspective Distortion
  • - Relative Height in Visual Field
  • - Atmospheric Perspective
  • - Color Perspective
  • - Depth of Field
  • - Light and Shade

Binocular Depth Cues

Monocular depth cues enable us to place objects in space. Nevertheless we only gain and perceive the full spatial impression via binocular vision, which is the basis of stereoscopy.

Parallax and Retinal Disparity

The mind perceives an object of three dimensions by means of the two dissimilar pictures projected by it on the two retinae.

(Charles Wheatstone, 1838)

Due to the specific positioning of the human eyes two slightly different images enter – via the optic nerve – the brain and a three-dimensional image will be formed.

A person fixates point A. Due to the convergence of the eyes this point will be projected onto the area of the highest visual acuity of the retina. Point B lies in greater distance and directly behind Point A for the right eye’s view. It will therefore be projected onto the same area as point A; in the left eye, point B’s projection onto the retina will be to the right of point A. The left eye perceives point B in a different, non-corresponding area. The distance point B seems to travel between Bleft and Bright is defined as ‘parallax’, the angle as ‘parallactic angle’. ‘Retinal disparity’ is the difference of the distance of two corresponding images on the retina.

The retinal disparity is dependent on the distance to the viewer’s focus area. When a person fixates an object it will be projected onto corresponding areas of the retina. All those points which will be projected onto corresponding areas at the same time are lying on a theoretical ‘horopter’ (a in a horizontal plane lying circle; difference of the distance between two points on the retina = 0).
The horopter is defined by the points which are being projected onto the retina under the same angle. Fixating a different point under a different angle will create a new circle. However, the actual horopter is much less curved.

(Click on image for enlagement)

The area around the horopter is called “Panum’s fusional area”. In this area our brain can fuse disparate retina information and generate an image. Outside the Panum’s fusional area the image fragments into two different single images. Nevertheless due to the viewers focus on one point and the oppression of the fragmented image of the recessive eye those double images are perceived as very blurred or go totally unnoticed.

Vergences and Accommodation (eye)

Accommodation (lens curvature) is effective in a range up to roughly seven meters. When a person focusses on an object which is several meters away, rays hit the flat lens almost parallel. If the object moves closer to the viewer, the lens contracts (convex lens) and refracts the light stronger in order to produce a sharp image.
Accommodation is accompanied by convergence of the eyes. Convergence is dependent on the distance of viewer to object and most effective in short distances as the eyes have to converge extremely for close objects while they are almost parallel for objects in greater distance.
A different vergence movement is ‘divergence’ which is only possible in a slight extent. Instead of eyes moving inwards (convergence) they move outwards and extend the parallel position.

Binocular masking

BinocularMaskingMonocular masking is one of the most important depth cues. In binocular masking two images with monocular masking depth cues will be compared: Due to the parallactic shift (perceiving slightly different images per eye) one eye (e.g. the right) gathers information of overlapping objects while the other eye might receive more information about the rear object. This helps our brain to calculate the position of both objects (to one another), especially in short distances.

Effectiveness of Depth Cues

Due to rapidly decreasing differences in the images our two eyes perceive in greater distance:

Stereoscopic accuracy decreases with distance, to a point where it’s useless, somewhere around the 150-yard mark

(Mendiburu, 2009)

This is equivalent to approximately 140 meters. Monocular depth cues are therefore and due to the fact that they are multifarious in general more important and effective in daily life.

Recording Procedures

There are various methods to capture and playback stereo productions.
Recording procedures are usually divided into ‘monocular’ and ‘binocular recording’, Playback options into ‘unaided stereo vision’, ‘passive’ and ‘active systems’ and ‘auto-stereoscopy’.
The following paragraphs address all of them briefly.

Binocular Recording

The binocular method uses two lenses (either in one or two bodies) and captures the images simultaneously.

Please click on section of interest

Two monocular cameras

twomonocularcamerasTwo identical cameras rigged next to each other, e.g. by using a stereo slide (distance between both photographs is defined as ‘stereo basis’). The cameras can be either triggered on after another (with static scenes) or synchronised by using controller or special cables.
A big disadvantage of this method is that the stereo basis cannot be freely chosen (no reduction possible at some point).

Stereo cameras

Stereo_cameraA stereo camera has two lenses in one body. Exposure and focus settings are automatically linked and both images are taken simultaneously.
Disadvantage: the stereo basis is usually not alterable.

Monocular Recording

Only one camera is used by the monocular recording technique.

Please click on section of interest

Successive recording

When using the successive recording a single camera will be moved after taking the first picture to record the second image. The big advantage is that you can use any camera and you can decide on whichever stereo basis. Nevertheless you can only capture still live. The use of a stereo slide is recommended.

Stereo lens

The stereo lens (prime lens) contains multiple mirrors resp. two lenses and captures both images at the same time. The advantage of this technique is that you can capture moving objects and the lens is relatively affordable. Nevertheless you cannot change the stereo basis, the images will be in landscape format and you’ll find very restricted aperture settings.

Playback Methods

Unaided Stereo Vision

No tools are needed to view stereo productions with this method. In order to be able to perceive a three-dimensional image you will need to practise as you will need to separate convergence and accommodation of the eyes (for parallel and cross eyes viewing).

Please click on section of interest

Parallel viewing

Two stereo images need to be placed next to each other. For this version of the unaided stereo vision the viewer needs to fixate a point in great distance in order to reach a convergence angle of 0/parallel view – the left eye fixates on the left, the right eye on the right image. Then subsequently the viewer needs to try to overlap both blurry images by focussing onto the image layer without changing the convergence angle (separation of convergence and accommodation). This stereo vision requires some practise and not everyone will succeed.

Cross eyed viewing

Cross eyed viewing is very similar to the parallel viewing technique. However the image for the right and left eye are interchanged (left image: right eye; right image: left eye) – the convergence point lies in front of the images. In order to see a three-dimensional picture the eyes have to focus onto the image plane without changing the convergence angle.

Animated flip images

Another technique to view stereo images are animated flip images. Both images will be displayed in quick succession (usually as animated GIF) and therefore give the impression of depth – a few examples can be viewed on Jim Gasperini’s webpage.

Passive Systems

Please click on section of interest


AnaglyphIn the anaglyph viewing method the image for the right and the one for the left eye are being coloured in using two complementary colours (most common combination: red-cyan) and overlapped. In order to receive a three-dimensional impression a pair of glasses with the same colour combination will be used to filter and separate the images into the one for the left and the one for the right eye. The red image information will be filtered by the red part of the glasses (passes through) while the cyan elements will be blocked and appear black through the red glass.
We differentiate between: real anaglyphs, grey anaglyphs and coloured anaglyphs.

Polarisation techniques

PolarisationThe polarisation technique uses different polarisation properties of light to separates the left from the right image. The image will be projected using two projectors, each polarising the ‘image’ in a different way. The left part of the corresponding glasses will let the left image pass through and the right part the right image.
We differentiate between: linear and circular polarisation

KMQ procedure

Screen Shot 2015-06-12 at 16.12.54Similar to the unaided viewing procedures (parallel/cross eyed viewing) the images will be displayed next to each other or one above the other. In order to view the 3D image special glasses will be used: prisms help to visually overlap the two images and direct the left image to the left and the right image to the right eye.

Interference filter technology

sectralThe interference filter technology uses slightly shifted wavelengths of red, green and blue for the right and left eye. Specific glasses (which are rather expensive to produce) filter the corresponding wavelengths for the left and right eye and therefore produce a 3D image.

Active Systems

Please click on section of interest

Shutter systems

ShutterThis technique uses specific ‘active’ glasses which works by only presenting the image meant for the left eye by blackening the right eye’s view as well as presenting the right-eye image by blackening the left part of the glasses. This will be repeated extremely rapidly (usually >100Hz) and is unperceived to the viewers eyes. The result is the illusion of a single 3D image.


Please click on section of interest


Parallax BarrierAlso known as “glassesless 3D” or “glasses-free 3D”, Autostereoscopy aims to create a convincing 3D image without the use of glasses. The two main approaches to realise this vision are Parallax Barrier (image left) and Lenticular Arrays.
For further information, click here.

Basics and parameter of stereophotography

Stereoscopy is aiming for a representation of the real world which is as realistic and three-dimensional as possible. Thereby it is much more than capturing two slightly different images and overlapping them in postproduction. There are a few characteristics and constraints to bear in mind in order to create a convincing image. While shallow depth of field seems to create more depth in 2D photography it can be perceived as intruding in stereoproductions. Close ups can be troublesome due to cutting objects at the edge of the screen (cf. stereo window)

Convergences and Accommodation (eye) in Stereoproductions


As mentioned before, convergence and accommodation of the eyes need to be separated in order to see two overlapping images as three-dimensional impression. The segregation of ocular movement and curvature of the eye lense is unnatural for humans and usually does not happen in the real world. It is however crucial to view stereoscopic images – in contrast to the real world the actual distance to an object (distance to the screen/image) does not (always) match the distance the viewer perceives. Objects can be perceived as ‘coming out’ of the image or ‘lying behind’ while the image information is for all objects on the same image plane.
When stereo-images are being projected, they overlap – the bigger the discrepancy between corresponding points of the left and right image (called ‘screen parallax’ or ‘pixel parallax’), the further the 3D vision appears from the actual image plane. If there is no pixel parallax between two points, the objects lies and is perceived on the image/screen plane.
In summary: the viewers eyes accommodate (‘focusses’) onto the screen plane while the eyes converge onto a point either before, on or behind the image plane.

Based on Lipton there exist four different states of convergence:
1. Crossed screen parallax: Objects in front of the screen; points for the right eye are displayed left of the corresponding points for the left eye
2. Zero/positive screen parallax: Objects on screen plane (and behind); points for left and right eye do not have any screen parallax or: points for the right eye are displayed right of the corresponding points for the left eye)
3. Infinity parallax: Objects far behind the screen; are the eyes fixated onto a point in infinity behind the screen, the convergence of the eyes is parallel
4. Divergent parallax: Objects behind the screen causing divergence; discrepancy between left and right image is too big to be converted into a 3D image, the eyes diverge outwards, causing exhaustion


Fusion range and image decay

As mentioned before, humans are able to fuse disparate retina information in order to generate an image within the Panum’s fusional area. Objects located too far from the fixation point and outside the Panum’s area therefore ought to create the impression of two completely separate images (image decay). Those double images are being suppressed in real life – the information the recessive eye perceives is restrained and objects appear blurry.

This suppression mechanism does not really work for stereo productions; accommodation and convergence are working separately while viewing stereo images and even huge translations/offsets will appear crisp and clear. This condition causes the viewer fusion problems – the stereo image fragments and disorientation occurs.
In order to prevent image decay for the viewer it is important to limit the offset to a maximal suitable value.
The so called ‘(screen) deviation’ is defined as the difference between biggest and smallest parallax offset on screen. It depends on the maximum parallax angle α (difference between αnear and αdistance), the screen width and the ratio of screen width to distance of viewer to screen (cf. fig right).
By rearranging dependencies:
Ratio of acceptable screen deviation to screen width = 1/30.
The 1/30 rule is valid until the screen parallax exceeds a value of 63mm – which is the average distance between human eyes. Any value bigger than that would cause painful divergence of the eyes and is unadvisable.

Stereo window

In stereo productions the screen is not only projection plane but furthermore limits the virtual 3D world. For objects being seen on or behind the image plane the borders of the screen restrict the view like a window which seems very natural to the viewer. Nevertheless it is much different for the area of negative parallax: objects appearing to be in front of the screen can be cut by the stereo window which lies in greater distance. This occurrence creates a conflict for the viewers brain: the masking of objects indicate a position behind the stereo window but the negative parallax and three-dimensional illusion imply that the object is positioned in front of the stereo window.

It is advisable to properly frame objects which should appear in front of the stereo window while shooting, i.e.: objects in front of the stereo window should not touch the picture margin. Nevertheless it is possible to fixed objects violating the stereo window in postproduction. By masking out the redundant image information (cf. fig. left: masking of sphere in left image in order to create the same clipping as in the right image), a ‘floating stereo window’ will be created; the objects is now coplanar with this virtual screen.
Other problems can occur e.g. when an object in front of the screen is cut at top and/or bottom – while it generally seems to float in front of the screen, the top and/or bottom part of the object is forced onto the projection plane in greater distance (since it seems to lie behind the window). The image will appear warped to the viewers eye.

Cardboard effect and its influencing factors

Stereoscopy is the attempt to reproduce the impression of the ‘real world’ – it complies with own rules, has its limits and has certain characteristics which do not exist in natural vision. Due to the nature of stereo productions two offset images are needed which can contain brightness, contrast, colour or size mismatch and lead to disturbing effects while viewing. These differences can be adjusted in post production whereas the so called ‘cardboard effect’ can be found in many productions and is hardly to fix.
The cardboard effect makes objects in a scene look like cardboard standups in a three-dimensional space – they lack plasticity.

The cardboard effect makes 3-D images look layered, i.e., consisting of flat objects and flat background though an observer can grasp the situation before and behind the target.

Okano, Okui und Yamanoue (2006)

Examples for the Cardboard Effect

Examples for films where the cardboard effect occurs (occasionally) include all kinds of famous films, such as:

  • Avatar (2009)
  • StreetDance 3D (2010)
  • Pirates of the Caribbean: On Stranger Tides (2011)
  • Priest (2011)

The examples above contain films entirely being filmed in stereo, films containing CG as well as real filmed footage, and films which were converted after being shot (traditional 2D). It can be assumed that various parameter can generate a cardboard-like effect.

Even if the described effect is most probably less disturbing than ghosting or image decay, it does diminish the spacial impression. Therefore the stereo image forfeits its fascination and impact to a certain extend: to create the illusion of a realistic impression.

Stereoscopic plasticity

The so called ‘stereoscopic plasticity‘ (SP) is a value describing the ratio of virtually perceived to actual depth of an object. If SP = 1, the object is presented in its proper ratio. SP > 1 makes objects appear as being stretched; and the cardboard effect occurs when the value of the stereoscopic plasticity is virtually zero (SP → 0).
The SP is depended on different factors which can be grouped into ‘recording’, ‘post production’ and ‘playback’ which will be discussed later on.
The factors include the specific stereo basis (recording) and eye distance, the distance between camera and filmed object and between viewer and screen during playback, and the focal length during recording.

Note: I will not go into greater detail and the mathematical principles in this article; I refer to our detailed Bachelor thesis (German) which can be purchased through Stephan or I.

Influencing factors during recording

Stereo basis

Screen Shot 2015-06-17 at 09.54.30
Stereo-photography tries to echo the natural vision where the recording cameras are the equivalent to human eyes. The distance between the lenses of those cameras is called ‘stereo basis’ and it suggests itself that the distance is equal to the average interocular distance (63mm). In some settings/situations however bigger or smaller values are being used which in turn influences the stereoscopic plasticity (cf. fig. right) – the bigger the stereo basis the bigger the visual angle under which you’ll perceive the depth of an object. Furthermore it can be observed that the visual angle is getting smaller the further afar an object is – objects lying in bigger distance will be appear as being less three-dimensional (cf. chapter ❯ Human Perception – Effectiveness of Depth Cues.
An extreme increase of the stereo basis can be useful to capture plasticity e.g. for faraway scenery (to prevent image decay, no objects should be in the foreground). The ratio of stereo basis to interocular distance exceeds 1 by far (hyper stereoscopy); the so called miniaturisation effect occurs. Objects appear like miniatures as the human interocular distance will stay fixed at 63mm: the viewer get the impression that the objects must have been close to the camera due to their plasticity (and therefore quite small) during recording.

  • The bigger the stereo basis the more three-dimensional an object appears.

Camera-to-subject distance

Most important factors to understand this parameter affecting the stereoscopic plasticity (SP) are the ‘near-point distance‘ (NP; distance between camera and nearest point), the ‘camera-to-subject distance‘ (CS; recording distance to objects) and the ‘stereo window distance‘ (SW; distance between camera and the plane which is used as projection plane during playback).
In the general case the SP = 1 if NP = CS = SW while the SW depends on stereo basis, focal length and deviation.
Important statements are:

  • The further an object is positioned behind the NP and therefore the SW, the bigger the CS: it appears less three-dimensional (SP < 1).
  • The smaller the stereo basis with unaltered distance, the smaller the SW and therefore the SP (SP < 1). In order to prevent a reduction of the plasticity the NP would have to be adjusted to the SW.
  • The bigger the NP while stereo basis and SW stay the same, the smaller the SP (SP < 1). In order to prevent the flattening of objects the SW would need to be adjusted by increasing the stereo basis.

Focal length

In traditional (2D) photography different focal lengths are being used to capture different image sections. Wide-angle lenses make it possible to capture a vast angle of view (ideal for landscape photography) while tele lens narrow the view but zoom into the object of interest (often used for capturing animals). Lenses with a focal length of 50mm are defined as normal lens (only applies to cameras with full-sized chips) as the angle of view roughly corresponds to the human perception. A focal length of > 50mm is defined as tele lens, < 50 as wide-angle lens.[/vc_column_text][vc_column_text]focallength
In regards to the figure above (top line) it can be said that an object in great distance which is captured with a tele lens appears to be bigger but not more three-dimensional (wide-angle to tele lens from left to right).
The second set (bottom) shows the stone ball in a comparable size, captured with different focal lengths. In order to achieve this the distance to the object had to be altered according to the focal length.

  • The bigger the focal length and therefore the distance to the object, the flatter it appears (flat/tele lens to three-dimensional/wide-angle lens from left to right).

Long lenses make poor 3D; short lenses makes great 3D.

Mendiburu (2009)

Tele lenses compress space (little plasticity) while wide-angle lenses seem to stretch it (high plasticity). In order to bring the stereoscopic plasticity (SP) closer to SP = 1, which meets the natural vision, various parameters can be altered. It is however quite difficult to find a satisfactory method to adjust the cardboard effect while using a tele lens.


Shallow depth of field (only small area is displayed fully in focus) is a stylistic element in film and photography. The photographer can direct the view of the audience to certain objects of interest and separate objects from the background. While those rules also apply to stereoscopy it has to be mentioned that a shallow depth of field defeats the purpose of three-dimensionality. It would be much more authentic if the viewer had the change to properly ‘look around’ in a completely crisp stereo image. Furthermore due to shallow depth of field the 3D scene appears layered and elements look like being cut out.

  • The blurrier an image the more extrem the cardboard effect



Screen Shot 2015-06-17 at 12.11.31
Lighting plays an important roll in both traditional and stereo photography. It should be used in a way which promotes the depth impression. The image above shows different lighting examples including a CG character casting virtually no shadow (appears very flat) to lateral light casting a soft shadow (appears most three-dimensional).

Image structure

Screen Shot 2015-06-17 at 17.03.50_2
Similar to traditional photography, a general depth impression is already ensured by positioning objects in various depths (monocular depth cues) and dividing a scene into fore-, mid- and background. Distinct lines of sight (objects positioned accordingly) let the scene appear much more spacious than objects arranged in a parallel manner (cf. fig. right: missing of monocular depth cues, only parallel lines → images appears flat, even in ‘2D’).

Influencing factors during post production

Horizontal image translation

Even after recording stereo images you can change the stereoscopic plasticity by applying the so called ‘horizontal image translation’ (HIT) method in post-production:


The probably most common method for HIT is the parallel camera arrangement (cameras will not be swivelled in onto a common point). Recording two images with this technique means that the images can be translated to each other without causing major problems. Images displayed without applying HIT would create big parallax offsets for foreground elements while objects in the far background would overlap completely as the stereoscopic accuracy decreases with distance to virtually zero. This means that the whole scenery would be displays in the area of negative parallax: in front of (and for distant objects on) the screen which would automatically create a violation of the stereo window (cf. fig. below: a) – image display without HIT). In figure b) the foremost area of the object has been aligned – the scene appears behind the stereo window but loses image information at the sides.


If the previously computed distance between camera and stereo window will be altered in post production (using HIT), it will result in a changed size perception. Objects being pushed into greater distance will appear bigger to the audience (the size of the displayed object on screen is unchanged while it seems further away), objects dragged into the front will appear smaller.

The 3D sizing effect can be used with a storytelling purpose. Say the heroes embark on a boat trip and get caught in a hurricane. You will start the sequence with a massive boat, far behind the screen. When the weather turns bad, bring the boat forward, it will shrink to the size of a train wagon. At the screen plane, it is the size of a van, much weaker in the moving waters. By the end of the sequence, bring it further in front of the screen, where it looks like a small toy, and all its might has vanished.

Tim Sassoon in Mendiburu (2009)

As mentioned before for the general case the rule applies: near-point distance = stereo window distance (NP = SW). This means that the nearest point of a stereo image would be positioned in a plane with the stereo window and the most remote point virtually in infinity. If the whole scene is left (cf. text above) or purposely moved into the area in front of the window, the spatial extent will be compressed into a small range (corresponds to half the distance between viewer and screen). Objects will appear flat – the cardboard effect occurs.

Apart from the horizontal camera arrangement it is worth mentioning the converging camera arrangement (cameras are swivelled in onto a common point) and censor shifting (combines advantages of parallel and converging camera arrangement; mostly used in CG production as virtually impossible to achieve good results in reality).


The basic method of a 2D-3D-Conversion comprises in creating two horizontally translated perspective views from a single two-dimensional image.
There are attempts to create automated conversion tool (e.g. based on depth from motion and colour segmentation.
Traditional conversion techniques include the creation of depth maps (grey-scale image which provides information about the position of an object – the darker the pixel the further it was located from the camera). This information will be used to displace the objects horizontally for left and right image (but: displaced objects leave gaps in images). Other techniques are: replication of the scene in 3D (very complex) and creation of cut-outs which will be horizontally translated (will leave gaps, too).

Influencing factors during playback

Playback size

The alteration of the playback size itself does not change the three-dimensional impression. As mentioned earlier the maximal deviation on screen is calculated following the 1/30 rule (ratio of deviation to screen = 1/30). This is valid up to a screen width of 1.9m – above that the deviation would exceed a value of 63mm and lead divergence of the eyes. For bigger productions the deviation is calculated via the ratio of interocular distance to projection magnification factor (with increasing magnification factor the maximal screen deviation decreases).
The stereo plasticity remains (more or less) the same with increasing screen size as the viewing distance will usually increase accordingly.

  • If the viewing distance remains unchanged with increasing screen size, the stereo plasticity will decrease (cf. following subchapter).
Viewing distance


The viewing distance matters significantly in regards to the three-dimensional impact of a stereo production.
In order to understand why a short distance to the screen produces a flatter image than a large distance, we need to have a look at how natural vision works in comparison to stereo vision.
1) The width G of a box will be seen under angle γ, section A’ (equivalent to depth T) by the right eye under angle α (g is eyes-to-subject distance, A interocular distance). With increasing distance g, width G and angle γ decreases accordingly (linear reduction). A’ as well as angle α decrease squared. Therefore ratio T/G resp. A’/G decreases linearly with increasing distance.
2) In stereo productions the depth impression depends on the recorded parallaxes. The offset (A’ or in stereo productions V’) does not change when the viewer de- or increases his viewing distance. Situation 2a) shows a viewer positioned close to the screen (viewing distance: a). The box seems wider than it is deep, while the viewer in position 2b) will see a box which is deeper than it is wide.

  • Even if the plasticity will decrease for shorter distances to the screen, people might prefer sitting in the front rows as a stereoscopic production might feel more tangible, immersive and intensive than from afar.
Practical analysis

In order to verify the constructed theories, part of the Bachelor thesis was a practical analysis. We presented 19 different stero images to 12 test persons. The images included examples with different stereo basis, camera-to-subject distance, focal length, focus, lighting, image structure, viewing distance and horizontal image translation (including attempts to fix image decay etc.).

Ideal image, SP = 1, focal length: 35mm, stereo basis: 63mm, distance to subject: 3.85m

Small stereo basis, SP = 0.48, focal length: 35mm, stereo basis: 30mm, distance to subject: 3.85m

Long focal length, SP = 0.12, focal length: 300mm, stereo basis: 63mm, distance to subject: 33m, with and without shallow depth of field

Stephan and I (shooting at the beach for images with only minor monocular depth cues)

The in-depth experimental setup and execution as well as the stereo images can be found in our Bachelor thesis.


Over the course of researching stereoscopy and the cardboard effect, we worked out that factors like a small stereo basis, high distance to the subject and the use of tele lenses play a part in restricting the three-dimensionality of objects in stereo productions. The test persons agreed that artistic elements like shallow depth of field, missing lighting and the lack of monocular depth cues also contribute to a cardboard-like impact – nevertheless they only seem to intensify the effect generated by factors like tele lenses but appear to be virtually ineffective on their own.
It was observed that most plasticity-forfeiting factors can be compensated by adjusting other factors – e.g. a small stereo basis can be compensated by a shorter focal length. It is however hard or sometimes impossible to fix (stereoscopic plasticity SP = 1) extrem long focal lengths without causing painful disparities on screen which can lead to image decay.

You cannot cheat the space when you shoot 3D.

Rob Engle in Mendiburu (2009)

Moreover one should be mindful of how comfortable it is to look at a stereo production. Even if the stereoscopic plasticity is mathematically correct it does not mean that an image is perceived as being perfectly three-dimensional and in some circumstances can appear as unpleasant to view.
Ultimately stereoscopy should not only be based on mathematics – aesthetics and pictorial design should be considered as well in order to create a convincing stereo image.

Bibliography and additional literature

This is the full list of literature we used for our 150-page Bachelor thesis.


3D-Foto-Shop. (2011). 3D-Objektiv LA9005 für digitale Spiegelreflexkameras. [online]. Abrufbar unter: http://www.3d-foto-shop.de/pi12/pd59.htm. [Stand: 11.05.2011]
3D-Foto-Shop. (2011b). Professioneller 2D zu 3D Konverter für Fotos. [online]. Abrufbar unter: http://www.3d-foto-shop.de/pi13/pd119.htm. [Stand: 02.06.2011]
Autodesk Stereoscopic Filmmaking Whitepaper. (2008). The Business and Technology of Stereoscopic Filmmaking. [online]. Abrufbar unter: http://images.autodesk.com/adsk/files/stereoscopic_whitepaper_final08.pdf. [Stand: 15.05.2011]
Belloc, M.A. (1858). The Future of Photography. The Photographic News. 2 (Jg.1). S.13-14.
Blenn, N. (2007). Entwicklung eines portable Stereo-Videoaufnahmesystems für die Präsentation auf einer Stereoprojektionswand. Diplomarbeit zur Erlangung des akademischen Grades Diplom-Medieninformatiker. Dresden: Technische Universität Dresden.
Bluray 3D. (2009). Die neue Faszination 3D-TV. [online]. Dezember 2009. Abrufbar unter: http://www.bluray-3d.de/die-neue-faszination-3d-tv.html. [Stand: 04.06.2011]
Boev, A., Hollosi, D. & Gotchev, A. (2008). Classification of stereoscopic artefacts. [online]. Juni 2008. Abrufbar unter: http://sp.cs.tut.fi/mobile3dtv/results/tech/D5.1_Mobile3DTV_v1.0.pdf. [Stand: 26.04.2011]
Bortz, J. & Döring, N. (2006). Forschungsmethoden und Evaluation. 4. Auflage. Heidelberg: Spinger Medizin Verlag
Bourke, K. (2010). In-Three Latest 2D-3D Conversion Masterpiece: Alice in Wonderland. [online]. April 2010. Abrufbar unter: http://bourkepr.typepad.com/my_weblog/2010/04/in- three-latest-2d3d-conversion-masterpiece-alice-in-wonderland.html. [Stand: 07.06.2011].
CEA Consumer Electronics Association. (2011). 3D Digital Cameras Sparking Enthusiast Interest. [online]. März 2011. Abrufbar unter: http://www.ce.org/Press/CurrentNews/press_release_detail.asp?id=12069.
[Stand: 04.05.2011]
Clasen, M. (2010). Vereinfachung und Automatisierung der Erzeugung von virtuellen, stereoskopischen Inhalten. Bachelorarbeit. Friedberg: Fachhochschule Gießen-Friedberg.
Crary, J. (1996). Techniken des Betrachters: Sehen und Moderne im 19. Jahrhundert. Dresden: Verlag der Kunst. [engl. Original: Crary, J. (1990). Techniques of the Observer: On Vision and Modernity in the Nineteenth Century. Cambridge/Mass: MIT Press.]
Cutting, J.E. & Vishton, P.M. (1995). Perceiving layout and knowing distances: The integra- tion, relative potency, and contextual use of different information about depth. In: Epstein, W., Rogers, S. (Hrsg.). Handbook of perception and cognition. Perception of space and motion, Nr. 5. San Diego: Academic Press.
Dahm, M. (2005). Grundlagen der Mensch-Computer Interaktion. München: Pearson Studium.
DeJohn, M., Nelson K. & Seigle D. (2007). Styles and degrees of 3D Dimensionalization. In- Three White Paper. Westlake Village: In-Three.
Enders, R. (2003). Die Optik des Auges und der Sehhilfen. Heidelberg: DOZ.
Fitzner, S. (2008). Raumrausch und Raumsehnsucht. Zur Inszenierung der Stereofotografie im Dritten Reich. Fotogeschichte. Beiträge zur Geschichte und Ästhetik der Fotografie. 109 (Jg.28). S.25-38.
Förster, U. (n.d.). Die Geschichte der Stereoskopie im Überblick. [online]. Abrufbar unter: http://www.uf-3d-foto.de. [Stand: 30.04.2011]
Goersch, H. (1980). Die Grundlagen der Stereopsis. Neues Optikerjournal. 11 (1980). S.17- 23.
Guski, R. (1996). Wahrnehmen – ein Lehrbuch. Stuttgart: Kohlhammer. Heine, K.-C. (2011). Fotografie für Journalisten. Köln: O’Reilly Verlag.
Herbig, G. (2005). Die 70 Minuten-Bedingung und ihre Folgen. [online]. Juni 2005. Abrufbar unter: http://www.herbig-3d.de/german/minuten.htm. [Stand: 13.05.2011]
Herbig, G. (2005b). Tiefenwahrnehmung bei der Stereoproduktion. [online]. Juni 2005. Ab- rufbar unter: http://www.herbig-3d.de/german/tiefenwahrnehmung.htm: [Stand: 27.05.2011]
Herbig, G. (2006). Anaglyphentechnik. [online]. März 2006. Abrufbar unter: http://www.herbig-3d.de/german/anaglyphen.htm. [Stand: 26.05.2011]
Hesse, C. (2010). Das kleine Einmaleins des klaren Denkens. 22 Denkwerkzeuge für ein bes- seres Leben. 2. Auflage. München: Beck.
Holmes, O. W. (1864). Soundings From The Atlantic. Boston: Ticknor and Fields.
Holzer, A. (2008). Editorial. Fotogeschichte. Beiträge zur Geschichte und Ästhetik der Fotografie. 109 (Jg.28). S.3.
Hottong, N. & Lesik, D. (Hrsg.) (2009). Stereoskope HD-Produktion. Schriftenreihe Fakultät
Digitale Medien. Arbeitspapier Nr. 5. Furtwangen: Hochschule Furtwangen.
IMDb The Internet Movie Database. (2009). Oben. [online]. Abrufbar unter: http://www.imdb.com/title/tt1049413/. [Stand: 04.06.2011]
IMDb The Internet Movie Database. (2011). Tim Sassoon. [online]. Abrufbar unter: http://www.imdb.com/name/nm1150839/. [Stand: 31.05.2011]
Kebeck, G. (1997). Wahrnehmung. Theorien, Methoden und Forschungsergebnisse der Wahrnehmungspsychologie. 2. Auflage. Weinheim: Juventa.
Kebeck, G. (2006). Bild und Betrachter. Auf der Suche nach Eindeutigkeit. Regensburg: Schnell & Steiner.
Kesselring, K. (2010). Die animierte Realität kann beginnen: „Avatar – Aufbruch nach Pan- dora“. [online]. Januar 2010.
Abrufbar unter: http://www.blauenarzisse.de/index.php/rezension/1255-die-animierte- realitaet-kann-beginnen-avatar-aufbruch-nach-pandora. [Stand: 27.05.2011]
Köhl, H., Roth, G. & Baust, D. (1995). Augenoptik. Ein Schulbuch und Leitfaden. Heidelberg: DOZ.
Kohler, M. (2004). Geschichte der Stereoskopie. [online]. Abrufbar unter: http://www.3d-historisch.de/Geschichte_Stereoskopie/Geschichte_Stereoskopie.htm. [Stand: 30.04.2011]
Kröger, M. (1983). Begrenzter Raum – erfahrene Zeit. Der stereofotografische Blick im 19. Jahrhundert. Fotogeschichte. Beiträge zur Geschichte und Ästhetik der Fotografie. 7 (Jg.3). S.19-24.
Kuhn, G. (1999). Stereo-Fotografie und Raumbild-Projektion. Gilching: vfv Verlag für Foto, Film und Video.
Lipton, L. (1982). Foundation of Stereoscopic Cinema. A Study in Depth. New York: Van Nostrand Reinhold Company.
Ludwig, M. (2010). Ausprobiert: Der 3D-Druckservice von Fujifilm. [online]. April 2010. Abrufbar unter: http://www.chip.de/news/Ausprobiert-Der-3D-Druckservice-von- Fujifilm_42255607.html. [Stand: 01.05.2011]
Lüscher, H. (1931). Stereophotographie. Berlin: Union Deutsche Verlagsgesellschaft.
Maier, F. (2008). Teil 1: 3D-Grundlagen. Professional Production. 46 (07/08). S.14-18.
Maier, F. (2008b). Technologien für das 3D-Kino. Professional Production. 48 (09). S.26-29.
Mallot, H. (1998). Sehen und die Verarbeitung visueller Information. Eine Einführung. Braunschweig: Vieweg Verlagsgesellschaft.
Marks, G. (2011). FFA-Kinobesucherstudie 2010 veröffentlicht: 3D-Filme mit Rekordergeb- nis!. [online]. Mai 2011. Abrufbar unter: http://www.digitaleleinwand.de/2011/05/11/ffa- kinobesucherstudie-2010-veroeffentlicht-3d-filme-mit-rekordergebnis. [Stand: 27.05.2011]
McCann, J. J. (1973). Human Color Perception. In: Eynard, R. A. (Hrsg.). Color Theory and Imaging Systems. Washington: Society of Photographic Scientists and Engineers.
Mendiburu, B. (2009). 3D Movie Making. Stereoscopic Digital Cinema from Script to Screen. Oxford: Focal Press.
Okano, F., Okui, M. & Yamanoue, H. (2006). Geometrical Analysis of Puppet-Theater and Cardboard Effects in Stereoscopic HDTV Images. IEEE Transactions on Circuits and Systems for Video Technology. 16 (6). S.744-752.
Okui, M., Yamanoue, H. & Yuyama, I. (2000). A Study on the Relationship Between Shoot- ing Conditions and Cardboard Effect of Stereoscopic Images. IEEE Transactions on Circuits and Systems for Video Technology. 10 (3). S.411-416.
Piringer, B. (2010). Street Dance 3D Review. [online]. Abrufbar unter: http://www.cinefacts.de/kino/2033/streetdance_3d/filmreview.html. [Stand: 30.05.2011]
Po, L.-M. (2010). Automatic 2D-to-3D Video Conversion Techniques for 3DTV. [online]. April 2010. Abrufbar unter: http://www.ee.cityu.edu.hk/~lmpo/publications/2010_3DV_Conversion_Seminar.pdf. [Stand: 03.06.2011]
Roth, G. (2003). Aus Sicht des Gehirns. Frankfurt am Main: Suhrkamp Verlag.
Sánchez Ruiz, M. (2010). Möglichkeiten und Grenzen der Filmstereoskopie. Technik und Gestaltung des S3D-Films als neue Herausforderung im digitalen Zeitalter. Masterarbeit zur Erlangung des akademischen Grades ‹Master of Arts› der Philosophischen Fakultät der Universität Zürich. Zürich: Universität Zürich.
Schmidt, C. (1999). Entwicklung und Erprobung eines stereoskopischen Videoaufnahmesystems unter Verwendung von Mini-DV Camcordern. Diplomarbeit. Köln: Fachhochschule Köln.
Schönfeld, J. (2001). Die Stereoskopie. Zu ihrer Geschichte und ihrem medialen Kontext. Magisterarbeit am Kunsthistorischen Institut der Fakultät für Kulturwissenschaften der Universität Tübingen. Tübingen: Universität Tübingen.
Schröter, J. (2009). Zur Geschichte, Theorie und Medienästhetik des technisch-transplanen Bildes. München: Wilhelm Fink Verlag.
Sczepek, J. (2011). Visuelle Wahrnehmung. Eine Einführung in die Konzepte Bildentstehung, Helligkeit und Farbe, Raumtiefe, Größe, Kontrast und Schärfe. Norderstedt: Books on Demand.
Settele, C. (2011). Das Ende der 3-D-Brille vor Augen. [online]. Mai 2011. Abrufbar unter: http://www.nzz.ch/nachrichten/kultur/medien/das_ende_der_3-d- brille_vor_augen_1.10434797.html. [Stand: 19.05.2011]
Seymour, M. (2011). Art of Stereo conversion: 2D to 3D. [online]. Januar 2011. Abrufbar unter: http://www.fxguide.com/featured/art-of-stereo-conversion-2d-to-3d/.
[Stand: 01.06.2011]
Shimono, K. et al. (2009). Removing the cardboard effect in stereoscopic images using smoothed depth maps. [online]. Dezember 2009. Abrufbar unter: ftp://ftp.crc.ca/crc/JTam/Papers/PerceptualEffects/PerceivedCardboardEffectJT-SDA-2010- 01.pdf. [Stand: 27.05.2011]
Stanke, K. (1938). Betrachtungen über das Raumbild. Das Raumbild. Stereoskopisches Ma- gazin für Zeit und Raum. 1. S.20-22.
Striewisch, T. (2005). Der große Humboldt Fotolehrgang. 4. Auflage. Baden-Baden: Humboldt.
Tauer, H. (2010). Stereo-3D. Grundlagen, Technik und Bildgestaltung. Berlin: Schiele & Schön.
Tillmanns, U. (1981). Geschichte der Photographie. Ein Jahrhundert prägt ein Medium. Stuttgart: Huber.
Tillmanns, U. (2011). Warum Jürg Kaufmann in der dritten Dimension fotografiert. [online]. Januar 2011. Abrufbar unter: http://www.fotointern.ch/archiv/2011/01/04/warum-jurg- kaufmann-in-der-dritten-dimension-fotografiert/. [Stand: 01.05.2011]
Voigt, S. (2011). Anforderungskatalog für das Storytelling und die Bildgestaltung im stereo- skopischen Film. Schriftliche Arbeit zur Erlangung des akademischen Grades ‹Bachelor of Arts in Mediendesign› im Studiengang Medien-Design/Zeitbasierte Medien an der Fachhoch- schule Mainz. Mainz: Fachhochschule Mainz.
Waack, F. (1985). Stereo photography. [online]. Berlin: Selbstverlag. Verfügbar unter: http://www.stereoscopy.com/library/waack-contents.html. [Stand: 26.05.2011]
Wheatstone, C. (1838). Contributions to the Physiology of Vision, Part the First: On Some Remarkable, and Hitherto Unobserved, Phenomena of Binocular Vision. Philosophical Transactions of the Royal Society of London. 128. S.371-394.
Wimmer, P. (2004). Aufnahme und Wiedergabe stereoskopischer Videos im Anwendungsbe- reich der Telekooperation. Diplomarbeit zur Erlangung des akademischen Grades Diplom-Ingenieur in der Studienrichtung Mechatronik. Linz: Johann Kepler Universität Linz.
Zone, R. (2007). Stereoscopic Cinema & the Origins of 3-D Film, 1838-1952. Kentucky: The University Press of Kentucky.


Hi, I'm Anni. I'm a VFX Artist. 3D lover. Multimedia Producer. Travel enthusiast. Nature lover. DIY fan. Music devotee. And much more.

   Find out more

Recent Posts

Recent Comments

  • Stephan
    July 3, 2015 - 10:43 pm · Reply

    Another “wow”! ;) Thank you for this summary of our thesis and the text in english. From now on the thesis (or at least some parts) is not just lying in our cupboard ;)

    • Anni
      July 4, 2015 - 10:22 pm · Reply

      Haha, you’re more than welcome! I thought it would be good to spread (parts of) the good work we’ve done :)

Leave a Comment