Hi! I’m reseacher from SLRLabs team and we want to improve the VR experience for you. While we look at the VR videos, you may notice some artifacts making the VR experience uncomfortable. Most of artifacts are caused by inconsistent position of stereoscopic camera objectives and eyes. Here I’ll explain the types of that artifacts, where it comes from and how we work on fixing it
- “I have to close one eye in closeup scenes!”. On close-ups the faces are really closes to camera. Our eyes are comfortable to track the objects and if something is close to your nose, usually the eyes are crossed and not looking straight ahead, so the interested objects are located at center of each eye field of view. The stereoscopic VR cameras have larger field of view but no moving parts, so the image remain static.
As a result, person instinctively crosses the eyes when something goes close to nose, but the image remains unchanged from stable cameras. Mind is not able to process that images, so usually people feel eyes strain and tend to close one eye. The panoramic stereocameras usually have larger field of view, than eye, so it is possible to fix that issue by rotation of the projected image during runtime. The rotation rate should correspond to eye position and can be taken either from eye tracking device (most of headsets do not have that ability for now) or calculated from the depth of point at central area of sight and interpapillary distance (IPD) using simple geometry.
“When I shake my head, the image goes wild!” In reality when people shake their head, one eye becomes accidently lower, than other one, so the visible objects are going up or down depending on the distance to them. Video from camera objectives remains static.
The solution to fix headtilt is straitforward motion of left eye and righteye videos up and down at the rate dependent on depth and head tilt angle.
“I cannot look at sides!”. The sides problem is the biggest one from the described ones. If you rotate your head, the eye position in virtual world will significantly differ from camera objective positions. If you look at far objects, the distortions are neglectable, but not for close objects.
The angular size for close objects may vary a lot for left and right image. The eyes do not see such difference in real world. So the scaling (in general, more complex deformation) of image should be applied depending on distance to eyes.
Each of corrections require the information about the distance to the objects, so the depthmaps could be precomputed and streamed among with video or calculated on a fly during videoplay. Than we can correct each frame depending on the depth information and head position. So, the streaming with depth information will be helpfull to remove annoying artifacts. Check it out with a test SLR app build for the Quest with the following videos:
https://insights.sexlikereal.com/videos/SLR_ASM2_vgg19_MIDAS_decoded_MKX200.mp4
https://insights.sexlikereal.com/videos/SLR_LAUNDRY_vgg19_MIDAS_decoded_MKX200.mp4
https://insights.sexlikereal.com/videos/SLR_HAUNTED2_vgg19_MIDAS_decoded_MKX200.mp4
Here you can download testing app with such corrections applied and play with correction rate for them. Here is some videos with depth maps streamed using Test build SLR compatible format. The rate of corrections is in options tab.
What else can we do with depthmaps?
Despite a lot of research dedicated to implicit encoding of 3D scenes (mainly form NVIDIA), most of such methods are unacceptable for dynamic scene streaming for now. I believe that explicit encodings including depth maps are the vital step for 3D video scene reconstruction and streaming the scene in walkable (in some degree) format. We also work in that direction with some progress. Looking for response abot what else can be fixed for better VR experience