In order to submit to the benchmark, i.e. evaluate results on the Spring test split, you need to register. Please note that in order to download the dataset, no registration is needed. To register, first create an account and confirm your mail address by clicking on the link we will send to you. Afterwards, your account request will be verified by our team. Finally, after successful verification, you can submit results for evaluation.
.flo5
and .dsp5
files here.<rootdir>/####/disp1_{left|right}/disp1_{left|right}_####.dsp5
<rootdir>/####/flow_{FW|BW}_{left|right}/flow_{FW|BW}_{left|right}_####.flo5
<rootdir>/####/disp2_{FW|BW}_{left|right}/disp2_{FW|BW}_{left|right}_####.dsp5
./{flow|disp1|disp2}_subsampling <rootdir>
. A hdf5 submission file is generated in the current directory.Images and maps are given in png
format. Disparity and optical flow files are given in HDF5 file format and named .dsp5
for disparity and .flo5
for optical flow.
You can find reference code to read/write .flo5
and .dsp5
files here.
You can find code to transform Spring data to point 3D point clouds here.
In general, depth Z
is computed from disparity d
through Z = fx * B / d
, where fx
is the focal length in pixels (given in intrinsics.txt) and B
is the stereo camera baseline distance; for Spring this is always 0.065m. Please note that the Spring dataset encodes infinitely distant sky pixels as zero disparity, leading to infinite values when using the above formula.
While our ground truth files are given in 4K (double the spatial resolution per dimension), the ground truth vectors (disparities, optical flow) relate to the images in full HD resolution. So when using these, there should be no need to divide them by 2. See also the example data loader for Spring here.
Spring uses an orthoparallel stereo camera setting, i.e. two cameras parallely pointing into the same direction. The baseline distance between the cameras is 0.065m. Intrinsic camera parameters (available for train and test) are given per sequence in intrinsics.txt, extrinsic camera data / camera poses (available for train) are given in extrinsics.txt. Please note that in some scenes a camera zoom / change of the focal length is used, leading to different intrinsics per frame. We additionally provide metric camera focal distances in focaldistance.txt.
Yes, very occasionally, there are nan values in the ground truth files, which arise from a bug in the Blender/cycles shading system. During evaluation, these pixels are ignored.
Please use the contact link at the bottom of the page.