This page contains information and downloads for the fluid motion stock-footage dataset used in the paper.
Dataset contents
The dataset consists of short video clips extracted from longer stock-footage videos. All files are prepended with two numbers that identify the video clip (
#####_#####
): the first number is the source video ID, the second number is the frame offset within that video. For clips with the same source video ID, the contents are likely to be very similar, but note that they are not identical --- the video clip, input image, and motion fields are taken from different sections of the source video.
Files in the dataset include:
#####_#####_gt.mp4
: The ground-truth video clip. All videos are 60 frames in duration, and have 1280 px width and variable height. Encoded as H264 MP4s with excessively high bitrates to avoid compression artifacts. Code to load videos as PyTorch tensors can be found in here.#####_#####_input.jpg
: The first frame of each video, for convenience. Used for training the motion estimation network. Images are the same size as the corresponding video.#####_#####_motion.pth
: The "ground-truth" motion fields, which were estimated from the video using off-the-shelf optical flow networks. Stored as compressed binaries, code for reading these files can be found here.