Unsupervised Learning of Video Representations using LSTMs

Long-term Future Prediction

A test set for evaluating sequence prediction/reconstruction

Moving MNIST [782Mb] contains 10,000 sequences each of length 20 showing 2 digits moving in a 64 x 64 frame.

The results in the updated arxiv paper use this test set to report numbers. For future prediction, the metric is cross entropy loss for predicting the last 10 frames for each sequence conditioned on the first 10 frames.

Code

unsup_video_lstm.tar.gz [119Kb]

Papers

Unsupervised Learning of Video Representations using LSTMs [pdf]
Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov
ICML 2015.

Updated arxiv version with more details -
Unsupervised Learning of Video Representations using LSTMs [arxiv]
Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov