A Survey on Video Prediction: From Deterministic to Generative Approaches
Comprehensive overview of video prediction algorithms
The paper provides an in-depth analysis of video prediction algorithms, categorizing them into two main groups based on their approach: one group warps pixels from reference frames to construct future frames, while the other generates new frames from scratch. A novel taxonomy is introduced, emphasizing the stochastic nature of algorithms, distinguishing between deterministic ones aiming for pixel-level accuracy and those making stochastic predictions in motion. Additionally, it introduces the generative video prediction task, prioritizing the creation of coherent video sequences over pixel-level precision, proposing a unified approach combining contextual constraints and generation methods to address challenges comprehensively.
Dataset significance in training and testing video prediction models is discussed, emphasizing the influence of diversity, quality, and characteristics on model advancement, highlighting the importance of higher-dimensional datasets for stronger generalization capabilities.
Evolution in video prediction methods, including the integration of dense voxel flow and differentiable routing modules, is explored to capture varying scales of motion. Future semantic segmentation is considered, focusing on predicting semantic maps to enrich scene comprehension.


Comments
None