Lester: rotoscope animation through video object segmentation and tracking
Method for automating the creation of 2D retro-style animations from input videos by segmenting objects, tracking masks, simplifying geometry, and adding optional finishing details.
The paper describes a novel method called Lester, which automates the process of generating 2D retro-style animations from input videos. The method approaches the challenge as an object segmentation problem, generating masks for different visual traits such as hair, skin, and clothes using the Segment Anything Model (SAM). These masks are then tracked across frames to ensure temporal consistency. The geometry of the masks is simplified to achieve the desired visual style, and optional finishing details such as color palette-based coloring, facial features, shadow effects, and pixelation can be added to customize the visual style of the outcome.
The paper discusses two main strategies to address the problem of automating retro-style animation generation. The first strategy involves conditional generative models, while the second strategy applies 3D human pose estimation to track 3D poses through different frames. The proposed Lester method is positioned as a third approach, mainly based on segmentation and tracking, offering predictability, better temporal consistency, and practicality without requiring custom 3D models for each character.
The methodology of the Lester method involves taking a target video sequence of a human performance as the main input, without general constraints on the depicted person's traits, pose, orientation, camera movements, background, lighting, occlusions, truncations, or resolution. The paper also presents graphical overviews of the contour simplification stage and the coloring process, as well as details about the finishing touches such as color palette-based coloring, facial features, shadow effects, and pixelation.


Comments
None