Distribution-Aware Data Expansion with Diffusion Models
Data expansion framework which creates diverse and distribution-consistent samples, leading to significant accuracy improvements across image datasets.
DistDiff is a new framework engineered to tackle the persistent challenge of data scarcity. It orchestrates a sophisticated interplay of hierarchical clustering and multi-step energy guidance to expand training data with precision and scalability. Unlike its predecessors, which rely on predefined perturbations to augment datasets, DistDiff zeroes in on refining denoising processes within the sampling mechanism. This distinctive approach yields remarkable enhancements in optimization outcomes.
Central to DistDiff's architecture is the construction of a novel energy function, engineered to approximate data distributions based on predicted clean data points. The method harnesses hierarchical prototypes as strategic waypoints to shape energy guidance, optimizing predictions across multiple stages for nuanced effects. Crucially, DistDiff navigates the task of determining the requisite number of group-level prototypes, a pivotal step in mirroring real-world data distributions.
Comments
None