🧠
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive LossThis research paper introduces Inf-CL, a novel approach for contrastive learning that dramatically reduces GPU memory usage during training, allowing for near-infinite batch sizes. The authors address the issue of quadratic memory growth in traditional methods by implementing a tile-based computation strategy that partitions the contrastive loss calculation into smaller, sequentially computed blocks. To further enhance efficiency, they propose a multi-level tiling strategy that leverages ring-based communication at the GPU level and fused kernels at the CUDA core level, minimizing I/O overhead. The experiments demonstrate that Inf-CL significantly outperforms previous methods, achieving unprecedented batch sizes while maintaining accuracy and comparable training speed. This breakthrough opens new possibilities for large-scale contrastive learning, paving the way for advancements in areas such as self-supervised learning and dense text retrieval.
📎
Link to paper