it really is demonstrated that the simple pre-training activity of predicting which caption goes with which image is really an economical and scalable way to understand SOTA picture representations from scratch on a https://k2spiceshop.com/product/liquid-k2-on-paper-online/