PyTorch Lightning code for the paper "RMCL: Robust Multimodal Contrastive Learning". Slides of our talk are avialable here.
pip install -r requirements.txt
pip install -e .
We provide five pretrained weights
- ViLT-B/32 Pretrained with MLM+ITM for 200k steps on GCC+SBU+COCO+VG (ViLT-B/32 200k) link
- ViLT-B/32 200k finetuned on VQAv2 link
- ViLT-B/32 200k finetuned on NLVR2 link
- ViLT-B/32 200k finetuned on COCO IR/TR link
- ViLT-B/32 200k finetuned on F30K IR/TR link
The synonym selection for the Geometric based attack is computed from the cosine similarity scores between word pairs based on the counter-fitting word embeddings link
See DATA.md
See TRAIN.md
See EVAL.md
If you use any part of this code and pretrained weights for your own purpose, please cite our [paper]