Deep Vision Transformer and T5-Based for Image Captioning | IEEE Conference Publication | IEEE Xplore