MV-GPT: A New Generative Pre-Training Framework for Multimodal Video Captioningai.googleblog.com1 pointtech-sucker4 years ago