Multi-modal generative models combine:
Answer options
A
Only text and rules
B
Text, images, audio and other modalities
C
Hardware sensors only
D
Only GANs
Correct answer: Text, images, audio and other modalities
Explanation
Quick AnswerThe correct answer is Text, images, audio and other modalities because it directly addresses the core logic of Generative AI.
Multi-modal generative models process and generate across multiple data types — text, images, audio — enabling richer cross-modal understanding and generation.