Multimodal AI

Templates for video, vision, and speech workflows

These templates combine text with image, video, and audio processing patterns.