November 26, 2025

Images To Video – Nano Banana 3, Veo, Flow, Gemini Word Salad

By: Stephen Toback

What started as a test prompt from Nick Janes to help doing some Nano Banana Pro 3 testing, ended up with an interesting video.

You can read more information about how I used reference videos to create a series of images with the same main character, setting and style here: https://sites.duke.edu/ddmc/2025/11/25/realism-consistency-and-responsibility-in-ai-image-creation-comparing-gemini-nano-banana-pro-3-and-chatgpt/

I took it a step further using Google Flow to use the still images to create videos. It took some rendering and using Gemini and ultimately ChatGPT to create some prompts but I got some good results that seemed like stylized animation or rotoscoping results.

The voice over was Eleven Labs. I tried using v3 and using tags to have it change the speed, but it didn’t work (following up with that next week). I ended up using v2, sped it up but still ended up using speed changes in Final Cut (110% to make the audio faster). I set the video to 90% to make the video slower – that lined up nicely without too much video or audio artifacts.

I used Eleven Labs music for the first time. It worked well but wasn’t quite as precise as I had anticipated. I hope to do more work with using scene adjustments to match video.

Here’s the final video. I used ChatGPT to create the fake hospital bug.

Since I used Veo 3.1, it automatically generated dialog. I had originally hoped to generate actual lip sync’d dialog, but I still can’t figure out how to create specific dialog consistently. 3 out of the 4 ain’t bad matey (LOL)

The combination of learning and technological enhancement makes this better and definitely more interesting.

Categories: Artificial Intelligence (AI), Audio

Leave a Reply Cancel reply