The world of filmmaking technology is advancing at an incredible pace thanks to artificial intelligence. This past week saw several groundbreaking announcements that will change visual effects and film editing forever. In this post, we’ll break down the newest AI tools and how they could be used by filmmakers and storytellers.
Pika 1.0 – Video Generation Leaps Forward
A new AI model called Pika 1.0 is set to revolutionize video editing and effects.
Developed by Pika Labs, this system allows users to upload video clips and use text prompts to modify the footage. The results are shockingly good for an initial release.
Some examples of what’s possible with Pika 1.0:
- Change the environment and backgrounds behind footage to customize scenes
- Expand and extrapolate environments outwards from existing video
- Add new elements like smoke, fire, animals, objects etc. into shots
- Simulate physics and effects like sparks and debris
Check out these Pika 1.0 Examples to see it in action.
It´s cold in Berlin.
— Martin Haerlin (@Martin_Haerlin) December 6, 2023
But @pika_labs and @MatanCohenGrumi got me covered! Literally.
This is so much fun, thank you for giving us a whole new way to play and to explore with Pika 1.0!
The future for storytellig is bright. pic.twitter.com/4zPPol76Le
The ease of use combined with stellar results means Pika has enormous potential. Any type of custom VFX will become drastically faster.
Stable Diffusion Turbo – 150 FPS Image Generation
If you thought AI image generation couldn’t get any faster, think again. Stable Diffusion Turbo is a new optimized model that can create realistic images at up to 150 frames per second!
For the first time, Stable Diffusion runs smoothly in real-time on consumer GPUs. This opens possibilities like:
- Instantly generating tons of variations to pick the best image
- Using images as custom textures in 3D applications
- Live visualization of image generation during creative sessions
The quality may not beat tools like Midjourney yet, but the pure speed is groundbreaking.
Meta AI Images – A Serious Contender
Yet another big tech company has jumped into AI image generation. Meta AI Images looks like a very solid offering for free image generation without any sign up.
In our testing, Meta’s results appear more polished and photorealistic out of the box. However it lacks deeper customization and control over aspects like aspect ratio. But as a starting point, Meta could give Midjourney a run for its money in terms of image quality.
We expect Meta will refine the tool over time by adding user accounts, libraries, etc. For now it’s still very new but shows immense promise already.
New AI Speech Generator – Cloning Voices From Seconds of Audio
Think deep fakes were scary? Well researchers just unveiled new AI that can clone anyone’s voice from only a few seconds of audio.
Some examples and demos can be found in this Twitter thread. The results are nearly indistinguishable from the real person’s voice.
StyleTTS 2: Towards Human-Level Text-to-Speech Voice-Clone 🎙 Colab 🥳
— camenduru (@camenduru) December 4, 2023
Thanks to Yinghao Aaron Li ❤ Cong Han ❤ @VinaySRaghavan ❤ Gavin Mischler ❤ @NimaMesgarani ❤
🌐page: https://t.co/UB4SjJbnwq
📄paper: https://t.co/cS5IyhpH3m
🧬code: https://t.co/0WExrNyo0q
🦒colab by… pic.twitter.com/nAuFuSk87W
Having just a short voice clip is enough for the AI to build a profile and generate entirely new speech in that same voice. The applications for filmmaking include:
- Vocal performances for voiceovers and ADR
- Dialog replacement while retaining original actor’s voice
- Adding speech to existing footage
As with all AI, there are certainly ethical concerns around misuse. But used properly this tech could save massive costs for animations and other productions needing speech elements.
Google’s Gemini vs ChatGPT – The Race For Smartest AI
Google has unveiled its own chatbot dubbed Gemini meant to compete with ChatGPT. Early demos position Google’s Gemini as extremely capable, outperforming GPT-3 in many areas.
- Gemini offers three tiers – a mobile version, enhanced model for Bard, and upcoming Ultra version.
- Results show Gemini matching or exceeding GPT-3 in 32 different tests.
- Demos prove highly intelligent behavior and versatility across many domains.
However, we found ChatGPT still edges out Gemini slightly when it comes to open ended creativity and ideation. The next year will be a battle for supremacy in this space. For now, Gemini is incredibly impressive but still catching up to ChatGPT’s creative abilities.
ZeroTen AI – Virtual Fashion & Cinematography
Here’s an eye opening demo – AI that lets you virtually try on clothes and automatically conforms them to a video. Developed by ZeroTen, this tech has major implications for costume design in films.
Experiments with generative #AI try-on 👀 pic.twitter.com/PzMnVZYa3k
— Denis Rossiev (AR/AI) (@Enuriru) December 7, 2023
Directors could scout clothing styles and test options before production begins. Costume changes can happen in seconds without reshoots. Even background actors could have wardrobe adjustments to fit scenes.
Take this concept further and actors may not need to wear physical costumes at all. Instead CG clothing gets mapped onto them in real time during filming or editing. The possibilities are endless!
Magic Animate & Animate Anyone – AI Assisted Animation
Animators rejoice – AI will soon become your best friend! Two new web apps called Magic Animate and Animate Anyone allow animating images with only an initial sketch + short video clip.
The workflow looks like this:
- Upload any artwork, drawing, or photo
- Provide a short video clip showing motion
- AI will animate the upload to match the video
Results are surprisingly fluid and well integrated. This opens up concepts like:
- Quickly animating character concept art
- Using AI mocap for faster animation
- Iterating through animation styles rapidly
As these tools improve, expect massive time savings for 2D and 3D animators.
Amazon Makes A Splash – Enterprise AI Images
Not one to be left out, Amazon leaped into the fray by announcing an AI image generator for businesses offered through AWS. Teams can customize models around specific products, styles, and content types.
Amazon is taking an intelligent approach here, letting users directly shape results instead of a one-size fits all model. Teaming up with Getty Images for content, they could produce very tailored outputs.
For companies regularly needing custom product renders, mockups, and visualization, this looks like a fantastic offering. Especially because Amazon assumes all legal liability related to generated images. Peace of mind for business users while keeping creativity unleashed.
Edit Video By Typing? New Multimodal Editor
In the future, editing videos may not require touching a mouse or clicking any buttons. Developer Mårten Andersson demoed an editor that takes typed commands to apply effects.
We made an app that lets you drag in a video and make changes by typing pic.twitter.com/uMoMB2Wt20
— Morten Just (@mortenjust) December 5, 2023
Users can type instructions like:
- Fade to black
- Increase contrast
- Make duration 10 seconds
- Add earthquake shake
And the software intuitively performs the requested edits!
This multimodal approach combining text, speech, and visualization points toward a major shift in editing. Soon manual clicking around timelines could be optional for rough cuts. Then refine from there as needed.
Runway + Getty Images Join Forces
Two more giants have teamed up to offer business oriented image generation. Runway provides the backend AI technology while Getty Images feeds high quality content.
Companies can train customized models using their own products and assets mixed with Getty’s media vaults. Outputs adhere to strict legal standards.
For advertising and ecommerce, this looks like an unbeatable combination. Brands retain full rights and ownership too.
Expect vivid product mockups, powerful concept imagery, and video showcasing offerings – all with custom AI fine tuned precisely for any business.
Magnific AI Upscales Classic Video Games
AI image upscaling has allowed radically improving old images and footage as we’ve covered before. But one person took things further by feeding classic video games into tools like Magnific.
Using stable diffusion variants, rasterized game visuals transform into nearly photorealistic scenes. Characters, textures, materials – everything gets enhanced.
Some striking examples include:
- Grand Theft Auto Vice City stories looks like live action
- Minecraft characters become remarkably lifelike
Saw a bunch of people using Magnific to upscale old PS1 games, so I wanted to see what could be done on the PS3 front– and, well– why not use one of the most remastered games ever?
— Theoretically Media (@TheoMediaAI) December 4, 2023
More Details Below: pic.twitter.com/fx2yALjABX
This technique shows promise for resurrecting retro game worlds or adding realism to new games built using pixel art styles.
Dead Actor Revived To Voice New Stories
In an impressive new demonstration, meditation app Calm managed to reproduce the iconic voice of Jimmy Stewart to narrate new stories. Stewart passed away over 25 years ago making this a significant achievement.
Calm utilized just a few samples of Stewart’s speech to train an AI voice model. From there it generated completely new readings in the same familiar tone.
This provides just a glimpse at the potential to revive actors or use AI voice double in films going forward. As the quality improves it may become difficult to discern what’s real or AI created.
Photo Realistic Painting Restorations
AI can not only create art from scratch but also enhance existing works. A tweeter user @PurzBeats showcased a tool that massively increases quality of classic paintings.
Feeding low res images from iconic art pieces, the AI delivers shockingly good upscaling and colorization. Brush strokes become pronounced, details emerge, and the whole painting appears virtually real.
For preserving historical art this holds tremendous potential. Of course it also enables reimagining paintings or even full films using upgraded versions.
Leonardo Live Canvas – AI Assisted Concept Art
Leonardo Live Canvas provides a brilliant bridge between digital painting and AI generation. Artists can sketch scenes or drawings, then utilize sliders to enhance creativity or realism.
- Lower creativity adheres more strictly to hand drawn elements
- Higher creativity adds unexpected surprises based on prompts
- Photo realism renders drawings as polished 3D scenes
This makes iterating visual ideas incredibly fast. Storyboarders, concept designers, and other previs roles stand to benefit greatly from tools like Live Canvas. Initial ideas guide the AI while leaving room for magical outputs impossible to imagine otherwise.
The Future Is Here
This roundup gives just a taste of the lightning fast progress in AI applications for every creative field. What else did we miss? Share your thoughts in the comments about these groundbreaking new tools!