Wan 2.2 AI Video Generator

Wan 2.2 is an open-source generative AI video model from Alibaba's DAMO Academy, publicly released on July 28, 2025. It introduces a Mixture-of-Experts (MoE) architecture into the video diffusion model, which significantly enhances model capacity and performance without increasing inference costs. The model is notable for its cinematic-level aesthetics, high-definition 1080p output, and its ability to generate complex, fluid motion with greater control than previous models.

Example videos

Generated by Wan 2.2

Prompt
Sidelit, soft light, high contrast, medium shot, centered composition, clean single subject frame, warm tones. A young man stands in a forest, his head gently lifted, with clear eyes. Sunlight filters through leaves, creating a golden halo around his hair. Dressed in a light-colored shirt, a breeze plays with his hair and collar as the light dances across his face with each movement. Background blurred, featuring distant dappled light and soft tree silhouettes.
Copied
Prompt
A purely visual and atmospheric video piece focusing on the interplay of light and shadow, with a corn train as the central motif. Imagine a stage bathed in dramatic, warm spotlights, where a corn train, rendered as a stark silhouette, moves slowly across the space. The video explores the dynamic interplay of light and shadow cast by the train, creating abstract patterns, shapes, and illusions that dance across the stage. The soundtrack should be ambient and minimalist, enhancing the atmospheric and abstract nature of the piece.
Copied
Prompt
Wide shot,The video shows a person in a red outfit standing on an escalator, facing away from the camera. The escalator is moving upwards, and the person appears to be stationary. The surroundings are dimly lit with reflective surfaces that create a mirrored effect, giving the impression of multiple identical figures ascending simultaneously.
Copied
Prompt
A man on the run, darting through the rain-soaked back alleys of a neon-lit city night, steam rising from the wet pavement. He's clad in a drenched trench coat, his face etched with panic as he sprints down the alley, constantly looking over his shoulder. A chase sequence shot from behind, immersing the viewer deeply, as if the pursuers are right behind the camera lens.
Copied
Prompt
A vintage filter with dusk tones captures a calm, thirty-something Black woman seated in a moving subway car. The people around were moving back and forth, creating a distinct blur effect, but she remained clearly visible. Soft light and cinematic quality create an enigmatic atmosphere in this moody setting.
Copied
Prompt
Aerial acrobatics on a flying airplane wing, a gymnast clad in a red and white gym suit forward as strong winds whip her hair and clothes. Suddenly, she leaps into a mid-air cartwheel, landing gracefully on the metal wingtip. Following up, she executes a side flip amidst the roaring air currents. Concluding her routine, she stabilizes herself with both feet firmly planted, fingertips lightly grazing the wing's edge.
Copied
Prompt
Under a vast azure sky, illuminated by the gentle and warm sunlight from the side, a red-haired woman was smiling and laughing joyfully. Her long, curly tresses dance in the breeze. Dressed in a green suit adorned with floral patterns and fitted trousers, she pairs her outfit with striking neon green ankle boots. A large-brimmed straw hat, slightly drooping at the edges, crowns her head. Standing on a rural path blanketed in golden hay, expansive fields and a pristine blue horizon form the backdrop. With hands aloft, she wields a blue garden hose from which a cascade of multicolored flowers erupts instead of water, scattering like fireworks in the air. The blossoms, diverse in hue and shape, gleam with a gentle luster under the sun's rays.
Copied

Key features of Wan 2.2

Wan 2.2 is an open-source generative AI video model from Alibaba's DAMO Academy, publicly released on July 28, 2025. It introduces a Mixture-of-Experts (MoE) architecture into the video diffusion model, which significantly enhances model capacity and performance without increasing inference costs. The model is notable for its cinematic-level aesthetics, high-definition 1080p output, and its ability to generate complex, fluid motion with greater control than previous models.

Advanced Motion Generation

Creates complex, fluid, and natural movements in videos, improving realism and coherence.

Cinematic Aesthetics

Trained on meticulously curated data to produce videos with precise control over lighting, color, and composition.

High-Definition Output

Generates videos with native 1080p resolution at 24fps, suitable for professional use.

Mixture-of-Experts (MoE)

Cinematic Camera Control Generates videos with native 1080p resolution at 24fps, suitable for professional use.

First-Last Frame to Video (FLF2V)

Creates seamless video transitions by interpolating between a specified start and end frame.

Consumer-Grade GPU Compatibility

A highly-compressed 5B model is available that can run on consumer GPUs like an RTX 4090.

Open-Source and Customizable

The model is publicly available, allowing for fine-tuning with LoRA and other community-developed tools.

Wan 2.2 Capabilities and Use Cases

Cinematic shot of a skateboarder performing a complex trick outdoors.
Complex Motion Generation
Simulates realistic physics and natural motion dynamics for characters and objects.
Cinematic day-to-night landscape time-lapse
First-Last Frame to Video
Simulates realistic physics and natural motion dynamics for characters and objects.
Film noir scene of two characters in a shadowed, rainy room.
Cinematic Aesthetic Control
Allows for precise control over the visual style, lighting, and mood of the output.
Yellow helicopter lowers giant banana chips over Bangalore as crowd watches.
High-Definition Output
Renders videos with a native resolution of 1080p, eliminating the need for upscaling.
Sketch transforms into a 3D bluebird under a gentle touch.
LoRA Fine-Tuning
Supports the integration of LoRA models to fine-tune the video's style.
Coca-Cola ad transforms into a realistic 3D bottle with fizz.
Efficient Hybrid TI2V
Uses a single model to support both text-to-video and image-to-video generation.
Cinematic montage of surreal stairs, industrial workshop, golden dance, and glowing digital veil.
Open-Source
The model's architecture and weights are publicly available for download.

How to use Wan 2.2 on Vizard

Here are three simple steps to help you explore Wan 2.2 on Vizard:

Choose the Wan 2.2 model

Choose the Wan 2.2 model

Go to Vizard’s text to video generator and select Wan 2.2 model.

Enter your prompt

Enter your prompt

Enter your prompt or upload your image to get started.

Save your video for download or share

Save your video for download or share

Once the video is ready, you can download it or share it on your social media accounts directly through Vizard.

YouTube videos about Wan 2.2

FAQ

What is Wan 2.2?

Wan 2.2 is a state-of-the-art, open-source generative AI video model developed by Alibaba's DAMO Academy. It is a major upgrade to the foundational Wan video model series, designed to create high-quality, cinematic videos from text and image prompts. The model is known for its advanced motion generation and aesthetic controls.

What version(s) are available?

Wan 2.2 is available in several versions with different capabilities. The core open-source models include the efficient TI2V-5B model, which supports both text-to-video (T2V) and image-to-video (I2V) at 720p resolution and can run on consumer-grade GPUs. There are also more powerful 14B models, such as the T2V-A14B and I2V-A14B, which use a Mixture-of-Experts (MoE) architecture for superior quality and performance, suitable for more robust hardware.

What makes it unique?

Wan 2.2 stands out due to its innovative Mixture-of-Experts (MoE) architecture, which separates the denoising process into specialized stages for better performance without a significant increase in computational cost. It also features cinematic-level aesthetic controls, an ability to generate complex and fluid motion, and a First-Last Frame to Video (FLF2V) function that creates smooth transitions between two images. Its open-source nature allows for community-driven fine-tuning and integration.

Is it safe to use?

As an open-source model, the safety of Wan 2.2 largely depends on how it is implemented and used. The developers have established a usage policy that prohibits the generation of illegal, harmful, or misleading content. While the model itself does not have a built-in content moderation system, developers and platforms using Wan 2.2 are expected to implement their own safeguards to ensure responsible use and compliance with legal and ethical standards.

How fast is it?

Wan 2.2 is highly optimized for speed, particularly its TI2V-5B model, which is one of the fastest available at 720p resolution and 24fps. A 5-second video can be generated in just a few minutes on a consumer GPU like an RTX 4090, with more powerful hardware offering even faster results. The speed is further enhanced by its efficient Mixture-of-Experts (MoE) architecture.

Is it accessible via mobile?

Wan 2.2 is primarily a developer-focused, open-source model. It does not have an official, dedicated mobile app from its producer. However, because it is open-source, developers can integrate it into mobile-friendly web applications or create their own mobile apps. Its consumer-grade GPU compatibility also makes it more accessible to users with high-end mobile workstations.

What can it generate or create?

Wan 2.2 is capable of generating a wide variety of video content, from short-form ads and social media clips to cinematic scenes and animations. Its capabilities include text-to-video, image-to-video, and image-based in-painting. Users can generate videos with specific camera movements, precise aesthetic styles, and realistic motion for characters and objects, making it a versatile tool for both technical and creative projects.

How can it be used?

The most common way to use Wan 2.2 is by downloading the model files and running them locally on a compatible machine, often with integration through platforms like ComfyUI or Diffusers. For a more accessible experience, the model is available via cloud API providers. There is also an opportunity to try Wan 2.2 for free through the Vizard platform, which provides an online interface for experimenting with the model's capabilities.

Get started with Wan 2.2 on Vizard now!

Wan 2.2 AI Video Generator

Wan 2.2 is an open-source generative AI video model from Alibaba's DAMO Academy, publicly released on July 28, 2025. It introduces a Mixture-of-Experts (MoE) architecture into the video diffusion model, which significantly enhances model capacity and performance without increasing inference costs. The model is notable for its cinematic-level aesthetics, high-definition 1080p output, and its ability to generate complex, fluid motion with greater control than previous models.

Try Wan 2.2 on Vizard Desktop
Example videos

Generated by Wan 2.2

Prompt
Sidelit, soft light, high contrast, medium shot, centered composition, clean single subject frame, warm tones. A young man stands in a forest, his head gently lifted, with clear eyes. Sunlight filters through leaves, creating a golden halo around his hair. Dressed in a light-colored shirt, a breeze plays with his hair and collar as the light dances across his face with each movement. Background blurred, featuring distant dappled light and soft tree silhouettes.
Copied
Prompt
A purely visual and atmospheric video piece focusing on the interplay of light and shadow, with a corn train as the central motif. Imagine a stage bathed in dramatic, warm spotlights, where a corn train, rendered as a stark silhouette, moves slowly across the space. The video explores the dynamic interplay of light and shadow cast by the train, creating abstract patterns, shapes, and illusions that dance across the stage. The soundtrack should be ambient and minimalist, enhancing the atmospheric and abstract nature of the piece.
Copied
Prompt
Wide shot,The video shows a person in a red outfit standing on an escalator, facing away from the camera. The escalator is moving upwards, and the person appears to be stationary. The surroundings are dimly lit with reflective surfaces that create a mirrored effect, giving the impression of multiple identical figures ascending simultaneously.
Copied
Prompt
A man on the run, darting through the rain-soaked back alleys of a neon-lit city night, steam rising from the wet pavement. He's clad in a drenched trench coat, his face etched with panic as he sprints down the alley, constantly looking over his shoulder. A chase sequence shot from behind, immersing the viewer deeply, as if the pursuers are right behind the camera lens.
Copied
Prompt
A vintage filter with dusk tones captures a calm, thirty-something Black woman seated in a moving subway car. The people around were moving back and forth, creating a distinct blur effect, but she remained clearly visible. Soft light and cinematic quality create an enigmatic atmosphere in this moody setting.
Copied
Prompt
Aerial acrobatics on a flying airplane wing, a gymnast clad in a red and white gym suit forward as strong winds whip her hair and clothes. Suddenly, she leaps into a mid-air cartwheel, landing gracefully on the metal wingtip. Following up, she executes a side flip amidst the roaring air currents. Concluding her routine, she stabilizes herself with both feet firmly planted, fingertips lightly grazing the wing's edge.
Copied
Prompt
Under a vast azure sky, illuminated by the gentle and warm sunlight from the side, a red-haired woman was smiling and laughing joyfully. Her long, curly tresses dance in the breeze. Dressed in a green suit adorned with floral patterns and fitted trousers, she pairs her outfit with striking neon green ankle boots. A large-brimmed straw hat, slightly drooping at the edges, crowns her head. Standing on a rural path blanketed in golden hay, expansive fields and a pristine blue horizon form the backdrop. With hands aloft, she wields a blue garden hose from which a cascade of multicolored flowers erupts instead of water, scattering like fireworks in the air. The blossoms, diverse in hue and shape, gleam with a gentle luster under the sun's rays.
Copied

Key features of Wan 2.2

Wan 2.2 is an open-source generative AI video model from Alibaba's DAMO Academy, publicly released on July 28, 2025. It introduces a Mixture-of-Experts (MoE) architecture into the video diffusion model, which significantly enhances model capacity and performance without increasing inference costs. The model is notable for its cinematic-level aesthetics, high-definition 1080p output, and its ability to generate complex, fluid motion with greater control than previous models.

Advanced Motion Generation

Creates complex, fluid, and natural movements in videos, improving realism and coherence.

Cinematic Aesthetics

Trained on meticulously curated data to produce videos with precise control over lighting, color, and composition.

High-Definition Output

Generates videos with native 1080p resolution at 24fps, suitable for professional use.

Mixture-of-Experts (MoE)

Cinematic Camera Control Generates videos with native 1080p resolution at 24fps, suitable for professional use.

First-Last Frame to Video (FLF2V)

Creates seamless video transitions by interpolating between a specified start and end frame.

Consumer-Grade GPU Compatibility

A highly-compressed 5B model is available that can run on consumer GPUs like an RTX 4090.

Open-Source and Customizable

The model is publicly available, allowing for fine-tuning with LoRA and other community-developed tools.

Wan 2.2 Capabilities and Use Cases

Cinematic shot of a skateboarder performing a complex trick outdoors.
Complex Motion Generation
Simulates realistic physics and natural motion dynamics for characters and objects.
Cinematic day-to-night landscape time-lapse
First-Last Frame to Video
Simulates realistic physics and natural motion dynamics for characters and objects.
Film noir scene of two characters in a shadowed, rainy room.
Cinematic Aesthetic Control
Allows for precise control over the visual style, lighting, and mood of the output.
Yellow helicopter lowers giant banana chips over Bangalore as crowd watches.
High-Definition Output
Renders videos with a native resolution of 1080p, eliminating the need for upscaling.
Sketch transforms into a 3D bluebird under a gentle touch.
LoRA Fine-Tuning
Supports the integration of LoRA models to fine-tune the video's style.
Coca-Cola ad transforms into a realistic 3D bottle with fizz.
Efficient Hybrid TI2V
Uses a single model to support both text-to-video and image-to-video generation.
Cinematic montage of surreal stairs, industrial workshop, golden dance, and glowing digital veil.
Open-Source
The model's architecture and weights are publicly available for download.

How to use Wan 2.2 on Vizard

Here are three simple steps to help you explore Wan 2.2 on Vizard:

Choose the Wan 2.2 model

Choose the Wan 2.2 model

Go to Vizard’s text to video generator and select Wan 2.2 model.

Enter your prompt

Enter your prompt

Enter your prompt or upload your image to get started.

Save your video for download or share

Save your video for download or share

Once the video is ready, you can download it or share it on your social media accounts directly through Vizard.

YouTube videos about Wan 2.2

FAQ

What is Wan 2.2?

Wan 2.2 is a state-of-the-art, open-source generative AI video model developed by Alibaba's DAMO Academy. It is a major upgrade to the foundational Wan video model series, designed to create high-quality, cinematic videos from text and image prompts. The model is known for its advanced motion generation and aesthetic controls.

What version(s) are available?

Wan 2.2 is available in several versions with different capabilities. The core open-source models include the efficient TI2V-5B model, which supports both text-to-video (T2V) and image-to-video (I2V) at 720p resolution and can run on consumer-grade GPUs. There are also more powerful 14B models, such as the T2V-A14B and I2V-A14B, which use a Mixture-of-Experts (MoE) architecture for superior quality and performance, suitable for more robust hardware.

What makes it unique?

Wan 2.2 stands out due to its innovative Mixture-of-Experts (MoE) architecture, which separates the denoising process into specialized stages for better performance without a significant increase in computational cost. It also features cinematic-level aesthetic controls, an ability to generate complex and fluid motion, and a First-Last Frame to Video (FLF2V) function that creates smooth transitions between two images. Its open-source nature allows for community-driven fine-tuning and integration.

Is it safe to use?

As an open-source model, the safety of Wan 2.2 largely depends on how it is implemented and used. The developers have established a usage policy that prohibits the generation of illegal, harmful, or misleading content. While the model itself does not have a built-in content moderation system, developers and platforms using Wan 2.2 are expected to implement their own safeguards to ensure responsible use and compliance with legal and ethical standards.

How fast is it?

Wan 2.2 is highly optimized for speed, particularly its TI2V-5B model, which is one of the fastest available at 720p resolution and 24fps. A 5-second video can be generated in just a few minutes on a consumer GPU like an RTX 4090, with more powerful hardware offering even faster results. The speed is further enhanced by its efficient Mixture-of-Experts (MoE) architecture.

Is it accessible via mobile?

Wan 2.2 is primarily a developer-focused, open-source model. It does not have an official, dedicated mobile app from its producer. However, because it is open-source, developers can integrate it into mobile-friendly web applications or create their own mobile apps. Its consumer-grade GPU compatibility also makes it more accessible to users with high-end mobile workstations.

What can it generate or create?

Wan 2.2 is capable of generating a wide variety of video content, from short-form ads and social media clips to cinematic scenes and animations. Its capabilities include text-to-video, image-to-video, and image-based in-painting. Users can generate videos with specific camera movements, precise aesthetic styles, and realistic motion for characters and objects, making it a versatile tool for both technical and creative projects.

How can it be used?

The most common way to use Wan 2.2 is by downloading the model files and running them locally on a compatible machine, often with integration through platforms like ComfyUI or Diffusers. For a more accessible experience, the model is available via cloud API providers. There is also an opportunity to try Wan 2.2 for free through the Vizard platform, which provides an online interface for experimenting with the model's capabilities.

Get started with Wan 2.2 on Vizard now!