TLDRs;
Contents
- Bilibili has introduced “Codename H,” an AI tool that generates videos from text or audio in minutes.
- The platform aims to capture the rising demand for visual podcast formats, which grew 270% year-on-year.
- Codename H drastically cuts production time and costs, opening doors for more creators.
- With China’s podcast audience projected to hit 150 million, Bilibili is positioning itself for long-term growth.
Bilibili has launched an ambitious new AI tool designed to transform the way content creators produce podcasts, turning written or spoken words into fully rendered videos in just a few minutes.
The tool, internally referred to as “Codename H,” aims to simplify video production by automating visuals using AI trained on text and audio inputs. For creators, especially those focused on podcasts and educational content, this could drastically cut production time while unlocking new audience segments drawn to visual formats.
With “Codename H,” a 1,000-word script can be transformed into a complete video in six minutes, and Bilibili says this speed will soon be cut in half. The tool also provides a set of templates that make it easier for users to convert traditional podcasts or articles into engaging, animated video episodes.
Podcasting Enters the Visual Era
This rollout comes as Bilibili positions itself at the forefront of a shifting media landscape where podcasts are increasingly visual. The platform is actively courting podcast creators this summer, pointing to strong internal numbers that reveal a surge in interest.
According to Bilibili, video podcast watch time on the platform hit 25.9 billion minutes in the first quarter of 2025 alone, a jump of over 270% compared to the same period last year. Over 40 million users are now regularly tuning into this format. The growth reflects a broader trend seen across the industry, where platforms like YouTube have normalized the fusion of audio and video in podcasting.
Lowering the Barriers to Creation
What makes “Codename H” stand out isn’t just its technical speed but how it alters the economics of creation. Tools that previously required days of editing and design can now deliver professional-looking videos in moments, opening the door for creators who lack the time, budget, or expertise to handle complex video workflows.
Bilibili’s move mirrors a larger industry shift in which AI is not just enhancing creativity but automating large portions of the creative process. The result is a democratization of production power that levels the playing field for emerging voices.
The company is backing the tool with a support strategy aimed at attracting new talent. This includes free recording venues and traffic incentives, ensuring that creators are not only equipped to make content but also to distribute it effectively.
A Strategic Play in a Growing Market
The timing of the release aligns with forecasts from iResearch Consulting that predict China’s podcast user base will reach 150 million by 2025. With traditional entertainment formats maturing and competition stiffening, video podcasting offers Bilibili a fresh growth avenue.
While audio-only shows continue to decline in relative influence, visually enhanced podcasts are gaining favor, particularly among younger audiences seeking a more immersive experience. For Bilibili, “Codename H” isn’t just a tool. It’s a bet on what comes next in media consumption.
As AI continues to evolve, platforms are not just embracing it for backend optimization but placing it at the heart of their content ecosystems. Bilibili’s latest play suggests that the future of podcasting may be more visual, more automated, and more accessible than ever before.