
What We Gain by Automating Video Transcription
Published: 5/23/2025
Managing video at scale is more than just uploading and organizing files. If we want our video libraries to remain searchable, accessible, and actionable over time, we need structured metadata. And one of the most powerful forms of metadata? A full transcript.
So, I started building a workflow that automatically transcribes video assets inside a Sanity.io Media Asset Management (MAM) system! This wasn't about automating for the sake of novelty; it was about solving a common problem: helping video content work harder and smarter for teams.
The Challenge: Making Video Content Searchable
When we upload videos into a CMS or MAM, the default metadata is often limited. E.g. file names, upload dates, maybe a few tags if someone had time to add them. But if we want to search video content by topic, quote, speaker, or moment of insight, we need more than that. We need the text behind the video.
Without transcripts, videos risk becoming invisible to the systems and people that rely on them. For editors, marketers, researchers, or executives looking to repurpose or discover assets quickly, that creates friction.
Manually transcribing videos? No thanks. So I built a workflow to help us:
- Detect when a new video is added to the MAM
- Automatically generate a transcription
- Import the transcription back into Sanity as structured metadata
The Workflow We Can Use
Here’s how the flow works, and how it could work for you:
- Video Upload A new video is added into Sanity’s Media Asset Manager (sanity-plugin-media).
- Detection Trigger A small backend script polls the MAM or listens for webhook events.
- Python Transcription The video is temporarily downloaded and processed through a Python-based transcription app. If the video lacks audio, the script skips it gracefully and logs a message.
- Transcription File Creation The transcript (plain text) is generated and saved with formatting.
- Metadata Update The transcript is uploaded into Sanity and linked to the original video asset. Searchable, editable, and usable across systems.
Lessons That Help Teams Work Smarter
Along the way, I ran into some issues. The kind that teams often don’t discover until they’re neck-deep in asset management. Here’s what I learned (so you don’t have to learn it the hard way):
Not all videos have audio Animations, intro clips, or b-roll often come without sound. The script originally failed on these. Solution: Add a check — if video.audio is None, log a warning and move on.
We have to support multiple formats .mp4 isn’t the only format we’ll see. Some teams use .mov, .mkv, and more. Solution: Expand accepted file types and add format validation.
Automation still needs human review Auto-generated transcripts can miss context or nuance. Solution: Flag transcripts for optional human review before they’re finalized.
Metadata structure matters We don’t want to clutter our video entries with unstructured blobs of text. Solution: Link transcripts as related documents, keeping metadata clean and reusable.
Why This Matters for All of Us
If we’re managing hundreds or thousands of video assets, automation is a necessity.
This kind of workflow can:
- Make video content searchable by keyword or concept
- Improve accessibility for all users
- Support SEO for video-based pages
- Feed downstream systems (like recommendation engines or internal search)
- Free up teams from repetitive tasks so they can focus on higher-impact work
Structured, accessible video content is where the future of digital media is headed. And small workflows like this are definitely a start.
Where We Could Go Next
There’s still so much opportunity to evolve this system:
- Auto-summarization: Generate short blurbs or key takeaways from long-form video
- Speaker detection: Tag and search by person for interviews or panels
- Topic tagging: Use AI to classify themes automatically
When we combine transcription with smart metadata, we unlock time, insight, and clarity.
Let’s Learn Together
If you’re building smarter video workflows, supporting creative teams, or managing a growing media library, I’d love to learn what’s working for you — and what challenges you’re tackling.
Let’s compare notes.


If you or your company is struggling with any of these challenges, feel free to contact me to learn how I might help you.