April 15, 2026

The "Push-Button AI" Myth: How I Actually Produce the Depositum Podcast

Author's Note: Whether you're a technically minded builder curious about AI audio pipelines, a critical Christian wondering about the theological rigor of automated media, or a fellow content creator evaluating this workflow, this breakdown is for you.

Let’s address the elephant in the room: people often assume AI-generated podcasts are cheap, easy, lazy, or lacking in rigor.

My experience producing the Depositum podcast has been the exact opposite.

As a Data Analytics Engineer, I wanted to build a daily audio deep-dive that synthesizes dense historical texts using cutting-edge AI. To achieve this at scale as a solo creator, I had to architect a highly automated, hallucination-resistant workflow for my podcast, The Depositum.

Building the initial foundation took two to three full-time work weeks squeezed in outside my 9-to-5. Even now, with the pipeline fully operational, a single 5-minute episode takes me 1 to 2 hours to produce.

I am very much a "human in the loop." The AI doesn't run the show—I do. I am constantly engaged, pulling levers, surfacing issues, and shaping the final form of every episode.

Here is the exact, step-by-step technical breakdown of how I built the pipeline—and why the "push-button AI" myth is not a fair characterization for this project.

The Problem & The Tech Stack

Every ambitious project starts by defining a clear problem. Personally, I wanted to know more about my Catholic Faith, but foundational texts are incredibly dense. I wondered if AI could help me (and others) learn efficiently.

From an engineering standpoint, the challenge was specific: How do I generate daily, historically accurate, and engaging audio content at scale without writing every script manually—and without the AI hallucinating or watering down the theology?

Here is the open-source stack and public domain library I used to solve it:

  • 📚 The Source Data: The Douay-Rheims Bible, The Catechism of the Council of Trent, and Haydock's Catholic Bible Commentary.

  • 🧠 The Audio Engine: Google NotebookLM.

  • 🤖 The AI Virtual Team: Custom Gemini Gems.

  • 🎛️ The Post-Processing: Python (pydub & FFmpeg) for programmatic audio mastering.

Part 1: Building the Foundation (The Prep Work)

You can't just dump raw data into an AI and expect magic. The system requires absolute boundaries.

1. Data Control & Precomputation

I chose older, public-domain texts to avoid modern copyright risks. Instead of relying on AI to parse raw PDFs on the fly, I treated this as a data engineering problem. I wrote a Python script to clean and format these massive texts into structured Markdown (.md) files. Markdown preserves crucial metadata and headers, optimizing the data for the AI's semantic retrieval.

2. The "Brain" of the Operation

I upgraded to Google Pro for higher compute limits, but the core engine is Google NotebookLM. I chose NotebookLM because it restricts its answers only to the source documents provided, walling it off from the open internet. To stay within limits and improve accuracy, I modularized the data (e.g., each book of the Bible is its own file). Feeding an AI smaller, highly targeted data yields far better results. I also chose NotebookLM for the all-things-considered incredible AI generated voices. There are some glitches, but the inflection, human likeness, and drama in their voices is very impressive.

3. Establishing Hard-Coded Guardrails

I spent weeks iteratively testing system instructions to force the AI hosts to read Scripture verbatim and maintain a reverent tone. To enforce this, I engineered a master 00_00_READ_ME "constitution" file. This document acts as my hard-coded theological guardrails, included in every single podcast generation.

4. Humanizing the Audio

To add a human element, I recorded my own intros and hired an audio engineer on Upwork to mix them into broadcast-quality assets. I then wrote a custom Python script that automatically adjusts the AI audio's loudness to podcast industry standards and stitches my intros/outros onto the main track.

Part 2: The Daily Production Workflow

As a solo creator, I couldn't do all the prompting, fact-checking, and marketing manually. I built a "virtual production team" using Gemini Gems (custom AI agents with specific roles and knowledge files).

Here is my true daily workflow:

Step 1: Ideation & Research

I select a topic from my content calendar, open NotebookLM, select the relevant source files, and ask for the narrative, Biblical insights, and dogmatic context related to the topic for that day. This closed-loop research helps me develop a specific angle for the episode and to ensure that the angle is supported by the sources.

Step 2: The Producer Collab

I feed the research and what the sources support to my Producer Gem. We collaborate to assemble a master podcast prompt guided by my chosen angle.

Step 3: Audio Generation

Back in NotebookLM, I ensure only the relevant sources are selected. I click "Audio Overview," paste my prompt, and wait about five minutes.

Step 4: The Iteration Loop (The Grind)

I listen to the raw draft. If the pacing or content is off, I download the audio, run it through my Python transcription script, and feed the text back to the Producer Gem with critiques. We tweak the prompt and regenerate. This usually takes 2 to 4 iterations (and up to 17 for highly complex topics!).

Step 5: The Audit & Transparency

Once I love a draft, it goes to my Auditor Gem—a strict QA tester. Did it read Scripture verbatim? Are Catechism claims perfectly accurate? Once it scores a 100%, I link the generated QA report in a public Google Doc for full listener transparency.

Step 6: Audio Post-Processing

I run the final audio through my Python script to level the volume and stitch on my intros/outros.

Step 7: SEO & Archiving

My SEO Gem generates psycholinguistically optimized titles and descriptions. I log everything (prompts, sources, metadata) in a master spreadsheet so the episode is perfectly reproducible.

Step 8: Publishing

I upload the MP3 to my host (Acast), paste the transcript onto my website (Podpage) for SEO, and within 30 minutes, it’s live everywhere.

The Future: The Era of "Sanctioned AI"

The payoff for this rigorous process is undeniable. AI gives a solo creator the immense leverage, output, and polish of a large production team. And this architecture isn't just for theology—it applies to newsletters, documentaries, and corporate training.

Looking ahead, I predict a massive paradigm shift. The winners of the next decade won't be those who individuals who work harder and manually grind out content; it will be those who architect the best data pipelines and AI workflows.

I also believe it won't stop at pre-recorded podcasts. In the near future, every major organization will want to develop their own "sanctioned AI voice." Consumers won't just listen; they will interact. Imagine having a real-time, personalized conversation with the Catholic Church about a theological nuance, or chatting with your favorite sports franchise about their history while buying tickets, or chatting with an art museum tour guide in your ear. A closed-loop, officially sanctioned AI voice consistent with an organization will drive unprecedented engagement completely tailored to the interests of the human interacting with the AI.

We are just scratching the surface. The tech is here. The question is, who is going to build it?