Skip to main content

Ingest and search Cloudflare R2 media with Gemini, Groq Whisper, and Supabase

Workflow preview

Workflow preview
100%
Ingest and search Cloudflare R2 media with Gemini, Groq Whisper, and Supabase preview
Open on n8n.io

1. Workflow Overview

Quick overview This workflow ingests images, PDFs, and videos from a Cloudflare R2 folder, uses Google Gemini to view pdfs, images and videos, Groq stt (Whisper) for video transcriptst to generate ...

Best for

  • Document Extraction automation workflows
  • AI RAG automation workflows
  • advanced n8n builders looking for reusable templates

Tools used

n8n-nodes-base.stickynote, n8n-nodes-base.set, @n8n/n8n-nodes-langchain.embeddingsgooglegemini, @n8n/n8n-nodes-langchain.documentdefaultdataloader, @n8n/n8n-nodes-langchain.textsplittercharactertextsplitter, @n8n/n8n-nodes-langchain.vectorstoresupabase, n8n-nodes-base.httprequest, n8n-nodes-base.webhook

Source and attribution

This workflow is cataloged by N8N Workflows and links back to its original n8n.io source page by Dave Sartori.

Original n8n.io source

1.1 Workflow description

Title
Ingest and search Cloudflare R2 media with Gemini, Groq Whisper, and Supabase
Workflow name
Ingest and search Cloudflare R2 media with Gemini, Groq Whisper, and Supabase

Quick overview

This workflow ingests images, PDFs, and videos from a Cloudflare R2 folder, uses Google Gemini to view pdfs, images and videos, Groq stt (Whisper) for video transcriptst - to generate searchable descriptions and tags, stores embeddings in a Supabase pgvector table.

How it works

  1. Receives a webhook request containing a Cloudflare R2 bucket and folder URL, then lists the objects in that folder.
  2. Filters to supported file types, builds public CDN URLs and timestamps, and routes each item as an image, PDF, or video.
  3. For images, calls Google Gemini with the image URL to generate structured metadata (summary, detailed description, tags, and scores).
  4. For PDFs, calls Google Gemini to analyze the document URL and return the same structured metadata.
  5. For videos, downloads each file locally, extracts representative frames with FFmpeg for Google Gemini visual analysis, extracts audio, transcribes it with Groq Whisper, and tags transcript chunks with Groq Llama.
  6. Normalizes results into a single text “content” field plus JSON metadata, generates Google Gemini embeddings, and inserts the vectors into Supabase (pgvector).
  7. Receives a separate webhook query, retrieves the most similar items from Supabase using embeddings, and returns ranked matches in the webhook response.

Setup

  1. Create a Cloudflare R2 bucket with publicly accessible object URLs, and add Cloudflare R2 credentials in n8n.
  2. Set up a Supabase project with pgvector enabled and a table named vec10, then add Supabase credentials in n8n.
  3. Add Google Gemini credentials (Google PaLM/Gemini API) for embeddings and provide an HTTP Header Auth credential for the Gemini HTTP requests.
  4. Set the GROQ_API_KEY environment variable for the Groq Whisper transcription and Llama tag extraction calls.
  5. If you enable video processing, install curl, ffmpeg, and ffprobe on the n8n host and update the local directory paths (temp root, frames directory, and video directory) in the workflow inputs.
  6. Copy the ingest webhook (/vector-ingest) and query webhook (/vector-query) URLs and configure your upstream app to send the expected JSON payloads.

Additional info

Video: FFmpeg code nodes cut videos smartly into "video_frames" items and "video_transcripts" for easy handling and pgvector storage. Exposed webhook to vector query flow allows Voice Agent to find and display the full video, pulled from Cloudflare bucket, by the referenced matching video_frames or video_transcripts returned from vector query.

1.2 Logical Blocks

This catalog entry is organized from the workflow JSON. The node-level section below shows the executable blocks available for review before importing the template.

2. Block-by-Block Analysis

Block 1 - Sticky Note3

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 2 - Sticky Note5

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 3 - Sticky Note6

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 4 - Sticky Note8

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 5 - Sticky Note9

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 6 - Sticky Note10

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 7 - Sticky Note11

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 8 - Sticky Note12

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 9 - Sticky Note13

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 10 - Sticky Note14

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 11 - Sticky Note15

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 12 - Sticky Note16

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 13 - Sticky Note17

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 14 - Sticky Note18

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 15 - Set Content and Metadata

Type / Role
n8n-nodes-base.set - set
Config choices
Version 3.4

Block 16 - Embedding with Gemini Model

Type / Role
@n8n/n8n-nodes-langchain.embeddingsGoogleGemini - embeddingsGoogleGemini
Config choices
Version 1

Block 17 - Load Default Data

Type / Role
@n8n/n8n-nodes-langchain.documentDefaultDataLoader - documentDefaultDataLoader
Config choices
Version 1.1

Block 18 - Split Text by Character

Type / Role
@n8n/n8n-nodes-langchain.textSplitterCharacterTextSplitter - textSplitterCharacterTextSplitter
Config choices
Version 1

Block 19 - Ingest to Vector Store

Type / Role
@n8n/n8n-nodes-langchain.vectorStoreSupabase - vectorStoreSupabase
Config choices
Version 1

Block 20 - Post PDF to API

Type / Role
n8n-nodes-base.httpRequest - httpRequest
Config choices
Version 4.4

Block 21 - Post Image to API

Type / Role
n8n-nodes-base.httpRequest - httpRequest
Config choices
Version 4.4

Block 22 - Image Webhook Trigger

Type / Role
n8n-nodes-base.webhook - webhook
Config choices
Version 2.1

Block 23 - PDF Webhook Trigger

Type / Role
n8n-nodes-base.webhook - webhook
Config choices
Version 2.1

Block 24 - Post New Image to API

Type / Role
n8n-nodes-base.httpRequest - httpRequest
Config choices
Version 4.4

Showing the first 24 of 68 workflow blocks. Download the JSON for the full node graph.

3. Summary Table

Workflow Ingest and search Cloudflare R2 media with Gemini, Groq Whisper, and Supabase
Complexity advanced
Nodes 68
Categories Document Extraction, AI RAG
Author Dave Sartori
Published 20 Jun 2026

4. Reproducing the Workflow from Scratch

  1. 1. Download the workflow JSON

    Use the JSON export at /data/workflows/16528/16528.json as the source template for this automation.

  2. 2. Import the template into n8n

    Open n8n, import the downloaded JSON, and review each node before activating the workflow.

  3. 3. Configure credentials and variables

    Replace placeholder credentials, API keys, webhook URLs, account IDs, and environment-specific values with your own settings.

  4. 4. Test with sample data

    Run the workflow manually or in a staging workspace, inspect node output, and confirm downstream systems receive the expected data.

  5. 5. Activate and monitor

    Enable the workflow only after testing, then monitor executions, errors, and rate limits during the first production runs.

5. General Notes & Resources

Review imported nodes carefully before activation. This catalog entry is intended to help you inspect the workflow structure, understand required services, and find related templates faster.

Node names, credentials, schedules, webhook paths, and external service limits may need adjustment for your workspace.

Frequently asked questions

What does Ingest and search Cloudflare R2 media with Gemini, Groq Whisper, and Supabase do?

Quick overview This workflow ingests images, PDFs, and videos from a Cloudflare R2 folder, uses Google Gemini to view pdfs, images and videos, Groq stt (Whisper) for video transcriptst to generate ...

What do I need before importing this workflow?

Review the workflow JSON, configure any required credentials in n8n, and test the automation in a safe workspace before using it in production.

Can I customize this workflow?

Yes. Use the block-by-block analysis and the downloadable JSON to inspect each node, then adjust credentials, prompts, schedules, filters, or destinations for your Document Extraction, AI RAG use case.