Build a RAG system for PDF documents with Google Drive, Unstructured, and OpenAI
$20/month : Unlimited workflows
2500 executions/month
THE #1 IN WEB SCRAPING
Scrape any website without limits
HOSTINGER 🎉 Early Black Friday Deal
DISCOUNT 20% Try free
DISCOUNT 20%
Self-hosted n8n
Unlimited workflows - from $4.99/mo
#1 hub for scraping, AI & automation
6000+ actors - $5 credits/mo
This template monitors a Google Drive folder, converts PDF documents into clean text chunks with Unstructured, generates OpenAI embeddings, and upserts vectors into Pinecone. It’s a practical, production-ready starting point for Retrieval-Augmented Generation (RAG) that you can plug into a chatbot, semantic search, or internal knowledge tools.
How it works
- Google Drive Trigger detects new files in a selected folder and downloads them.
- The files are sent to Unstructured where they are split into smaller pieces (chunks).
- The chunks are prepared to be sent to OpenAI where they are converted into vectors (embeddings).
- The embeddings are recombined with their original data and the payload is prepared for upsert into the Pinecone index.
Set up steps
- In Pinecone, create an index with 1536 dimensions and configure it for
text-embedding-3-small. - Copy the host url and paste it on the 'Pinecone Upsert' node. It should look something like this: https://{your-index-name}.pinecone.io/vectors/upsert.
- Add Google Drive, OpenAI and Pinecone credentials in n8n.
- Point the trigger to your ingest folder (you can use this article for demo).
- Click the 'Open chat' button and enter the following: Which Git provider do the authors use?