Build a document QA system with Google Drive, Pinecone, and OpenAI RAG

Name: Build a document QA system with Google Drive, Pinecone, and OpenAI RAG
Availability: InStock
Rating: 4.5 (32 reviews)
Author: Abdullahi Ahmed

$20/month : Unlimited workflows

2500 executions/month

Try free

THE #1 IN WEB SCRAPING

Scrape any website without limits

Try free

HOSTINGER 🎉 Early Black Friday Deal
DISCOUNT 20%

Self-hosted n8n

Unlimited workflows - from $4.99/mo

Try free

#1 hub for scraping, AI & automation

6000+ actors - $5 credits/mo

Try free

Title

RAG AI Agent for Documents in Google Drive → Pinecone → OpenAI Chat (n8n workflow)

Short Description

This n8n workflow implements a Retrieval-Augmented Generation (RAG) pipeline + AI agent, allowing users to drop documents into a Google Drive folder and then ask questions about them via a chatbot. New files are indexed automatically to a Pinecone vector store using OpenAI embeddings; the AI agent loads relevant chunks at query time and answers using context plus memory.

Why this workflow matters / what problem it solves

Large language models (LLMs) are powerful, but they lack up-to-date, domain-specific knowledge.
RAG augments the LLM with relevant external documents, reducing hallucination and enabling precise answers. (Pinecone)
This workflow automates the ingestion, embedding, storage, retrieval, and chat logic — with minimal manual work.
It’s modular: you can swap data sources, vector DBs, or LLMs (with some adjustments).
It leverages the built-in AI Agent node in n8n to tie all the parts together. (n8n)

How to get the required credentials

Service	Purpose in Workflow	Setup Link	What you need / steps
Google Drive (OAuth2)	Trigger new file events & download the file	https://docs.n8n.io/integrations/builtin/credentials/google/oauth-generic/	Create a Google Cloud OAuth app, grant it Drive scopes, get client ID & secret, configure redirect URI, paste into n8n credentials.
Pinecone	Vector database for embeddings	https://docs.n8n.io/integrations/builtin/credentials/pinecone/	Sign up at Pinecone, in dashboard create an index, get API key + environment, paste into n8n credential.
OpenAI	Embeddings + chat model	https://docs.n8n.io/integrations/builtin/credentials/openai/	Log in to OpenAI, generate a secret API key, paste into n8n credentials.

You’ll configure these under n8n → Credentials → New Credential, matching credential names referenced in your workflow nodes.

Detailed Walkthrough: How the Workflow Works

Here’s a step-by-step of what happens inside your workflow (matching your JSON):

1. Google Drive Trigger

Watches a specified folder in Google Drive. Whenever a new file appears (fileCreated event), the workflow is triggered (polling every minute).
You must set the folder ID (in “folderToWatch”) to the Drive folder you want to monitor.

2. Download File

Takes the file ID from the trigger and downloads the file content (binary).

3. Indexing Path: Embeddings + Storage

(This path only runs when new files arrive)

The file is sent to the Default Data Loader node (via the Recursive Character Text Splitter) to break it into chunks with overlap (so context is preserved).
Each chunk is fed into Embeddings OpenAI to convert text into embedding vectors.
Then Pinecone Vector Store (insert mode) ingests the vector + text metadata into your Pinecone index.
This ensures your vector store stays up-to-date with files you drop into Drive.

4. Chat / Query Path

(Triggered by user chat via webhook)

When a chat message arrives via When Chat Message Received, it gets passed into the AI Agent node.
Before generation, the AI Agent calls the Pinecone Vector Store1 set in “retrieve-as-tool” mode, which runs a vector-based retrieval using the user query embedding. The relevant text chunks are pulled as tools/context.
The OpenAI Chat Model node is linked as the language model for the agent.
Simple Memory node provides conversational memory (keeping history across messages).
The agent combines retrieved context + memory + user input and instructs the model to produce a response.

5. Connections / Flow Logic

The Embeddings OpenAI node’s output is wired into Pinecone Vector Store (insert) and also into Pinecone Vector Store1 (so the same embeddings can be used for retrieval).
The AI Agent has tool access to Pinecone retrieval and memory.
The Download File node triggers the insert path.
The When chat message triggers the agent path.

Similar Workflows / Inspirations & Comparisons

To help understand how your workflow fits into what’s already out there, here are a few analogues:

n8n Blog: “Build a custom knowledge RAG chatbot” — they show a workflow that ingests documents from external sources, indexes them in Pinecone, and responds to queries via n8n + LLM. (n8n Blog)
Index Documents from Google Drive to Pinecone — this is nearly identical for the ingestion part: trigger on Drive, split, embed, upload. (n8n)
Build & Query RAG System with Google Drive, OpenAI, Pinecone — shows the full RAG + chat logic, same pattern. (n8n)
Chat with GitHub API Documentation (RAG) — demonstrates converting API spec into chunks, embedding, retrieving, and chatting. (n8n)
Community tutorials & forums talk about using the AI Agent node with tools like Pinecone, and how the RAG part is often built as a sub-workflow feeding an agent. (n8n Community)

What sets your workflow apart is your explicit combination: Google Drive → automatic ingestion → chat agent with tool integration + memory. Many templates show either ingestion or chat, but fewer show them combined cleanly with n8n’s AI Agent.

Suggested Published Description (you can paste/adjust)

> RAG AI Agent for Google Drive Documents (n8n workflow) > > This workflow turns a Google Drive folder into a live, queryable knowledge base. Drop PDF, docx, or text files into the folder → new documents are automatically indexed into a Pinecone vector store using OpenAI embeddings → you can ask questions via a webhook chat interface and the AI agent will retrieve relevant text, combine it with memory, and answer in context. > > Credentials needed > > * Google Drive OAuth2 (see: https://docs.n8n.io/integrations/builtin/credentials/google/oauth-generic/) > * Pinecone (see: https://docs.n8n.io/integrations/builtin/credentials/pinecone/) > * OpenAI (see: https://docs.n8n.io/integrations/builtin/credentials/openai/) > > How it works > > 1. Drive trigger picks up new files > 2. Download, split, embed, insert into Pinecone > 3. Chat webhook triggers AI Agent > 4. Agent retrieves relevant chunks + memory > 5. Agent uses OpenAI model to craft answer > > This is built on the core RAG pattern (ingest → retrieve → generate) and enhanced by n8n’s AI Agent node for clean tool integration. > > Inspiration & context > This approach follows best practices from existing n8n RAG tutorials and templates, such as the “Index Documents from Google Drive to Pinecone” ingestion workflow and “Build & Query RAG System” templates. (n8n) > > You're free to swap out the data source (e.g. Dropbox, S3) or vector DB (e.g. Qdrant) as long as you adjust the relevant nodes.

If you like, I can generate a polished Markdown README for you (with badges, diagrams, instructions) ready for GitHub/n8n community publishing. Do you want me to build that?

Abdullahi Ahmed

0 workflows

Nodes

set gmail telegram agent google-gemini

Complexity

advanced

Published 28 Sept 2025

Likes 0

View on n8n.io Download Workflow

✨

Share Your Workflow

Have a great workflow to share? Join the n8n Creator Hub and help the community!

Submit Your Template How to Submit

Related Workflows

Create an AI Telegram bot using Google Drive, Qdrant, and OpenAI GPT-4.1

### How it works This workflow creates an intelligent Telegram bot with a knowledge base powered by Qdrant vector database. The bot automatically processes documents uploaded to Google Drive, stores them as embeddings, and uses this knowledge to answer questions in Telegram. It consists of two independent flows: **document processing** (Google Drive → Qdrant) and **chat interaction** (Telegram → AI Agent → Telegram). ### Step-by-step **Document Processing Flow:** * **New File Trigger:** The workflow starts when the **New File Trigger** node detects a new file created in the specified Google Drive folder (polling every 15 minutes). * **Download File:** The **Download File** (Google Drive) node downloads the detected file from Google Drive. * **Text Splitting:** The **Split Text into Chunks** node splits the document text into chunks of 3000 characters with 300 character overlap for optimal embedding. * **Load Document Data:** The **Load Document Data** node processes the binary file data and prepares it for vectorization. * **OpenAI Embeddings:** The **OpenAI Embeddings** node generates vector embeddings for each text chunk. * **Insert into Qdrant:** The **Insert into Qdrant** node stores the embeddings in the Qdrant vector database collection. * **Move to Processed Folder:** After successful processing, the **Move to Processed Folder** (Google Drive) node moves the file to a "Qdrant Ready" folder to keep files organized. **Telegram Chat Flow:** * **Telegram Message Trigger:** The **Telegram Message Trigger** node receives new messages from the Telegram bot. * **Filter Authorized User:** The **Filter Authorized User** node checks if the message is from an authorized chat ID (26899549) to restrict bot access. * **AI Agent Processing:** The **AI Agent** receives the user's message text and processes it using the fine-tuned GPT-4.1 model with access to the Qdrant knowledge base tool. * **Qdrant Knowledge Base:** The **Qdrant Knowledge Base** node retrieves relevant information from the vector database to provide context for the AI agent's responses. * **Conversation Memory:** The **Conversation Memory** node maintains conversation history per chat ID, allowing the bot to remember context. * **Send Response to Telegram:** The **Send Response to Telegram** node sends the AI-generated response back to the user in Telegram. ### Set up steps Estimated set up time: 15 minutes 1. **Google Drive Setup:** * Add your Google Drive OAuth2 credentials to the **New File Trigger**, **Download File**, and **Move to Processed Folder** nodes. * Create two folders in your Google Drive: one for incoming files and one for processed files. * Copy the folder IDs from the URLs and update them in the **New File Trigger** (folderToWatch) and **Move to Processed Folder** (folderId) nodes. 2. **Qdrant Setup:** * Add your Qdrant API credentials to the **Insert into Qdrant** and **Qdrant Knowledge Base** nodes. * Create a collection in your Qdrant instance (e.g., "Test-youtube-adept-ecom"). * Update the collection name in both Qdrant nodes. 3. **OpenAI Setup:** * Add your OpenAI API credentials to the **OpenAI Chat Model** and **OpenAI Embeddings** nodes. * (Optional) Replace the fine-tuned model ID in **OpenAI Chat Model** with your own model or use a standard model like `gpt-4-turbo`. 4. **Telegram Setup:** * Create a Telegram bot via [@BotFather](https://t.me/botfather) and obtain the bot token. * Add your Telegram bot credentials to the **Telegram Message Trigger** and **Send Response to Telegram** nodes. * Update the authorized chat ID in the **Filter Authorized User** node (replace `26899549` with your Telegram user ID). 5. **Customize System Prompt (Optional):** * Modify the system message in the **AI Agent** node to customize your bot's personality and behavior. * The current prompt is configured for an n8n automation expert creating social media content. 6. **Activate the Workflow:** * Toggle "Active" in the top-right to enable both the Google Drive trigger and Telegram trigger. * Upload a document to your Google Drive folder to test the document processing flow. * Send a message to your Telegram bot to test the chat interaction flow.

View

Control AI agent tool access with Port RBAC and Slack mentions

## RBAC for AI agents with n8n and Port This workflow implements role-based access control for AI agent tools using Port as the single source of truth for permissions. Different users get access to different tools based on their roles, without needing a separate permission database. For example, developers might have access to PagerDuty and AWS S3, while support staff only gets Wikipedia and a calculator. The workflow checks each user's permissions in Port before letting the agent use any tools. For the full guide with blueprint setup and detailed configuration, see [RBAC for AI Agents with n8n and Port](https://docs.port.io/guides/all/implement-rbac-for-ai-agents-with-n8n-and-port/) in the Port documentation. ## How it works The n8n workflow orchestrates the following steps: - Slack trigger — Listens for @mentions and extracts the user ID from the message. - Get user profile — Fetches the user's Slack profile to get their email address. - Port authentication — Requests an access token from the Port API using client credentials. - Permission lookup — Queries Port for the user entity (by email) and reads their allowed_tools array. - Unknown user check — If the user doesn't exist in Port, sends an error message and stops. - Permission filtering — The "Check permissions" node compares each connected tool against allowed_tools and replaces unauthorized ones with a stub that returns "You are not authorized to use this tool." - AI agent — Runs with only permitted tools, using GPT-4 and chat memory. - Response — Posts the agent output back to the Slack channel. ## Setup - [ ] Connect your Slack account and set the channel ID in the trigger node - [ ] Add your OpenAI API key - [ ] Register for free on [Port.io](https://www.port.io) - [ ] Create the rbacUser blueprint in Port (see [full guide](https://docs.port.io/guides/all/implement-rbac-for-ai-agents-with-n8n-and-port/) for blueprint setup) - [ ] Add user entities using email as the identifier - [ ] Replace YOUR_PORT_CLIENT_ID and YOUR_PORT_CLIENT_SECRET in the "Get Port access token" node - [ ] Connect credentials for any tools you want to use (PagerDuty, AWS, etc.) - [ ] Update the channel ID in the Slack nodes - [ ] Invite the bot to your Slack channel - [ ] You should be good to go! ## Prerequisites - You have a Port account and have completed the onboarding process. - You have a working n8n instance (self-hosted) with LangChain nodes available. - Slack workspace with bot permissions to receive mentions and post messages. - OpenAI API key for the LangChain agent. - Port client ID and secret for API authentication. - (Optional) PagerDuty, AWS, or other service credentials for tools you want to control. ⚠️ This template is intended for Self-Hosted instances only.

View

Create AI FAQ articles from Slack threads into Notion and Zendesk

# Create FAQ articles from Slack threads to Notion and Zendesk This workflow helps you capture "tribal knowledge" shared in Slack conversations and automatically converts it into structured documentation. By simply adding a specific reaction (default: 📚) to a message, the workflow aggregates the thread, uses AI to summarize it into a Q&A format, and publishes it to your knowledge base (Notion and Zendesk). ## Who is this for? - **Customer Support Teams** who want to turn internal troubleshooting discussions into public help articles. - **Knowledge Managers** looking to reduce the friction of documentation. - **Development Teams** wanting to archive technical decisions made in Slack threads. ## What it does 1. **Trigger:** Watches for a specific emoji reaction (📚 `:book:`) on a Slack message. 2. **Data Collection:** Fetches the parent message and all replies in the thread to get the full context. 3. **AI Processing:** Uses **OpenAI** to analyze the conversation, summarize the solution, and format it into a clear Question & Answer structure. 4. **Publishing:** - Creates a new page in a **Notion** database with tags and summaries. - (Optional) Drafts a new article in **Zendesk**. 5. **Notification:** Replies to the original Slack thread with links to the newly created documentation. ## Requirements - **n8n** (Self-hosted or Cloud) - **Slack** workspace (with an App installed that has permissions to read channels and reactions). - **OpenAI** API Key. - **Notion** account with an Integration Token. - **Zendesk** account (optional, can be removed if not needed). ## How to set up 1. **Configure Credentials:** Set up authentication for Slack, OpenAI, Notion, and Zendesk in n8n. 2. **Setup Notion:** Create a database in Notion with the following properties: - `Name` (Title) - `Summary` (Text/Rich Text) - `Tags` (Multi-select) - `Source` (URL) - `Channel` (Select or Text) 3. **Update Configuration Node:** Open the **Workflow Configuration1** node (Set node) and replace the placeholder values: - `slackWorkspaceId`: Your Slack Workspace ID (e.g., T01234567). - `notionDatabaseId`: The ID of your Notion database. - `zendeskSectionId`: (Optional) The ID of the section where articles should be created. 4. **Slack App Scopes:** Ensure your Slack App has the following scopes: `reactions:read`, `channels:history`, `groups:history`, `chat:write`. ## How to customize - **Change the Trigger:** If you prefer a different emoji (e.g., 📝 or 💡), update the "Right Value" in the **IF - :book: Reaction Check** node. - **Modify the Prompt:** Edit the **OpenAI** node to change how the AI formats the answer (e.g., ask it to be more technical or more casual). - **Remove Zendesk:** If you don't use Zendesk, simply delete the **Zendesk** node and remove the reference to it in the final **Slack - Notify Completion** node.

View

👨‍💻

Need Custom Automation?

N8N Automation Expert

Specialized in N8N automation, I design custom workflows that connect your tools and automate your processes.