Auto-update knowledge base with Drive, LlamaIndex & Azure OpenAI embeddings

Name: Auto-update knowledge base with Drive, LlamaIndex & Azure OpenAI embeddings
Availability: InStock
Rating: 4.5 (26 reviews)
Author: Khairul Muhtadin

$20/month : Unlimited workflows

2500 executions/month

Try free

THE #1 IN WEB SCRAPING

Scrape any website without limits

Try free

HOSTINGER 🎉 Early Black Friday Deal
DISCOUNT 20%

Self-hosted n8n

Unlimited workflows - from $4.99/mo

Try free

#1 hub for scraping, AI & automation

6000+ actors - $5 credits/mo

Try free

This Workflow auto-ingests Google Drive documents, parses them with LlamaIndex, and stores Azure OpenAI embeddings in an in-memory vector store—cutting manual update time from ~30 minutes to under 2 minutes per doc.

Why Use This Workflow?

Cost Reduction: Eliminates pays monthly fee on cloud just for store knowledge

Ideal For

Knowledge Managers / Documentation Teams: Automatically keep product docs and SOPs in sync when source files change on Google Drive.
Support Teams: Ensure the searchable KB is always up-to-date after doc edits, speeding agent onboarding and resolution time.
Developer / AI Teams: Populate an in-memory vector store for experiments, rapid prototyping, or local RAG demos.

How It Works

Trigger: Google Drive Trigger watches a specific document or folder for updates.
Data Collection: The updated file is downloaded from Google Drive.
Processing: The file is uploaded to LlamaIndex cloud via an HTTP Request to create a parsing job.
Intelligence Layer: Workflow polls LlamaIndex job status (Wait + Monitor loop). If parsing status equals SUCCESS, the result is retrieved as markdown.
Output & Delivery: Parsed markdown is loaded into LangChain's Default Data Loader, passed to Azure OpenAI embeddings (deployment "3small"), then inserted into an in-memory vector store.
Storage & Logging: Vector store holds embeddings in memory (good for prototyping). Optionally persist to an external vector DB for production.

Setup Guide

Prerequisites

Requirement	Type	Purpose
n8n instance	Essential	Execute and import the workflow — use the n8n instance
Google Drive OAuth2	Essential	Watch and download documents from Google Drive
LlamaIndex Cloud API	Essential	Parse and convert documents to structured markdown
Azure OpenAI Account	Essential	Generate embeddings (deployment configured to model name "3small")
Persistent Vector DB (e.g., Pinecone)	Optional	Persist embeddings for production-scale search

Installation Steps

Import the workflow JSON into your n8n instance: open your n8n instance and import the file.
Configure credentials:
- Azure OpenAI: Provide Endpoint, API Key and set deployment name.
- LlamaIndex API: Create an HTTP Header Auth credential in n8n. Header Name: Authorization. Header Value: Bearer YOUR_API_KEY.
- Google Drive OAuth2: Create OAuth 2.0 credentials in Google Cloud Console, enable Drive API, and configure the Google Drive OAuth2 credential in n8n.
Update environment-specific values:
- Replace the workflow's Google Drive fileId with the GUID or folder ID you want to watch (do not commit public IDs).
Customize settings:
- Polling interval (Wait node): adjust for faster or slower job status checks.
- Target file or folder: toggled on the Google Drive Trigger node.
- Embedding model: change Azure OpenAI deployment if needed.
Test execution:
- Save changes and trigger a sample file update on Drive. Verify each node runs and the vector store receives embeddings.

Technical Details

Core Nodes

Node	Purpose	Key Configuration
Knowledge Base Updated Trigger (Google Drive Trigger)	Triggers on file/folder changes	Set trigger type to specific file or folder; configure OAuth2 credential
Download Knowledge Document (Google Drive)	Downloads file binary	Operation: download; ensure OAuth2 credential is selected
Parse Document via LlamaIndex (HTTP Request)	Uploads file to LlamaIndex parsing endpoint	POST multipart/form-data to /parsing/upload; use HTTP Header Auth credential
Monitor Document Processing (HTTP Request)	Polls parsing job status	GET /parsing/job/{{jobId}}; check status field
Check Parsing Completion (If)	Branches on job status	Condition: {{$json.status}} equals SUCCESS
Retrieve Parsed Content (HTTP Request)	Fetches parsed markdown result	GET /parsing/job/{{jobId}}/result/markdown
Default Data Loader (LangChain)	Loads parsed markdown into document format	Use as document source for embeddings
Embeddings Azure OpenAI	Generates embeddings for documents	Credentials: Azure OpenAI; Model/Deployment: 3small
Insert Data to Store (vectorStoreInMemory)	Stores documents + embeddings	Use memory store for prototyping; switch to DB for persistence

Workflow Logic

On Drive change, the file binary is downloaded and sent to LlamaIndex.
Workflow enters a monitor loop: Monitor Document Processing fetches job status, If node checks status. If not SUCCESS, Wait node delays before re-check.
When parsing completes, the workflow retrieves markdown, loads documents, creates embeddings via Azure OpenAI, and inserts data into an in-memory vector store.

Customization Options

Basic Adjustments:

Poll Delay: Set Wait node (default: every minute) to balance speed vs. API quota.
Target Scope: Switch the trigger from a single file to a folder to auto-handle many docs.
Embedding Model: Swap Azure deployment for a different model name as needed.

Advanced Enhancements:

Persistent Vector DB Integration: Replace vectorStoreInMemory with Pinecone or Milvus for production search.
Notification: Add Slack or email nodes to notify when parsing completes or fails.
Summarization: Add an LLM summarization step to generate chunk-level summaries.

Scaling option:

Batch uploads and chunking to reduce embedding calls; use a queue (Redis or n8n queue patterns) and horizontal workers for high throughput.

Performance & Optimization

Metric	Expected Performance	Optimization Tips
Execution time (per doc)	~10s–2min (depends on file size & LlamaIndex processing)	Chunk large docs; run embeddings in batches
API calls (per doc)	3–8 (upload, poll(s), retrieve, embedding calls)	Increase poll interval; consolidate requests
Error handling	Retries via Wait loop and If checks	Add exponential backoff, failure notifications, and retry limits

Troubleshooting

Problem	Cause	Solution
Authentication errors	Invalid/missing credentials	Reconfigure n8n Credentials; do not paste API keys directly into nodes
File not found	Incorrect fileId or permissions	Verify Drive fileId and OAuth scopes; share file with the service account if needed
Parsing stuck in PENDING	LlamaIndex processing delay or rate limit	Increase Wait node interval, monitor LlamaIndex dashboard, add retry limits
Embedding failures	Model/deployment mismatch or quota limits	Confirm Azure deployment name (3small) and subscription quotas

Created by: khmuhtadin
Category: Knowledge Management Tags: google-drive, llamaindex, azure-openai, embeddings, knowledge-base, vector-store

Need custom workflows? Contact us

Khairul Muhtadin

0 workflows

Nodes

set gmail telegram agent google-gemini

Complexity

intermediate

Published 02 Oct 2025

Likes 0

View on n8n.io Download Workflow

✨

Share Your Workflow

Have a great workflow to share? Join the n8n Creator Hub and help the community!

Submit Your Template How to Submit

Related Workflows

Create an AI Telegram bot using Google Drive, Qdrant, and OpenAI GPT-4.1

### How it works This workflow creates an intelligent Telegram bot with a knowledge base powered by Qdrant vector database. The bot automatically processes documents uploaded to Google Drive, stores them as embeddings, and uses this knowledge to answer questions in Telegram. It consists of two independent flows: **document processing** (Google Drive → Qdrant) and **chat interaction** (Telegram → AI Agent → Telegram). ### Step-by-step **Document Processing Flow:** * **New File Trigger:** The workflow starts when the **New File Trigger** node detects a new file created in the specified Google Drive folder (polling every 15 minutes). * **Download File:** The **Download File** (Google Drive) node downloads the detected file from Google Drive. * **Text Splitting:** The **Split Text into Chunks** node splits the document text into chunks of 3000 characters with 300 character overlap for optimal embedding. * **Load Document Data:** The **Load Document Data** node processes the binary file data and prepares it for vectorization. * **OpenAI Embeddings:** The **OpenAI Embeddings** node generates vector embeddings for each text chunk. * **Insert into Qdrant:** The **Insert into Qdrant** node stores the embeddings in the Qdrant vector database collection. * **Move to Processed Folder:** After successful processing, the **Move to Processed Folder** (Google Drive) node moves the file to a "Qdrant Ready" folder to keep files organized. **Telegram Chat Flow:** * **Telegram Message Trigger:** The **Telegram Message Trigger** node receives new messages from the Telegram bot. * **Filter Authorized User:** The **Filter Authorized User** node checks if the message is from an authorized chat ID (26899549) to restrict bot access. * **AI Agent Processing:** The **AI Agent** receives the user's message text and processes it using the fine-tuned GPT-4.1 model with access to the Qdrant knowledge base tool. * **Qdrant Knowledge Base:** The **Qdrant Knowledge Base** node retrieves relevant information from the vector database to provide context for the AI agent's responses. * **Conversation Memory:** The **Conversation Memory** node maintains conversation history per chat ID, allowing the bot to remember context. * **Send Response to Telegram:** The **Send Response to Telegram** node sends the AI-generated response back to the user in Telegram. ### Set up steps Estimated set up time: 15 minutes 1. **Google Drive Setup:** * Add your Google Drive OAuth2 credentials to the **New File Trigger**, **Download File**, and **Move to Processed Folder** nodes. * Create two folders in your Google Drive: one for incoming files and one for processed files. * Copy the folder IDs from the URLs and update them in the **New File Trigger** (folderToWatch) and **Move to Processed Folder** (folderId) nodes. 2. **Qdrant Setup:** * Add your Qdrant API credentials to the **Insert into Qdrant** and **Qdrant Knowledge Base** nodes. * Create a collection in your Qdrant instance (e.g., "Test-youtube-adept-ecom"). * Update the collection name in both Qdrant nodes. 3. **OpenAI Setup:** * Add your OpenAI API credentials to the **OpenAI Chat Model** and **OpenAI Embeddings** nodes. * (Optional) Replace the fine-tuned model ID in **OpenAI Chat Model** with your own model or use a standard model like `gpt-4-turbo`. 4. **Telegram Setup:** * Create a Telegram bot via [@BotFather](https://t.me/botfather) and obtain the bot token. * Add your Telegram bot credentials to the **Telegram Message Trigger** and **Send Response to Telegram** nodes. * Update the authorized chat ID in the **Filter Authorized User** node (replace `26899549` with your Telegram user ID). 5. **Customize System Prompt (Optional):** * Modify the system message in the **AI Agent** node to customize your bot's personality and behavior. * The current prompt is configured for an n8n automation expert creating social media content. 6. **Activate the Workflow:** * Toggle "Active" in the top-right to enable both the Google Drive trigger and Telegram trigger. * Upload a document to your Google Drive folder to test the document processing flow. * Send a message to your Telegram bot to test the chat interaction flow.

View

Control AI agent tool access with Port RBAC and Slack mentions

## RBAC for AI agents with n8n and Port This workflow implements role-based access control for AI agent tools using Port as the single source of truth for permissions. Different users get access to different tools based on their roles, without needing a separate permission database. For example, developers might have access to PagerDuty and AWS S3, while support staff only gets Wikipedia and a calculator. The workflow checks each user's permissions in Port before letting the agent use any tools. For the full guide with blueprint setup and detailed configuration, see [RBAC for AI Agents with n8n and Port](https://docs.port.io/guides/all/implement-rbac-for-ai-agents-with-n8n-and-port/) in the Port documentation. ## How it works The n8n workflow orchestrates the following steps: - Slack trigger — Listens for @mentions and extracts the user ID from the message. - Get user profile — Fetches the user's Slack profile to get their email address. - Port authentication — Requests an access token from the Port API using client credentials. - Permission lookup — Queries Port for the user entity (by email) and reads their allowed_tools array. - Unknown user check — If the user doesn't exist in Port, sends an error message and stops. - Permission filtering — The "Check permissions" node compares each connected tool against allowed_tools and replaces unauthorized ones with a stub that returns "You are not authorized to use this tool." - AI agent — Runs with only permitted tools, using GPT-4 and chat memory. - Response — Posts the agent output back to the Slack channel. ## Setup - [ ] Connect your Slack account and set the channel ID in the trigger node - [ ] Add your OpenAI API key - [ ] Register for free on [Port.io](https://www.port.io) - [ ] Create the rbacUser blueprint in Port (see [full guide](https://docs.port.io/guides/all/implement-rbac-for-ai-agents-with-n8n-and-port/) for blueprint setup) - [ ] Add user entities using email as the identifier - [ ] Replace YOUR_PORT_CLIENT_ID and YOUR_PORT_CLIENT_SECRET in the "Get Port access token" node - [ ] Connect credentials for any tools you want to use (PagerDuty, AWS, etc.) - [ ] Update the channel ID in the Slack nodes - [ ] Invite the bot to your Slack channel - [ ] You should be good to go! ## Prerequisites - You have a Port account and have completed the onboarding process. - You have a working n8n instance (self-hosted) with LangChain nodes available. - Slack workspace with bot permissions to receive mentions and post messages. - OpenAI API key for the LangChain agent. - Port client ID and secret for API authentication. - (Optional) PagerDuty, AWS, or other service credentials for tools you want to control. ⚠️ This template is intended for Self-Hosted instances only.

View

Create AI FAQ articles from Slack threads into Notion and Zendesk

# Create FAQ articles from Slack threads to Notion and Zendesk This workflow helps you capture "tribal knowledge" shared in Slack conversations and automatically converts it into structured documentation. By simply adding a specific reaction (default: 📚) to a message, the workflow aggregates the thread, uses AI to summarize it into a Q&A format, and publishes it to your knowledge base (Notion and Zendesk). ## Who is this for? - **Customer Support Teams** who want to turn internal troubleshooting discussions into public help articles. - **Knowledge Managers** looking to reduce the friction of documentation. - **Development Teams** wanting to archive technical decisions made in Slack threads. ## What it does 1. **Trigger:** Watches for a specific emoji reaction (📚 `:book:`) on a Slack message. 2. **Data Collection:** Fetches the parent message and all replies in the thread to get the full context. 3. **AI Processing:** Uses **OpenAI** to analyze the conversation, summarize the solution, and format it into a clear Question & Answer structure. 4. **Publishing:** - Creates a new page in a **Notion** database with tags and summaries. - (Optional) Drafts a new article in **Zendesk**. 5. **Notification:** Replies to the original Slack thread with links to the newly created documentation. ## Requirements - **n8n** (Self-hosted or Cloud) - **Slack** workspace (with an App installed that has permissions to read channels and reactions). - **OpenAI** API Key. - **Notion** account with an Integration Token. - **Zendesk** account (optional, can be removed if not needed). ## How to set up 1. **Configure Credentials:** Set up authentication for Slack, OpenAI, Notion, and Zendesk in n8n. 2. **Setup Notion:** Create a database in Notion with the following properties: - `Name` (Title) - `Summary` (Text/Rich Text) - `Tags` (Multi-select) - `Source` (URL) - `Channel` (Select or Text) 3. **Update Configuration Node:** Open the **Workflow Configuration1** node (Set node) and replace the placeholder values: - `slackWorkspaceId`: Your Slack Workspace ID (e.g., T01234567). - `notionDatabaseId`: The ID of your Notion database. - `zendeskSectionId`: (Optional) The ID of the section where articles should be created. 4. **Slack App Scopes:** Ensure your Slack App has the following scopes: `reactions:read`, `channels:history`, `groups:history`, `chat:write`. ## How to customize - **Change the Trigger:** If you prefer a different emoji (e.g., 📝 or 💡), update the "Right Value" in the **IF - :book: Reaction Check** node. - **Modify the Prompt:** Edit the **OpenAI** node to change how the AI formats the answer (e.g., ask it to be more technical or more casual). - **Remove Zendesk:** If you don't use Zendesk, simply delete the **Zendesk** node and remove the reference to it in the final **Slack - Notify Completion** node.

View

👨‍💻

Need Custom Automation?

N8N Automation Expert

Specialized in N8N automation, I design custom workflows that connect your tools and automate your processes.