Multimodal telegram bot with voice, image & video analysis using Claude & Gemini

Name: Multimodal telegram bot with voice, image & video analysis using Claude & Gemini
Availability: InStock
Author: Keith Uy

Multimodal telegram bot with voice, image & video analysis using Claude & Gemini preview

Open on n8n.io

$20/month : Unlimited workflows

2500 executions/month

Try free

THE #1 IN WEB SCRAPING

Scrape any website without limits

Try free

HOSTINGER

Early Deal
DISCOUNT 20%

Self-hosted n8n

Unlimited workflows - from $4.99/mo

Try free

#1 hub for scraping, AI & automation

6000+ actors - $5 credits/mo

Try free

Important notice

This workflow is provided as-is. Please review and test before using in production.

Overview

What it's for:

This is a base template for anyone trying to develop a telegram AI Agent. This base allows for multiple inputs (Voice, Picture, Video, and Text inputs) to be processed by an AI model of their choosing to a get a User started. From here, the User may connect any tools that they see fit to the AI Agent for their n8n workflows.

How it works:

Input: Telegram message to a bot chat

n8n Processing: Switch node determines the type:

Voice Message
Picture Message
Video Message
Text Message

(Currently uses OpenAI and Gemini to analyze Voice/Photo/Video content but feel free to change these nodes with other models)

AI Agent Proccessing: LLM of your choosing examines message and based on system prompt, generates an output

Output: AI Output is sent back in telegram Message

How to use:

Create your chat bot and generate access token -> Search Bot father in telegram -> Type "/newbot" -> follow instructions and create access token -> Copy access token
Create Credentials in n8n -> Open telegram trigger node -> Click create credential -> Paste access token -> Save
Create LLM access token (Different per LLM but search your LLM + API in google) -> (will have to create an account with the LLM platform) -> buy credits to use LLM API -> Generate Access token -> Paste token in LLM node

Requirements:

Telegram Bot Access Token
Google Gemini Access Token (For Picture and Video messages)
OpenAI Access Token (For Voice messages)
LLM Access Token (Your preference for the AI Agent)

Customizing this workflow:

To personalize the AI Output, adjust the system prompt (give context or directions on the AI's role)
Add tools to the AI agent to give it more utility besides a personalied LLM (Example: Calendars, Databases, etc).

Keith Uy

2 workflows

Nodes

n8n-nodes-base.telegramtrigger n8n-nodes-base.switch n8n-nodes-base.noop @n8n/n8n-nodes-langchain.agent @n8n/n8n-nodes-langchain.openai @n8n/n8n-nodes-langchain.memorybufferwindow n8n-nodes-base.datetimetool @n8n/n8n-nodes-langchain.lmchatanthropic

Complexity

advanced

Published 27 Sept 2025

Likes 0

View on n8n.io Download Workflow

Install path: /data/workflows/9008/9008.json

Share Your Workflow

Have a useful automation to share? Publish it and help the community.

Submit Your Template How to Submit

Related Workflows

Automate WhatsApp lead capture and replies with Whapi, Ollama and Sheets

Receive WhatsApp messages via Whapi, generate AI replies with a local Ollama model, log conversations in Google Sheets, and auto-capture leads — all without touching a cloud LLM. This n8n template builds a fully automated WhatsApp AI CRM using Whapi.cloud for messaging and Ollama for 100% local AI inference — no OpenAI costs, no data leaving your server. # How it works A Webhook node receives inbound WhatsApp messages from Whapi.cloud. A Code node extracts the sender's phone, name, message text, and filters out outbound/non-text messages. An IF node ensures only real inbound text messages from customers are processed. Google Sheets is used to fetch that customer's full conversation history, enabling memory across sessions. A Code node builds a full prompt — system instructions + conversation history + new message — passed to the AI model. Ollama (via LangChain LLM Chain node) generates a contextual reply using a local model (default: gemma3:1b). The user message and AI reply are each appended to Google Sheets as conversation history logs. A separate Google Sheets upsert captures or updates the lead record with phone and name. The AI reply is sent back to the customer via Whapi's HTTP API. # How to use Set up a Whapi.cloud account and connect a WhatsApp number. Point the webhook to your n8n webhook URL. Create a Google Sheet with a History tab (columns: Phone, Name, Role, Message, Timestamp) and a Leads tab (columns: Phone, Name, CreatedAt). Add your Google Sheets credentials and replace YOUR_GOOGLE_SHEET_ID in the relevant nodes. Run Ollama locally or on your server. Pull the model: ollama pull gemma3:1b. Update the model name in the Ollama node if using a different model. Customise the system prompt inside the Build AI Prompt node to match your business (real estate, support, bookings, etc.). Activate the workflow and send a WhatsApp message to test. # Requirements Whapi.cloud account (WhatsApp Business API) Ollama running locally or on a self-hosted server Google Sheets (with OAuth2 credentials connected in n8n) # Customising this workflow Switch AI models: Swap gemma3:1b for any Ollama-supported model like llama3, mistral, or phi3 depending on your hardware. Change the industry: Edit the system prompt in Build AI Prompt to serve any business — bookings, customer support, sales qualification, etc. Upgrade the CRM: Replace Google Sheets with Airtable, Notion, or a real CRM (HubSpot, Pipedrive) by swapping out the Sheets nodes. Add handoff logic: Insert a condition to escalate to a human agent if the message contains keywords like "speak to someone" or "human". Multi-language: The system prompt already instructs the AI to reply in the customer's language — no extra setup needed. # Who is this for It's designed for service businesses (real estate, consultants, agencies) that want to respond to inbound WhatsApp leads instantly, log conversations, and build a simple CRM — all from a single workflow.

View

Track multi-chain crypto portfolios and analyze risk with Gemini and QuickNode

**This workflow provides a fully automated multi-chain crypto portfolio tracking system powered by AI.** It fetches wallet balances and gas prices across multiple blockchain networks (e.g., Ethereum, Polygon, and more via QuickNode), retrieves real-time token prices, and calculates total portfolio value in USD. Using an AI agent, it generates: - Portfolio insights - Risk analysis - Investment suggestions - Gas fee insights across chains - Portfolio health score The final report is formatted and delivered directly to Slack. ⚙️ **Key Features** 🌐 Multi-chain support (Ethereum, Polygon, extendable to any EVM chain via QuickNode) 💰 Real-time USD portfolio valuation ⚖️ Accurate asset allocation (%) 🧠 AI-powered insights, risk & suggestions ⛽ Cross-chain gas fee analysis 📩 Automated Slack alerts ⏰ Daily scheduling support (Cron) 🔌 Powered by QuickNode for reliable blockchain data 🧠 **How It Works** - Fetch wallet balances across chains (via QuickNode RPC) - Fetch gas prices for each network - Retrieve live token prices (ETH, MATIC, etc.) - Calculate total portfolio value & allocation - Generate AI-driven insights and recommendations - Format a clean Slack-ready report - Send automated alert ⏰ **Scheduling (CORE VALUE)** Turn this into a Daily AI Portfolio Assistant: - Add a Cron node - Run every morning (e.g., 9 AM) - Get daily portfolio intelligence in Slack 🎯 **Use Cases** - Multi-chain portfolio tracking - Daily crypto risk monitoring - Automated investment insights - Web3 traders & investors - DAO treasury monitoring - Crypto founders & analysts 🔧 **Requirements** - Wallet address (EVM-compatible chains) - QuickNode RPC endpoints (Ethereum, Polygon, etc.) - Slack account (for alerts) - Price API (CoinGecko / CryptoCompare) **Sample Output** ![Sample Output.png](fileId:5480)

View

Orchestrate multi-agent compliance monitoring and audit logging with GPT-4o and Slack

## How It Works This workflow automates enterprise compliance governance using a multi-agent AI architecture. It targets compliance officers, legal teams, and risk managers who need continuous, jurisdiction-aware monitoring without manual intervention.The workflow runs on two paths: a scheduled governance check and an event-driven compliance signal receiver. Both converge on a central Governance Agent that orchestrates four specialised sub-agents, Compliance Signal, Jurisdiction Analysis, Remediation Planning, and Audit Documentation, each backed by dedicated AI models. External tools handle API lookups, penalty calculations, escalation notifications, and structured output parsing. Results feed into immutable audit logs, merged with scheduled reports, and dispatched as a daily email summary. Escalation logic ensures critical violations trigger immediate alerts. This eliminates fragmented compliance tracking, reduces human error, and provides auditable, timestamped records across multiple regulatory jurisdictions. ## Setup Steps 1. Add Claude/OpenAI credentials to all Chat Model nodes. 2. Configure the External Compliance API Tool with your regulatory API key and endpoint. 3. Set the Compliance Data Calculation Tool parameters (jurisdiction codes, penalty thresholds). 4. Connect the Escalation Notification Tool to your email or Slack credentials. 5. Set the Periodic Compliance Check schedule (cron) to your audit cadence. ## Prerequisites - External compliance/regulatory API access - Email or Slack account for notifications - n8n instance (v1.40+ recommended) ## Use Cases - Automated GDPR/MAS/SOX compliance monitoring across jurisdictions ## Customization Swap AI models per agent for cost optimisation. ## Benefits Eliminates manual compliance checks with autonomous multi-agent orchestration.

View

Need Custom Automation?

Get help designing a custom n8n workflow that connects your stack and fits your process.

Multimodal telegram bot with voice, image & video analysis using Claude & Gemini

Workflow preview

Important notice

Overview

What it's for:

How it works:

How to use:

Requirements:

Customizing this workflow: