Skip to main content

Extract ecommerce product data with Google Sheets, ScrapingBee and Gemini

Workflow preview

Workflow preview
100%
Extract ecommerce product data with Google Sheets, ScrapingBee and Gemini preview
Open on n8n.io

1. Workflow Overview

Quick Overview This workflow manually runs to read a list of webpage URLs from Google Sheets, scrape each page with ScrapingBee, and use Google Gemini to extract structured product data from screen...

Best for

  • Market Research automation workflows
  • AI RAG automation workflows
  • advanced n8n builders looking for reusable templates

Tools used

n8n-nodes-base.manualtrigger, n8n-nodes-base.httprequest, @n8n/n8n-nodes-langchain.outputparserstructured, @n8n/n8n-nodes-langchain.lmchatgooglegemini, n8n-nodes-base.splitout, n8n-nodes-base.googlesheets, n8n-nodes-base.stickynote, n8n-nodes-base.set

Source and attribution

This workflow is cataloged by N8N Workflows and links back to its original n8n.io source page by Ravi Patel.

Original n8n.io source

1.1 Workflow description

Title
Extract ecommerce product data with Google Sheets, ScrapingBee and Gemini
Workflow name
Extract ecommerce product data with Google Sheets, ScrapingBee and Gemini

Quick Overview

This workflow manually runs to read a list of webpage URLs from Google Sheets, scrape each page with ScrapingBee, and use Google Gemini to extract structured product data from screenshots with an HTML fallback, then append the results back into a Google Sheets sheet.

How it works

  1. Runs when you manually trigger the workflow.
  2. Reads the list of URLs to scrape from a Google Sheets spreadsheet.
  3. Fetches a full-page screenshot for each URL using the ScrapingBee API.
  4. Sends the screenshot (and the URL for context) to a Google Gemini model to extract product details into a structured JSON format, calling a ScrapingBee HTML fetch tool when screenshot extraction is incomplete.
  5. Converts any fetched HTML to Markdown and returns it to the Gemini agent to complete the extraction.
  6. Splits the extracted product array into individual items and appends them as new rows in the Google Sheets “Results” sheet.

Setup

  1. Create a Google Sheets service account connection in n8n and set the target spreadsheet and the “List of URLs” and “Results” sheet selections.
  2. Add a Google Gemini (PaLM) API credential in n8n and ensure the selected model (gemini-1.5-pro-latest) is available for your account.
  3. Add your ScrapingBee API key in both ScrapingBee HTTP requests (screenshot and HTML) and confirm the target URLs are reachable from your n8n environment.
  4. Ensure the “Results” sheet columns match the structured output fields (for example: product_title, product_price, product_brand, promo, promo_percentage/promo_percent) or update the output schema and column mappings accordingly.

1.2 Logical Blocks

This catalog entry is organized from the workflow JSON. The node-level section below shows the executable blocks available for review before importing the template.

2. Block-by-Block Analysis

Block 1 - When clicking ‘Test workflow’

Type / Role
n8n-nodes-base.manualTrigger - manualTrigger
Config choices
Version 1

Block 2 - ScrapingBee- Get page HTML

Type / Role
n8n-nodes-base.httpRequest - httpRequest
Config choices
Version 4.2

Block 3 - Structured Output Parser

Type / Role
@n8n/n8n-nodes-langchain.outputParserStructured - outputParserStructured
Config choices
Version 1.2

Block 4 - Google Gemini Chat Model

Type / Role
@n8n/n8n-nodes-langchain.lmChatGoogleGemini - lmChatGoogleGemini
Config choices
Version 1

Block 5 - Split Out

Type / Role
n8n-nodes-base.splitOut - splitOut
Config choices
Version 1

Block 6 - Google Sheets - Get list of URLs

Type / Role
n8n-nodes-base.googleSheets - googleSheets
Config choices
Version 4.5

Block 7 - Sticky Note

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 8 - Sticky Note1

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 9 - Sticky Note2

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 10 - Set fields

Type / Role
n8n-nodes-base.set - set
Config choices
Version 3.4

Block 11 - Sticky Note3

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 12 - ScrapingBee - Get page screenshot

Type / Role
n8n-nodes-base.httpRequest - httpRequest
Config choices
Version 4.2

Block 13 - Sticky Note4

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 14 - HTML-based Scraping Tool

Type / Role
@n8n/n8n-nodes-langchain.toolWorkflow - toolWorkflow
Config choices
Version 1.2

Block 15 - Sticky Note5

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 16 - Sticky Note6

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 17 - Sticky Note7

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 18 - Sticky Note8

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 19 - Sticky Note9

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 20 - Google Sheets - Create Rows

Type / Role
n8n-nodes-base.googleSheets - googleSheets
Config choices
Version 4.5

Block 21 - Vision-based Scraping Agent

Type / Role
@n8n/n8n-nodes-langchain.agent - agent
Config choices
Version 1.7

Block 22 - Sticky Note10

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Block 23 - HTML-Scraping Tool

Type / Role
n8n-nodes-base.executeWorkflowTrigger - executeWorkflowTrigger
Config choices
Version 1

Block 24 - Sticky Note11

Type / Role
n8n-nodes-base.stickyNote - stickyNote
Config choices
Version 1

Showing the first 24 of 29 workflow blocks. Download the JSON for the full node graph.

3. Summary Table

Workflow Extract ecommerce product data with Google Sheets, ScrapingBee and Gemini
Complexity advanced
Nodes 29
Categories Market Research, AI RAG
Author Ravi Patel
Published 15 Jun 2026

4. Reproducing the Workflow from Scratch

  1. 1. Download the workflow JSON

    Use the JSON export at /data/workflows/16354/16354.json as the source template for this automation.

  2. 2. Import the template into n8n

    Open n8n, import the downloaded JSON, and review each node before activating the workflow.

  3. 3. Configure credentials and variables

    Replace placeholder credentials, API keys, webhook URLs, account IDs, and environment-specific values with your own settings.

  4. 4. Test with sample data

    Run the workflow manually or in a staging workspace, inspect node output, and confirm downstream systems receive the expected data.

  5. 5. Activate and monitor

    Enable the workflow only after testing, then monitor executions, errors, and rate limits during the first production runs.

5. General Notes & Resources

Review imported nodes carefully before activation. This catalog entry is intended to help you inspect the workflow structure, understand required services, and find related templates faster.

Node names, credentials, schedules, webhook paths, and external service limits may need adjustment for your workspace.

Frequently asked questions

What does Extract ecommerce product data with Google Sheets, ScrapingBee and Gemini do?

Quick Overview This workflow manually runs to read a list of webpage URLs from Google Sheets, scrape each page with ScrapingBee, and use Google Gemini to extract structured product data from screen...

What do I need before importing this workflow?

Review the workflow JSON, configure any required credentials in n8n, and test the automation in a safe workspace before using it in production.

Can I customize this workflow?

Yes. Use the block-by-block analysis and the downloadable JSON to inspect each node, then adjust credentials, prompts, schedules, filters, or destinations for your Market Research, AI RAG use case.