The project involves reading PDF packing lists stored in Google Drive, extracting structured data (order number, product names, and quantities), detecting pen color in checkboxes to assign packer identity (with a predefined mapping), and updating a Google Sheet with the results. The workflow should be automated, low-maintenance once set up, and ideally cost-efficient.
The fulfillment team hand-checks every PDF packing list, then re-keys order numbers, SKUs, and quantities into a Google Sheet. With nearly 7,000 orders a month, this manual transcription is slow, error-prone, and offers no clear audit trail of which packer prepared each order.
Design and implement a hands-off, low-maintenance workflow that:
To automate the extraction of data from scanned packing lists, each tool plays a clear role from handling incoming files to pulling out key details and saving them in a tracking system.
We’ve also added simple analogies to help explain what each tool does and how they work together as one smooth process.
The PDF is converted into a readable image using PDF.co, then analyzed step-by-step by GPT to extract order names, order numbers, quantities, dates, and even pen color used and it is used to identify the packer. Finally, the parsed and enriched data is appended into a Google Sheet with no manual intervention required, ensuring fast, accurate, and consistent data logging.
Before running the automated workflow in n8n, make sure the Google Sheet is structured to capture the extracted data accurately.
The current sheet is organized into the following columns:
A unique identifier assigned to each processed packing list entry. This ensures no duplicate records are stored and helps track each row distinctly
Extracted from the packing list, this identifies the unique order code
Lists the product names as detected by GPT-4o from the scanned PDF.
Displays the corresponding quantity of each product.
Indicates the processing or packing date as recognized from the document.
Stores the detected pen color used to mark or check items, which is used to determine the packer’s identity.
Based on pen color analysis, this column logs the associated packer’s name.
A checkbox column used for manual review or marking completion after validation.
You are a document parser.
Analyze the attached scanned order form image and extract only the Orders Name.
Here is the scanned order form image: {{ $('PDFco Api').item.json.body[0] }}
Return only the Orders Name, with no additional words, explanations, or punctuation.
You are a document parser.
Analyze the attached scanned order form image and extract only the Order Number.
Here is the scanned order form image: {{ $('PDFco Api').item.json.body[0] }}
Return only the Order Number, with no additional words, explanations, or punctuation.
You are a document parser.
Analyze the scanned order form image at: {{ $('PDFco Api').item.json.body[0] }}
Extract only the Quantity values (not the "Ship Quantity"). If multiple orders are present, extract all corresponding quantities.
Return only the Quantity values—no extra words, explanations, or punctuation.
You are a document parser.
Analyze the scanned order form image at: {{ $('PDFco Api').item.json.body[0] }}.
Locate the scribbled or check mark pen color beside the "Packer" section and extract it.
Only output is just the pen color (e.g. Red, Blue, Purple)
You are a document parser.
Extract and identify the associated name based on the pen color used.
Return only the name—no additional words, formatting, or punctuation.
Color-to-Name Mapping:
Red → Olivia Thompson
Blue → Jack Williams
Purple → Sophie Patel
Here: {{ $json.content }}
Fill out the form and let’s get started.