Automate Data Extraction with Low-Code Platforms | Latenode Tutorial
Automate Data Extraction with Latenode: A Step-by-Step Guide
If you often work with large amounts of files, such as CSVs or reports, automation can be a game-changer. In this tutorial, we'll show you how to create an automated workflow for extracting and storing needed data from files without even reading them. We'll use three main tools: ChatGPT Assistant, online file converters API, and Latenode, the best low-code automation platform.
For those interested in replicating this workflow, there's a link in the description to a ready-to-go template. Now, let’s dive into the details of creating this workflow.
A Walkthrough of the Workflow
First, we need a Google Drive account. Imagine you have some people to hire, and they send their CVs to your Google Drive. This workflow will check files with 'CV' in their names, convert these PDF files to text, extract data such as name, email, and experience, and finally store this information in a Google Spreadsheet.
Initial Setup: Triggers and Google Drive
Start by logging into your Latenode account and creating a new scenario. Add two triggers: one for a schedule and another for manual activation, useful for development and testing. After setting up the triggers, switch to the Google Drive section to find and download files with 'CV' in their names.
To set up the Google Drive node, you need an authorization token. If you don't have one, create a new authorization. Specify the drive, search for files with 'CV' in their names, and initiate the node running. This will display the file details in the console.
Converting and Parsing Files
Next, we'll convert the downloaded PDF files to text using the Converter API. Add an HTTP request node, connect it with the previous nodes, and fill in the necessary details from the API documentation. This will convert PDF files to a base64 formatted text.
To decode the base64 format, use a JavaScript node. The code samples are provided in the description, making your task easier. Copy and paste the code, ensuring it aligns with your previous node's data format. Run the node to decode the content to a string of text.
Extracting Information Using ChatGPT
Next, use the ChatGPT Assistant for data extraction. Create three ChatGPT nodes: Create Thread, Create Message, and Get Reply. In your OpenAI account, set up the assistant to extract name, email, and experience. You'll need the assistant ID for this.
First, create a thread to start a conversation with ChatGPT. Then, create a message detailing what you want to extract. Use the decoded file content as input. Finally, get the assistant's reply, which will contain the extracted data in a structured format.
Final Step: Filling Google Spreadsheet
In the last step, we will insert the extracted data into the Google Spreadsheet. Use the Google Sheets node to add a single row. Authorize Google Sheets and specify the sheet details and column names. Map the data extracted by ChatGPT to the respective columns.
Run the node to verify if the data is correctly populated in your spreadsheet. You can further refine the model's accuracy by tweaking the assistant's prompt.
Saving Time with Automation
This automated workflow can significantly save time, especially for tasks such as hiring and handling large volumes of reports. Once deployed, the workflow triggers on schedule and processes new files automatically, reducing manual intervention.
If you found this guide helpful, consider subscribing to our channel. For any questions, feel free to join our Discord community, where we discuss automation and more. Happy automating!