PRICING
PRODUCT
SOLUTIONS
by use case
learn more
BlogTemplatesVideosYoutubeRESOURCES
COMMUNITIES AND SOCIAL MEDIA
PARTNERS
By utilizing our product, users can streamline their workflow and efficiently extract text from PDFs in a matter of minutes. Our automated system eliminates the need for manual data entry, reducing the risk of errors and increasing overall productivity. With just a few simple clicks, users can extract text from multiple PDF files simultaneously, making it a valuable tool for businesses and individuals alike.
Build your automated pdf to text extractor in a minute! Perfect solution for those who deal with large amounts of incoming CVs or other documents! Extract data from documents using remote workflow on Latenode.com!
ā
In this article youāll see how to create an automated no code workflow that will extract data from the PDF files on your Google Drive using API and Chat GPT, and filling the Google Sheet form with the data.
Also, you can get the template of that scenario that you can copy and use for yourself for free!Ā
You can upgrade this template or use it not only for pdf text extraction.
Letās take a look at the whole pdf scraper scenario first, and then break each step down.
ā
How does this pdf extract scenario work? Once in an hour it takes all the CVs in pdf format from the Google Drive, converts to the txt files using HTTP-request, then chatGPT assistant Extracts the needed data and fills out the Google Spreadsheet.
And hereās a step-by-step instruction for those who want to learn the process. Remember about free to use and ready to go templates at the end of the article.
For this scenario, youāll need to have the ability to use Open AI Assistants and any API converter.
That is simple. Log into your OpenAI account, then go to assistants, and click the ācreateā button in the upper right corner. Youāll see the assistant's settings panel.
Here you need to:
Copy the assistantās ID (you see it under the Name column), then go to API keys, create one and save it.
Use any you like. I took ConvertAPI because it has a free trial and provides a lot of info.
ā
Now we switch to Latenode.com. Here, we have to create the pdf to text conversion scenario: (You don't need to create it from scratch, just copy the template at the end of the article)
Click āAdd nodeā in the scenario tab, choose āScheduleā from the list. Click on the node to set it up. Specify the interval and the timezone, save the changes. I also added a trigger on Run once, just for convenience.
Click āAdd nodeā, search for Google drive folder in actions tab, and choose āFind fileā node.
To make it work you have to log into gmail account to get access token, choose drive and Search name. In this case, I want to extract data from files that have CV in their name.
ā
Next add āDownload fileā from the same action folder of Google drive.
Use id from the results of the previous node, and then click run node once to save the changes and make the data flow through the scenario. Youāll get the file in the output.
That is the code the AI gave me. You can take it here(#1). Replace const fileContentPath with your object from the previous node.
Find HTTP request in the list of actions. To understand how to create it, visit ConvertAPI documentation. Here we can take info about setting up the request.
Hereās how my pdf to txt conversion request looks like.
ā
I use an object from Find file Google Drive node to specify the name of the downloaded file, and file content in base64 from the Javascript node. And add Content-Type=application/json pair in Headers.
Run node once to get the file from the HTTP request.
Another code node, #2 here.
This time, I asked AI to extract text from the txt file.
Weāll face 3 GPT nodes here:
Ā Each node performs an action with OpenAI.
First we create a thread, or conversation with the GPT Assistant
Insert your Open AI API key, thatās it! Run node once and get the id of the created thread in the output.
Here you need an API key again. In the thread ID field put the result of the previous node. Youāll see it in the helper window after you click on the input field.
In āMessage contentā give some additional instructions if you want and put the filecontent from the last Javascript node. Automated pdf extractor is one step closer!
This node gets the pdf scraper reply.
Specify the node just like your OpenAI assistant and use GPT assistant ID.
Here we use JS node the last time, to make 3 separate json objects out of the Assistants reply.
Hereās the example, just put your data in content const.
Letās put this data somewhere, Google Sheets is a good option for this pdf data extract scenario on latenode.com.
Log into gmail account once more to get access token, choose the drive and the sheet, and put jsons into the fields, save the scenario and click run once to run or deploy the scenario to activate the schedule trigger.
After a successful scenario execution this workflow will extract the text from pdf file on your Google Drive and put it into your Google Spreadsheet.
That is how to create a pdf extractor with no code on latenode.
As I promised, hereās the template of this workflow. Just copy it and follow this guide article to set it up.
Thereās a video about it, donāt bother reading!
If you want somebody to help you, check out our Discord channel, we have some devs in it ready to assist!
ā
Yes, Latenode is designed for users of all skill levels. It offers advanced features for those proficient in JavaScript and intuitive visual tools and AI assistance for beginners. Whether you're an experienced developer or a novice, Latenode provides a user-friendly experience tailored to your skill level.āā
Yes, Latenode supports integration with a wide range of third-party services and APIs. You can connect Latenode to various online platforms, databases, and software systems to automate data transfers, trigger actions, and streamline workflows. Latenode also provides tools and resources to facilitate the integration process.
Yes, Latenode offers a free version that lets you explore its capabilities. This version includes a subset of Latenode's features, enabling you to start with automation and experience its benefits. You can then decide whether to upgrade to a paid plan for additional features and resources.
Latenode is a visible and intuitive automation tool designed to empower customers to streamline their workflows through automation. It allows customers to create computerized strategies via connecting diverse internet offerings and gadgets, permitting them to automate obligations and decorate productivity successfully.
By integrating all your marketing tools in one place through data integration, Latenode helps you gain a comprehensive view of your operations. This enables you to identify potential opportunities more easily and make informed decisions based on accurate data.