importing pdf-parse and fs modules - Forum

Hello

first thank you for providing this very useful tool.

However I am confronted with banking statement files that do not indicate the period covered in the file name. The only way to get this information is to read this pdf.

I archive my banking statements by formatting the name: ISO Period start date + bank and account information + ISO Period end date. for instance: 20251101_BNK_ME_EUR_20251231

Would it be possible when using a javascript method to import and use modules like fs and pdf-parse ?

Thank you for looking at this question
Laurent

Reply to #1:

Hi Laurent,

I was trying to wait to see if Kim, the developer, would jump in. At any rate, he has stated many times that he won't allow loading external modules because of security, complexity, and support problems. I believe he is right to do that. He HAS added a new javascript command, readFileText(), that (if you can OCR your pdfs) can read the text versions and extract any information available there. You can then add the date to the names of the text files (they should be named with the same names as the pdfs, but with the txt extension of course), make sure Pair renaming is on, then add your pdfs; through the magic of pair renaming your pdfs will take on the new name including whatever you've added to the txt file name.

There are a couple of other ways to do this, short of using a javascript interpreter on your computer to do it external from any renaming program:
1. Use an online AI renamer (something I wouldn't do with nothing files, certainly not with bank info).
2. I read about a renaming program for Macs that can delve into pdf files, but they do the same thing I just suggested - ocr the pdf and extract the information you set up in field descriptors. If you have a mac it 'might' be worth looking at, I don't know.

I do know that I wrote a script to extract date info from my bank pdfs. After the half-hour writing it, it took less time to batch ocr the pdfs and run them through the script than it does to describe the process. My script probably would need a lot of changing to work for you, but if you want to see it let me know.

Best,
DF

Reply to #2:
For added security the standard library is not available. This means that you cannot access the file system or the internet from the script methods. The same goes for importing external libraries.

You can access the content of a file with the readFileText() method, but since PDFs are binary, this might not work for you. The readFileText() is meant for text files only. The next version 4.20 will have a modified version of readFileText() that can read text parts from a binary file. Version 4.20 should have been released in December, but because of the holidays I have chosen to postpone the release for January.

Reply to #3: Thank you for all the input you provided about my request, I will wait for version 4.20 and the readFileText() method. Have a great year 2026