PDFpen OCR Folder Action Script

As discussed on Mac Power Users episode 3, “Going Paperless,” the nice people at Smile On My Mac put together an Applescript that, when combined with a folder action, gives you a way to automatically OCR documents using PDFpen or PDFpenPro. So here is the promised walk through:

What you’ll need:

1. Some scanned PDF images;
2. PDFpen or PDFpenPro (See my review here);
3. A bit of patience.

Step 1 – Load up the Script Editor

Script Editor.png

This little application allows you to create and save AppleScripts.

Step 2 – Copy in the below script

on adding folder items to this_folder after receiving added_items
try
repeat with i from 1 to number of items in added_items
set this_item to item i of added_items
tell application “PDFpenPro”
open this_item
set theDoc to document 1
repeat with aPage in pages of theDoc
ocr aPage
— Looks like we need to modify PDFpen so that we can detect when OCR is done; for now use 15 seconds
delay 15
end repeat
save theDoc
close theDoc
end tell
end repeat
on error errText
display dialog “Error: ” & errText
end try
end adding folder items to

————-

Note – if you use PDFpenPro instead of PDFpen, you’ll need to open the script and edit the command that reads “tell application “PDFpen” to read “tell application “PDFpenPro”.

Note 2 – Wordpress seems to have converted the double dash before the comment in to an em-dash and the quotes to smart quotes. Although I fixed it in the wordpress code, it still reverts to “fixing” things when I publish so you’ll have to correct those in your editor. Sorry. If anyone knows a better way to post applescript via wordpress, please drop me a note.

Step 3 – Save the script

You need to save it to a specific directory:

HD/Library/Scripts/Folder Action Scripts/

I named mine “PDFpen Scriptacular”

Step 4 – Create a folder

Save the folder wherever is convenient. Perhaps in your documents folder or (for you anarchists) on the desktop. By the way, did you know that command-shift-n gets you a new folder? I named mine “OCR Drop.”

Step 5 – Enable folder actions

Secondary click on the folder and enable folder actions under the “More” item.

Enable Folder Actions.jpg

Step 6 – Configure Folder Action

Right clicking the folder a second time gives you a new option, Configure Folder Action. Click it.

Configure Folder Actions-1.jpg

Step 7 – Pick Your Folder

On the menu that appears, hit the plus (+) sign under the “Folders with Actions” box.

FA pick folder.jpg

Select your folder, wherever you located it. It will then ask you to pick a script. Pick the PDFpen scriptacular.scpt

pick script.jpg

It should now look like this.

Script menu.jpg

Close the window and you are done.

Now just drag a few PDFs in and let the script go to work. Copy the OCR’d PDFs where they belong and you are done. There are a few additional points:

1. There is no Applescript command in PDFpen that reports when it is done doing an OCR so instead there is a 15 second timer. The PDFpen wizards report they are going to try and fix this in a future release.

2. While this script generally works, it sometimes gave me an error when I overloaded it. Be patient.

I want to give my personal thanks to the gang at Smile On My Mac, particularly Greg, who put this script together for Mac Power Users just because we asked.

80 Comments PDFpen OCR Folder Action Script

  1. lamike@mac.com

    Very helpful. There was an article sometime last year in, I think, Macworld entitled something like “Going Paperless” that used script to control scan, OCR the product, and file the OCR on your hard drive. Acrobat was used to provide the OCR.

    Reply
  2. lamike@mac.com

    Very helpful. There was an article sometime last year in, I think, Macworld entitled something like “Going Paperless” that used script to control scan, OCR the product, and file the OCR on your hard drive. Acrobat was used to provide the OCR.

    Reply
  3. lamike@mac.com

    Very helpful. There was an article sometime last year in, I think, Macworld entitled something like “Going Paperless” that used script to control scan, OCR the product, and file the OCR on your hard drive. Acrobat was used to provide the OCR.

    Reply
  4. lamike@mac.com

    Very helpful. There was an article sometime last year in, I think, Macworld entitled something like “Going Paperless” that used script to control scan, OCR the product, and file the OCR on your hard drive. Acrobat was used to provide the OCR.

    Reply
  5. lamike@mac.com

    Very helpful. There was an article sometime last year in, I think, Macworld entitled something like “Going Paperless” that used script to control scan, OCR the product, and file the OCR on your hard drive. Acrobat was used to provide the OCR.

    Reply
  6. me@gregmote.com

    WordPress seems to have converted the double dash before the comment in to an em-dash and the quotes to smart quotes. You might want to edit the text to correct this so that it functions properly for those who do not know enough to fix that sort of little bug.

    Reply
  7. me@gregmote.com

    WordPress seems to have converted the double dash before the comment in to an em-dash and the quotes to smart quotes. You might want to edit the text to correct this so that it functions properly for those who do not know enough to fix that sort of little bug.

    Reply
  8. me@gregmote.com

    WordPress seems to have converted the double dash before the comment in to an em-dash and the quotes to smart quotes. You might want to edit the text to correct this so that it functions properly for those who do not know enough to fix that sort of little bug.

    Reply
  9. me@gregmote.com

    WordPress seems to have converted the double dash before the comment in to an em-dash and the quotes to smart quotes. You might want to edit the text to correct this so that it functions properly for those who do not know enough to fix that sort of little bug.

    Reply
  10. me@gregmote.com

    WordPress seems to have converted the double dash before the comment in to an em-dash and the quotes to smart quotes. You might want to edit the text to correct this so that it functions properly for those who do not know enough to fix that sort of little bug.

    Reply
  11. david@macsparky.com

    @Greg –

    Thanks for catching that. I tried to fix it but WordPress keeps reverting it so I placed a note. If anyone knows a better way to post Applescript to WordPress, please drop me a note.

    Reply
  12. david@macsparky.com

    @Greg –

    Thanks for catching that. I tried to fix it but WordPress keeps reverting it so I placed a note. If anyone knows a better way to post Applescript to WordPress, please drop me a note.

    Reply
  13. david@macsparky.com

    @Greg –

    Thanks for catching that. I tried to fix it but WordPress keeps reverting it so I placed a note. If anyone knows a better way to post Applescript to WordPress, please drop me a note.

    Reply
  14. david@macsparky.com

    @Greg –

    Thanks for catching that. I tried to fix it but WordPress keeps reverting it so I placed a note. If anyone knows a better way to post Applescript to WordPress, please drop me a note.

    Reply
  15. david@macsparky.com

    @Greg –

    Thanks for catching that. I tried to fix it but WordPress keeps reverting it so I placed a note. If anyone knows a better way to post Applescript to WordPress, please drop me a note.

    Reply
  16. markus.pletscher@aebivoelkerund.com

    Hey Mr. Sparky

    Nice one, now could you do the same with Adobe Acrobat 🙂

    Reply
  17. markus.pletscher@aebivoelkerund.com

    Hey Mr. Sparky

    Nice one, now could you do the same with Adobe Acrobat 🙂

    Reply
  18. markus.pletscher@aebivoelkerund.com

    Hey Mr. Sparky

    Nice one, now could you do the same with Adobe Acrobat 🙂

    Reply
  19. markus.pletscher@aebivoelkerund.com

    Hey Mr. Sparky

    Nice one, now could you do the same with Adobe Acrobat 🙂

    Reply
  20. markus.pletscher@aebivoelkerund.com

    Hey Mr. Sparky

    Nice one, now could you do the same with Adobe Acrobat 🙂

    Reply
  21. markus.pletscher@aebivoelkerund.com

    Me again

    Why I ask for Acrobat is because Acrobat reduces the size after the OCR process which PDFpen doesnt do in the same process.

    Reply
  22. markus.pletscher@aebivoelkerund.com

    Me again

    Why I ask for Acrobat is because Acrobat reduces the size after the OCR process which PDFpen doesnt do in the same process.

    Reply
  23. markus.pletscher@aebivoelkerund.com

    Me again

    Why I ask for Acrobat is because Acrobat reduces the size after the OCR process which PDFpen doesnt do in the same process.

    Reply
  24. markus.pletscher@aebivoelkerund.com

    Me again

    Why I ask for Acrobat is because Acrobat reduces the size after the OCR process which PDFpen doesnt do in the same process.

    Reply
  25. markus.pletscher@aebivoelkerund.com

    Me again

    Why I ask for Acrobat is because Acrobat reduces the size after the OCR process which PDFpen doesnt do in the same process.

    Reply
  26. duncjunk@snowcapped.net

    This script is really buggy. I keep getting these errors:

    1st error: Error: PDFpen got an error: Connection is invalid.

    2nd error: Error: PDFPen got and error: Can’t get document 1. Invalid index.

    (#2 pops up once I hit OK on error #1)

    I thought the Scripts published for Acrobat by Macworld were buggy but this one is just as bad. Ideas? I’ll also post this over at MPUs. Thanks.

    Reply
  27. duncjunk@snowcapped.net

    This script is really buggy. I keep getting these errors:

    1st error: Error: PDFpen got an error: Connection is invalid.

    2nd error: Error: PDFPen got and error: Can’t get document 1. Invalid index.

    (#2 pops up once I hit OK on error #1)

    I thought the Scripts published for Acrobat by Macworld were buggy but this one is just as bad. Ideas? I’ll also post this over at MPUs. Thanks.

    Reply
  28. duncjunk@snowcapped.net

    This script is really buggy. I keep getting these errors:

    1st error: Error: PDFpen got an error: Connection is invalid.

    2nd error: Error: PDFPen got and error: Can’t get document 1. Invalid index.

    (#2 pops up once I hit OK on error #1)

    I thought the Scripts published for Acrobat by Macworld were buggy but this one is just as bad. Ideas? I’ll also post this over at MPUs. Thanks.

    Reply
  29. duncjunk@snowcapped.net

    This script is really buggy. I keep getting these errors:

    1st error: Error: PDFpen got an error: Connection is invalid.

    2nd error: Error: PDFPen got and error: Can’t get document 1. Invalid index.

    (#2 pops up once I hit OK on error #1)

    I thought the Scripts published for Acrobat by Macworld were buggy but this one is just as bad. Ideas? I’ll also post this over at MPUs. Thanks.

    Reply
  30. duncjunk@snowcapped.net

    This script is really buggy. I keep getting these errors:

    1st error: Error: PDFpen got an error: Connection is invalid.

    2nd error: Error: PDFPen got and error: Can’t get document 1. Invalid index.

    (#2 pops up once I hit OK on error #1)

    I thought the Scripts published for Acrobat by Macworld were buggy but this one is just as bad. Ideas? I’ll also post this over at MPUs. Thanks.

    Reply
  31. john@johnchandler.org

    David,
    Did you ever update this. I just attempted it and it crashed PDFPen Pro. And I assume it just saves the file back as an OCR’d PDF, correct? (I’m attempting with a JPG, so that might be the issue. PDFPen Pro can open the jpg, but might not know what to do when it’s time to save it.)

    What I really want to do is have a script like this open a jpg screencap save the OCR results as an .rtf

    Here’s the ideal situation — I capture a screen on the ipad, save it in a dropbox folder. My Mac at home processes that .jpg via OCR and then saves it as an .rtf, which Hazel then moves into Notational Velocity. In short, I can capture a screen from Kindle or my Bible app (neither of which allow cut and paste) and minutes later have it available in SimpleNote!

    Reply
  32. john@johnchandler.org

    David,
    Did you ever update this. I just attempted it and it crashed PDFPen Pro. And I assume it just saves the file back as an OCR’d PDF, correct? (I’m attempting with a JPG, so that might be the issue. PDFPen Pro can open the jpg, but might not know what to do when it’s time to save it.)

    What I really want to do is have a script like this open a jpg screencap save the OCR results as an .rtf

    Here’s the ideal situation — I capture a screen on the ipad, save it in a dropbox folder. My Mac at home processes that .jpg via OCR and then saves it as an .rtf, which Hazel then moves into Notational Velocity. In short, I can capture a screen from Kindle or my Bible app (neither of which allow cut and paste) and minutes later have it available in SimpleNote!

    Reply
  33. john@johnchandler.org

    David,
    Did you ever update this. I just attempted it and it crashed PDFPen Pro. And I assume it just saves the file back as an OCR’d PDF, correct? (I’m attempting with a JPG, so that might be the issue. PDFPen Pro can open the jpg, but might not know what to do when it’s time to save it.)

    What I really want to do is have a script like this open a jpg screencap save the OCR results as an .rtf

    Here’s the ideal situation — I capture a screen on the ipad, save it in a dropbox folder. My Mac at home processes that .jpg via OCR and then saves it as an .rtf, which Hazel then moves into Notational Velocity. In short, I can capture a screen from Kindle or my Bible app (neither of which allow cut and paste) and minutes later have it available in SimpleNote!

    Reply
  34. john@johnchandler.org

    David,
    Did you ever update this. I just attempted it and it crashed PDFPen Pro. And I assume it just saves the file back as an OCR’d PDF, correct? (I’m attempting with a JPG, so that might be the issue. PDFPen Pro can open the jpg, but might not know what to do when it’s time to save it.)

    What I really want to do is have a script like this open a jpg screencap save the OCR results as an .rtf

    Here’s the ideal situation — I capture a screen on the ipad, save it in a dropbox folder. My Mac at home processes that .jpg via OCR and then saves it as an .rtf, which Hazel then moves into Notational Velocity. In short, I can capture a screen from Kindle or my Bible app (neither of which allow cut and paste) and minutes later have it available in SimpleNote!

    Reply
  35. john@johnchandler.org

    David,
    Did you ever update this. I just attempted it and it crashed PDFPen Pro. And I assume it just saves the file back as an OCR’d PDF, correct? (I’m attempting with a JPG, so that might be the issue. PDFPen Pro can open the jpg, but might not know what to do when it’s time to save it.)

    What I really want to do is have a script like this open a jpg screencap save the OCR results as an .rtf

    Here’s the ideal situation — I capture a screen on the ipad, save it in a dropbox folder. My Mac at home processes that .jpg via OCR and then saves it as an .rtf, which Hazel then moves into Notational Velocity. In short, I can capture a screen from Kindle or my Bible app (neither of which allow cut and paste) and minutes later have it available in SimpleNote!

    Reply
  36. mspeyer22@gmail.com

    David,
    I'm getting the same "Connection is Invalid" error as Rob (above). I wonder if you could take a look at the script again. I inquired with Smile about the script but got no traction…

    On May-27-2010, at 6:50 AM, PDFpen Support wrote:

    Hi

    As its not our script, we don't maintain or update it. Your best bet would be to contact the MacSparky folks if there's a specific update to the script you'd like to see.

    Thanks for using PDFPen from SmileOnMyMac!

    Regards,

    Justin
    PDFPen Support
    support@pdfpen.com
    http://www.smileonmymac.com/pdfpen

    Thanks, Marc

    Reply
  37. mspeyer22@gmail.com

    David,
    I'm getting the same "Connection is Invalid" error as Rob (above). I wonder if you could take a look at the script again. I inquired with Smile about the script but got no traction…

    On May-27-2010, at 6:50 AM, PDFpen Support wrote:

    Hi

    As its not our script, we don't maintain or update it. Your best bet would be to contact the MacSparky folks if there's a specific update to the script you'd like to see.

    Thanks for using PDFPen from SmileOnMyMac!

    Regards,

    Justin
    PDFPen Support
    support@pdfpen.com
    http://www.smileonmymac.com/pdfpen

    Thanks, Marc

    Reply
  38. mspeyer22@gmail.com

    David,
    I'm getting the same "Connection is Invalid" error as Rob (above). I wonder if you could take a look at the script again. I inquired with Smile about the script but got no traction…

    On May-27-2010, at 6:50 AM, PDFpen Support wrote:

    Hi

    As its not our script, we don't maintain or update it. Your best bet would be to contact the MacSparky folks if there's a specific update to the script you'd like to see.

    Thanks for using PDFPen from SmileOnMyMac!

    Regards,

    Justin
    PDFPen Support
    support@pdfpen.com
    http://www.smileonmymac.com/pdfpen

    Thanks, Marc

    Reply
  39. mspeyer22@gmail.com

    David,
    I'm getting the same "Connection is Invalid" error as Rob (above). I wonder if you could take a look at the script again. I inquired with Smile about the script but got no traction…

    On May-27-2010, at 6:50 AM, PDFpen Support wrote:

    Hi

    As its not our script, we don't maintain or update it. Your best bet would be to contact the MacSparky folks if there's a specific update to the script you'd like to see.

    Thanks for using PDFPen from SmileOnMyMac!

    Regards,

    Justin
    PDFPen Support
    support@pdfpen.com
    http://www.smileonmymac.com/pdfpen

    Thanks, Marc

    Reply
  40. mspeyer22@gmail.com

    David,
    I'm getting the same "Connection is Invalid" error as Rob (above). I wonder if you could take a look at the script again. I inquired with Smile about the script but got no traction…

    On May-27-2010, at 6:50 AM, PDFpen Support wrote:

    Hi

    As its not our script, we don't maintain or update it. Your best bet would be to contact the MacSparky folks if there's a specific update to the script you'd like to see.

    Thanks for using PDFPen from SmileOnMyMac!

    Regards,

    Justin
    PDFPen Support
    support@pdfpen.com
    http://www.smileonmymac.com/pdfpen

    Thanks, Marc

    Reply
  41. davidwsparks@mac.com

    Gang,

    I'm underwater on a big project for a bit longer but once I can breathe again, I'm going to be rewriting and posting this script. Stay tuned.

    Reply
  42. davidwsparks@mac.com

    Gang,

    I'm underwater on a big project for a bit longer but once I can breathe again, I'm going to be rewriting and posting this script. Stay tuned.

    Reply
  43. davidwsparks@mac.com

    Gang,

    I'm underwater on a big project for a bit longer but once I can breathe again, I'm going to be rewriting and posting this script. Stay tuned.

    Reply
  44. davidwsparks@mac.com

    Gang,

    I'm underwater on a big project for a bit longer but once I can breathe again, I'm going to be rewriting and posting this script. Stay tuned.

    Reply
  45. davidwsparks@mac.com

    Gang,

    I'm underwater on a big project for a bit longer but once I can breathe again, I'm going to be rewriting and posting this script. Stay tuned.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *