As discussed on Mac Power Users episode 3, “Going Paperless,” the nice people at Smile On My Mac put together an Applescript that, when combined with a folder action, gives you a way to automatically OCR documents using PDFpen or PDFpenPro. So here is the promised walk through:
What you’ll need:
1. Some scanned PDF images;
2. PDFpen or PDFpenPro (See my review here);
3. A bit of patience.
Step 1 – Load up the Script Editor
This little application allows you to create and save AppleScripts.
Step 2 – Copy in the below script
on adding folder items to this_folder after receiving added_items
repeat with i from 1 to number of items in added_items
set this_item to item i of added_items
tell application “PDFpenPro”
set theDoc to document 1
repeat with aPage in pages of theDoc
— Looks like we need to modify PDFpen so that we can detect when OCR is done; for now use 15 seconds
on error errText
display dialog “Error: ” & errText
end adding folder items to
Note – if you use PDFpenPro instead of PDFpen, you’ll need to open the script and edit the command that reads “tell application “PDFpen” to read “tell application “PDFpenPro”.
Note 2 – Wordpress seems to have converted the double dash before the comment in to an em-dash and the quotes to smart quotes. Although I fixed it in the wordpress code, it still reverts to “fixing” things when I publish so you’ll have to correct those in your editor. Sorry. If anyone knows a better way to post applescript via wordpress, please drop me a note.
Step 3 – Save the script
You need to save it to a specific directory:
HD/Library/Scripts/Folder Action Scripts/
I named mine “PDFpen Scriptacular”
Step 4 – Create a folder
Save the folder wherever is convenient. Perhaps in your documents folder or (for you anarchists) on the desktop. By the way, did you know that command-shift-n gets you a new folder? I named mine “OCR Drop.”
Step 5 – Enable folder actions
Secondary click on the folder and enable folder actions under the “More” item.
Step 6 – Configure Folder Action
Right clicking the folder a second time gives you a new option, Configure Folder Action. Click it.
Step 7 – Pick Your Folder
On the menu that appears, hit the plus (+) sign under the “Folders with Actions” box.
Select your folder, wherever you located it. It will then ask you to pick a script. Pick the PDFpen scriptacular.scpt
It should now look like this.
Close the window and you are done.
Now just drag a few PDFs in and let the script go to work. Copy the OCR’d PDFs where they belong and you are done. There are a few additional points:
1. There is no Applescript command in PDFpen that reports when it is done doing an OCR so instead there is a 15 second timer. The PDFpen wizards report they are going to try and fix this in a future release.
2. While this script generally works, it sometimes gave me an error when I overloaded it. Be patient.
I want to give my personal thanks to the gang at Smile On My Mac, particularly Greg, who put this script together for Mac Power Users just because we asked.