My paperless office

The 43Folders piece Workflow for the Fujitsu ScanSnap reminds me that I haven't written about how I've been using my ScanSnap S500M. I got it about 5 months ago, and its easily the most useful electronic gizmo purchase of the year. (Since I bought it, the newer S510M has been released.)

I was hesitant to get a ScanSnap because of its price, but when I discovered that it was available through my credit card company, via Amazon, in exchange for "reward points" I dove right in. I already had a flatbed scanner, but the ScanSnap's document feeder and single-pass duplex scanning makes it a lot faster and less hassle to use compared to a flatbed. (It's not, however, a high-resolution photo scanner so if you're into that, you'll need to keep your flatbed.)

My second concern was about the size of the scanner. I really don't have room in my office for another piece of equipment, so I was pleasantly surprised to find that the ScanSnap is a lot more compact than it appears in a photograph. Its footprint is less than the size of standard US piece of paper, and its height, when closed, isn't much taller than a CD jewel case. You need a bit more room when it is open, but it is very portable and I keep mine tucked away behind my Cinema Display when I'm not using it.

I've been using DevonThink Pro Office to catalog and manage the PDFs that the SnapScan creates. So far, I've got about 1,000 documents spread between three DevonThink databases. (I have no idea how many pages that is total, maybe around 5000, see below for details.) Here are some notes about my workflow:

I wait until I have 10 or more things to scan, instead of scanning documents "on demand." This is because I don't keep the scanner hooked up all the time, due to space limitations and a paucity of unused USB ports on my iMac.

I've eliminated about 3 boxes of stored paper so far, and thrown away countless of magazines I was saving for just a few articles. Now, when I see something I want to save, I tear the pages out and put them in my "to be scanned" pile. Some might consider this a "marriage saver."

The three DevonThink databases I currently use are: one for conjuring literature, one for household/legal items (bank statements, credit card bills, pay stubs), and a general "this is cool" catch-all. My intention is to eventually use DevonThink's AI to categorize documents and my understanding is that specialized domains are better kept separate for this purpose. I haven't yet tried to train the AI, though.

I normally use Skim to view PDFs, but while scanning I prefer PDFPen instead. It's the perfect tool for this task because it lets me rearrange and delete pages within the finished PDF. SnapScan does a good job of automatically removing blank pages, but when scanning magazine articles I sometimes need to eliminate the back-side of the last scanned page. I only wish that DevonThink Pro allowed you to specify a preferred PDF application instead of using the system's setting.

Speaking of DevonThink Pro wishes, here are some additional items that would improve my satisfaction:

DevonThink shows the file size of a PDF, but you have to open it to see the number of pages it contains. To me, pages are the most important count, not bytes.

The integration between SnapScan and DevonThink basically boils down to SnapScan sending an open-event to DT after the PDF is initially created. It would be much better if the two could actually "talk" to each other. For example, in order to make a single-sided scan you have to use SnapScan's contextual Dock menu. A set of controls for this within DevonThink would eliminate this awkwardness.

While the OCR process that DevonThink uses is essential for finding things later, it's unfortunate that you can't easily postpone until after you've completed several scans. Your choice is to either wait while each document is recognized immediately after scanning, or to turn off the OCR and then tediously process each document later. If you do this, you end up with two copies of the document in your database, one that has been OCR'd and one that has not. This is probably my biggest "gripe" with DevonThink so far.

A similar, but more minor, nit. When DevonThink opens the scanned file it can prompt you for meta info to add to the PDF. Title, author, and so on. Unfortunately, the document info dialog is modal and the author's name defaults to your login name; which of course is rarely the right answer.

The folks at Devon Technologies are too generous with their trial period for DevonThink Pro Office. You can use it a very long time before it starts urging you to buy it. In fact, when it goes into "sales mode" it simply stops running the OCR process. Which, as I've discussed above, could be viewed as a timesaver.

So, all-in-all, I'm satisfied but there are plenty of opportunities for improving the workflow. I think the DevonThink Pro Office, PDFPen, and SnapScan combination is a real winner. There's no doubt that this is the first time that I've felt good about converting to a strictly digital storage method for paper files.

If I discover any more tips, I'll add them later. For now, just a quick note that SnapScan and DevonThink Pro Office are working just fine for me under Mac OS X 10.5 Leopard. If going paperless appeals to you, now might be the time to dive in.

(This piece is cross-posted from my MacDevCenter blog. If you'd like to add or read comments, please go to the version there.)