Xerox Scanning/OCR Package Rocks
By Jim
Bray
Scanning software
suites have reached new heights, as evidenced by the latest entry from
"the Document Company."
ScanSoft, a Xerox
offshoot, has upgraded its Pagis Pro suite to Version 3, an all-in-one
scanning solution if ever there was one.
Pagis Pro 3 gives
you a special "desktop" on which to organize and store your
scanned files - and redirect them to other applications - but that's only
the beginning. The suite also comes with Adobe PhotoDeluxe Business Edition,
which lets you mess around with images after you've scanned them, and
TextBridge Pro 9, a wonderfully powerful OCR (optical character recognition)
program that turns scanned pages of print back into real text you can
edit in your favorite word processor and/or save in a variety of file
formats.
Once you install the
software you're left with a "Pagis Inbox" icon on your desktop,
from which you can access all the features of Pagis Pro. You can use it
as your "scanning and filing headquarters," where your scanned
files will reside (you can also use it as a Windows Explorer environment)
and from where you can drag and drop them into whatever applications you
choose. You can also scan, file, search, and organize stuff by category,
which gives you a nice way to keep track of receipts, family pics, bills,
and the like, at least until your next hard drive crash.
One thing I liked
about using the Pagis scanning facility (and there's a long list of scanners
it supports) is the way its interface blows the generic Windows scanner
drivers out of the water.
You see, my scanner
is a couple of years old now and, though it works fine, when you install
Windows 98 it senses the scanner and installs drivers and software for
it. Unfortunately, this software only allows you very basic choices (b&w
photo, color photo, etc.) and doesn't let you do other things like change
resolutions. Or if it does, it's so user unfriendly that I haven't yet
found out how to do it.
Pagis, however, is
easy you can use it as your scanner interface - so it's a terrific way
to get more functionality out of a scanner (well, mine at least!) without
going to the original manufacturer and begging for new drivers - some
of which are only available if you ante up some cash.
Pagis Pro 3 includes
new search features to help you find stuff on your hard drive, and it
lets you capture (and keep a thumbnail view of) your favorite web pages.
It stores the thumbnail in the Pagis Inbox and from there you can just
click on it and surf to it, or open it to edit or OCR it. It's also a
nifty way of giving yourself "visual bookmarks" so that, instead
of remembering web pages by their titles (as are stored in your Browser's
"bookmarks" or "favorites") you can actually see them
- if you squint.
This is neat, though
I found upon my initial attempt at surfing by TechnoFILE's home page that
it had trouble in the OCR process because it couldn't differentiate between
the navigation and the main windows - and on the whole it did a pretty
lousy job. I don't find this a big deal, however, because it's easier
to simply
copy and paste text (or just save the file onto your hard drive) if you're
going
to steal from a Web page anyway!
Pagis also lets you
open files from a variety of formats, without loading the original application
-and you can use Pagis' e-mail viewer to send pictures through cyberspace.
The Pagis applications
work well, and the organization and search functions are powerful and
fairly straightforward. You can scan to over 40 image formats - though
Pagis' XIFF is the default (and that's okay) and Xerox says Pagis Pro
automatically converts your "dragged and dropped" file into
the format of the application into which you're dropping it.
Besides
pictures, however, the other real joy of having a scanner is to get printed
pages back into your computer so you can turn them into electronic versions
of themselves. This is where TextBridge Pro 9 comes in and it does
a very good job of taking scanned pages and converting them back into
digital pages.
I used the product
to scan a bunch of brochures that included pictures, captions, tables,
and various types of columns. It took the stuff and dumped it into Microsoft
Word for me without even so much as a "By your Leave." The formatting
is sloppy, but it works - and saves a lot of retyping.
TextBridge Pro will
also output into HTML format for use on a web site, and when I tried this
I found that the finished document was, for the most part, a pretty good
representation of the original - until I looked at the HTML code. It was
a real dog's breakfast, so I went in by hand and cleaned it up.
Still, it was a lot
less onerous a task than typing the whole shebang in and then laying it
out as a Web page. Besides, most surfers won't be passing judgment on
your HTML source code, so you can probably safely ignore the "dog's
breakfast" and just publish the thing.
One the whole, TextBridge
Pro does an excellent job. It isn't perfect by any means, but it's the
closest I've seen to perfection so far. It takes the mind numbing task
of retyping a page and turns it into a relatively minor edit and spell-check.
And as with spelling checkers, you can train TextBridge Pro so it recognizes
specialized words before it sends the page to your word processor.
Pagis
itself will also straighten and otherwise crop or orient your page, and
once you've pre-scanned it you can manually set the page area to be scanned
in the final scan. This can be handy if, for instance, you're scanning
images for a Web site; you can send them directly to the Pagis editor,
mess around with them there, and then save them to their destination folder
without having to load your normal image editing software. This, by the
way, is how I got the ScanSoft logo onto this page.
Or, if you want to
do some heavy duty manipulation, you can send it to your favored image
editor right from the scanning dialog.
And if your hand-cropped
page turns out not to have scanned well, or you did something wrong in
the setup, you can delete the page from the scan tool without having to
exit and then go back in again. It's very flexible and efficient operation,
and it works well.
This latter feature
came in handy when I tried scanning in the above ScanSoft logo from a
press release. I found that when I cropped too closely around it the image
wouldn't scan at all - leaving me with a blank page - so I was grateful
to be able to delete the attempt and start over again right away.
I've never really
used a scanning suite a lot, though with the terrific manner in which
Pagis Pro runs my scanner (as mentioned, it now works better than with
the Windows-installed
drivers),
I'll be using it more often. There's a lot of flexibility here, and the
more I use it the more things I find to do with it.
Xerox says you can
even scan in forms and then just "tab and type" through it which,
though I didn't try it, sounds pretty cool.
In all, Pagis Pro
3 is a powerful and flexible tool for the person who does a variety of
tasks with a scanner. But it's more than that: it not only helps you keep
track of your files, it allows you to easily interact with your other
applications, and in some cases bypass them completely.
In short, a nice labor
saving device. And that's what computers are supposed to be all about,
isn't it?
Tell us at TechnoFile what YOU think