Departmental Document Imaging: Issues and Concerns
by Daniel V. Arrington
Copyright 1992 CAUSE From _CAUSE/EFFECT_ Volume 15, Number 1, Spring
1992. Permission to copy or disseminate all or part of this material is
granted provided that the copies are not made or distributed for
commercial advantage, the CAUSE copyright and its date appear,and notice
is given that copying is by permission of CAUSE, the association for
managing and using information resources in higher education. To
disseminate otherwise, or to republish, requires written permission.For
further information, contact CAUSE, 4840 Pearl East Circle, Suite 302E,
Boulder, CO 80301, 303-449-4430, e-mail info@CAUSE.colorado.edu
DEPARTMENTAL DOCUMENT IMAGING: ISSUES AND CONCERNS
by Daniel V. Arrington
************************************************************************
Daniel V. Arrington, Management Analysis Coordinator with the Division
of Operations Analysis at the University of Florida, has been a
Certified Office Automation Professional since 1986. He has worked with
computers for more than twenty years and has authored a number of papers
concerning microcomputer-based automation. In addition to internal
management consulting duties, he shares responsibility for providing
personal computer support to all divisions of Administrative Affairs at
the University of Florida.
************************************************************************
ABSTRACT: Document imaging is a process used to transform printed text,
pictures, and figures into computer-accessible forms. Imaging technology
clearly offers dramatic opportunities for enhancing office automation,
but vendors may be too quick to promote imaging as the ultimate solution
for document management problems involving both space and personnel.
This article presents relevant issues, observations, and a few
suggestions that may be useful for anyone thinking about establishing
departmental document imaging and management systems, based on
investigations undertaken at the University of Florida.
Colleges and universities throughout the country are struggling to find
some way to deal with paper documents that must be maintained to ensure
institutional accountability. The cost of storing, filing, and finding
documents continues to escalate even as familiar but time-worn paper
handling methods fail to take advantage of modern technologies. In fact,
estimates suggest that less than 1 percent of the 1.3 trillion documents
stored in U. S. offices today are available in any kind of computerized
format.[1] Although document management has been complicated by bleak
fiscal conditions affecting hiring and spending patterns at many
institutions, issues of accountability and efficiency must be resolved
before the situation gets completely out of hand.
Many vendors are now promoting document imaging as the preferred
solution for paper management problems.[2] Certainly, converging
advances in a number of otherwise diverse automation technologies have
heightened interest in using computers to address issues associated with
processing, storing, and using paper documents. Even though product
demonstrations and reports of successful new system implementations
imply that imaging is exceptionally advantageous, five- to six-figure
price tags often mean commercial systems cannot be purchased regardless
of their potential value--discretionary funds are simply not available.
Two years ago, several University of Florida offices began
assessing the potential benefit of imaging applications. An
investigation conducted by the University's Division of Operations
Analysis included hands-on evaluations of selected imaging technology
products. Observations of vendor presentations prompted an attempt to
duplicate system capabilities offered by proprietary imaging products.
The goal of our investigation was simply to find out if it was possible
to match the functionality of expensive commercial systems by using
readily available and relatively inexpensive off-the-shelf personal
computer (PC) hardware and software. As we reported in a CAUSE90
presentation, lack of an adequate database management system and various
component limitations precluded successful completion of the project.[3]
Nevertheless, observations derived from the experiment have proven
useful in ongoing evaluations of commercial offerings and will
undoubtedly lead to further investigatory work at the University of
Florida.
On the basis of our experiences in that investigation and since,
this article presents relevant issues, concepts, observations, and a few
suggestions that may be useful for anyone thinking about establishing
document imaging and management systems in their organizations. Results
of preliminary investigations into this technology by two University of
Florida offices are shared.
Imaging issues
Document imaging describes a process whereby sheets of paper are
passed through a page scanner to produce graphic images or pictures.
Imaged document files (images) can be managed as regular computer files
and, with the aid of appropriate software, can be retrieved, printed,
and to a limited extent, modified.
Traditional document storage methods are resource intensive and
expensive. Discussions involving storage problems commonly cite
accessibility, cost, space, security, and system integrity as some of
the issues to be resolved by document imaging.[4] Since computer files
occupy far less physical space than paper records, substantial cost-
avoidance savings can be gained by using former storage areas for more
critical purposes such as laboratories, classrooms, or offices. Another
expected benefit is improved efficiency as people locate and retrieve
documents faster and more easily. This issue is especially significant
because personnel costs are usually the most expensive component of any
institution's operation.
Even a cursory examination of these points can lead to favorable
cost/benefit projections, but the most valuable advantages of document
imaging will only be obtained through shared processing techniques made
possible by local area networking. Simultaneous access to the same
document by different workers may literally revolutionize document
processing methodologies. Imaging appears to offer a continuum of
"potential value" benefits with simple document archival at one end
(easy to do, now) and parallel processing of normal business functions
at the other (difficult to do, sometime in the future).
As awareness of document management technology increases, perceived
advantages of paperwork automation become more compelling. Although
benefits such as these are extraordinarily desirable and have been
promised many times in the history of computer automation, efforts to
actually attain them have been both challenging and elusive.
Experience gained with the introduction of other innovative
technologies suggests implementation issues, both apparent and subtle,
must be anticipated before an imaging application can be seriously
considered. Cost justification and funding are the most obvious problems
of turnkey systems, which sell for hundreds of thousands of dollars.[5]
Other, far more dangerous pitfalls include: resistance to change,
problems typically associated with automation of manual practices and
procedures, dangers inherent in over-dependence on a single vendor or on
a vendor's proprietary system, and problems caused by unrealistic
expectations--such as the idea that document imaging will finally lead
to the paperless office.
Imaging concepts
Two very distinct ideas are involved in document imaging. The most
common one is to make a digital representation of a document. Under this
graphic imagery approach, a scanner is used like a camera to take a
"picture" of the original document, saving text, line-art drawings, and
photographic figures in a single graphic file. Text imagery, on the
other hand, depends on optical character recognition (OCR) to convert
scanned text into standard word processing documents, while
intentionally disregarding drawings and figures.
Graphic imagery
With graphic imagery, everything on the original page--including
handwritten notes, date and time stamps, alterations, figures, drawings,
and typed or printed text--is saved exactly as it exists when the
document is scanned. A popular format for saving bit-mapped graphic
files of scanner images on PCs is called TIFF (Tag Image File Format).
Like a photograph, once it has been "taken," few actions beyond
displaying, rotating, scaling, or printing a TIFF image are possible.
Modifications can be made with certain kinds of graphics programs, but
these are unlikely to be used in document imaging applications.
The most tenacious problem of graphic imagery is related to file
size: graphic images can be very large. Since every part of the page is
saved regardless of the presence or absence of ink, file sizes vary
directly with scanning resolution, the size of the area being digitized,
and the style of graphic file format used to save the image. As an
example, a full-page TIFF image scanned at 300 dpi (dots per inch) can
reach a size of 1 to 4 megabytes (Mb) or larger. The scope of this
observation may become clearer by envisioning a 115Mb hard disk holding
no more than 100 graphic page images--obviously far too expensive for
serious consideration. One of the main reasons for growing interest in
optical mass storage devices is the realization that using a hard disk
for storing images is totally impractical.
Large file sizes also have adverse effects on the amount of time
needed for saving, retrieving, displaying, and printing images. All of
these issues lead to system requirements for powerful microcomputers and
peripherals which, in turn, increases imaging costs. Other problems are
associated with the fact that the physical appearance and composition of
a document has a direct effect on scanning. Because all image
enhancement processes offered by scanning programs may not be acceptable
for a particular document, relying on operator experimentation and
experience to acquire an acceptable image can increase the time needed
for successfully completing the capture process.
Graphic imagery advantages are compelling. Relatively inexpensive
hardware and software can be used for total preservation of original
document appearance. Although the real value of digitized and
electronically stored images can only be inferred at this time, they may
actually be worth more than the original documents simply because they
can be copied and restored without suffering any degradation in
appearance over time. Graphic files also offer the possibility of post-
capture processing by OCR software, which further heightens the
potential value of scanned images.
Text imagery
Choosing text imagery over graphic imagery is often based on the
premise that printed characters and words are the most important aspect
of any document and, further, that text files are much smaller than
graphic files. The same 115Mb drive (remember, around 100 graphic page
images) could hold more than 23,500 single-page (5Kb) documents. Text
imagery makes it possible to use data captured from printed resources
without time-consuming and error-prone retyping, and common word
processing is the only technical skill needed for editing and using
these documents.
Optical character recognition programs are software tools used to
transform pages of printed text into word processing documents. Growing
numbers of OCR programs are capable of accurately interpreting a printed
page of text with the aid of omnifont technology. The term omnifont
describes a series of techniques enhancing a program's capability to
recognize a wide variety of fonts, type styles, and text sizes. Page
recognition programs isolate particular areas of the page to be
interpreted and text recognition processes convert the scanned image
into ASCII (American Standard Code for Information Interchange)
characters.
Performing accurate OCR takes substantial amounts of time. Beyond
the kinds of problems already described, the chief difficulties of
character recognition have to do with locating and correcting errors. Be
assured there will be errors. The quality of a document's appearance is
critical. An original marred by smudges, fingerprints, dot-matrix print,
or fuzziness can be nearly as disastrous as skewed placement or a dirty
scanner glass. Other errors are caused by colored inks or papers,
outsized or otherwise unrecognized fonts, and by underlined descenders
such as the letters q, y, and p in the words quality and mapped.
Our work with software-based OCR products suggests nearly half of
all conversion errors are unrecognized by the OCR program and therefore
are not indicated by special characters. Such errors require painstaking
examination and editing which must often be accomplished with the
original page immediately at hand. This function can take far longer to
complete than the scanning and recognition processing steps combined[6]
Products sporting a blend of artificial intelligence and OCR are
beginning to emerge in the form of Intelligent Character Recognition
(ICR) systems.[7] An intriguing concept, ICR analyzes OCR results to
resolve recognized translation errors without operator intervention. For
example, the text string "5elling" might be corrected to become
"selling" by automatically examining an integrated database containing
likely spelling alternatives. More advanced programs may also assess
text phraseology as an additional editing technique. As might be
expected, most current examples of ICR products seem to be directed
towards specific vertical market applications (e.g., resume processing
functions for personnel departments) in which the potential number of
pertinent terms or phrases has been determined to be manageably small.
Since OCR text files are no more than word processing documents,
any PC capable of satisfying the organization's word processing needs
will work for text imagery users. Users of graphic files on the other
hand, like the OCR processing workstation operator(s), must have
relatively powerful microcomputers to handle the system workload imposed
by large files and graphics processing requirements.
Optical disk archiving
Emerging technologies involving PC optical storage devices may
provide a reasonable solution to problems involving storage capacity and
document archiving or retention. Unlike traditional disks that rely on
magnetic components, optical drives record information by writing data
onto the disk with a laser beam. Optical drives offer enormous storage
capability. Commonly, 5.25" format drives can store 500 to 600Mb per
disk although only one side is accessible at a time.[8] Optical disk
technology is relatively immature, with arguments still raging about
purely technical issues, and suffers from a general absence of
acceptable device drivers.[9] Furthermore, although improvements are
forthcoming, optical drives are slow devices with access times
comparable to those of floppy disk drives.
Despite the absence of industry-wide technical standards, Write-
Once, Read-Many (WORM) optical disks are attractive archival devices
because once information has been written to this kind of optical disk
it cannot be easily removed or altered. Although the potential
importance of archiving images cannot be overstated, a brief warning is
in order. While it is true that a WORM-saved file cannot be easily
modified, it is quite simple to alter a graphic file using any number of
paint programs before the file is copied to a WORM disk. Administrative
procedures with traditional checks and balances should be sufficient to
deal with this possibility, but managers need to be aware of prospects
for unauthorized image manipulation.
Because storing document images on optical media can preserve the
unaltered appearance of original documents for years (claims of data
life expectancy on optical disks range from thirty to a hundred
years[10]) and because the cartridges themselves are impervious to many
conditions capable of easily destroying magnetic tapes or disks, optical
drives are considered extremely attractive mass storage devices. Recent
legislative changes (e.g., Section 119.011(1), Florida Statutes)
addressing the acceptability of optical devices for storing public
records provide affirmation that the write-once characteristic of WORM
drives is conducive to archiving records and to creating relatively
fool-proof audit trails. Nevertheless, questions about the legal
validity of WORM-archived originals in all situations remain
unanswered.[11]
Document management systems
If the only purpose of the imaging process were to preserve
documents, then this discussion of capturing graphic and text images
would be complete. However, the unequivocal value of imaging will be
realized only when resources are diverted from filing, finding, and
moving paperwork to activities designed to enhance data extraction and
use of the information contained in stored documents.
Once graphic images and OCR-processed text documents have been
saved as files, a database is needed for selective retrieval of indexed
data and images. Database management systems allow rapid retrieval of
data contained in one or more fields within each record in the database.
Similarly, document management software (DMS) enables users to rapidly
and accurately locate documents (or images) for subsequent retrieval and
use. Instead of merely finding out that a document is being stored in a
particular cabinet or folder, document management software makes it
possible to perform ad hoc searches for files containing specific data
and allows interactive retrieval and subsequent manipulation of relevant
documents.
A text-only DMS product's ease of use is determined in part by the
degree to which it supports native document formats. That is, some
applications require text to be stored in a generalized form such as
ASCII. Unfortunately, although leading word processors can usually
import ASCII text, most rely on proprietary encoding techniques to
support unique document formats. If a text management system does not
work directly with the document format used by the organization's word
processor, users must perform manual conversions which will greatly
diminish the system's effectiveness.
If converting a document into an image or text file is counted as
the first step in the imaging process, indexing those files comprises
the second step. With some applications it is possible to minimize the
impact of this process by means of creative programming (i.e., pointing
and clicking to select a specific image file or automatically updating a
field with the system date), but most information used for record
indexing must be entered manually. If the number of hand-keyed data
fields is limited, database search options are limited proportionately.
Thus, compromise between system flexibility (many record fields) and
entry speed (fewer fields) will be ruled by user-specified system
constraints, and ultimately will be the most important factor in
determining a system's value.
Strategic alternatives
The issues described thus far are quite real and document imaging
sounds promising, but it is expensive and will introduce new kinds of
problems to be solved. So what is the most practical response for your
organization?
There are only two all-purpose choices: either store and archive
tangible paper documents and continue to suffer the inefficiency and
expense of existing methodologies, or implement something new.
Advantages and disadvantages of alternative archival media like
microfiche are well known, and while some developments in microfiche may
still be forthcoming, this is a mature technology that has done little
to reduce dependence on paper documents.[12] Microfiche and automated
filing equipment will certainly continue to play a role in most offices,
but the question to be asked is: "Are these enough to cope with
increasing demands of paper processing requirements in the face of
stable or, worse, declining numbers of support personnel?"
Before imaging can be used to solve specific document processing
problems, choices as to extent and approach must be made. Personal
computers, peripherals, and a LAN (local area network) may provide a
reasonable alternative to commercial imaging systems, but imaging and
document management complexities preclude purchasing system components
as though they were delicacies on an "imaging buffet." The full
ramifications of each decision must be completely understood.
System requirements for microcomputers used to capture or access
graphic images are quite similar. A hardware configuration for an image
capture system might consist of a powerful microcomputer, a scanner,
laser printer, some form of high-capacity data storage device such as
WORM or erasable optical drive, and (optionally) a relatively large
display. Additional system specifications might include a LAN for
distributed access to the image database as well as software designed
for image capture, indexing, database maintenance, and ad hoc image
selection. Commercial systems can often be distinguished by the use of
UNIX-based workstations, minicomputers, or mainframes; online access to
devices with massive amounts of storage capacity; and very high-speed
scanners, large-screen high-resolution displays, or, in some cases,
hardware-based optical character recognition systems.
Approach issues
Although some aspects of imaging and document management can be
accomplished with familiar automation components, the potential
contribution of this concept is so significant that implementing an
imaging application will conceivably involve fundamental changes in the
organization itself, as well as in the way new functions are achieved.
Some required decisions may be considered unusual simply because they
constitute a rare opportunity to design a completely new computing
environment.
Software source alternatives
If your institution has enough programming resources to design
original computer applications whenever needed, you may enjoy wonderful
opportunities to solve problems in innovative ways. On the other hand,
if needed development tasks are always consigned to the end of a multi-
year programming to-do list, you may need to find alternative sources
for unique software requirements. While consulting with two University
of Florida offices considering imaging applications, we discussed two
such software alternatives--employing an off-campus software developer
to produce customized applications, or buying off-the-shelf commercial
software.
Regardless of the computer or combination of computers involved,
custom programs constitute the most expensive kind of software. Quality
contract programmers are difficult to find; they command high salaries
and cannot always guarantee timely application development or long-term
program reliability. Successfully exercising this option requires
someone in the office (who may not be otherwise qualified) to assume
responsibility for clearly and accurately defining specific departmental
automation requirements. The chosen individual must possess pertinent
management and program design skills and enough desire and time to see
the project through to completion.
Extended support for system flexibility, interface design, program
documentation, and training may also be expensive. Such support must
enable office personnel to deal with changing needs and technologies on
an ongoing basis after a project's completion. Sponsoring in-house
departmental application development also involves more risks than
office personnel have generally experienced. There have been occasions
when, after buying customized applications from outside experts,
University offices have found themselves paying the developer to create
something as simple as a new report format. If the original author goes
out of business, responsibility for finding and certifying replacement
programming expertise falls completely on departmental personnel.
Though more economical, choosing off-the-shelf applications as an
alternative strategy may force compromises in an office's automation
objectives. Beyond selecting one product over another because of a
particular feature, there is little opportunity to exert any control
over system operations, and administrators may have to adjust internal
policies or procedures accordingly. If a department's requirements are
specific enough, programs written for a general audience may not be
appropriate for use in that organization. Although vertical market
applications developed for a specific operation in a single industry may
reduce or eliminate customized programming needs, such programs are more
expensive than software normally associated with microcomputers.
Computer environment alternatives
The three alternative computer environments considered in our
investigations included a minicomputer-based departmental system,
powerful personal computers in a local area network, or stand-alone
microcomputers.
Persistent vendor claims to the contrary, a minicomputer requires
on-site expertise and is expensive to purchase and maintain. Customized
software applications may be far more important for minicomputer users
simply because PC users are able to choose among a far greater number of
commercial microcomputer-based programs.
A LAN can be less expensive than a departmental system, but just as
with a minicomputer, buying, installing, configuring, and maintaining a
local area network requires on-site expertise. If a department cannot
develop or hire an in-house staff expert, a LAN may not be an
appropriate option. If a network of micros is planned, the most
expedient implementation method may involve hiring an experienced LAN
consultant to provide a turnkey configuration.
Though the thousands of PC-based applications being developed every
year constitute a strong reason for seriously considering a LAN, care
must be taken to avoid buying programs that will not work properly in a
LAN environment. LAN-incompatible applications will not work on a
network at all. LAN-tolerant programs are not designed to work on a LAN
but can generally be used as long as informed users avoid compromising
shared data and licensing restrictions are not violated. LAN-specific
programs are expressly designed and licensed for simultaneous access by
multiple users. LAN versions of imaging and document management programs
are readily available.
Suggestions to buy powerful personal computers reflect evolving
automation options and the hardware demands of imaging systems. Modern
applications, such as those involving document imaging and management,
either cannot operate on low-end PCs at all, or if they do work, are too
slow or too limited in function. While graphical programs can work on
some older PCs, many important functions simply cannot be used to their
greatest effect without having an adequate amount of installed memory in
a sufficiently capable computer. The point is this--microcomputers
limited by aging processors, constrained memory, and slow hard drives
can preclude a number of very desirable automation options.
Finally, as an intermediate solution, the stand-alone micro
approach is probably the least expensive way to initially address
document archival imaging. Administrators need to be aware however, that
because system configurations and procedural operations frequently
change, everything becomes much more complicated as users try to work
with programs which at best are unaware of one another or, at worst, are
incompatible with one another.
Obstacles to successful document imaging
Actions addressing image capture, storage, and indexing, though
crucial, are really no more than a good beginning for the document
imaging process. While many technical aspects of image capture appear to
be straightforward, human factors, quality control, and document
management are major issues which invite further comment.
Human factors
Document imaging can obviously save time and money by reducing
filing errors and making record retrieval faster, but the impact of
shifting human resources is seldom mentioned in vendor presentations.
Beginning with the statement, "This is a PC ...," many workers will have
to be trained from the ground up. Even after trained personnel are
available, employees formerly responsible for finding filed documents
will have to be assigned new tasks as efforts are shifted to capturing
and indexing images. These observations are especially noteworthy
because they provide a hint about the type of managerial skills needed
to guide an organization through the substantial changes that will be
required for the most effective implementation of imaging
technology.[13]
Operators of image capture workstations will need judgmental skills
to ensure image validity. They will have to make decisions about what
portion of a scanned image to save and will have to make sure variables
employed during the scan result in an accurate copy. Naturally, anything
that requires this sort of attention will take more time per execution
than would a mass production approach in which a scanned image is simply
ignored until image documents are edited sometime later in the process.
Quality assurance
Before any automated system can be used with reliance, users must
have full confidence in the accuracy of all data contained in the
database. Regardless of the document imaging approach (graphic or text),
inaccuracies caused by scanning problems or document variations make it
seem reasonable to suggest that operators must proof each image and edit
OCR documents before files are finally committed to the document imaging
system.
Valid data precision concerns may be partially alleviated by using
ICR applications to resolve recognition errors while simultaneously
displaying a copy of the scanned image. Being able to see text and the
original image on a screen at the same time should make it possible to
separate physical acquisition (scanning) and data verification
operations. In any event, although procedural steps designed to ensure
database quality could be time-consuming, the alternative--inaccurate
data--is far more expensive and must be avoided at any cost.
Utilization
Differences between microcomputer LAN-based and enterprise-wide
mainframe imaging systems are generally related to scale. The best
example may be telecommunications bandwidth. That is, because image
files contain far more data than traditional text applications, as the
physical distance between system components increases, so does the
importance of telecommunications capabilities (cabling, controllers,
modems, etc.). To suggest a need for fiber optic cabling and channel-
attached, LAN-compatible controllers to support mainframe imaging is no
more remarkable than suggesting that a LAN operating at twice the speed
of another LAN will be more satisfactory. If an existing computing
environment is barely able to satisfy text-only transaction processing
needs, the anticipated expense of an imaging system will also have to
include the costs of upgrading relevant telecommunications resources.
Management
Modifying archiving practices should not pose any significant
managerial problems, but issues such as packet management and parallel
processing cannot be so easily dismissed. Using a LAN to provide full-
time interactive access to database records may make it possible to
support the creation of virtual files or packets whose contents will be
determined by ad hoc or structured database queries. Depending on the
intent of a particular database user, an electronic folder for a student
might consist of a transcript, financial aid forms, and employment
records, while a different system user's view of the same student's
record would be limited to historical tuition and housing payment
documents.
As exciting as archiving and virtual files may be, transformation
of existing (sequential) processing procedures into parallel
methodologies could be the single most important aspect of imaging.
Consider the commonplace processes involved in administering campus
construction, conducting purchasing transactions, initiating personnel
actions, or monitoring campus safety issues. Such functions might be
dramatically improved by allowing simultaneous action by many
individuals instead of the traditional sequential processing scenario of
"you do your thing, then I do my thing, and then we are done." This is
not to suggest that imaging will somehow magically fix a malfunctioning
process. The interrelationships between technology and process
engineering and management are extremely complex. It may very well take
years to integrate parallel processing ideas with everyday work
practices, but just imagining such dramatic possibilities provides an
impetus to plan for process reengineering.
By comparison with these possibilities, devising a strategy for
dealing with existing paper records may be mundane, but is nevertheless
an important issue. There are just as many valid arguments for trying to
scan every existing document as there are for only scanning new
documents. The most practical compromise may be to scan everything new
and catch up with archived documents as circumstances and resources
allow. Care will obviously have to be taken to maintain full compliance
with mandated archiving guidelines, and legal concerns will probably
mean verifying the validity of optical archival on a case-by-case basis.
Although it might be reasonable to start using a stand-alone imaging
system for document archival, knowing that an imaging system's major
advantages will not be attained without widely distributed access means
that all software and hardware purchases should be made with the
intention of eventually incorporating each product into a networked
environment.
Preliminary University of Florida experiences
While the University of Florida relies on a centralized data center
(IBM 3090-600J) for the vast majority of administrative and academic
computing, professional programming staffs are distributed among self-
reliant development operations in different vice presidential areas
(e.g., Administrative Affairs, Academic Affairs, the University
Registrar, Student Financial Affairs, and so forth) which serve related
offices and departments on campus. Academic departments and
administrative offices have access to a variety of consulting and
support services, including these area development operations, but any
office implementing a local automation project may decide what software
is to be used, and, within certain financial and administrative
guidelines, may choose an appropriate mini- or microcomputer platform.
Several University of Florida offices have begun actively exploring
document imaging systems, among them the General Counsel's office and
University Personnel Services. Although neither of these two
organizations has bought a system, the rationale and reasons for their
actions may be informative. The following discussions were derived from
internal consulting reports prepared for each office.
General Counsel's office
This department employs seven attorneys and a three-person support
staff reporting to the University Attorney. Case documentation commonly
involves many pages of text acquired over long periods of time, and
attorneys often need to gather specific documents on short notice. Any
system capable of reducing the effort needed to access enormous volumes
of textual resources offers an opportunity to extend the cumulative
effectiveness of an otherwise limited number of staff professionals.
Cost savings and efficiency gains would be clearly defined objectives
for any document management application in this office.
Although a final decision has not yet been made, an exploratory
investigation addressing general document management and imaging issues
resulted in a recommendation to delay any major action for the time
being for at least three reasons.
* Computing platform obsolescence. This office is operating a
minicomputer which was only installed a few years ago. Unfortunately,
the introduction of a new series of minicomputers which cannot use
software designed for older machines (without significant modification)
has discouraged developers from writing new commercial software for the
department's computer. Certainly nothing as advanced as imaging is being
developed, and yet the outlook for the University's funding over the
next few years is so poor that no one can afford to arbitrarily (or
justifiably, for that matter) replace a working office automation
system.
* Customized software. Research into vertical market applications
specifically designed for law offices revealed that most of these
products place heavy emphasis on time accounting and billing. These
factors are nearly irrelevant in a university setting, and because the
primary automation objective in this office focuses on enhancing the way
work is accomplished, customized or adaptable commercial software would
very likely be required.
* Limited automation experience. Like most University departments,
the General Counsel's office is not populated with computer experts.
Adding a computer support person for this small office is a virtual
impossibility, and time constraints associated with just completing
assigned tasks reduces any possibility of developing an in-house expert.
Automation support is provided on an as-time-is-available basis by
another campus office which was involved in setting up the system. The
absence of a clearly identified source of system and user support
compelled a recommendation for the General Counsel staff to begin
acquiring in-house expertise by gradually exploring stand-alone PC
applications. As an example, an inexpensive, flat-file database
management system could be used to replace a manual rolodex file
containing location references for documents stored in traditional file
cabinets.
These factors virtually guarantee that an imaging project in the General
Counsel's office would be far more expensive than more generalized
installations.
University Personnel Services
The University of Florida employs nearly 11,500 employees in four
categories: Faculty, Administrative & Professional (A&P), University
Support Personnel System (career service), and Other Personnel Support
(temporary). Along with conventional human resource duties, the
University's Personnel Services division recently assumed responsibility
for initiating and managing a vitae bank of non-specific minority
resumes. No new staff will be available for administering the new
function, although interviewers must begin examining all filed resumes
for potential applicability for every job opening.
To be successful, any human resource organization must constantly
watch for opportunities to manage daunting amounts of paperwork. The
division's interest in document management and imaging is founded on a
desire to improve overall managerial efficiency despite increasing
demands on limited resources. An on-site demonstration of a proprietary
document imaging system tailored for resume processing provided an
opportunity to make the following observations about a specific product
and application with regard to Personnel Services' working environment.
* Advantages: An automated database would be invaluable in reducing
requirements for physical storage space by avoiding duplicate resumes,
and through use of file compression techniques to reduce the size of
images, would conserve computer disk space as well. Use of an imaging
and document management system would clearly achieve time savings by
reducing data entry efforts and minimizing editing errors. Besides
expected gains in response efficiencies, delegating specific parts of
the acquisition process to trained support personnel could relieve
interviewers of routine administrative duties without threatening the
security or sensitivity of existing hiring processes.
* Disadvantages: Trying to perk up system performance by restricting
image and text storage to a hard disk as this vendor does may actually
generate more problems than are resolved. The absence of support for
optical storage technology is also a matter of concern because it may
indicate creeping system obsolescence. Without question, this system's
potential usefulness is compromised by online disk storage limits that
will force the department to continue archiving physical records.
Although the implementation of ICR in the reviewed product is
an exciting development, its functionality depends on an unchangeable
database definition. Because these values are embedded in the program,
even if the need were to arise, local programmers could not alter the
database design in response to changes in the organization's needs.
Gaining and maintaining on-site expertise would be complicated by
employee turn-over and by the fact that this system is based on a
derivative of the otherwise unfamiliar UNIX operating system.
Finally, with base prices beginning at $50,000 and a software
licensing cost of $2,000 to $4,000 per PC, this system is probably too
expensive even without considering the additional costs of eventual
integration with other University automated systems (requiring
programming services), a LAN, additional imaging hardware, and user
support expenditures.
* Questionable aspects: Required maintenance fees, batch processing
techniques, ASCII text, the possible need for interpretive analysis and
data entry duties, new system administration and user training
requirements, and a decidedly unusual licensing stratagem based on the
maximum number of online resumes, are all issues involving advantages as
well as disadvantages.
Though the product clearly holds enormous potential to address some of
Personnel Services' goals, at this time the division simply cannot
afford an investment of this magnitude.
Final thoughts
Strategies for home-grown microcomputer imaging suffer from at
least three critical flaws: (1) because low-end scanning programs offer
little support for batch processing, little or no capability exists for
volume processing of graphic images; (2) any inexpensive optical disk
drives are too slow and too small to be of any significant value in a
high-volume setting; and (3) because generalized databases do not offer
any particular ability to manage the massive numbers of scanned images
and OCR documents expected in an imaging environment, multi-user
document management software is an absolute requirement.
Document and image management software and LAN-integration are
clearly the key concepts for successful PC-based imaging applications.
Certain deficiencies of non-specific, off-the-shelf imaging components
can be offset by using better (read: more expensive) products and still
others may soon be resolved by technological innovations. Anyone
considering imaging as a strategic application would be well advised to
watch for improvements in mass storage technology and document
management programs.
In the meantime, low-end products are perfectly adequate for
project-level applications. Users can continue to rely on functions
provided by available hardware and commercial software to develop
innovative solutions as needs arise. Document management is
unquestionably important enough to warrant a substantial investment as
long as proprietary or single-source systems can be avoided.
It is also important to recognize that using automation technology
to improve organizational performance involves a lot more than simply
buying a particular computer system. With fiscal constraints, change
management is most effectively promoted by restricting attention to
those innovations most likely to fulfill clearly defined expectations.
Ultimately, however, as is true in any such endeavor, lasting success
can only be achieved by making an appropriate investment in the people
who will be working with, and dependent upon, the system changes.
So what is all of this supposed to mean? Just this--automation plans
must accurately reflect an organization's willingness and ability to
develop in-house expertise through training or hiring. The level and
quality of funded training and support must be structured in such a way
as to allow employees to concentrate on their jobs while gradually
becoming more proficient system users.
Irrespective of anyone's claims to the contrary, computerized
systems are complex tools. Competent users and a supportive
administration working in concert with automated processes can achieve
extraordinary results. Imaging will eventually be adopted as an everyday
office automation tool simply because the potential benefits are
attractive to a number of incredibly diverse operational areas. However,
it is important to realize the actual costs of imaging will be
determined more by support expenses than by a system's purchase price.
As always, the ultimate secret for unqualified success will be to
balance organizational expectations and needs against eternally
overburdened resources.
========================================================================
Footnotes:
1 David O. Stephens, "What's Ahead for Records Management in the
'90s?," The Office, January 1990, p. 135; David E. MacWhorter, "Image Is
the Next Information Frontier," The Office, April 1990, p. 78; David T.
Bogue, "Micrographics: Its Once and Future Technology," The Office,
January 1990, p. 71; and L. C. Kingman, R. E. Lambert and R. P. Steen,
"Operational Image Systems," IBM Systems Journal, September 1990, in
Computer Select, June 1991, Doc #49169.
2 John A. Murphry, "Document/Image Management Systems: Their
Advantages Are Not Optical Illusions," Today's Office, April 1990, p.
40; and H. M. Helms, "Introduction to Image Technology," IBM Systems
Journal, September 1990, p. 313 in Computer Select, June 1991, Doc
#49170.
3 Daniel V. Arrington, "Small-Scale Document Imaging," in Proceedings
of the 1990 CAUSE National Conference, Challenges and Opportunities of
Information Technology in the 90s (Boulder, Colo.: CAUSE, 1991), p. 449.
This paper includes evaluations of hardware products (IBM PS/2s and an
external IBM 3363 WORM drive, a Hewlett-Packard ScanJet Plus scanner and
LaserJet III printer) and software (Microsoft's Windows 3.0, Scanning
Gallery Plus 5.0 from Hewlett-Packard, Precision Software's Superbase 4
Windows, Caere's OmniPage/386 and OmniPage Professional).
4 MacWhorter, p. 78.
5 Turnkey systems are a combination of services and products designed
for a specific purpose--usually sold, installed, and maintained by a
single vendor.
6 Lori Grunin, "OCR Software Moves Into The Mainstream," PC Magazine,
10 October 1990, p. 320.
7 "Ask Byte Lab," Byte, August 1991, p. 294.
8 Garry Frenkel, "Erasable optical drives aid LAN users," PC Week, 11
March 1991, p. 93.
9 David A. Harvey, "State of the Media," in "State of the Art:
Magnetic vs. Optical," Jane Morrill Tazelaar, Byte, November 1990, p.
275.
10 David Kalstrom, "Getting Past the 'Write-Once' in WORM," IMC
Journal, January/February 1990, p. 16; and Harvey, p. 275.
11 Emily Leinfuss, "When Optical Storage Courts Danger," inset in
"USAA's Image of Success," Datamation, 15 May 1990, p. 80.
12 Steve Davis, "Micrographics is Increasing Its Exposure," Today's
Office, April 1990, p. 46.
13 Roger E. Wasson, "Organizing for Future Technologies," Datamation,
1 April 1990, p. 93; and Gary H. Cox, "Technology's Rewards Without the
Risks," Datamation, 1 February 1990, p. 69.
========================================================================