Why Only a Human Mind Will Do
AI has come a long way, but the human touch is irreplaceable in digitizing old texts.
Even with trained proofreaders, errors sneak through. When I was working on The Vision of Hamilton, a book which includes the texts of four of Alexander Hamilton's famous reports, I found that the Lillian Goldman Law Library of Yale Law School posted this sentence as part of its online flowing-text edition of Hamilton's 1791 "Opinion as to the Constitutionality of the Bank of the United States."
"It is that which declares that the Constitution, and the laws of the United States made in pursuance of it, and all treaties made, or which shall be made, under their authority, shall be the serene law of the land." Of course Hamilton did not say "serene" but "supreme." Of course "serene" is a word, so a spellchecker would not flag it as a misspelling, but it is not the correct word in this sentence.
Artificial Intelligence (AI) has come a long way. It is now better than humans, in some situations, at pattern recognition. However, pattern recognition is not the same as cognition. Ink smudges, or, in some cases, nearly-blank spots force proofreaders to solve puzzles. What must the missing letters or words be? Sometimes if a page is somewhat skewed, the Optical Character Recognition (OCR) process will skip across lines, thus turning out a useless soup. Some old lead type printed "f" and "s" impressions that were so similar as to create a puzzle at each appearance of an "f" or an "s." Sometimes magazine or newspaper columns cannot be properly detected by OCR software and thus a letter soup is returned. On every OCR output I run a search and replace for "modem" and "modern." No old books were written about modems. Nonetheless, the word "modem" appears very frequently in OCR output. All of that refers to the recognition of words. But to create a flowing text file from OCR output, you must also recreate paragraphing, footnotes, bolds and italics, headers, etc. Digitizing paper is a complex human process driven by the love of important written insight. See this post.
Perhaps there will come a time in which AI will take over more and more of this process. However, it is extremely doubtful whether AI could ever match the loving attention of a good proofreader. That is why you must always turn off any "autocorrect" features of your word processing system. Let automatic spelling and grammar review processes passively point to potential problems, but remember that only the writer or the proofreader can actually understand the situation and appropriately respond—or not. And remember that just because a word is not misspelled, does not mean that it is the correct word in the correct spot!