How to Scan Book Pages into Word Text

How to Scan Book Pages into Word Text

Postby WeR1Family » Fri May 08, 2009 2:28 am

OVERVIEW
Still typing manually words by words to copy paragraphs on books pages?
Now no need anymore. Scanning book pages into word text is possible, didukung scanner baru Officirio yg disediakan pihak APU dimedia center. Singkatnya dengan memilih output format sebagai PDF yg textnya searchable, maka dengan mudah file pdf itu bisa dijadikan editable text. Teknologi macam ini dinamakan Optical Character Recognition (OCR) dan sudah banyak scanner2 modern yg menyediakan fitur ini.

Tergantung seberapa besar kecilnya tulisan dipages yg discan, accuracy text recognition yg dihasilkan varies. Semakin besar font dari pages yg discan, smakin tinggi ketepatan tingkat recognitionnya. But still is even better than have to type all the words one by one whenever u want to copy it to digital version.

This is especially useful if you are thinking of copying a book/magazine/newspaper/documents into its digital text format instead of preserving it in its un-editable raw PDF format which is sometimes hard to read.

HOW TO
Bagi yang tidak mengerti penjelasan diatas, berikut panduan step by step (click image to enlarge):

1. Click button besar 'Scan' yang terletak dikanan bawah dialog box, lalu di dialog box 'File Save Settings', set:
Image

2.1 File format PDF bakal dihasilkan dari halaman2 yg discan. Then buka hasilnya lewat program Adobe Reader yg disediakan, darisana jadikanlah file PDF itu menjadi file .txt:
Image

2.2 Bisa juga dengan select manual kalimat dan baris2 tertentu yg ingin dijadikan text, cukup dengan cara dihighlight(dragging ur mouse over the texts/paragraphs) lalu dicopy:
Image n then dipaste ke word Image

OPTIMIZATION AND TROUBLESHOOTING
Accuracy Enhancement
To make the text recognition output more accurate, you can help by setting up the contrast of the scanned image to more favorable condition. Just tick the checkbox of the 'Text Enhancement' option down at the main dialog menu. It will help so by making the black color of the text thicker than the original output.
Image
Enabling the OCR feature
To enable the OCR function in Media Center's Scanner, you must have 'Create Searchable PDF' option checked. Without having it checked, doing the steps above would just create ordinary PDF without the text copy-able or select-able.
Image
Less than 300dpi No Good
Make sure the dpi is set to 300dpi. Default setting is 300dpi btw, so if u dun disturb it, it should be 300dpi automatically. Having it set to lower dpi will lower the level of accuracy of the OCR. 300dpi is minimum by OCR standard.
Image
Still not working?
Well, perhaps they have changed the scanner to new model again and if so it means this post is outdated then. If so, perhaps you can write the new manual for the new scanner model and ask the administrator to get rid of this outdated post (if it is outdated ya,if not,encourage the forum administrator to keep it for the sake of others) Btw, the date of this posting is 8 May 09, 2AM, and i still hv bunch of HW to get done T_T
---
Many ways to learn money in internet for FREE, i got paid already too! Learn how:

Image
User avatar
WeR1Family
Angkatan 12
 
Posts: 182
Joined: Thu Nov 29, 2007 9:41 pm
Location: Japan

Return to APM & APS

Who is online

Users browsing this forum: No registered users and 1 guest

cron