library Tesseract

lang/py

library Tesseract - OCR test

C/H 2019. 1. 25. 18:03

이미지 처리 후 OCR 결과 얻기

from PIL import Image
import subprocess

def cleanFile(filePath, newFilePath):
	image = Image.open(filePath)

	# 회색 임계점 설정후 저장
	image = image.point(lambda x: 0 if x<143 else 255)
	image.save(newFilePath)

	# 테세렉트 읽기
	subprocess.call(["tesseract", newFilePath, "output"])

	# 텍스트 결과값 확인
	out = open("output.txt", "r")
	print(out.read())
	out.close()

cleanFile("test.tiff", "text_clean.tiff")

파이썬으로 웹 크롤러 만들기 한빛 미디어
11.2 형식이 일정한 텍스트 처리, 207p

'lang > py' 카테고리의 다른 글

Udemy Download (0)	2019.04.08
python proxy scraping (0)	2019.01.29
library Tesseract - OCR (0)	2019.01.24
library Pillow - thumbnail create (0)	2019.01.23
library requests - HTTPBasicAuth (0)	2019.01.22

현재글library Tesseract - OCR test

C.H가 끄적이는 개발자 로그

Python, javascript, nodejs, node, API, java, error, mysql, Android, ubuntu, Linux, Godot, 우분투, windows, CSS, Godot3, HTML, 구글, Google, PHP,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Blue Breeze

library Tesseract - OCR test

이미지 처리 후 OCR 결과 얻기

'lang > py' 카테고리의 다른 글

'lang/py'의 다른글

티스토리툴바

library Tesseract - OCR test

이미지 처리 후 OCR 결과 얻기

'lang > py' 카테고리의 다른 글

'lang/py'의 다른글

관련글

티스토리툴바