I have a web page with a simple image OCR text.
I would like to get the text of this image with Tesseract.js. It's working fine except at first launch. The following message is displayed and nothing more:
initializing api (100%)
After reloading it's working fine. I don't know why it only work after reloading the page. If I clear the cache the issue reappears. I use Firefox.
My HTML/Javascript file
<html>
<head>
<title>QRScanner Library Test</title>
<script src="tesseract.js"></script>
</head>
<body>
<input type="button" id="go_button" value="Run" />
<div id="ocr_results"> </div>
<div id="ocr_status"> </div>
<img id="img" src="ocr.gif"/>
<script>
document.getElementById("go_button")
.addEventListener("click", function(e) {
var url = document.getElementById("img").src;
runOCR(url);
});
function runOCR(url) {
Tesseract.recognize(url)
.then(function(result) {
document.getElementById("ocr_results")
.innerText = result.text;
}).progress(function(result) {
document.getElementById("ocr_status")
.innerText = result["status"] + " (" +
(result["progress"] * 100) + "%)";
});
}
</script>
</body>
</html>
I have downloaded in the same folder all js files: tesseract.js
, worker.js
, index.js
and language package eng.traineddata.gz