The word images must be able to be described by parameters extracted from a Con-volutional Neural Network (CNN). This to determine what words look similar. The similarity of word images is what is used to create an alignment. The algorithm nds words that are common in the text and identi es these by the shape of the word image and its position on the page. It uses the features it creates for each image to return an alignment, linking words in the transcript to word images given the similarity of reoc-curring words in the ...
Read More
The word images must be able to be described by parameters extracted from a Con-volutional Neural Network (CNN). This to determine what words look similar. The similarity of word images is what is used to create an alignment. The algorithm nds words that are common in the text and identi es these by the shape of the word image and its position on the page. It uses the features it creates for each image to return an alignment, linking words in the transcript to word images given the similarity of reoc-curring words in the text. These words are added as labelled images to a data set. A labelled data set of word images subsequently grows with each page shown to the algo-rithm and is used to facilitate better alignments on future pages. This yields not only a probable alignment of words to word images on a page but also a data set of labelled words.
Read Less
Add this copy of Learning from Imperfections: Building Datasets with to cart. $26.10, new condition, Sold by Ingram Customer Returns Center rated 5.0 out of 5 stars, ships from NV, USA, published 2024 by Tredition Gmbh.