Bimonthly    Since 1986
ISSN 1004-9037
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
 
   
      1 Jan 2023, Volume 38 Issue 1   
    Article

    1. AUTOSEG: AN AUTOMATED SEGMENTATION APPROACH FOR TYPEWRITTEN GURMUKHI CHARACTER RECOGNITION
    Gurvir Kaur1, Dr. Ajit Kumar2
    Journal of Data Acquisition and Processing, 2023, 38 (1): 1857-1870 . 

    Abstract

    Background: The increasing demand for digital media influenced many fields. Most of the content is now available digitally for easy access to every user. India is a multilingual country and rich in culture. Most of the historical Vedas, Granths, literature books, poetries, etc., are in the local languages, so now there is a need to develop the systems that help to secure this history or culture in a digital form so people from other cultures can also understand or read it by translating it in their language. Many researchers have worked on different languages, but recognition systems still need improvement. The Gurmukhi language is one of them. Though systems were developed for the Gurmukhi language, the samples were either handwritten or printed. Much of the literary data is present in the typewritten form, as typewriters were very popular earlier, so there is a need to develop a system that automatically deals with this data. Objective: In this work, an automated segmentation approach (AutoSeg) is proposed that takes an input of typewritten Gurmukhi text in an image form and segments it. Methods: The proposed approach (AutoSeg) is divided into two phases: pre-processing to enhance the quality of the sample image and then the segmentation phase to segment all the characters from the sample image. This automated system will help derive a system that can recognize the character accurately to use in the future. The proposed AutoSeg uses different image processing methods to segment the lines, words, and characters from the given image samples. The proposed approach also deals with the issue of broken and touching characters. Results: The proposed approach is tested using our collected database of books, thesis, poetry documents, etc., and segmentation accuracy is measured at each level. It has been found that the proposed approach achieved an accuracy of more than 90% for each line, word, and character segmentation, proving the approach's effectiveness. Conclusion: The novel automated segmentation approach is proposed that effectively segments the Gurmukhi typewritten characters and deals with the problems like broken or touching characters. The results also show the effectiveness of this segmentation approach. Hence, it can be utilized soon for Gurmukhi typewritten character recognition systems.

    Keyword

    Automatic Segmentation, Line, Word, Character, Gurmukhi Text, Typewritten.


    PDF Download (click here)

SCImago Journal & Country Rank

ISSN 1004-9037

         

Home
Editorial Board
Author Guidelines
Subscription
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: info@sjcjycl.cn
 
  Copyright ©2015 JCST, All Rights Reserved