Bimonthly    Since 1986
ISSN 1004-9037
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
 
   
      30 Dec 2022, Volume 37 Issue 5   
    Article

    AN ANALYSIS OF WORD EMBEDDING MODELS WIDE-RANGING OF SOTA TRANSFORMERS
    T. Priyanka, A. Mary Sowjanya
    Journal of Data Acquisition and Processing, 2022, 37 (5): 1763-1780 . 

    Abstract

    Research on word representation has always been an important area of interest in the antiquity of Natural Language Processing (NLP). Interpreting such intricate linguistic data is essential, since it carries a wealth of information and is useful for many applications. In the context of NLP, Deep Learning manifests as word embeddings, which are extensively used to represent words of a document as multi-dimensional numeric vectors in place of traditional word representations. In deep learning models, word embeddings are crucial part of providing input features for downstream tasks, such as sequence labeling, text classification etc., large amounts of text can be converted into effective vector representations that capture the same semantic information using these approaches. Furthermore, several learning algorithms can use such representations for a range of NLP-related tasks. The effectiveness or accuracy of an embedding can be established if it could be transferred to a downstream task of NLP to surpass the performance levels that could be reached by traditional machine learning algorithms. Over the past decade a number of word embedding methods, mainly catered to the traditional and context-based categories, were proposed. As part of this study, we examine different word representation models in terms of their power of expression, from historical models to today's state-of-the-art word representation language models.

    Keyword

    NLP, Machine Learning, word embedding, Deep Learning, Language model.


    PDF Download (click here)

SCImago Journal & Country Rank

ISSN 1004-9037

         

Home
Editorial Board
Author Guidelines
Subscription
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: info@sjcjycl.cn
 
  Copyright ©2015 JCST, All Rights Reserved