Bimonthly    Since 1986
ISSN 1004-9037
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
 
   
      09 May 2023, Volume 38 Issue 3
    Article

    FEATURE EXTRACTION FOR TEXT MINING WEIGHT BASED CORE CORPUS TF-IDF (W2CTF-IDF)
    1 Dr.P.Logeswari, 2 S.Sudha, 3 G.Banupriya, 4J.Gokulapriya
    Journal of Data Acquisition and Processing, 2023, 38 (3): 1356-1374 . 

    Abstract

    Text mining, otherwise called Intelligent Text Analysis is a significant exploration region. It is extremely challenging to zero in on the most suitable data because of the great dimensionality of information. Highlight Extraction is one of the significant techniques in information decrease to find the main features. Processing a huge measure of information put away in an unstructured structure is a challenging undertaking. Feature extraction is one of the huge pre-processing techniques in information mining that registers features esteem in documents. Thus, productive element extraction techniques term frequency-inverse document frequency (TF-IDF) techniques are regularly used in term weighting. This issue can't mean the accommodation or significance of certain features and diminishes the productivity of characterization. The record server executes stop word expulsion, labelling, and the examination of polysemous words in a pre-processing methodology to make a competitor corpus. Weight-based Core Corpus TF-IDF (W2CTF-IDF) is proposed to the competitor corpus to assess the significance of words in a bunch of documents. The words named of high significance by W2CTF-IDF are remembered for a bunch of keywords, and the transactions of each document are made. The technique is assessed W2CTF-IDF to weight the terms on financial data. The experiments show that W2CTF-IDF further develops the performance evaluation of component extraction as indicated by the maximum worth of the F1 measure.

    Keyword

    Feature extraction, Text Mining, TF-IDF, Weight-based Core Corpus TF-IDF and Weighting;


    PDF Download (click here)

SCImago Journal & Country Rank

ISSN 1004-9037

         

Home
Editorial Board
Author Guidelines
Subscription
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: info@sjcjycl.cn
 
  Copyright ©2015 JCST, All Rights Reserved