|
|
Bimonthly Since 1986 |
ISSN 1004-9037
|
|
|
|
|
Publication Details |
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
|
|
|
|
|
|
|
|
|
|
Abstract
Vector space model is a mathematical model for evaluating the similarities between large data set and a query in increasing order so that a user can find the best matching document among all. It calculates similarity value by using their cosine function. The cosine function evaluates the similarity value by using a weighting scheme. The available factors for weighting schemes are TF(Term-frequency) & IDF(Inverse document frequency). There are various stop words are used when we are writing a query, but only main query terms are important for us for finding best match. It is found that sometimes the results of vector space model are slightly different from other due to the separation of the stop words during similarity analysis. So here we are using some value for stop word so that they can also improve the rank of a document. Also, we are working with entropy-based link optimization algorithm for ranking document, so that we can compare the improved version of vector space model with the entropy-based link optimization algorithm.
Keyword
Optimization, Entropy, Data Mining
PDF Download (click here)
|
|
|
|
|