Loading...
Bimonthly    Since 1986
ISSN 1004-9037
/
Indexed in:
SCIE, Ei, INSPEC, JST, AJ, MR, CA, DBLP, etc.
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
 
  • Table of Content
      05 September 2011, Volume 26 Issue 5   
    For Selected: View Abstracts Toggle Thumbnails
    Special Section on Community Analysis and Information Recommendation
    Preface
    Xue-Qi Cheng, Wei-Song Shi, Tao Zhou
    Journal of Data Acquisition and Processing, 2011, 26 (5): 751-753. 
    Abstract   PDF(199KB) ( 1671 )  
    In the past few years, we have witnessed rapid proliferation of online social network services, such as Facebook, Twitter, and Del.icio.us, which greatly facilitate the collaboration, sharing, and other kinds of interactions among individuals. The term 'social media' has thus been coined to embrace all those new collaborative services or applications and also to indicate a new "social" approach to generating and distributing Web contents. The everincreasing social network services produce large-scale social data for analyzing user behaviors, and then improving existing services. On the other hand, they also bring forth more severe information overload and require more flexible and robust computing platforms or service paradigms.
    Personalized News Recommendation: A Review and an Experimental Investigation
    LI Lei, WANG Ding-Ding, SHU Shun-Zhi, LI Chao
    Journal of Data Acquisition and Processing, 2011, 26 (5): 754-766. 
    Abstract   PDF(1153KB) ( 3066 )  
    Online news articles, as a new format of press releases, have sprung up on the Internet. With its convenience and recency, more and more people prefer to read news online instead of reading the paper-format press releases. However, a gigantic amount of news events might be released at a rate of hundreds, even thousands per hour. A challenging problem is how to efficiently select specific news articles from a large corpus of newly-published press releases to recommend to individual readers, where the selected news items should match the reader's reading preference as much as possible. This issue refers to personalized news recommendation. Recently, personalized news recommendation has become a promising research direction as the Internet provides fast access to real-time information from multiple sources around the world. Existing personalized news recommendation systems strive to adapt their services to individual users by virtue of both user and news content information. A variety of techniques have been proposed to tackle personalized news recommendation, including content-based, collaborative filtering systems and hybrid versions of these two. In this paper, we provide a comprehensive investigation of existing personalized news recommenders. We discuss several essential issues underlying the problem of personalized news recommendation, and explore possible solutions for performance improvement. Further, we provide an empirical study on a collection of news articles obtained from various news websites, and evaluate the effect of different factors for personalized news recommendation. We hope our discussion and exploration would provide insights for researchers who are interested in personalized news recommendation.
    Tag-Aware Recommender Systems: A State-of-the-Art Survey
    Zi-Ke Zhang (张子柯), Tao Zhou (周涛), and Yi-Cheng Zhang (张翼成)
    Journal of Data Acquisition and Processing, 2011, 26 (5): 767-777. 
    Abstract   PDF(811KB) ( 3481 )  
    In the past decade, Social Tagging Systems have attracted increasing attention from both physical and computer science communities. Besides the underlying structure and dynamics of tagging systems, many efforts have been addressed to unify tagging information to reveal user behaviors and preferences, extract the latent semantic relations among items, make recommendations, and so on. Specifically, this article summarizes recent progress about tag-aware recommender systems, emphasizing on the contributions from three mainstream perspectives and approaches: network-based methods, tensor-based methods, and the topic-based methods. Finally, we outline some other tag-related studies and future challenges of tag-aware recommendation algorithms.
    Detecting Communities in K-Partite K-Uniform (Hyper)Networks
    Xin Liu (刘欣) and Tsuyoshi Murata, Member, ACM, IEEE
    Journal of Data Acquisition and Processing, 2011, 26 (5): 778-791. 
    Abstract   PDF(1897KB) ( 1841 )  
    In social tagging systems such as Delicious and Flickr, users collaboratively manage tags to annotate resources. Naturally, a social tagging system can be modeled as a (user, tag, resource) hypernetwork, where there are three different types of nodes, namely users, resources and tags, and each hyperedge has three end nodes, connecting a user, a resource and a tag that the user employs to annotate the resource. Then how can we automatically cluster related users, resources and tags, respectively? This is a problem of community detection in a 3-partite, 3-uniform hypernetwork. More generally, given a K-partite K-uniform (hyper)network, where each (hyper)edge is a K-tuple composed of nodes of K different types, how can we automatically detect communities for nodes of different types? In this paper, by turning this problem into a problem of finding an efficient compression of the (hyper)network's structure, we propose a quality function for measuring the goodness of partitions of a K-partite K-uniform (hyper)network into communities, and develop a fast community detection method based on optimization. Our method overcomes the limitations of state of the art techniques and has several desired properties such as comprehensive, parameter-free, and scalable. We compare our method with existing methods in both synthetic and real-world datasets.
    A Posteriori Approach for Community Detection
    Chuan Shi (石川), Member, CCF, IEEE, Zhen-Yu Yan (闫震宇), Member, IEEE, Xin Pan (潘欣), Ya-Nan Cai (蔡亚男), and Bin Wu (吴斌), Member, CCF
    Journal of Data Acquisition and Processing, 2011, 26 (5): 792-805. 
    Abstract   PDF(465KB) ( 1752 )  
    Conventional community detection approaches in complex network are based on the optimization of a priori decision, i.e., a single quality function designed beforehand. This paper proposes a posteriori decision approach for community detection. The approach includes two phases: in the search phase, a special multi-objective evolutionary algorithm is designed to search for a set of tradeoff partitions that reveal the community structure at different scales in one run; in the decision phase, three model selection criteria and the Possibility Matrix method are proposed to aid decision makers to select the preferable solutions through differentiating the set of optimal solutions according to their qualities. The experiments in five synthetic and real social networks illustrate that, in one run, our method is able to obtain many candidate solutions, which effectively avoids the resolution limit existing in priori decision approaches. In addition, our method can discover more authentic and comprehensive community structures than those priori decision approaches.
    Modeling Consensus Semantics in Social Tagging Systems
    Bin Zhang (张斌), Senior Member, CCF, Member, ACM, Yin Zhang (张引), and Ke-Ning Gao (高克宁)
    Journal of Data Acquisition and Processing, 2011, 26 (5): 806-815. 
    Abstract   PDF(472KB) ( 1901 )  
    In social tagging systems, people can annotate arbitrary tags to online data to categorize and index them. However, the lack of the "a priori" set of words makes it difficult for people to reach consensus about the semantics of tags and how to categorize data. Ontologies based approaches can help reaching such consensus, but they are still facing problems such as inability of model ambiguous and new concepts properly. For tags that are used very few times, since they can only be used in very specific contexts, their semantics are very clear and detailed. Although people have no consensus on these tags, it is still possible to leverage these detailed semantics to model the other tags. In this paper we introduce a random walk and spreading activation like model to represent the semantics of tags using semantics of unpopular tags. By comparing the proposed model to the classic Latent Semantic Analysis approach in a concept clustering task, we show that the proposed model can properly capture the semantics of tags.
    From Popularity to Personality —— A Heuristic Music Recommendation Method for Niche Market
    Jun-Lin Zhou (周俊临), Yan Fu (傅彦), Hua Lu (路华), and Chong-Jing Sun (孙崇敬)
    Journal of Data Acquisition and Processing, 2011, 26 (5): 816-822. 
    Abstract   PDF(574KB) ( 1705 )  
    In most current recommender systems, the goal to accurately predict what people want leads to the tendency to recommend popular items, which is less helpful in revealing user's personality, especially to new users. In this paper, we propose a heuristic music recommendation method for niche market by focusing on how to identify user's personality as soon as possible. Instead of trying to improve algorithm's performance on new users by recommending the most popular items, we work on how to make them "familiar" with the system earlier. The method is more suitable for brand-new users, and gives a hint to solve the cold start problem. In real applications it is better to combine it with a traditional approach.
    Preventing Recommendation Attack in Trust-Based Recommender Systems
    Fu-Guo Zhang (张富国)
    Journal of Data Acquisition and Processing, 2011, 26 (5): 823-828. 
    Abstract   PDF(485KB) ( 1916 )  
    Despite its success, similarity-based collaborative filtering suffers from some limitations, such as scalability, sparsity and recommendation attack. Prior work has shown incorporating trust mechanism into traditional collaborative filtering recommender systems can improve these limitations. We argue that trust-based recommender systems are facing novel recommendation attack which is different from the profile injection attacks in traditional recommender system. To the best of our knowledge, there has not any prior study on recommendation attack in a trust-based recommender system. We analyze the attack problem, and find that "victim" nodes play a significant role in the attack. Furthermore, we propose a data provenance method to trace malicious users and identify the "victim" nodes as distrust users of recommender system. Feasibility study of the defend method is done with the dataset crawled from Epinions website.
    Understanding and Exploiting Information Spreading and Integrating Technologies
    Petter Holme and Mikael Huss
    Journal of Data Acquisition and Processing, 2011, 26 (5): 829-836. 
    Abstract   PDF(943KB) ( 1815 )  
    Our daily life leaves an increasing amount of digital traces, footprints that are improving our lives. Data-mining tools, like recommender systems, convert these traces to information for aiding decisions in an ever-increasing number of areas in our lives. The feedback loop from what we do, to the information this produces, to decisions what to do next, will likely be an increasingly important factor in human behavior on all levels from individuals to societies. In this essay, we review some effects of this feedback and discuss how to understand and exploit them beyond mapping them on more well-understood phenomena. We take examples from models of spreading phenomena in social media to argue that analogies can be deceptive, instead we need to fresh approaches to the new types of data, something we exemplify with promising applications in medicine.
    Architecture and High Performance Computer Systems
    QoS-Aware Automatic Service Composition: A Graph View
    Wei Jiang (姜伟), Tian Wu (吴甜), Song-Lin Hu (虎嵩林), Senior Member, CCF, and Zhi-Yong Liu (刘志勇), Senior Member, CCF
    Journal of Data Acquisition and Processing, 2011, 26 (5): 837-853. 
    Abstract   PDF(738KB) ( 1815 )  
    In the research of service composition, it demands efficient algorithms that not only retrieve correct service compositions automatically from thousands of services but also satisfy the quality requirements of different service users. However, most approaches treat these two aspects as two separate problems, automatic service composition and service selection. Although the latest researches realize the restriction of this separate view and some specific methods are proposed, they still suffer from serious limitations in scalability and accuracy when addressing both requirements simultaneously. In order to cope with these limitations and efficiently solve the combined problem which is known as QoS-aware or QoS-driven automatic service composition problem, we propose a new graph search problem, single-source optimal directed acyclic graphs (DAGs), for the first time. This novel single-source optimal DAGs (SSOD) problem is similar to, but more general than the classical single-source shortest paths (SSSP) problem. In this paper, a new graph model of SSOD problem is proposed and a Sim-Dijkstra algorithm is presented to address the SSOD problem with the time complexity of O(n log n + m) (n and m are the number of nodes and edges in the graph respectively), and the proofs of its soundness. It is also directly applied to solve the QoS-aware automatic service composition problem, and a service composition tool named QSynth is implemented. Evaluations show that Sim-Dijkstra algorithm achieves superior scalability and efficiency with respect to a large variety of composition scenarios, even more efficient than our worklist algorithm that won the performance championship of Web Services Challenge 2009.
    Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer
    Feng Wang (王锋) Member, CCF, ACM, Can-Qun Yang (杨灿群), Yun-Fei Du (杜云飞), Juan Chen (陈娟), Hui-Zhan Yi (易会战), and Wei-Xia Xu (徐炜遐)
    Journal of Data Acquisition and Processing, 2011, 26 (5): 854-865. 
    Abstract   PDF(381KB) ( 2395 )  
    In this paper we present the programming of the Linpack benchmark on TianHe-1 system, the first petascale supercomputer system of China, and the largest GPU-accelerated heterogeneous system ever attempted before. A hybrid programming model consisting of MPI, OpenMP and streaming computing is described to explore the task parallel, thread parallel and data parallel of the Linpack. We explain how we optimized the load distribution across the CPUs and GPUs using the two-level adaptive method and describe the implementation in details. To overcome the low-bandwidth between the CPU and GPU communication, we present a software pipelining technique to hide the communication overhead. Combined with other traditional optimizations, the Linpack we developed achieved 196:7 GFLOPS on a single compute element of TianHe-1. This result is 70:1% of the peak compute capability, 3:3 times faster than the result by using the vendor's library. On the full configuration of TianHe-1 our optimizations resulted in a Linpack performance of 0:563 PFLOPS, which made TianHe-1 the 5th fastest supercomputer on the Top500 list in November, 2009.
    Revisiting Multiple Pattern Matching Algorithms for Multi-Core Architecture
    Guang-Ming Tan (谭光明), Member, CCF, ACM, Ping Liu (刘萍), Member, CCF, ACM, Dong-Bo Bu (卜东波), Member, CCF, ACM, and Yan-Bing Liu (刘燕兵), Member, CCF, ACM
    Journal of Data Acquisition and Processing, 2011, 26 (5): 866-874. 
    Abstract   PDF(527KB) ( 2114 )  
    Due to the huge size of patterns to be searched, multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection. In this paper, we present an attempt to design efficient multiple pattern searching algorithms on multi-core architectures. We observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of patterns. The multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is minimized. We formulate the problem as an optimal decomposition and scheduling of a pattern set, then propose a heuristic algorithm, which takes advantage of dynamic programming and greedy algorithmic techniques, to solve the optimization problem. Experimental results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.
    An FFT Performance Model for Optimizing General-Purpose Processor Architecture
    Ling Li (李玲), Member, CCF, ACM, IEEE, Yun-Ji Chen (陈云霁), Member, CCF, ACM, Dao-Fu Liu (刘道福), Cheng Qian (钱诚), and Wei-Wu Hu (胡伟武), Member, IEEE
    Journal of Data Acquisition and Processing, 2011, 26 (5): 875-889. 
    Abstract   PDF(868KB) ( 1948 )  
    General-purpose processor (GPP) is an important platform for fast Fourier transform (FFT), due to its flexibility, reliability and practicality. FFT is a representative application intensive in both computation and memory access, optimizing the FFT performance of a GPP also benefits the performances of many other applications. To facilitate the analysis of FFT, this paper proposes a theoretical model of the FFT processing. The model gives out a tight lower bound of the runtime of FFT on a GPP, and guides the architecture optimization for GPP as well. Based on the model, two theorems on optimization of architecture parameters are deduced, which refer to the lower bounds of register number and memory bandwidth. Experimental results on different processor architectures (including Intel Core i7 and Godson-3B) validate the performance model.
    The above investigations were adopted in the development of Godson-3B, which is an industrial GPP. The optimization techniques deduced from our performance model improve the FFT performance by about 40%, while incurring only 0:8% additional area cost. Consequently, Godson-3B solves the 1024-point single-precision complex FFT in 0:368 μs with about 40Watt power consumption, and has the highest performance-per-watt in complex FFT among processors as far as we know. This work could benefit optimization of other GPPs as well.
    Artificial Intelligence
    Managing Software Requirements Changes Based on Negotiation-Style Revision
    Ke-Dian Mu (牟克典), Member, CCF, Weiru Liu, Zhi Jin (金芝), Senior Member, CCF, IEEE, Jun Hong, and David Bell
    Journal of Data Acquisition and Processing, 2011, 26 (5): 890-907. 
    Abstract   PDF(597KB) ( 1835 )  
    For any proposed software project, when the software requirements specification has been established, requirements changes may result in not only a modification of the requirements specification but also a series of modifications of all existing artifacts during the development. Then it is necessary to provide effective and flexible requirements changes management. In this paper, we present an approach to managing requirements changes based on Booth's negotiation-style framework for belief revision. Informally, we consider the current requirements specification as a belief set about the systemto-be. The request of requirements change is viewed as new information about the same system-to-be. Then the process of executing the requirements change is a process of revising beliefs about the system-to-be. We design a family of belief negotiation models appropriate for different processes of requirements revision, including the setting of the request of requirements change being fully accepted, the setting of the current requirements specification being fully preserved, and that of the current specification and the request of requirements change reaching a compromise. In particular, the prioritization of requirements plays an important role in reaching an agreement in each belief negotiation model designed in this paper.
    Linearly and Quadratically Separable Classifiers Using Adaptive Approach
    Mohamed Abdel-Kawy Mohamed Ali Soliman and Rasha M. Abo-Bakr
    Journal of Data Acquisition and Processing, 2011, 26 (5): 908-918. 
    Abstract   PDF(822KB) ( 2031 )  
    This paper presents a fast adaptive iterative algorithm to solve linearly separable classification problems in Rn. In each iteration, a subset of the sampling data (n-points, where n is the number of features) is adaptively chosen and a hyperplane is constructed such that it separates the chosen n-points at a margin ε and best classifies the remaining points. The classification problem is formulated and the details of the algorithm are presented. Further, the algorithm is extended to solving quadratically separable classification problems. The basic idea is based on mapping the physical space to another larger one where the problem becomes linearly separable. Numerical illustrations show that few iteration steps are sufficient for convergence when classes are linearly separable. For nonlinearly separable data, given a specified maximum number of iteration steps, the algorithm returns the best hyperplane that minimizes the number of misclassified points occurring through these steps. Comparisons with other machine learning algorithms on practical and benchmark datasets are also presented, showing the performance of the proposed algorithm.
SCImago Journal & Country Rank
 

ISSN 1004-9037

         

Home
Editorial Board
Author Guidelines
Subscription
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China

E-mail: info@sjcjycl.cn
 
  Copyright ©2015 JCST, All Rights Reserved