Bimonthly    Since 1986
ISSN 1004-9037
Indexed in:
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Distributed by:
China: All Local Post Offices
  • Table of Content
      05 July 2013, Volume 28 Issue 4   
    For Selected: View Abstracts Toggle Thumbnails
    Special Section of EDB2012
    Jinho Kim, Sang-Wook Kim, Sanghyun Park, Haixun Wang
    Journal of Data Acquisition and Processing, 2013, 28 (4): 583-584. 
    Abstract   PDF(312KB) ( 1459 )  
    It is our great pleasure to edit this special section of the Journal of Data Acquisition and Processing (JCST). The database field has experienced a rapid growth with increasing of data. Therefore, novel technology for covering emerging databases such as network or graph analysis, spatial or temporal data analysis, search, recommendation, and data mining is required. The goal of the section is to provide state-of-the-art research issues, challenges, new technologies, and solutions of emerging databases. This section publishes seven interesting articles related to query processing, trajectory data reduction, botnet evolution, recommendation system, biclustering, and protein structure alignment. The articles are summarized as follows.
    k-Nearest Neighbor Query Processing Algorithms for a Query Region in Road Networks
    Hyeong-Il Kim and Jae-Woo Chang
    Journal of Data Acquisition and Processing, 2013, 28 (4): 585-596. 
    Abstract   PDF(843KB) ( 1929 )  
    Recent development of wireless communication technologies and the popularity of smart phones are making location-based services (LBS) popular. However, requesting queries to LBS servers with users' exact locations may threat the privacy of users. Therefore, there have been many researches on generating a cloaked query region for user privacy protection. Consequently, an efficient query processing algorithm for a query region is required. So, in this paper, we propose k-nearest neighbor query (k-NN) processing algorithms for a query region in road networks. To efficiently retrieve k-NN points of interest (POIs), we make use of the Island index. We also propose a method that generates an adaptive Island index to improve the query processing performance and storage usage. Finally, we show by our performance analysis that our k-NN query processing algorithms outperform the existing k-Range Nearest Neighbor (kRNN) algorithm in terms of network expansion cost and query processing time.
    Online Approach for Spatio-Temporal Trajectory Data Reduction for Portable Devices
    Heemin Park, Young-Jun Lee, Jinseok Chae, and Wonik Choi
    Journal of Data Acquisition and Processing, 2013, 28 (4): 597-604. 
    Abstract   PDF(3062KB) ( 1999 )  
    As location data are widely available to portable devices, trajectory tracking of moving objects has become an essential technology for most location-based services. To maintain such streaming data of location updates from mobile clients, conventional approaches such as time-based regular location updating and distance-based location updating have been used. However, these methods suffer from the large amount of data, redundant location updates, and large trajectory estimation errors due to the varying speed of moving objects. In this paper, we propose a simple but efficient online trajectory data reduction method for portable devices. To solve the problems of redundancy and large estimation errors, the proposed algorithm computes trajectory errors and finds a recent location update that should be sent to the server to satisfy the user requirements. We evaluate the proposed algorithm with real GPS trajectory data consisting of 17 201 trajectories. The intensive simulation results prove that the proposed algorithm always meets the given user requirements and exhibits a data reduction ratio of greater than 87% when the acceptable trajectory error is greater than or equal to 10 meters.
    Mining Botnets and Their Evolution Patterns
    Jaehoon Choi, Jaewoo Kang, Jinseung Lee, Chihwan Song, Qingsong Jin, Sunwon Lee, and Jinsun Uh
    Journal of Data Acquisition and Processing, 2013, 28 (4): 605-615. 
    Abstract   PDF(6597KB) ( 2003 )  
    The botnet is the network of compromised computers that have fallen under the control of hackers after being infected by malicious programs such as trojan viruses. The compromised machines are mobilized to perform various attacks including mass spamming, distributed denial of service (DDoS) and additional trojans. This is becoming one of the most serious threats to the Internet infrastructure at present. We introduce a method to uncover compromised machines and characterize their behaviors using large email logs. We report various spam campaign variants with different characteristics and introduce a statistical method to combine them. We also report the long-term evolution patterns of the spam campaigns.
    Securing Recommender Systems Against Shilling Attacks Using Social-Based Clustering
    Xiang-Liang Zhang, Tak Man Desmond Lee, and Georgios Pitsilis
    Journal of Data Acquisition and Processing, 2013, 28 (4): 616-624. 
    Abstract   PDF(896KB) ( 2387 )  
    Recommender systems (RS) have been found supportive and practical in e-commerce and been established as useful aiding services. Despite their great adoption in the user communities, RS are still vulnerable to unscrupulous producers who try to promote their products by shilling the systems. With the advent of social networks new sources of information have been made available which can potentially render RS more resistant to attacks. In this paper we explore the information provided in the form of social links with clustering for diminishing the impact of attacks. We propose two algorithms, CluTr and WCluTr, to combine clustering with "trust" among users. We demonstrate that CluTr and WCluTr enhance the robustness of RS by experimentally evaluating them on data from a public consumer recommender system Epinions.com.
    Dominant Skyline Query Processing over Multiple Time Series
    Hao Wang, Chao-Kun Wang, Ya-Jun Xu and Yuan-Chi Ning
    Journal of Data Acquisition and Processing, 2013, 28 (4): 625-635. 
    Abstract   PDF(1116KB) ( 1586 )  
    Multiple time series (MTS), which describes an object in multi-dimensions, is based on single time series and has been proved to be useful. In this paper, a new analytical method called α/β-Dominant-Skyline on MTS and a formal definition of the α/β-dominant skyline MTS are given. Also, three algorithms, called NL, BC and MFB, are proposed to address the α/β-dominant skyline queries over MTS. Finally experimental results on both synthetic and real data verify the correctness and effectiveness of the proposed method and algorithms.
    SUBic: A Scalable Unsupervised Framework for Discovering High Quality Biclusters
    Jooil Lee, Yanhua Jin, and Won Suk Lee
    Journal of Data Acquisition and Processing, 2013, 28 (4): 636-646. 
    Abstract   PDF(881KB) ( 1695 )  
    A biclustering algorithm extends conventional clustering techniques to extract all of the meaningful subgroups of genes and conditions in the expression matrix of a microarray dataset. However, such algorithms are very sensitive to input parameters and show poor scalability. This paper proposes a scalable unsupervised biclustering framework, SUBic, to find high quality constant-row biclusters in an expression matrix effectively. A one-dimensional clustering algorithm is proposed to partition the attributes, that is, columns of an expression matrix into disjoint groups based on the similarity of expression values. These groups form a set of short transactions and are used to discover a set of frequent itemsets each of which corresponds to a bicluster. However, a bicluster may include any attribute whose expression value is not similar enough to others, so a bicluster refinement is used to enhance the quality of a bicluster by removing those attributes based on its distribution of expression values. The performance of the proposed method is comparatively analyzed through a series of experiments on synthetic and real datasets.
    CORE: Common Region Extension Based Multiple Protein Structure Alignment for Producing Multiple Solution
    Woo-Cheol Kim, Sanghyun Park, and Jung-Im Won
    Journal of Data Acquisition and Processing, 2013, 28 (4): 647-656. 
    Abstract   PDF(823KB) ( 1480 )  
    Over the past several decades, biologists have conducted numerous studies examining both general and specific functions of proteins. Generally, if similarities in either the structure or sequence of amino acids exist for two proteins, then a common biological function is expected. Protein function is determined primarily based on the structure rather than the sequence of amino acids. The algorithm for protein structure alignment is an essential tool for the research. The quality of the algorithm depends on the quality of the similarity measure that is used, and the similarity measure is an objective function used to determine the best alignment. However, none of existing similarity measures became golden standard because of their individual strength and weakness. They require excessive filtering to find a single alignment. In this paper, we introduce a new strategy that finds not a single alignment, but multiple alignments with different lengths. This method has obvious benefits of high quality alignment. However, this novel method leads to a new problem that the running time for this method is considerably longer than that for methods that find only a single alignment. To address this problem, we propose algorithms that can locate a common region (CORE) of multiple alignment candidates, and can then extend the CORE into multiple alignments. Because the CORE can be defined from a final alignment, we introduce CORE* that is similar to CORE and propose an algorithm to identify the CORE*. By adopting CORE* and dynamic programming, our proposed method produces multiple alignments of various lengths with higher accuracy than previous methods. In the experiments, the alignments identified by our algorithm are longer than those obtained by TM-align by 17% and 15.48%, on average, when the comparison is conducted at the level of super-family and fold, respectively.
    Architecture and VLSI Design
    Optimizing Parallel Sn Sweeps on Unstructured Grids for Multi-Core Clusters
    Jie Yan, Guang-Ming Tan, and Ning-Hui Sun
    Journal of Data Acquisition and Processing, 2013, 28 (4): 657-670. 
    Abstract   PDF(2623KB) ( 2008 )  
    In particle transport simulations, radiation effects are often described by the discrete ordinates (Sn) form of Boltzmann equation. In each ordinate direction, the solution is computed by sweeping the radiation flux across the grid. Parallel Sn sweep on an unstructured grid can be explicitly modeled as topological traversal through an equivalent directed acyclic graph (DAG), which is a data-driven algorithm. Its traditional design using MPI model results in irregular communication of massive short messages which cannot be efficiently handled by MPI runtime. Meanwhile, in high-end HPC cluster systems, multicore has become the standard processor configuration of a single node. The traditional data-driven algorithm of Sn sweeps has not exploited potential advantages of multi-threading of multicore on shared memory. These advantages, however, as we shall demonstrate, could provide an elegant solution resolving problems in the previous MPI-only design. In this paper, we give a new design of data-driven parallel Sn sweeps using hybrid MPI and Pthread programming, namely Sweep-H, to exploit hierarchical parallelism of processes and threads. With special multi-threading techniques and vertex schedule policy, Sweep-H gets more efficient communication and better load balance. We further present an analytical performance model for Sweep-H to reveal why and when it is advantageous over former MPI counterpart. On a 64-node multicore cluster system with 12 cores per node, 768 cores in total, Sweep-H achieves nearly linear scalability for moderate problem sizes, and better absolute performance than the previous MPI algorithm on more than 16 nodes (by up to two times speedup on 64 nodes).
    Thermal-Aware Post Layout Voltage-Island Generation for 3D ICs
    Ning Xu, Yu-Chun Ma, Jia Liu and Shou-Chun Tao
    Journal of Data Acquisition and Processing, 2013, 28 (4): 671-681. 
    Abstract   PDF(1933KB) ( 2066 )  
    To reduce the interconnect delay and improve the chip performance, three-dimensional (3D) chip emerged with the rapid increasing of chip integration and chip power density. Therefore, thermal issue is one of the critical challenges in 3D IC design due to the high power density. Multiple Supply Voltages (MSV) technique provides an efficient way to optimize power consumption which in turn may alleviate the hotspots. But the voltage assignment is limited not only by the performance constraints of the design, but also by the physical layout of circuit modules since the modules with the same voltage should be gathered to reduce the power-network routing resource. Especially in 3D designs, the optimization using MSV technique becomes even more complicated since the high temperature also influences the power consumption and delay on paths. In this paper, we address the voltage-island generation problem for MSV designs in 3D ICs based on a mixed integer linear programming (MILP) model. First, we propose a general MILP formulation for voltage-island generation to optimize thermal distribution as well as power-network routing resources while maintaining the whole chip performance. With the thermal-power interdependency, an iterative optimization approach is proposed to obtain the convergence. Experimental results show that our thermal-aware voltage-island generation approach can reduce the maximal on-chip temperature by 23.64% with a reasonable runtime and save the power-network routing resources by 16.71%.
    A Robust and Power-Efficient SoC Implementation in 65nm
    Bin Xiao, Yi-Fu Zhang, Yan-Ping Gao, Liang Yang, Dong-Mei Wu, and Bao-Xia Fan
    Journal of Data Acquisition and Processing, 2013, 28 (4): 682-688. 
    Abstract   PDF(2602KB) ( 1200 )  
    Godson2H is a complex SoC (System-on-Chip) of Godson series, which is a 117mm2, 152 million transistors chip fabricated in 65nm CMOS LP/GP process technology. It integrates a 1 GHz processor core and abundant high or low speed peripheral IO interfaces. To overcome on-chip-variation problems in deep submicron designs, many methods are adopted in clock tree, and PVT detectors are integrated for debug. To meet the low power constraints in different applications, most of state-of-the-art low power methods are used carefully, such as dynamic voltage and frequency scaling, power gating and aggressive multi-voltage design.
    Artificial Intelligence and Pattern Recognition
    A Survey of Commonsense Knowledge Acquisition
    Liang-Jun Zang, Cong Cao, Ya-Nan Cao, Yu-Ming Wu, and Cun-Gen Cao
    Journal of Data Acquisition and Processing, 2013, 28 (4): 689-719. 
    Abstract   PDF(1185KB) ( 2634 )  
    Collecting massive commonsense knowledge (CSK) for commonsense reasoning has been a long time standing challenge within artificial intelligence research. Numerous methods and systems for acquiring CSK have been developed to overcome the knowledge acquisition bottleneck. Although some specific commonsense reasoning tasks have been presented to allow researchers to measure and compare the performance of their CSK systems, we compare them at a higher level from the following aspects: CSK acquisition task (what CSK is acquired from where), technique used (how can CSK be acquired), and CSK evaluation methods (how to evaluate the acquired CSK). In this survey, we first present a categorization of CSK acquisition systems and the great challenges in the field. Then, we review and compare the CSK acquisition systems in detail. Finally, we conclude the current progress in this field and explore some promising future research issues.
    Parameter Control of Genetic Algorithms by Learning and Simulation of Bayesian Networks —— A Case Study for the Optimal Ordering of Tables
    Concha Bielza, Juan A. Fernández del Pozo, and Pedro Larrañaga
    Journal of Data Acquisition and Processing, 2013, 28 (4): 720-731. 
    Abstract   PDF(1773KB) ( 1322 )  
    Parameter setting for evolutionary algorithms is still an important issue in evolutionary computation. There are two main approaches to parameter setting: parameter tuning and parameter control. In this paper, we introduce self-adaptive parameter control of a genetic algorithm based on Bayesian network learning and simulation. The nodes of this Bayesian network are genetic algorithm parameters to be controlled. Its structure captures probabilistic conditional (in)dependence relationships between the parameters. They are learned from the best individuals, i.e., the best configurations of the genetic algorithm. Individuals are evaluated by running the genetic algorithm for the respective parameter configuration. Since all these runs are time-consuming tasks, each genetic algorithm uses a small-sized population and is stopped before convergence. In this way promising individuals should not be lost. Experiments with an optimal search problem for simultaneous row and column orderings yield the same optima as state-of-the-art methods but with a sharp reduction in computational time. Moreover, our approach can cope with as yet unsolved high-dimensional problems.
    Information Security
    A Generic Framework for Anonymous Authentication in Mobile Networks
    Jing Xu and Wen-Tao Zhu
    Journal of Data Acquisition and Processing, 2013, 28 (4): 732-742. 
    Abstract   PDF(376KB) ( 1734 )  
    Designing an anonymous user authentication scheme in global mobility networks is a non-trivial task because wireless networks are susceptible to attacks and mobile devices powered by batteries have limited communication, processing and storage capabilities. In this paper, we present a generic construction that converts any existing secure password authen-tication scheme based on a smart card into an anonymous authentication scheme for roaming services. The security proof of our construction can be derived from the underlying password authentication scheme employing the same assumptions. Compared with the original password authentication scheme, the transformed scheme does not sacrifice the authentication efficiency, and additionally, an agreed session key can be securely established between an anonymous mobile user and the foreign agent in charge of the network being visited. Furthermore, we present an instantiation of the proposed generic construction. The performance analysis shows that compared with other related anonymous authentication schemes, our instantiation is more efficient.
    Forgeability of Wang-Zhu-Feng-Yau's Attribute-Based Signature with Policy-and-Endorsement Mechanism
    Ai-Jun Ge, Xin-Yi Huang, Cheng Chen, Chuan-Gui Ma, and Rui Zhang
    Journal of Data Acquisition and Processing, 2013, 28 (4): 743-748. 
    Abstract   PDF(479KB) ( 1687 )  
    Recently, Wang et al. presented a new construction of attribute-based signature with policy-and-endorsement mechanism. The existential unforgeability of their scheme was claimed to be based on the strong Diffie-Hellman assumption in the random oracle model. Unfortunately, by carefully revisiting the design and security proof of Wang et al.'s scheme, we show that their scheme cannot provide unforgeability, namely, a forger, whose attributes do not satisfy a given signing predicate, can also generate valid signatures. We also point out the flaws in Wang et al.'s proof.
SCImago Journal & Country Rank

ISSN 1004-9037


Editorial Board
Author Guidelines
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China

E-mail: info@sjcjycl.cn
  Copyright ©2015 JCST, All Rights Reserved