Loading...
Bimonthly    Since 1986
ISSN 1004-9037
/
Indexed in:
SCIE, Ei, INSPEC, JST, AJ, MR, CA, DBLP, etc.
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
 
  • Table of Content
      05 March 2019, Volume 34 Issue 2   
    For Selected: View Abstracts Toggle Thumbnails
    Special Section of Advances in Computer Science and Technology—Current Advances in the NSFC Joint Research Fund for Overseas Chinese Scholars and Scholars in Hong Kong and Macao 2014-2017 (Part 2)
    Preface
    Su Song, Ke Liu, Zhi-Yong Liu
    Journal of Data Acquisition and Processing, 2019, 34 (2): 253-255. 
    Abstract   PDF(509KB) ( 195 )  
    Real-Time Avatar Pose Transfer and Motion Generation Using Locally Encoded Laplacian Offsets
    Masoud Zadghorban Lifkooee, Celong Liu, Yongqing Liang, Yimin Zhu, Xin Li
    Journal of Data Acquisition and Processing, 2019, 34 (2): 256-271. 
    Abstract   PDF(6600KB) ( 539 )  
    We propose a human avatar representation scheme based on intrinsic coordinates, which are invariant to isometry and insensitive to human pose changes, and an efficient pose transfer algorithm that can utilize this representation to reconstruct a human body geometry following a given pose. Such a pose transfer algorithm can be used to control the movement of an avatar model in virtual reality environments following a user's motion in real time. Our proposed algorithm consists of three main steps. First, we recognize the user's pose and select a template model from the database who has a similar pose; then, the intrinsic Laplacian offsets encoded in local coordinates are used to reconstruct the human body geometry following the template pose; finally, the morphing between the two poses is generated using a linear interpolation. We perform experiments to evaluate the accuracy and efficiency of our algorithm. We believe our proposed system is a promising human modeling tool that can be used in general virtual reality applications.
    Improving Data Utility Through Game Theory in Personalized Differential Privacy
    Lei Cui, Youyang Qu, Mohammad Reza Nosouhi, Shui Yu, Jian-Wei Niu, Gang Xie
    Journal of Data Acquisition and Processing, 2019, 34 (2): 272-286. 
    Abstract   PDF(442KB) ( 612 )  
    Due to dramatically increasing information published in social networks, privacy issues have given rise to public concerns. Although the presence of differential privacy provides privacy protection with theoretical foundations, the trade-off between privacy and data utility still demands further improvement. However, most existing studies do not consider the quantitative impact of the adversary when measuring data utility. In this paper, we firstly propose a personalized differential privacy method based on social distance. Then, we analyze the maximum data utility when users and adversaries are blind to the strategy sets of each other. We formalize all the payoff functions in the differential privacy sense, which is followed by the establishment of a static Bayesian game. The trade-off is calculated by deriving the Bayesian Nash equilibrium with a modified reinforcement learning algorithm. The proposed method achieves fast convergence by reducing the cardinality from n to 2. In addition, the in-place trade-off can maximize the user's data utility if the action sets of the user and the adversary are public while the strategy sets are unrevealed. Our extensive experiments on the real-world dataset prove the proposed model is effective and feasible.
    CATIRI: An Efficient Method for Content-and-Text Based Image Retrieval
    Mengqi Zeng, Bin Yao, Zhi-Jie Wang, Yanyan Shen, Feifei Li, Jianfeng Zhang, Hao Lin, Minyi Guo
    Journal of Data Acquisition and Processing, 2019, 34 (2): 287-304. 
    Abstract   PDF(608KB) ( 615 )  
    The combination of visual and textual information in image retrieval remarkably alleviates the semantic gap of traditional image retrieval methods, and thus it has attracted much attention recently. Image retrieval based on such a combination is usually called the content-and-text based image retrieval (CTBIR). Nevertheless, existing studies in CTBIR mainly make efforts on improving the retrieval quality. To the best of our knowledge, little attention has been focused on how to enhance the retrieval efficiency. Nowadays, image data is widespread and expanding rapidly in our daily life. Obviously, it is important and interesting to investigate the retrieval efficiency. To this end, this paper presents an efficient image retrieval method named CATIRI (content-and-text based image retrieval using indexing). CATIRI follows a three-phase solution framework that develops a new indexing structure called MHIM-tree. The MHIM-tree seamlessly integrates several elements including Manhattan Hashing, Inverted index, and M-tree. To use our MHIM-tree wisely in the query, we present a set of important metrics and reveal their inherent properties. Based on them, we develop a top-k query algorithm for CTBIR. Experimental results based on benchmark image datasets demonstrate that CATIRI outperforms the competitors by an order of magnitude.
    Space Efficient Quantization for Deep Convolutional Neural Networks
    Dong-Di Zhao, Fan Li, Kashif Sharif, Guang-Min Xia, Yu Wang
    Journal of Data Acquisition and Processing, 2019, 34 (2): 305-317. 
    Abstract   PDF(2290KB) ( 624 )  
    Deep convolutional neural networks (DCNNs) have shown outstanding performance in the fields of computer vision, natural language processing, and complex system analysis. With the improvement of performance with deeper layers, DCNNs incur higher computational complexity and larger storage requirement, making it extremely difficult to deploy DCNNs on resource-limited embedded systems (such as mobile devices or Internet of Things devices). Network quantization efficiently reduces storage space required by DCNNs. However, the performance of DCNNs often drops rapidly as the quantization bit reduces. In this article, we propose a space efficient quantization scheme which uses eight or less bits to represent the original 32-bit weights. We adopt singular value decomposition (SVD) method to decrease the parameter size of fully-connected layers for further compression. Additionally, we propose a weight clipping method based on dynamic boundary to improve the performance when using lower precision. Experimental results demonstrate that our approach can achieve up to approximately 14x compression while preserving almost the same accuracy compared with the full-precision models. The proposed weight clipping method can also significantly improve the performance of DCNNs when lower precision is required.
    A 2-Stage Strategy for Non-Stationary Signal Prediction and Recovery Using Iterative Filtering and Neural Network
    Feng Zhou, Hao-Min Zhou, Zhi-Hua Yang, Li-Hua Yang
    Journal of Data Acquisition and Processing, 2019, 34 (2): 318-338. 
    Abstract   PDF(1288KB) ( 309 )  
    Predicting the future information and recovering the missing data for time series are two vital tasks faced in various application fields. They are often subjected to big challenges, especially when the signal is nonlinear and nonstationary which is common in practice. In this paper, we propose a hybrid 2-stage approach, named IF2FNN, to predict (including short-term and long-term predictions) and recover the general types of time series. In the first stage, we decompose the original non-stationary series into several “quasi stationary” intrinsic mode functions (IMFs) by the iterative filtering (IF) method. In the second stage, all of the IMFs are fed as the inputs to the factorization machine based neural network model to perform the prediction and recovery. We test the strategy on five datasets including an artificial constructed signal (ACS), and four real-world signals: the length of day (LOD), the northern hemisphere land-ocean temperature index (NHLTI), the troposphere monthly mean temperature (TMMT), and the national association of securities dealers automated quotations index (NASDAQ). The results are compared with those obtained from the other prevailing methods. Our experiments indicate that under the same conditions, the proposed method outperforms the others for prediction and recovery according to various metrics such as mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE).
    A Survey on Graph Processing Accelerators: Challenges and Opportunities
    Chuang-Yi Gui, Long Zheng, Bingsheng He, Cheng Liu, Xin-Yu Chen, Xiao-Fei Liao, Hai Jin
    Journal of Data Acquisition and Processing, 2019, 34 (2): 339-371. 
    Abstract   PDF(1062KB) ( 1205 )  
    Graph is a well known data structure to represent the associated relationships in a variety of applications, e.g., data science and machine learning. Despite a wealth of existing efforts on developing graph processing systems for improving the performance and/or energy efficiency on traditional architectures, dedicated hardware solutions, also referred to as graph processing accelerators, are essential and emerging to provide the benefits significantly beyond what those pure software solutions can offer. In this paper, we conduct a systematical survey regarding the design and implementation of graph processing accelerators. Specifically, we review the relevant techniques in three core components toward a graph processing accelerator: preprocessing, parallel graph computation, and runtime scheduling. We also examine the benchmarks and results in existing studies for evaluating a graph processing accelerator. Interestingly, we find that there is not an absolute winner for all three aspects in graph acceleration due to the diverse characteristics of graph processing and the complexity of hardware configurations. We finally present and discuss several challenges in details, and further explore the opportunities for the future research.
    Computer Networks and Distributed Computing
    Optimally Embedding 3-Ary n-Cubes into Grids
    Wei-Bei Fan, Jian-Xi Fan, Cheng-Kuan Lin, Yan Wang, Yue-Juan Han, Ru-Chuan Wang
    Journal of Data Acquisition and Processing, 2019, 34 (2): 372-387. 
    Abstract   PDF(1585KB) ( 746 )  
    The 3-ary n-cube, denoted as Qn3, is an important interconnection network topology proposed for parallel computers, owing to its many desirable properties such as regular and symmetrical structure, and strong scalability, among others. In this paper, we first obtain an exact formula for the minimum wirelength to embed Qn3 into grids. We then propose a load balancing algorithm for embedding Qn3 into a square grid with minimum dilation and congestion. Finally, we derive an O(N2) algorithm for embedding Qn3 into a gird with balanced communication, where N is the number of nodes in Qn3. Simulation experiments are performed to verify the total wirelength and evaluate the network cost of our proposed embedding algorithm.
    A New Approach to Multivariate Network Traffic Analysis
    Jinoh Kim, Alex Sim
    Journal of Data Acquisition and Processing, 2019, 34 (2): 388-402. 
    Abstract   PDF(4303KB) ( 420 )  
    Network traffic analysis is one of the core functions in network monitoring for effective network operations and management. While online traffic analysis has been widely studied, it is still intensively challenging due to several reasons. One of the primary challenges is the heavy volume of traffic to analyze within a finite amount of time due to the increasing network bandwidth. Another important challenge for effective traffic analysis is to support multivariate functions of traffic variables to help administrators identify unexpected network events intuitively. To this end, we propose a new approach with the multivariate analysis that offers a high-level summary of the online network traffic. With this approach, the current state of the network will display patterns compiled from a set of traffic variables, and the detection problems in network monitoring (e.g., change detection and anomaly detection) can be reduced to a pattern identification and classification problem. In this paper, we introduce our preliminary work with clustered patterns for online, multivariate network traffic analysis with the challenges and limitations we observed. We then present a grid-based model that is designed to overcome the limitations of the clustered pattern-based technique. We will discuss the potential of the new model with respect to the technical challenges including streaming-based computation and robustness to outliers.
    MicroBTC: Efficient, Flexible and Fair Micropayment for Bitcoin Using Hash Chains
    Zhi-Guo Wan, Robert H. Deng, David Lee, Ying Li
    Journal of Data Acquisition and Processing, 2019, 34 (2): 403-415. 
    Abstract   PDF(1121KB) ( 664 )  
    While Bitcoin gains increasing popularity in different payment scenarios, the transaction fees make it difficult to be applied to micropayment. Given the wide applicability of micropayment, it is crucial for all cryptocurrencies including Bitcoin to provide effective support therein. In light of this, a number of low-cost micropayment schemes for Bitcoin have been proposed recently to reduce micropayment costs. Existing schemes, however, suffer from drawbacks such as high computation cost, inflexible payment value, and possibly unfair exchanges. The paper proposes two new micropayment schemes, namely the basic MicroBTC and the advanced MicroBTC, for Bitcoin by integrating the hash chain technique into cryptocurrency transactions. The basic MicroBTC realizes micropayment by exposing hash pre-images on the hash chain one by one, and it can also make arbitrary micropayments by exposing multiple hash pre-images. We further design the advanced MicroBTC to achieve non-interactive refund and efficient hash chain verification. We analyze the complexity and security of the both MicroBTC schemes and implement them using the Bitcoin source code. Extensive experiments were conducted to validate their performance, and the result showed that a micropayment session can be processed within about 18 ms for the basic MicroBTC and 9 ms for the advanced MicroBTC on a laptop. Both schemes enjoy great efficiency in computation and flexibility in micropayments, and they also achieve fairness for both the payer and the payee.
    Software Systems
    Decomposing Composite Changes for Code Review and Regression Test Selection in Evolving Software
    Bo Guo, Young-Woo Kwon, Myoungkyu Song
    Journal of Data Acquisition and Processing, 2019, 34 (2): 416-436. 
    Abstract   PDF(4404KB) ( 372 )  
    Inspecting and testing code changes typically require a significant amount of developer effort. As a system evolves, developers often create composite changes by mixing multiple development issues, as opposed to addressing one independent issue — an atomic change. Inspecting composite changes often becomes time-consuming and error-prone. To test unrelated edits on composite changes, rerunning all regression tests may require excessive time. To address the problem, we present an interactive technique for change decomposition to support code reviews and regression test selection, called ChgCutter. When a developer specifies code change within a diff patch, ChgCutter partitions composite changes into a set of related atomic changes, which is more cohesive and self-contained regarding the issue being addressed. For composite change inspection, it generates an intermediate program version that only includes a related change subset using program dependence relationships. For cost reduction during regression testing, it safely selects only affected tests responsible for changes to an intermediate version. In the evaluation, we apply ChgCutter to 28 composite changes in four open source projects. ChgCutter partitions these changes with 95.7% accuracy, while selecting affected tests with 89.0% accuracy. We conduct a user study with professional software engineers at PayPal and find that ChgCutter is helpful in understanding and validating composite changes, scaling to industry projects.
    On Identifying and Explaining Similarities in Android Apps
    Li Li, Tegawendé F. Bissyandé, Hao-Yu Wang, Jacques Klein
    Journal of Data Acquisition and Processing, 2019, 34 (2): 437-455. 
    Abstract   PDF(2788KB) ( 578 )  
    App updates and repackaging are recurrent in the Android ecosystem, filling markets with similar apps that must be identified. Despite the existence of several approaches to improving the scalability of detecting repackaged/cloned apps, researchers and practitioners are eventually faced with the need for a comprehensive pairwise comparison (or simultaneously multiple app comparisons) to understand and validate the similarities among apps. In this work, we present the design and implementation of our research-based prototype tool called SimiDroid for multi-level similarity comparison of Android apps. SimiDroid is built with the aim to support the comprehension of similarities/changes among app versions and among repackaged apps. In particular, we demonstrate the need and usefulness of such a framework based on different case studies implementing different dissection scenarios for revealing various insights on how repackaged apps are built. We further show that the similarity comparison plugins implemented in SimiDroid yield more accurate results than the state of the art.
    Regular Paper
    Revisiting the Parallel Strategy for DOACROSS Loops
    Song Liu, Yuan-Zhen Cui, Nian-Jun Zou, Wen-Hao Zhu, Dong Zhang, Wei-Guo Wu
    Journal of Data Acquisition and Processing, 2019, 34 (2): 456-475. 
    Abstract   PDF(8978KB) ( 152 )  
    DOACROSS loops are significant parts in many important scientific and engineering applications, which are generally exploited pipeline/wave-front parallelism by loop transformations. However, previous work almost statically performs iterations in parallel threads, thus causing a waste of computing resources in thread synchronization. This paper proposes a brand-new parallel strategy for DOACROSS loops that provides a dynamic task assignment with reduced dependences to achieve wave-front parallelism through loop tiling. The proposed strategy uses a master-slave parallel mode and some customized structures to realize dynamic and flexible parallelization, which effectively avoids threads from waiting in communication. An efficient tile size selection (TSS) approach is also proposed to preserve data reuse in cache for tiled codes. The experimental results show that the proposed parallel strategy obtains good and stable speedups over six typical benchmarks with different problem sizes and different numbers of threads on an Intel® Xeon® 32-core serve®. And it outperforms two static strategies, a barrier-based strategy and a post/wait-based strategy, by 32% and 20% in average performance, respectively. This strategy also yields a better performance than a mutex-based dynamic strategy. Besides, it has been demonstrated that the proposed TSS approach can achieve a near-optimal performance and is comparable with a state-of-the-art TSS approach.
    Plug-and-Play Based Optimization Algorithm for New Crime Density Estimation
    Xiang-Chu Feng, Chen-Ping Zhao, Si-Long Peng, Xi-Yuan Hu, Zhao-Wei Ouyang
    Journal of Data Acquisition and Processing, 2019, 34 (2): 476-493. 
    Abstract   PDF(10687KB) ( 190 )  
    Different from a general density estimation, the crime density estimation usually has one important factor: the geographical constraint. In this paper, a new crime density estimation model is formulated, in which the regions where crime is impossible to happen, such as mountains and lakes, are excluded. To further optimize the estimation method, a learning-based algorithm, named Plug-and-Play, is implanted into the augmented Lagrangian scheme, which involves an off-the-shelf filtering operator. Different selections of the filtering operator make the algorithm correspond to several classical estimation models. Therefore, the proposed Plug-and-Play optimization based estimation algorithm can be regarded as the extended version and general form of several classical methods. In the experiment part, synthetic examples with different invalid regions and samples of various distributions are first tested. Then under complex geographic constraints, we apply the proposed method with a real crime dataset to recover the density estimation. The state-of-the-art results show the feasibility of the proposed model.
    Lossless Compression of Random Forests
    Amichai Painsky, Saharon Rosset
    Journal of Data Acquisition and Processing, 2019, 34 (2): 494-506. 
    Abstract   PDF(530KB) ( 729 )  
    Ensemble methods are among the state-of-the-art predictive modeling approaches. Applied to modern big data, these methods often require a large number of sub-learners, where the complexity of each learner typically grows with the size of the dataset. This phenomenon results in an increasing demand for storage space, which may be very costly. This problem mostly manifests in a subscriber-based environment, where a user-specific ensemble needs to be stored on a personal device with strict storage limitations (such as a cellular device). In this work we introduce a novel method for lossless compression of tree-based ensemble methods, focusing on random forests. Our suggested method is based on probabilistic modeling of the ensemble's trees, followed by model clustering via Bregman divergence. This allows us to find a minimal set of models that provides an accurate description of the trees, and at the same time is small enough to store and maintain. Our compression scheme demonstrates high compression rates on a variety of modern datasets. Importantly, our scheme enables predictions from the compressed format and a perfect reconstruction of the original ensemble. In addition, we introduce a theoretically sound lossy compression scheme, which allows us to control the trade-off between the distortion and the coding rate.
SCImago Journal & Country Rank
 

ISSN 1004-9037

         

Home
Editorial Board
Author Guidelines
Subscription
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China

E-mail: info@sjcjycl.cn
 
  Copyright ©2015 JCST, All Rights Reserved