Bimonthly    Since 1986
ISSN 1004-9037
Indexed in:
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Distributed by:
China: All Local Post Offices
  • Table of Content
      05 November 2016, Volume 31 Issue 6   
    For Selected: View Abstracts Toggle Thumbnails
    Special Section on Data-Driven Design for Edge Network and Edge Cloud
    Wenwu Zhu
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1069-1071. 
    Abstract   PDF(134KB) ( 706 )  
    Edge Video CDN: A Wi-Fi Content Hotspot Solution
    Wen Hu, Zhi Wang, Ming Ma, Li-Feng Sun
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1072-1086. 
    Abstract   PDF(3648KB) ( 1394 )  
    The emergence of smart edge-network content item hotspots,which are equipped with huge storage space (e.g.,several GBs),opens up the opportunity to study the possibility of delivering videos at the edge network.Different from both the conventional content item delivery network (CDN) and the peer-to-peer (P2P) scheme,this new delivery paradigm,namely edge video CDN,requires up to millions of edge hotspots located at users' homes/offices to be coordinately managed to serve mobile video content item.Specifically,two challenges are involved in building edge video CDN,including how edge content item hotspots should be organized to serve users,and how content items should be replicated to them at different locations to serve users.To address these challenges,we propose our data-driven design as follows.First,we formulate an edge region partition problem to jointly maximize the quality experienced by users and minimize the replication cost,which is NP-hard in nature,and we design a Voronoi-like partition algorithm to generate optimal service cells.Second,to replicate content items to edge-network content item hotspots,we propose an edge request prediction based replication strategy,which carries out the replication in a server peak offloading manner.We implement our design and use trace-driven experiments to verify its effectiveness.Compared with conventional centralized CDN and popularity-based replication,our design can significantly improve users' quality of experience,in terms of users' perceived bandwidth and latency,up to 40%.
    CPA-VoD: Cloud and Peer-Assisted Video on Demand System for Mobile Devices
    Lei-Gen Cheng, Laizhong Cui, Yong Jiang
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1087-1095. 
    Abstract   PDF(1365KB) ( 800 )  
    With the rapid development of WiFi and 3G/4G,people tend to view videos on mobile devices.These devices are ubiquitous but have small memory to cache videos.As a result,in contrast to traditional computers,these devices aggravate the network pressure of content providers.Previous studies use CDN to solve this problem.But its static leasing mechanism in which the rental space cannot be dynamically adjusted makes the operational cost soar and incompatible with the dynamically video delivery.In our study,based on a thorough analysis of user behavior from Tencent Video,a popular Chinese on-line video share platform,we identify two key user behaviors.Firstly,lots of users in the same region tend to watch the same video.Secondly,the popularity distribution of videos conforms with the Pareto principle,i.e.,the top 20% popular videos own 80% of all video traffic.To turn these observations into silver bullet,we propose and implement a novel cloud-and peer-assisted video on demand system (CPA-VoD).In the system,we group users in the same region as a peer swarm,and in the same peer swarm,users can provide videos to other users by sharing their cached videos.Besides,we cache the 10% most popular videos in cloud servers to further alleviate the network pressure.We choose cloud servers to cache videos because the rental space can be dynamically adjusted.According to the evaluation on a real dataset from Tencent Video,CPA-VoD alleviates the network pressure and the operation cost excellently,while only 20.9% traffic is serviced by the content provider.
    Interference-Limited Device-to-Device Multi-User Cooperation Scheme for Optimization of Edge Networking
    Hong-Cheng Huang, Jie Zhang, Zu-Fan Zhang, Zhong-Yang Xiong
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1096-1109. 
    Abstract   PDF(1751KB) ( 902 )  
    Device-to-device (D2D) communication is an emerging technology for improving cellular networks,which plays an important role in realizing Internet of Things (IoT).The spectrum efficiency,energy efficiency and throughput of network can be enhanced by the cooperation among multiple D2D users in a self-organized method.In order to limit the interference of D2D users and load off the energy consumption of D2D users without decreasing communication quality,an interferencelimited multi-user cooperation scheme is proposed for multiple D2D users to solve the energy problem and the interference problem in this paper.Multiple D2D users use non-orthogonal spectrums to form clusters by self-organized method.Multiple D2D users are divided into different cooperative units.There is no interference among different cooperative units so as to limit the interference of each D2D user in cooperative units.When the link capacity cannot meet the requirements of the user rate,it will produce an interrupt event.In order to evaluate the communication quality,the outrage probability of D2D link is derived by considering link delay threshold,data rate and interference.Besides the energy availability and signal-to-noise ratio (SNR) of each D2D user,the distance between D2D users is considered when selecting the relaying D2D users so as to enhance the signal-to-interference-plus-noise ratio (SINR) of D2D receiving users.Combining the derived outrage probability,the relationships among the average link delay threshold,the efficiency of energy and the efficiency of capacity are studied.The simulation results show that the interference-limited multiple D2D users cooperation scheme can not only help to offload energy consumption and limit the interference of D2D users,but also enhance the efficiency of energy and the efficiency of capacity.
    Combined Cloud: A Mixture of Voluntary Cloud and Reserved Instance Marketplace
    Wei Shen, Wan-Chun Dou, Fan Wu, Shaojie Tang, Qiang Ni
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1110-1123. 
    Abstract   PDF(2193KB) ( 973 )  
    Voluntary cloud is a new paradigm of cloud computing.It provides an alternative selection along with some well-provisioned clouds.However,for the uncertain time span that participants share their computing resources in voluntary cloud,there are some challenging issues,i.e.,fluctuation,under-capacity and low-benefit.In this paper,an architecture is first proposed based on Bittorrent protocol.In this architecture,resources could be reserved or requested from Reserved Instance Marketplace and could be accessed with a lower price in a short circle.Actually,these resources could replenish the inadequate resource pool and relieve the fluctuation and under-capacity issue in voluntary cloud.Then,the fault rate of each node is used to evaluate the uncertainty of its sharing time.By leveraging a linear prediction model,it is enabled by a distribution function which is used for evaluating the computing capacity of the system.Moreover,the cost optimization problem is investigated and a computational method is presented to solve the low-benefit issue in voluntary cloud.At last,the system performance is validated by two sets of simulations.And the experimental results show the effectiveness of our computational method for resource reservation optimization.
    Semi-Homogenous Generalization:Improving Homogenous Generalization for Privacy Preservation in Cloud Computing
    Xian-Mang He, Xiaoyang Sean Wang, Member, CCF, ACM, IEEE, Dong Li, Yan-Ni Hao
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1124-1135. 
    Abstract   PDF(377KB) ( 871 )  
    Data security is one of the leading concerns and primary challenges for cloud computing.This issue is getting more and more serious with the development of cloud computing.However,the existing privacy-preserving data sharing techniques either fail to prevent the leakage of privacy or incur huge amounts of information loss.In this paper,we propose a novel technique,termed as linking-based anonymity model,which achieves K-anonymity with quasi-identifiers groups (QI-groups) having a size less than K.In the meanwhile,a semi-homogenous generalization is introduced to be against the attack incurred by homogenous generalization.To implement linking-based anonymization model,we propose a simple yet efficient heuristic local recoding method.Extensive experiments on real datasets are also conducted to show that the utility has been significantly improved by our approach compared with the state-of-the-art methods.
    Regular Paper
    Using Computational Intelligence Algorithms to Solve the Coalition Structure Generation Problem in Coalitional Skill Games
    Yang Liu, Guo-Fu Zhang, Member, IEEE, Zhao-Pin Su, Member, IEEE, Feng Yue, Jian-Guo Jiang, Senior Member, CCF
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1136-1150. 
    Abstract   PDF(420KB) ( 952 )  
    Coalitional skill games (CSGs) are a simple model of cooperation in an uncertain environment where each agent has a set of skills that are required to accomplish a variety of tasks and each task requires a set of skills to be completed,but each skill is very hard to be quantified and can only be qualitatively expressed.Thus far,many computational questions surrounding CSGs have been studied.However,to the best of our knowledge,the coalition structure generation problem (CSGP),as a central issue of CSGs,is extremely challenging and has not been well solved.To this end,two different computational intelligence algorithms are herein evaluated:binary particle swarm optimization (BPSO) and binary differential evolution (BDE).In particular,we develop the two stochastic search algorithms with two-dimensional binary encoding and corresponding heuristic for individual repairs.After that,we discuss some fundamental properties of the proposed heuristic.Finally,we compare the improved BPSO and BDE with the state-of-the-art algorithms for solving CSGP in CSGs.The experimental results show that our algorithms can find the same near optimal solutions with the existing approaches but take extremely short time,especially under the large problem size.
    A Tensor Neural Network with Layerwise Pretraining: Towards Effective Answer Retrieval
    Xin-Qi Bao, Yun-Fang Wu
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1151-1160. 
    Abstract   PDF(1541KB) ( 984 )  
    In this paper we address the answer retrieval problem in community-based question answering.To fully capture the interactions between question-answer pairs,we propose an original tensor neural network to model the relevance between them.The question and candidate answers are separately embedded into different latent semantic spaces,and a 3-way tensor is then utilized to model the interactions between latent semantics.To initialize the network layers properly,we propose a novel algorithm called denoising tensor autoencoder (DTAE),and then implement a layerwise pretraining strategy using denoising autoencoders (DAE) on word embedding layers and DTAE on the tensor layer.The experimental results show that our tensor neural network outperforms various baselines with other competitive neural network methods,and our pretraining DTAE strategy improves the system's performance and robustness.
    Enhanced Userspace and In-Kernel Trace Filtering for Production Systems
    Suchakrapani Datt Sharma, Student Member, IEEE, Michel Dagenais, Senior Member, IEEE
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1161-1178. 
    Abstract   PDF(659KB) ( 1096 )  
    Trace tools like LTTng have a very low impact on the traced software as compared with traditional debuggers.However,for long runs,in resource constrained and high throughput environments,such as embedded network switching nodes and production servers,the collective tracing impact on the target software adds up considerably.The overhead is not just in terms of execution time but also in terms of the huge amount of data to be stored,processed and analyzed offline.This paper presents a novel way of dealing with such huge trace data generation by introducing a Just-In-Time (JIT) filter based tracing system,for sieving through the flood of high frequency events,and recording only those that are relevant,when a specific condition is met.With a tiny filtering cost,the user can filter out most events and focus only on the events of interest.We show that in certain scenarios,the JIT compiled filters prove to be three times more effective than similar interpreted filters.We also show that with the increasing number of filter predicates and context variables,the benefits of JIT compilation increase with some JIT compiled filters being even three times faster than their interpreted counterparts.We further present a new architecture,using our filtering system,which can enable co-operative tracing between kernel and process tracing VMs (virtual machines) that share data efficiently.We demonstrate its use through a tracing scenario where the user can dynamically specify syscall latency through the userspace tracing VM whose effect is reflected in tracing decisions made by the kernel tracing VM.We compare the data access performance on our shared memory system and show an almost 100 times improvement over traditional data sharing for co-operative tracing.
    Reducing the Upper Bound Delay by Optimizing Bank-to-Core Mapping
    Ji-Zan Zhang, Zhi-Min Gu, Member, CCF, Ming-Quan Zhang, Member, CCF
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1179-1193. 
    Abstract   PDF(772KB) ( 657 )  
    Nowadays,inter-task interferences are the main difficulty in analyzing the timing behavior of multicores.The timing predictable embedded multicore architecture MERASA,which allows safe worst-case execution time (WCET) estimations,has emerged as an attractive solution.In the architecture,WCET can be estimated by the upper bound delay (UBD) which can be bounded by the interference-aware bus arbiter (IABA) and the dynamic cache partitioning such as columnization or bankization.However,this architecture faces a dilemma between decreasing UBD and efficient shared cache utilization.To obtain tighter WCET estimation,we propose a novel approach that reduces UBD by optimizing bank-to-core mapping on the multicore system with IABA and the two-level partitioned cache.For this,we first present a new UBD computation model based on the analysis of inter-task interference delay,and then put forward the core-sequence optimization method of bank-to-core mapping and the optimizing algorithms with the minimum UBD.Experimental results demonstrate that our approach can reduce WCET from 4% to 37%.
    Efficient Metric All-k-Nearest-Neighbor Search on Datasets Without Any Index
    Hai-Da Zhang, Zhi-Hao Xing, Lu Chen, Yun-Jun Gao, Senior Member, CCF, Member, ACM, IEEE
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1194-1211. 
    Abstract   PDF(672KB) ( 908 )  
    An all-k-nearest-neighbor (AkNN) query finds k nearest neighbors for each query object.This problem arises naturally in many areas,such as GIS (geographic information system),multimedia retrieval,and recommender systems.To support various data types and flexible distance metrics involved in real applications,we study AkNN retrieval in metric spaces,namely,metric AkNN (MAkNN) search.Consider that the underlying indexes on the query set and the object set may not exist,which is natural in many scenarios.For example,the query set and the object set could be the results of other queries,and thus,the underlying indexes cannot be built in advance.To support MAkNN search on datasets without any underlying index,we propose an efficient disk-based algorithm,termed as Partition-Based MAkNN Algorithm (PMA),which follows a partition-search framework and employs a series of pruning rules for accelerating the search.In addition,we extend our techniques to tackle an interesting variant of MAkNN queries,i.e.,metric self-AkNN (MSAkNN) search,where the query set is identical to the object set.Extensive experiments using both real and synthetic datasets demonstrate the effectiveness of our pruning rules and the efficiency of the proposed algorithms,compared with state-of-the-art MAkNN and MSAkNN algorithms.
    An Efficient Approach of Processing Multiple Continuous Queries
    Wen Liu, Yan-Ming Shen, Member, CCF, Peng Wang
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1212-1227. 
    Abstract   PDF(1671KB) ( 641 )  
    As stream data is being more frequently collected and analyzed,stream processing systems are faced with more design challenges.One challenge is to perform continuous window aggregation,which involves intensive computation.When there are a large number of aggregation queries,the system may suffer from scalability problems.The queries are usually similar and only differ in window specifications.In this paper,we propose collaborative aggregation which promotes aggregate sharing among the windows so that repeated aggregate operations can be avoided.Different from the previous approaches in which the aggregate sharing is restricted by the window pace,we generalize the aggregation over multiple values as a series of reductions.Therefore,the results generated by each reduction step can be shared.The sharing process is formalized in the feed semantics and we present the compose-and-declare framework to determine the data sharing logic at a very low cost.Experimental results show that our approach offers an order of magnitude performance improvement to the state-of-the-art results and has a small memory footprint.
    A Buffer Scheduling Method Based on Message Priority in Delay Tolerant Networks
    En Wang, Yong-Jian Yang, Jie Wu, Fellow, IEEE, Wen-Bin Liu
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1228-1245. 
    Abstract   PDF(1407KB) ( 923 )  
    Routing protocols in delay tolerant networks usually utilize multiple message copies to guarantee the message delivery,in order to overcome unpredictable node mobility and easily-interrupted connections.A store-carry-and-forward paradigm was also proposed to further improve the message delivery.However,excessive message copies lead to the shortage of buffer and bandwidth.The spray and wait routing protocol has been proposed to reduce the network overload caused by the buffer and transmission of unrestricted message copies.However,when a node's buffer is quite constrained,there still exist congestion problems.In this paper,we propose a message scheduling and drop strategy on spray and wait routing protocol (SDSRP).To improve the delivery ratio,first of all,SDSRP calculates the priority of each message by evaluating the impact of both replicating and dropping a message copy on delivery ratio.Subsequently,scheduling and drop decisions are made according to the priority.In order to further increase delivery ratio,we propose an improved message scheduling and drop strategy on spray and wait routing protocol (ISDSRP) through enhancing the accuracy of estimating parameters.Finally,we conduct extensive simulations based on synthetic and real traces in ONE.The results show that compared with other buffer management strategies,ISDSRP and SDSRP achieve higher delivery ratio,similar average hopcounts,and lower overhead ratio.
    A Script-Based Prototyping Framework to Boost Agile-UX Developments
    Pedro Luis Mateo Navarro, Gregorio Martínez Pérez, Member, IEEE, Diego Sevilla Ruiz
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1246-1261. 
    Abstract   PDF(686KB) ( 660 )  
    Prototypes are described as a successful mechanism to incorporate user-experience design (UX) into Agile developments,but their integration into such developments is not exempt from difficulties.Prototypes and final applications are often developed using different tools,which hinders the collaboration between designers and developers and also complicates reuse.Moreover,integrating stakeholders such as clients and users into the Agile process of designing,evaluating,and refining a prototype is not straightforward mainly because of its iterative nature.In an attempt to tackle these problems,this work presents the design and implementation of a new framework in which scripting languages are used to code prototyped behaviors.Prototyping is then treated as a separate aspect that coexists and runs together with final functionality.Using this framework communication is enhanced because designers and developers work in parallel on the same software artifact.Prototypes are fully reused and iteratively added with final functionality while prototyped behaviors are removed.They can be also modified on the fly to implement participatory design techniques.
    Highly Optimized Code Generation for Stencil Codes with Computation Reuse for GPUs
    Wen-Jing Ma, Member, CCF, ACM, Kan Gao, Guo-Ping Long
    Journal of Data Acquisition and Processing, 2016, 31 (6): 1262-1274. 
    Abstract   PDF(744KB) ( 821 )  
    Computation reuse is known as an effective optimization technique.However,due to the complexity of modern GPU architectures,there is yet not enough understanding regarding the intriguing implications of the interplay of computation reuse and hardware specifics on application performance.In this paper,we propose an automatic code generator for a class of stencil codes with inherent computation reuse on GPUs.For such applications,the proper reuse of intermediate results,combined with careful register and on-chip local memory usage,has profound implications on performance.Current state of the art does not address this problem in depth,partially due to the lack of a good program representation that can expose all potential computation reuse.In this paper,we leverage the computation overlap graph (COG),a simple representation of data dependence and data reuse with "element view",to expose potential reuse opportunities.Using COG,we propose a portable code generation and tuning framework for GPUs.Compared with current state-of-the-art code generators,our experimental results show up to 56.7% performance improvement on modern GPUs such as NVIDIA C2050.
SCImago Journal & Country Rank

ISSN 1004-9037


Editorial Board
Author Guidelines
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China

E-mail: info@sjcjycl.cn
  Copyright ©2015 JCST, All Rights Reserved