Bimonthly    Since 1986
ISSN 1004-9037
Indexed in:
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Distributed by:
China: All Local Post Offices
  • Table of Content
      28 January 2022, Volume 37 Issue 1   
    For Selected: View Abstracts Toggle Thumbnails
    Special Section on Software Systems 2021
    Tao Xie, Shengchao Qin, Wenhui Zhang, Jun Sun, Lei Bu, and Ge Li
    Journal of Data Acquisition and Processing, 2022, 37 (1): 1-3. 
    Abstract   PDF(321KB) ( 155 )  
    Event-Based Semantics of UML 2.X Concurrent Sequence Diagrams for Formal Verification
    Inès Mouakher, Fatma Dhaou, and J. Christian Attiogbé
    Journal of Data Acquisition and Processing, 2022, 37 (1): 4-28. 

    UML 2.X sequence diagrams (SD) are among privileged scenarios-based approaches dealing with the complexity of modeling the behaviors of some current systems. However, there are several issues related to the standard semantics of UML 2.X SD proposed by the Object Management Group (OMG). They mainly concern ambiguities of the interpretation of SDs, and the computation of causal relations between events which is not specifically laid out. Moreover, SD is a semi-formal language, and it does not support the verification of the modeled system. This justifies the considerable number of research studies intending to define formal semantics of UML SDs. We proposed in our previous work semantics covering the most popular combined fragments (CF) of control-flow ALT, OPT, LOOP and SEQ, allowing to model alternative, optional, iterative and sequential behaviors respectively. The proposed semantics is based on partial order theory relations that permit the computation of the precedence relations between the events of an SD with nested CFs. We also addressed the issue of the evaluation of the interaction constraint (guard) for guarded CFs, and the related synchronization issue. In this paper, we first extend our semantics, proposed in our previous work; indeed, we propose new rules for the computation of causal relations for SD with PAR and STRICT CFs (dedicated to modeling concurrent and strict behaviors respectively) as well as their nesting. Then, we propose a transformational semantics in Event-B. Our modeling approach emphasizes computation of causal relations, guard handling and transformational semantics into Event-B. The transformation of UML 2.X SD into the formal method Event-B allows us to perform several kinds of verification including simulation, trace acceptance, verification of properties, and verification of refinement relation between SDs.

    DeltaFuzz: Historical Version Information Guided Fuzz Testing
    Jia-Ming Zhang, Zhan-Qi Cui, Xiang Chen, Huan-Huan Wu, Li-Wei Zheng, and Jian-Bin Liu
    Journal of Data Acquisition and Processing, 2022, 37 (1): 29-49. 
    With the widespread use of agile software development methods, such as agile and scrum, software is iteratively updated more frequently. To ensure the quality of the software, regression testing is conducted before new versions are released. Moreover, to improve the efficiency of regression testing, testing efforts should be concentrated on the modified and impacted parts of a program. However, the costs of manually constructing new test cases for the modified and impacted parts are relatively expensive. Fuzz testing is an effective method for generating test data automatically, but it is usually devoted to achieving higher code coverage, which makes fuzz testing unsuitable for direct regression testing scenarios. For this reason, we propose a fuzz testing method based on the guidance of historical version information. First, the differences between the program being tested and the last version are analyzed, and the results of the analysis are used to locate change points. Second, change impact analysis is performed to find the corresponding impacted basic blocks. Finally, the fitness values of test cases are calculated according to the execution traces, and new test cases are generated iteratively by the genetic algorithm. Based on the proposed method, we implement a prototype tool DeltaFuzz and conduct experiments on six open-source projects. Compared with the fuzzing tool AFLGo, AFLFast and AFL, DeltaFuzz can reach the target faster, and the time taken by DeltaFuzz was reduced by 20.59%, 30.05% and 32.61%, respectively.
    TOAST: Automated Testing of Object Transformers in Dynamic Software Updates
    Ze-Lin Zhao, Di Huang, and Xiao-Xing Ma
    Journal of Data Acquisition and Processing, 2022, 37 (1): 50-66. 
    Dynamic software update (DSU) patches programs on the fly. It often involves the critical task of object transformation that converts live objects of the old-version program to their semantically consistent counterparts under the new-version program. This task is accomplished by invoking an object transformer on each stale object. However, a defective transformer failing to maintain consistency would cause errors or even crash the program. We propose TOAST (Test Object trAnSformaTion), an automated approach to detecting potential inconsistency caused by object transformers. TOAST first analyzes an update to identify multiple target methods and then adopts a fuzzer with specially designed inconsistency guidance to randomly generate object states to drive two versions of a target method. This creates two corresponding execution traces and a pair of old and new objects. TOAST finally performs object transformation to create a transformed object and detects inconsistency between it and the corresponding new object produced from scratch by the new program. Moreover, TOAST checks behavior inconsistency by comparing the return variables and exceptions of the two executions. Experimental evaluation on 130 updates with default transformers shows that TOAST is promising: it got 96.0% precision and 85.7% recall in state inconsistency detection, and 81.4% precision and 94.6% recall in behavior inconsistency detection. The inconsistency guidance improved the fuzzing efficiency by 14.1% for state inconsistency detection and 40.5% for behavior inconsistency detection.
    Characterizing and Detecting Gas-Inefficient Patterns in Smart Contracts
    Que-Ping Kong, Zi-Yan Wang, Yuan Huang, Xiang-Ping Chen, Xiao-Cong Zhou, Zi-Bin Zheng, and Gang Huang
    Journal of Data Acquisition and Processing, 2022, 37 (1): 67-82. 
    Ethereum blockchain is a new internetware with tens of millions of smart contracts running on it. Different from general programs, smart contracts are decentralized, tamper-resistant and permanently running. Moreover, to avoid resource abuse, Ethereum charges users for deploying and invoking smart contracts according to the size of contract and the operations executed by contracts. It is necessary to optimize smart contracts to save money. However, since developers are not familiar with the operating environment of smart contracts (i.e., Ethereum virtual machine) or do not pay attention to resource consumption during development, there are many optimization opportunities for smart contracts. To fill this gap, this paper defines six gas-inefficient patterns from more than 25,000 posts and proposes an optimization approach at the source code level to let users know clearly where the contract is optimized. To evaluate the prevalence and economic benefits of gas-inefficient patterns, this paper conducts an empirical study on more than 160,000 real smart contracts. The promising experimental results demonstrate that 52.75% of contracts contain at least one gas-inefficient pattern proposed in this paper. If these patterns are removed from the contract, at least 0.30 can be saved per contract.
    Simulation Might Change Your Results: A Comparison of Context-Aware System Input Validation in Simulated and Physical Environments
    Jin-Chi Chen, Yi Qin, Hui-Yan Wang, and Chang Xu
    Journal of Data Acquisition and Processing, 2022, 37 (1): 83-105. 
    Context-aware systems (a.k.a. CASs) integrate cyber and physical space to provide adaptive functionalities in response to changes in context. Building context-aware systems is challenging due to the uncertain running environment. Therefore, many input validation approaches have been proposed to protect context-aware systems from uncertainty and keep them executing safely. However, in contrast to context-aware systems' prevailing in physical environments, most of those academic solutions (83%) are purely evaluated in simulated environments. In this article, we study whether this evaluation setting could lead to biased conclusions. We build a testing platform, RM-Testing, based on DJI RoboMaster robot car, to conduct the physical-environment based experiments. We select three up-to-date input validation approaches, and compare their performance in the simulated environment and in the physical environment. The experimental results show that all three approaches' performance in simulated environments (improving task success rate by 82% compared with the system without the support of input validation) does differ from their performance in a physical environment (improving the task success rate by 50%). We also recognize three factors (scenario setting, physical platform and environmental model) that affect the performance of input validation approaches, based on an execution model of the context-aware system.
    Meaningful Update and Repair of Markov Decision Processes for Self-Adaptive Systems
    Wen-Hua Yang, Min-Xue Pan, Yu Zhou, and Zhi-Qiu Huang
    Journal of Data Acquisition and Processing, 2022, 37 (1): 106-127. 
    Self-adaptive systems are able to adjust their behaviour in response to environmental condition changes and are widely deployed as Internetwares. Considered as a promising way to handle the ever-growing complexity of software systems, they have seen an increasing level of interest and are covering a variety of applications, e.g., autonomous car systems and adaptive network systems. Many approaches for the construction of self-adaptive systems have been developed, and probabilistic models, such as Markov decision processes (MDPs), are one of the favoured. However, the majority of them do not deal with the problems of the underlying MDP being obsolete under new environments or unsatisfactory to the given properties. This results in the generated policies from such MDP failing to guide the self-adaptive system to run correctly and meet goals. In this article, we propose a systematic approach to updating an obsolete MDP by exploring new states and transitions and removing obsolete ones, and repairing an unsatisfactory MDP by adjusting its structure in a more meaningful way rather than arbitrarily changing the transition probabilities to values not in line with reality. Experimental results show that the MDPs updated and repaired by our approach are more competent in guiding the self-adaptive systems' correct running compared with the original ones.
    Checking Causal Consistency of MongoDB
    Hong-Rong Ouyang, Heng-Feng Wei, Hai-Xiang Li, An-Qun Pan, and Yu Huang
    Journal of Data Acquisition and Processing, 2022, 37 (1): 128-146. 
    MongoDB is one of the first commercial distributed databases that support causal consistency. Its implementation of causal consistency combines several research ideas for achieving scalability, fault tolerance, and security. Given its inherent complexity, a natural question arises: "Has MongoDB correctly implemented causal consistency as it claimed?'' To address this concern, the Jepsen team has conducted black-box testing of MongoDB. However, this Jepsen testing has several drawbacks in terms of specification, test case generation, implementation of causal consistency checking algorithms, and testing scenarios, which undermine the credibility of its reports. In this work, we propose a more thorough design of Jepsen testing of causal consistency of MongoDB. Specifically, we fully implement the causal consistency checking algorithms proposed by Bouajjani et al. and test MongoDB against three well-known variants of causal consistency, namely CC, CCv, and CM, under various scenarios including node failures, data movement, and network partitions. In addition, we develop formal specifications of causal consistency and their checking algorithms in TLA+ , and verify them using the TLC model checker. We also explain how TLA+ specification can be related to Jepsen testing.
    GridDroid---An Effective and Efficient Approach for Android Repackaging Detection Based on Runtime Graphical User Interface
    Jun Ma, Qing-Wei Sun, Chang Xu, and Xian-Ping Tao
    Journal of Data Acquisition and Processing, 2022, 37 (1): 147-181. 
    Repackaging brings serious threats to Android ecosystem. Software birthmark techniques are typically applied to detect repackaged apps. Birthmarks based on apps' runtime graphical user interfaces (GUI) are effective, especially for obfuscated or encrypted apps. However, existing studies are time-consuming and not suitable for handling apps in large scale. In this paper, we propose an effective yet efficient dynamic GUI birthmark for Android apps. Briefly, we run an app with automatically generated GUI events and dump its layout after each event. We divide each dumped layout into a grid, count in each grid cell the vertices of boundary rectangles corresponding to widgets within the layout, and generate a feature vector to encode the layout. Similar layouts are merged at runtime, and finally we obtain a graph as the birthmark of the app. Given a pair of apps to be compared, we build a weighted bipartite graph from their birthmarks and apply a modified version of the maximum-weight-bipartite-matching algorithm to determine whether they form a repackaging pair (RP) or not. We implement the proposed technique in a prototype, GridDroid, and apply it to detect RPs in three datasets involving 527 apks. GridDroid reports only six false negatives and seven false positives, and it takes GridDroid merely 20 microseconds on average to compare a pair of birthmarks.
    Community Smell Occurrence Prediction on Multi-Granularity by Developer-Oriented Features and Process Metrics
    Zi-Jie Huang, Zhi-Qing Shao, Gui-Sheng Fan, Hui-Qun Yu, Xing-Guang Yang, and Kang Yang
    Journal of Data Acquisition and Processing, 2022, 37 (1): 182-206. 
    Community smells are sub-optimal developer community structures that hinder productivity. Prior studies performed smell prediction and provided refactoring guidelines from a top-down aspect to help community shepherds. Simultaneously, refactoring smells also requires bottom-up effort from every developer. However, supportive measures and guidelines for them are not available at a fine-grained level. Since recent work revealed developers' personalities and working states could influence community smells' emergence and variation, we build prediction models with experience, sentiment, and development process features of developers considering three smells including Organizational Silo, Lone Wolf, and Bottleneck, as well as two related classes including smelly developer and smelly quitter. We predict the five classes in the individual granularity, and we also generate forecasts for the number of smelly developers in the community granularity. The proposed models achieve F-measures ranging from 0.73 to 0.92 in individual-wide within-project, time-wise, and cross-project prediction, and mean R2 performance of 0.68 in community-wide Smelly Developer prediction. We also exploit SHAP (SHapley Additive exPlanations) to assess feature importance to explain our predictors. In conclusion, we suggest developers with heavy workload should foster more frequent communication in a straightforward and polite way to build healthier communities, and we recommend community shepherds to use the forecasting model for refactoring planning.
    Regular Paper
    MacroTrend: A Write-Efficient Cache Algorithm for NVM-Based Read Cache
    Ning Bao, Yun-Peng Chai, Xiao Qin, and Chuan-Wen Wang
    Journal of Data Acquisition and Processing, 2022, 37 (1): 207-230. 
    The future storage systems are expected to contain a wide variety of storage media and layers due to the rapid development of NVM (non-volatile memory) techniques. For NVM-based read caches, many kinds of NVM devices cannot stand frequent data updates due to limited write endurance or high energy consumption of writing. However, traditional cache algorithms have to update cached blocks frequently because it is difficult for them to predict long-term popularity according to such limited information about data blocks, such as only a single value or a queue that reflects frequency or recency. In this paper, we propose a new MacroTrend (macroscopic trend) prediction method to discover long-term hot blocks through blocks' macro trends illustrated by their access count histograms. And then a new cache replacement algorithm is designed based on the MacroTrend prediction to greatly reduce the write amount while improving the hit ratio. We conduct extensive experiments driven by a series of real-world traces and find that compared with LRU, MacroTrend can reduce the write amounts of NVM cache devices significantly with similar hit ratios, leading to longer NVM lifetime or less energy consumption.
    Correlated Differential Privacy of Multiparty Data Release in Machine Learning
    Jian-Zhe Zhao, Xing-Wei Wang, Ke-Ming Mao, Chen-Xi Huang, Yu-Kai Su, and Yu-Chen Li
    Journal of Data Acquisition and Processing, 2022, 37 (1): 231-251. 
    Differential privacy (DP) is widely employed for the private data release in the single-party scenario. Data utility could be degraded with noise generated by ubiquitous data correlation, and it is often addressed by sensitivity reduction with correlation analysis. However, increasing multiparty data release applications present new challenges for existing methods. In this paper, we propose a novel correlated differential privacy of the multiparty data release (MP-CRDP). It effectively reduces the merged dataset's dimensionality and correlated sensitivity in two steps to optimize the utility. We also propose a multiparty correlation analysis technique. Based on the prior knowledge of multiparty data, a more reasonable and rigorous standard is designed to measure the correlated degree, reducing correlated sensitivity, and thus improve the data utility. Moreover, by adding noise to the weights of machine learning algorithms and query noise to the release data, MP-CRDP provides the release technology for both low-noise private data and private machine learning algorithms. Comprehensive experiments demonstrate the effectiveness and practicability of the proposed method on the utilized Adult and Breast Cancer datasets.
    On the Discrete-Time Dynamics of Cross-Coupled Hebbian Algorithm
    Xiao-Wei Feng, Xiang-Yu Kong, Chuan He, and Dong-Hui Xu
    Journal of Data Acquisition and Processing, 2022, 37 (1): 252-265. 
    Principal/minor component analysis (PCA/MCA), generalized principal/minor component analysis (GPCA/GMCA), and singular value decomposition (SVD) algorithms are important techniques for feature extraction. In the convergence analysis of these algorithms, the deterministic discrete-time (DDT) method can reveal the dynamic behavior of PCA/MCA and GPCA/GMCA algorithms effectively. However, the dynamic behavior of SVD algorithms has not been studied quantitatively because of their special structure. In this paper, for the first time, we utilize the advantages of the DDT method in PCA algorithms analysis to study the dynamics of SVD algorithms. First, taking the cross-coupled Hebbian algorithm as an example, by concatenating the two cross-coupled variables into a single vector, we successfully get a PCA-like DDT system. Second, we analyze the discrete-time dynamic behavior and stability of the PCA-like DDT system in detail based on the DDT method, and obtain the boundedness of the weight vectors and learning rate. Moreover, further discussion shows the universality of the proposed method for analyzing other SVD algorithms. As a result, the proposed method provides a new way to study the dynamical convergence properties of SVD algorithms.
    A Blockchain-Based Protocol for Malicious Price Discrimination
    Li-De Xue, Ya-Jun Liu, Wei Yang, Wei-Lin Chen, and Liu-Sheng Huang
    Journal of Data Acquisition and Processing, 2022, 37 (1): 266-276. 
    Serious price discrimination emerges with the development of big data and mobile networks, which harms the interests of consumers. To solve this problem, we propose a blockchain-based price consensus protocol to solve the malicious price discrimination faced by consumers. We give a mathematical definition of price discrimination, which requires the system to satisfy consistency and timeliness. The distributed blockchain can make the different pricing of merchants transparent to consumers, thus satisfying the consistency. The aging window mechanism of our protocol ensures that there is no disagreement between any node on the consensus on price or price discrimination within a fixed period, which meets the timeliness. Moreover, we evaluate its performance through a prototype implementation and experiments with up to 100 user nodes. Experimental results show that our protocol achieves all the expected goals like price transparency, consistency, and timeliness, and it additionally guarantees the consensus of the optimal price with a high probability.

    Chandni Tiwari, Mukesh Kumar Sahu
    Journal of Data Acquisition and Processing, 2022, 37 (1): 277-281 . 

    Gaurav Verma, Sonam Tyagi
    Journal of Data Acquisition and Processing, 2022, 37 (1): 287-291 . 

    Chandni Tiwari, Mayurika Joshi
    Journal of Data Acquisition and Processing, 2022, 37 (1): 292-296 . 

    Devesh Shukla, Mukta Jakerria
    Journal of Data Acquisition and Processing, 2022, 37 (1): 297-302 . 

    Shashikant Radke, Prof. (Dr.) Alok Kumar, Prof. (Dr.) Raj Singh
    Journal of Data Acquisition and Processing, 2022, 37 (1): 303-308 . 

    Dr. Carmel Vip Conejos Derasin
    Journal of Data Acquisition and Processing, 2022, 37 (1): 309-315 . 
SCImago Journal & Country Rank

ISSN 1004-9037


Editorial Board
Author Guidelines
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China

E-mail: info@sjcjycl.cn
  Copyright ©2015 JCST, All Rights Reserved