Loading...
Bimonthly    Since 1986
ISSN 1004-9037
/
Indexed in:
SCIE, Ei, INSPEC, JST, AJ, MR, CA, DBLP, etc.
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
 
  • Table of Content
      05 March 2010, Volume 25 Issue 2   
    For Selected: View Abstracts Toggle Thumbnails
    Special Section on CPU Researches in China
    Preface
    Wei-Wu Hu, Xue-Jun Yang, and Xiao-Wei Li
    Journal of Data Acquisition and Processing, 2010, 25 (2): 179-180. 
    Abstract   PDF(66KB) ( 1919 )  

    CPU is the "heart'' of information systems. As the key technology of IT industry, it increasingly plays an important role in national economy and information security. We are pleased to present this selection of nine papers in this Special Section on CPU Researches in China. The authors are from leading universities and research institutions. The achievements presented in these papers have been supported by the projects from NSFC, 863, 973 etc. We believe that this collection represents the state-of-the-art progresses in the field of microprocessor research and development in China.

    The paper "System Architecture of Godson-3 Multi-Core Processors'' by Xiang Gao et al. introduces the system architecture of Godson-3 from aspects including system scalability, organization of memory hierarchy, network on-chip, inter-chip connection and I/O subsystem.

    The paper "Physical Implementation of the 1GHz Godson-3 Quad-Core Microprocessor'' by Bao-Xia Fan et al. describes the design methodology of the physical implementation of Godson-3A, with particular emphasis on design flow, design methods for high frequency, clock tree design, power management, and on chip variation issue.

    The paper "Research Progress of UniCore CPUs and PKUnity SoCs'' by Xu Cheng et al. reviews the evolution of the UniCore CPU and the PKUnity SoC family, and introduces the hardware/software co-design platform.

    The paper "YHFT-QDSP: High-Performance Heterogeneous Multi-Core DSP'' by Shu-Ming Chen et al. presents a novel heterogeneous multi-core architecture DSP processor, and provides a simple parallel programming environment.

    The paper "Physical Design Methodology for Godson-2G Microprocessor'' by Ji-Ye Zhao et al. proposes the design flow of Godson-2G microprocessor, and provides three physical design methodologies, an interconnect-centric driven floorplan generation, auto-adapted boundary constraints design optimization, and automatic register group clock tree generation.

    The paper "Managing Data-Objects in Dynamically Reconfigurable Caches'' by Xue-Jun Yang et al. proposes a quantitative framework for analyzing the cache requirement of data-objects, which includes cache capacity, block size, associativity and coherence protocol.

    The paper "Hierarchical Cache Directory for CMP'' by Song-Liu Guo et al. introduces hierarchical cache directory into CMP, which divides CMP tiles into multiple regions hierarchically, and combines it with data replication, and proposes a new directory organization to record the share status within a region and assist the regional home to complete operation efficiently.

    The paper "CCNoC: Cache-Coherent Network on Chip for Chip Multiprocessors'' by Jing-Lei Wang et al. proposes cache coherent network on chip, a scheme that decouples cache coherency maintenance from processors and shared L2 caches and implements it completely in network on chip to free up processors and shared L2 caches from the chore of maintaining coherency.

    The paper "Design and Application of Instruction Set Simulator on Multi-Core Verification'' by Xiang-Dong Hu et al. proposes a general methodology to expand a single-core instruction set simulator (ISS) to a multi-core ISS.

    The guest editors hope that the technological developments and empirical findings as presented in this special section will help encourage the research on the related fields. We would like to thank all the referees who have worked hard to review each paper and provided authors with constructive comments. We sincerely hope that the readers will enjoy reading this special section.

    System Architecture of Godson-3 Multi-Core Processors
    Xiang Gao, Yun-Ji Chen, Huan-Dong Wang, Dan Tang, and Wei-Wu Hu
    Journal of Data Acquisition and Processing, 2010, 25 (2): 181-191. 
    Abstract   PDF(439KB) ( 5222 )  

    Godson-3 is the latest generation of Godson microprocessor family. It takes a scalable multi-core architecture with hardware support for accelerating applications including X86 emulation and signal processing. This paper introduces the system architecture of Godson-3 from various aspects including system scalability, organization of memory hierarchy, network-on-chip, inter-chip connection and I/O subsystem.

    Physical Implementation of the 1GHz Godson-3 Quad-Core Microprocessor
    Bao-Xia Fan, Liang Yang, Jiang-Mei Wang, Ru Wang, Bin Xiao, Ying Xu, Dong Liu, and Ji-Ye Zhao
    Journal of Data Acquisition and Processing, 2010, 25 (2): 192-199. 
    Abstract   PDF(1006KB) ( 4035 )  

    The Godson-3A microprocessor is a quad-core version of the scalable Godson-3 multi-core series. It is physically implemented based on the 65nm CMOS process. This 174mm2 chip consists of 425 million transistors. The maximum frequency is 1GHz with a maximum power consumption of 15W. The main challenges of Godson-3A physical implementation include very large scale, high frequency requirement, sub-micron technology effects and aggressive time schedule. This paper describes the design methodology of the physical implementation of Godson-3A, with particular emphasis on design methods for high frequency, clock tree design, power management, and on-chip variation (OCV) issue.

    Research Progress of UniCore CPUs and PKUnity SoCs
    Xu Cheng, Senior Member, CCF, Xiao-Yin Wang, Jun-Lin Lu, Jiang-Fang Yi, Dong Tong| Xue-Tao Guan, Feng Liu, Xian-Hua Liu, Member, CCF, Chun Yang, and Yi Feng
    Journal of Data Acquisition and Processing, 2010, 25 (2): 200-213. 
    Abstract   PDF(2319KB) ( 1943 )  

    CPU and System-on-Chip (SoC) are two key technologies of IT industry. During the course of ten years of research, we have defined the UniCore instruction set architecture, and designed the UniCore CPU and the PKUnity SoC family. This cross-disciplinary practice has also fostered many innovations in microprocessor architecture, optimizing compilers, low power design, functional verification, physical design, and so on. In the mean time, we have put technology transfer on the list of our top priorities. This effort has led to several marketable products, such as ultra mobile personal computers, secure micro-workstations and 3C-converged consumer electronics. The development of the next generation products, the 64-bit multi-core CPU and SoC, is also underway. They will find their applications in secure and adaptable computers for mobile and desktop, as well as personal digital multimedia devices. Being consistent with the philosophy and the long-term plan, and by leveraging the cutting-edge process technology, we will continue to make more innovations in CPUs and SoCs, and strengthen our commitment to technology transfer.

    YHFT-QDSP: High-Performance Heterogeneous Multi-Core DSP
    Shu-Ming Chen, Member, CCF, Jiang-Hua Wan, Jian-Zhuang Lu, Zhong Liu, Hai-Yan Sun, Yong-Jie Sun, Member, CCF,Heng-Zhu Liu, Member, CCF, Xiang-Yuan Liu, Zhen-Tao Li, Yi Xu, and Xiao-Wen Chen
    Journal of Data Acquisition and Processing, 2010, 25 (2): 214-224. 
    Abstract   PDF(908KB) ( 2107 )  

    Multi-core architectures are widely used to enhance the microprocessor performance within a limited increase in time-to-market and power consumption of the chips. Toward the application of high-density data signal processing, this paper presents a novel heterogeneous multi-core architecture digital signal processor (DSP), YHFT-QDSP, with one RISC CPU core and 4 VLIW DSP cores. By three kinds of interconnection, YHFT-QDSP provides high efficiency message communication for inner-chip RISC core and DSP cores, inner-chip and inter-chip DSP cores. A parallel programming platform is specifically developed for the heterogeneous multi-core architecture of YHFT-QDSP. This parallel programming environment provides a parallel support library and a friendly interface between high level application softwares and multi-core DSP. The 130\,nm CMOS custom chip design results in a high speed and moderate power design. The results of typical benchmarks show that the interconnection structure of YHFT-QDSP is much better than other related structures and achieves better speedup when using the interconnection facilities in combing methods. YHFT-QDSP has been signed off and manufactured presently. The future applications of the multi-core chip could be found in 3G wireless base station, high performance radar, industrial applications, and so on.

    Physical Design Methodology for Godson-2G Microprocessor
    Ji-Ye Zhao, Dong Liu, Dan-Dan Huan, Meng-Hao Su, Bin Xiao, Ying Xu, Feng Shi, Chen Chen, and Song Wang
    Journal of Data Acquisition and Processing, 2010, 25 (2): 225-231. 
    Abstract   PDF(628KB) ( 1879 )  

    The Godson-2G microprocessor is a high performance SOC which integrates a four-issue 64-bit high performance CPU core (called GS464), a DDR2/3 controller, a HyperTransport controller, a PCI/PCI-X controller, etc. It is physically implemented in 65nm CMOS process and reaches the frequency of 1GHz with power consumption less than 4W. The main challenges of Godson-2G physical implementation include nanometer process technology effects, high performance design targets, and tight schedule. This paper describes the key innovative features of physical design methodology which had been used in Godson-2G physical implementation, with particular emphasis on interconnect driven floorplan generation (ICD-FP), adapted boundary constraints design optimization (ABC-OPT), automatic register group clock tree generation methodology (ARG-CTS).

    Managing Data-Objects in Dynamically Reconfigurable Caches
    Xue-Jun Yang, Member, CCF, ACM, IEEE, Jun-Jie Wu, Student Member, CCF, ACM, IEEE, Kun Zeng, and Yu-Hua Tang
    Journal of Data Acquisition and Processing, 2010, 25 (2): 232-245. 
    Abstract   PDF(705KB) ( 1765 )  

    The widening gap between processor and memory speeds makes cache an important issue in the computer system design. Compared with work set of programs, cache resource is often rare. Therefore, it is very important for a computer system to use cache efficiently. Toward a dynamically reconfigurable cache proposed recently, DOOC (Data-Object Oriented Cache), this paper proposes a quantitative framework for analyzing the cache requirement of data-objects, which includes cache capacity, block size, associativity and coherence protocol. And a kind of graph coloring algorithm dealing with the competition between data-objects in the DOOC is proposed as well. Finally, we apply our approaches to the compiler management of DOOC. We test our approaches on both a single-core platform and a four-core platform. Compared with the traditional caches, the DOOC in both platforms achieves an average reduction of 44.98% and 49.69% in miss rate respectively. And its performance is very close to the ideal optimal cache.

    Hierarchical Cache Directory for CMP
    Song-Liu Guo, Hai-Xia Wang, Senior Member, CCF, Yi-Bo Xue, Senior Member, CCF, Chong-Min Li, Student Member, CCF| and Dong-Sheng Wang, Senior Member, CCF
    Journal of Data Acquisition and Processing, 2010, 25 (2): 246-256. 
    Abstract   PDF(830KB) ( 1925 )  

    As more processing cores are integrated into one chip and feature size continues to shrink, the average access latency for remote nodes using directory-based coherence protocol becomes higher, which greatly impacts system performance. Previous techniques such as data replication and data migration optimize the performance of the requesting core, but offer little improvement for neighbor nodes. Other techniques such as in-transit optimization try to reduce latency at the cost of increased storage. This paper introduces hierarchical cache directory into CMP (chip multiprocessor), which divides CMP tiles into multiple regions hierarchically, and combines it with data replication. A new directory organization is proposed to record the share status within a region and assist the regional home to complete operation efficiently. Simulation results show that for a 16-core CMP, compared to traditional directory, hierarchical cache directory reduces average access latency by 9% and on-chip network traffic by 34% on average with less storage. Theoretical analyses show that for a 2n times 2n tiled CMP, the average access latency in hierarchical cache directory asymptotically approaches a function that is independent of n, hence the architecture is highly scalable.

    CCNoC: Cache-Coherent Network on Chip for Chip Multiprocessors
    Jing-Lei Wang, Yi-Bo Xue, Member, CCF, IEEE, Hai-Xia Wang, Member, CCF, IEEE, Chong-Min Li, and Dong-Sheng Wang, Senior Member, CCF, Member, IEEE
    Journal of Data Acquisition and Processing, 2010, 25 (2): 257-266. 
    Abstract   PDF(3318KB) ( 2357 )  

    As the number of cores in chip multiprocessors (CMPs) increases, cache coherence protocol has become a key issue in integration of chip multiprocessors. Supporting cache coherence protocol in large chip multiprocessors still faces three hurdles: design complexity, performance and scalability. This paper proposes Cache Coherent Network on Chip (CCNoC), a scheme that decouples cache coherency maintenance from processors and shared L2 caches and implements it completely in network on chip to free up processors and shared L2 caches from the chore of maintaining coherency, thereby reduces design complexity of CMPs. In this way, CCNoC also improves the performance of cache coherence protocol through reducing directory access latency and enhances scalability by avoiding massive directories overhead in shared L2 caches. In CCNoC, coherence state caches and active directory caches are implemented in the network interface components of network on chip to maintain cache coherence states for blocks in L1 caches and manage directory information for recently accessed blocks in L2 caches respectively. CCNoC provides a scalable CMP framework to tackle cache coherency which is the foundation of CMP. This paper evaluates the performance of CCNoC. Experimental results show that for a 16-core system, CCNoC improves performance by 3% on average over the conventional chip multiprocessor and by 10% at best, while reduces storage overhead by 1.8% and saves directory storage by 88%, showing good scalability.

    Design and Application of Instruction Set Simulator on Multi-Core Verification
    Xiang-Dong Hu, Senior Member, CCF, Yong Guo, Ying Zhu, Xin Guo, and Peng Wang
    Journal of Data Acquisition and Processing, 2010, 25 (2): 267-273. 
    Abstract   PDF(370KB) ( 2430 )  

    Instruction Set Simulator (ISS) is a highly abstracted and executable model of micro architecture. It is widely used in the fields of verification and debugging during the development of microprocessors. However, with the emergence of Chip Multi-Processors, the single-core ISS cannot meet the needs of microprocessor development. In this paper, we introduce our multi-core chip architecture first, after that a general methodology to expand a single-core ISS to a multi-core ISS (MCISS) is proposed. On this basis, a real-time comparison environment is created for multi-core verification, and the problems of multi-core communication and synchronization are addressed gracefully. With the ``save and restore'' mechanism, the verification procedure and the debugging are speeding up greatly.

    Computer Network and Internet
    Location, Localization, and Localizability
    Yunhao Liu, Member, ACM, Senior Member, IEEE, Zheng Yang, Student Member, ACM, IEEE, Xiaoping Wang, Student Member, IEEE, and Lirong Jian, Student Member, IEEE
    Journal of Data Acquisition and Processing, 2010, 25 (2): 274-297. 
    Abstract   PDF(1033KB) ( 3399 )  

    Location-aware technology spawns numerous unforeseen pervasive applications in a wide range of living, production, commence, and public services. This article provides an overview of the location, localization, and localizability issues of wireless ad-hoc and sensor networks. Making data geographically meaningful, location information is essential for many applications, and it deeply aids a number of network functions, such as network routing, topology control, coverage, boundary detection, clustering, etc. We investigate a large body of existing localization approaches with focuses on error control and network localizability, the two rising aspects that attract significant research interests in recent years. Error control aims to alleviate the negative impact of noisy ranging measurement and the error accumulation effect during coope-rative localization process. Network localizability provides theoretical analysis on the performance of localization approaches, providing guidance on network configuration and adjustment. We emphasize the basic principles of localization to understand the state-of-the-art and to address directions of future research in the new and largely open areas of location-aware technologies.

    ROAD+: Route Optimization with Additional Destination-Information and Its Mobility Management in Mobile Networks
    Moonseong Kim, Matt W. Mutka, Member, ACM, Senior Member, IEEE, Jeonghoon Park, and Hyunseung Choo, Member, ACM, IEEE
    Journal of Data Acquisition and Processing, 2010, 25 (2): 298-312. 
    Abstract   PDF(961KB) ( 1797 )  

    MObility (NEMO) environment, mobile networks can form a nested structure. In nested mobile networks that use the NEMO Basic Support (NBS) protocol, pinball routing problems occur because packets are routed to all the home agents of the mobile routers using nested tunneling. In addition, the nodes in the same mobile networks can communicate with each other regardless of Internet connectivity. However, the nodes in some mobile networks that are based on NBS cannot communicate when the network is disconnected from the Internet. In this paper, we propose a route optimization scheme to solve these problems. We introduce a new IPv6 routing header named "destination-information header'' (DH), which uses DH instead of routing header type 2 to optimize the route in the nested mobile network. The proposed scheme shows at least 30% better performance than ROTIO and similar performance improvement as DBU in inter-route optimization. With respect to intra-route optimization, the proposed scheme always uses the optimal routing path. In addition, the handover mechanism in ROAD+ outperforms existing schemes and is less sensitive to network size than other existing schemes.

    Distributed Computing and Systems
    Tree-Based Index Overlay in Hybrid Peer-to-Peer Systems
    InSung Kang, SungJin Choi, Member, IEEE, SoonYoung Jung, and SangKeun Lee
    Journal of Data Acquisition and Processing, 2010, 25 (2): 313-329. 
    Abstract   PDF(828KB) ( 1899 )  

    Hybrid Peer-to-Peer (P2P) systems that construct overlay networks structured among superpeers have great potential in that they can give the benefits such as scalability, search speed and network traffic, taking advantages of superpeer-based and the structured P2P systems. In this article, we enhance keyword search in hybrid P2P systems by constructing a tree-based index overlay among directory nodes that maintain indices, according to the load and popularity of a keyword. The mathematical analysis shows that the keyword search based on semi-structured P2P overlay can improve the search performance, reducing the message traffic and maintenance costs.

    Location-Based Data Dissemination for Spatial Queries in Wireless Broadcast Environments
    Kwangjin Park and Hyunseung Choo
    Journal of Data Acquisition and Processing, 2010, 25 (2): 330-346. 
    Abstract   PDF(774KB) ( 1861 )  

    Most current research on Location-Based Services (LBSs, for short) assumes point-to-point wireless communication, where the server processes a query and returns the query result to the user via a point-to-point wireless channel. However, LBSs via point-to-point wireless channel suffer from a tremendous amount of traffic and service requests from the user and thereby result in poor performance. In this paper, we present broadcast-based spatial query processing algorithms designed to support k-NN (k-Nearest Neighbor) and range queries via a wireless network. The task of the query processor is to selectively monitor the wireless broadcast channel, when the data items are disseminated by the server, according to their locations. Experiments are conducted to evaluate the performance of the proposed algorithms. Comprehensive experiments illustrate that the presented algorithms are highly scalable and are more efficient than the previous techniques in terms of both access time and energy consumption.

    An Effective Semantic Cache for Exploiting XPath Query/View Answerability
    Guo-Liang Li, Member, CCF, ACM, and Jian-Hua Feng, Senior Member, CCF, Member, ACM, IEEE
    Journal of Data Acquisition and Processing, 2010, 25 (2): 347-361. 
    Abstract   PDF(662KB) ( 2107 )  

    Maintaining a semantic cache of materialized XPath views inside or outside the database is a novel, feasible and efficient approach to facilitating XML query processing. However, most of the existing approaches incur the following disadvantages: 1) they cannot discover enough potential cached views sufficiently to effectively answer subsequent queries; or 2) they are inefficient for view selection due to the complexity of XPath expressions. In this paper, we propose SCEND, an effective Semantic Cache based on dEcompositioN and Divisibility, to exploit the XPath query/view answerability. The contributions of this paper include: 1) a novel technique of decomposing complex XPath queries into some much simpler ones, which can facilitate discovering more potential views to answer a new query than the existing methods and thus can adequately exploit the query/view answerability; 2) an efficient view-section method by checking the divisibility between two positive numbers assigned to queries and views; 3) a cache-replacement approach to further enhancing the query/view answerability; 4) an extensive experimental study which demonstrates that our approach achieves higher performance and outperforms the existing state-of-the-art alternative methods significantly.

    Information Security
    A Multi-Key Pirate Decoder Against Traitor Tracing Schemes
    Yong-Dong Wu, Member, IEEE, and Robert H. Deng, Member, IEEE
    Journal of Data Acquisition and Processing, 2010, 25 (2): 362-374. 
    Abstract   PDF(355KB) ( 2024 )  

    In this paper we introduce an architecture for a multi-key pirate decoder which employs decryption keys from multiple traitors. The decoder has built-in monitoring and self protection functionalities and is capable of defeating most multiple-round based traitor tracing schemes such as the schemes based on the black-box confirmation method. In particular, the proposed pirate decoder is customized to defeat the private key and the public key fully collusion resistant traitor tracing (FTT) schemes, respectively. We show how the decoder prolongs a trace process so that the tracer has to give up his effort. FTT schemes are designed to identify all the traitors. We show that decoder enables the FTT schemes to identify at most 1 traitors. Finally, assuming the decoder is embedded with several bytes of memory, we demonstrate how the decoder is able to frame innocent users at will.

    Towards Risk Evaluation of Denial-of-Service Vulnerabilities in Security Protocols
    Zhen Cao, Zhi Guan, Zhong Chen, Member, IEEE, Jian-Bin Hu, and Li-Yong Tang
    Journal of Data Acquisition and Processing, 2010, 25 (2): 375-inside back cover. 
    Abstract   PDF(409KB) ( 1944 )  

    Denial-of-Service (DoS) attacks are virulent to both computer and networked systems. Modeling and evaluating DoS attacks are very important issues to networked systems; they provide both mathematical foundations and theoretic guidelines to security system design. As defense against DoS has been built more and more into security protocols, this paper studies how to evaluate the risk of DoS in security protocols. First, we build a formal framework to model protocol operations and attacker capabilities. Then we propose an economic model for the risk evaluation. By characterizing the intruder capability with a probability model, our risk evaluation model specifies the "Value-at-Risk'' (VaR) for the security protocols. The "Value-at-Risk'' represents how much computing resources are expected to lose with a given level of confidence. The proposed model can help users to have a better understanding of the protocols they are using, and in the meantime help designers to examine their designs and get clues of improvement. Finally we apply the proposed model to analyze a key agreement protocol used in sensor networks and identify a DoS flaw there, and we also validate the applicability and effectiveness of our risk evaluation model by applying it to analyze and compare two public key authentication protocols.

SCImago Journal & Country Rank
 

ISSN 1004-9037

         

Home
Editorial Board
Author Guidelines
Subscription
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China

E-mail: info@sjcjycl.cn
 
  Copyright ©2015 JCST, All Rights Reserved