Hide menu

List of Papers

  1. J. Handy and T. Coughlin.
    Semiconductor Architectures Enable Compute in Memory
    Computer, vol. 56, no. 5, pp. 126-129, May 2023


  2. Muthukrishnan H, Nellans D, Lustig D, et al.
    Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers
    Proc. 48th Annual International Symposium on Computer Architecture, 2021


  3. Jang J W, Lee S, Kim D, et al.
    Sparsity-Aware and Re-configurable NPU Architecture for Samsung Flagship Mobile SoC
    Proc. 48th Annual International Symposium on Computer Architecture, 2021


  4. Wang M, Ta T, Cheng L, et al.
    Efficiently Supporting Dynamic Task Parallelism on Heterogeneous Cache-Coherent Systems
    Proc. 47th Annual International Symposium on Computer Architecture, 2020


  5. K. Wang, et al.
    IntelliNoC: A Holistic Design Framework for Energy-Efficient and Reliable On-Chip Communication for Manycores
    46th International Symposium on Computer Architecture, 2019


  6. A. Pattnaik, et al.
    Opportunistic Computing in GPU Architectures
    46th International Symposium on Computer Architecture, 2019


  7. Jeff Dean, David Patterson, and Cliff Young (Google Brain)
    A New Golden Age in Computer Architecture: Empowering the MachineLearning Revolution
    Architectural Support for Programming Languages and Operating Systems (ASPLOS'17)


  8. Norman P. Jouppi, et al.
    In-Datacenter Performance Analysis of a Tensor Processing Unit
    Proc. 44th International Symposium on Computer Architecture, 2017.


  9. Martin Abadi, et al.
    TensorFlow: A System for Large-Scale Machine Learning
    Proc. 12th USENIX Symposium on Operating Systems Design, 2016.


  10. Shawn Hershey, et al.
    CNN Architectures for Large-Scale Audio Classification
    Proc. IEEE ICASSP, 2017.


  11. Gorkem Asilioglu, et al.
    LaZy superscalar
    ISCA, 2015.


  12. Eyerman, Stijn and Eeckhout, Lieven
    The Benefit of SMT in the Multi-core Era: Flexibility Towards Degrees of Thread-level Parallelism
    ASPLOS, 2014.


  13. E. Shiu and J. Ko (Google)
    System design challenges for future consumer devices: From glass to Chromebooks
    ICEP, 2016.


  14. Arkaprava Basu, et al. (AMD Research)
    Software Assisted Hardware Cache Coherence for Heterogeneous Processors
    MEMSYS, 2016.


  15. Whitepaper sponsored by AMD
    HSA: A New Architecture for Heterogeneous Computing
    TIRIAS research, 2013.


  16. Loi, I; Benini, L
    A Multi Banked - Multi Ported - non Blocking Shared L2 Cache for MPSoC Platforms
    Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014.


  17. Pricopi, M; Mitra, T
    Bahurupi: A polymorphic heterogeneous multi-core architecture
    ACM Transactions on Architecture and Code Optimization (TACO), 2012.


  18. Minji Kim ; Jinyong Lee ; Younglok Kim
    Fast and flexible pipelined multi-processor architecture for multimedia device
    7th International Symposium on Communication Systems Networks and Digital Signal Processing (CSNDSP), 2010


  19. Shekofteh, S.K. ; Deldari, H. ; Khalkhali, M.B.
    Reducing cache contention in a multi-core processor via a scheduler
    3rd International Conference on Advanced Computer Theory and Engineering (ICACTE), 2010


  20. Cohen, J. ; Garland, M.
    Novel Architectures: Solving Computational Problems with GPU Computing
    Computing in Science & Engineering, 2009


  21. Chengming Zou; Chunfen Xia; Guanghui Zhao
    Numerical Parallel Processing Based on GPU with CUDA Architecture
    International Conference on Wireless Networks and Information Systems (WNIS), 2009


  22. Zamith, M.P.M. ; Clua, E.W.G. ; Conci, A. ; Montenegro, A.
    Parallel processing between GPU and CPU: Concepts in a game architecture
    Computer Graphics, Imaging and Visualisation (CGIV), 2007


  23. Hankins, R.A.; Chinya, G.N.; Collins, J.D.; Wang, P.H.; Rakvic, R.; Hong Wang; Shen, J.P.
    Multiple Instruction Stream Processor
    33th Intl. Symp. on Computer Architecture (ISCA), pp. 114-127, 2006.


  24. Jichuan Chang; Sohi, G.S.
    Cooperative Caching for Chip Multiprocessors
    33th Intl. Symp. on Computer Architecture (ISCA), pp. 264-276, 2006.


  25. Alameldeen, A.R.; Wood, D.A.
    Interactions Between Compression and Prefetching in Chip Multiprocessors
    13th Intl. Symp. on High Performance Computer Architecture (HPCA), pp. 228-239, 2007.


  26. Strauss, K., Shen, X., and Torrellas, J. 2006.
    Flexible SnoopingAdaptive Forwarding and Filtering of Snoops in Embedded-Ring Multiprocessors.
    33rd Ann. Intl. Symp. on Computer Architecture (ISCA), pp. 327-338.


  27. Hoseok Chang, Junho Cho, and Wonyong Sung. 2006.
    Performance Evaluation of an SIMD Architecture with a Multi-bank Vector Memory Unit.
    IEEE Work. on Signal Processing Systems Design and Implementation (SIPS), pp. 71-76.



Page responsible: Zebo Peng
Last updated: 2023-09-04