Swedish web site

A — Z

IDA - Department of Computer and Information Science

Hide menu

List of Papers

Zhang A Q, Goens A, Oswald N, et al.
PipeGen: Automated Transformation of a Single-Core Pipeline into a Multicore Pipeline for a Given Memory Consistency Model
Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques. 2024: 1-13.
Wang Y, Li B, Jaleel A, et al.
GRIT: Enhancing Multi-GPU Performance with Fine-Grained Dynamic Page Placement
2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2024: 1080-1094.
J. Handy and T. Coughlin.
Semiconductor Architectures Enable Compute in Memory
Computer, vol. 56, no. 5, pp. 126-129, May 2023
Muthukrishnan H, Nellans D, Lustig D, et al.
Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers
Proc. 48th Annual International Symposium on Computer Architecture, 2021
Jang J W, Lee S, Kim D, et al.
Sparsity-Aware and Re-configurable NPU Architecture for Samsung Flagship Mobile SoC
Proc. 48th Annual International Symposium on Computer Architecture, 2021
Wang M, Ta T, Cheng L, et al.
Efficiently Supporting Dynamic Task Parallelism on Heterogeneous Cache-Coherent Systems
Proc. 47th Annual International Symposium on Computer Architecture, 2020
K. Wang, et al.
IntelliNoC: A Holistic Design Framework for Energy-Efficient and Reliable On-Chip Communication for Manycores
46th International Symposium on Computer Architecture, 2019
A. Pattnaik, et al.
Opportunistic Computing in GPU Architectures
46th International Symposium on Computer Architecture, 2019
Jeff Dean, David Patterson, and Cliff Young (Google Brain)
A New Golden Age in Computer Architecture: Empowering the MachineLearning Revolution
Architectural Support for Programming Languages and Operating Systems (ASPLOS'17)
Norman P. Jouppi, et al.
In-Datacenter Performance Analysis of a Tensor Processing Unit
Proc. 44th International Symposium on Computer Architecture, 2017.
Martin Abadi, et al.
TensorFlow: A System for Large-Scale Machine Learning
Proc. 12th USENIX Symposium on Operating Systems Design, 2016.
Shawn Hershey, et al.
CNN Architectures for Large-Scale Audio Classification
Proc. IEEE ICASSP, 2017.
Gorkem Asilioglu, et al.
LaZy superscalar
ISCA, 2015.
Eyerman, Stijn and Eeckhout, Lieven
The Benefit of SMT in the Multi-core Era: Flexibility Towards Degrees of Thread-level Parallelism
ASPLOS, 2014.
E. Shiu and J. Ko (Google)
System design challenges for future consumer devices: From glass to Chromebooks
ICEP, 2016.
Arkaprava Basu, et al. (AMD Research)
Software Assisted Hardware Cache Coherence for Heterogeneous Processors
MEMSYS, 2016.
Whitepaper sponsored by AMD
HSA: A New Architecture for Heterogeneous Computing
TIRIAS research, 2013.
Loi, I; Benini, L
A Multi Banked - Multi Ported - non Blocking Shared L2 Cache for MPSoC Platforms
Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014.
Pricopi, M; Mitra, T
Bahurupi: A polymorphic heterogeneous multi-core architecture
ACM Transactions on Architecture and Code Optimization (TACO), 2012.
Minji Kim ; Jinyong Lee ; Younglok Kim
Fast and flexible pipelined multi-processor architecture for multimedia device
7th International Symposium on Communication Systems Networks and Digital Signal Processing (CSNDSP), 2010
Shekofteh, S.K. ; Deldari, H. ; Khalkhali, M.B.
Reducing cache contention in a multi-core processor via a scheduler
3rd International Conference on Advanced Computer Theory and Engineering (ICACTE), 2010
Cohen, J. ; Garland, M.
Novel Architectures: Solving Computational Problems with GPU Computing
Computing in Science & Engineering, 2009
Chengming Zou; Chunfen Xia; Guanghui Zhao
Numerical Parallel Processing Based on GPU with CUDA Architecture
International Conference on Wireless Networks and Information Systems (WNIS), 2009
Zamith, M.P.M. ; Clua, E.W.G. ; Conci, A. ; Montenegro, A.
Parallel processing between GPU and CPU: Concepts in a game architecture
Computer Graphics, Imaging and Visualisation (CGIV), 2007
Hankins, R.A.; Chinya, G.N.; Collins, J.D.; Wang, P.H.; Rakvic, R.; Hong Wang; Shen, J.P.
Multiple Instruction Stream Processor
33th Intl. Symp. on Computer Architecture (ISCA), pp. 114-127, 2006.
Jichuan Chang; Sohi, G.S.
Cooperative Caching for Chip Multiprocessors
33th Intl. Symp. on Computer Architecture (ISCA), pp. 264-276, 2006.
Alameldeen, A.R.; Wood, D.A.
Interactions Between Compression and Prefetching in Chip Multiprocessors
13th Intl. Symp. on High Performance Computer Architecture (HPCA), pp. 228-239, 2007.
Strauss, K., Shen, X., and Torrellas, J. 2006.
Flexible SnoopingAdaptive Forwarding and Filtering of Snoops in Embedded-Ring Multiprocessors.
33rd Ann. Intl. Symp. on Computer Architecture (ISCA), pp. 327-338.
Hoseok Chang, Junho Cho, and Wonyong Sung. 2006.
Performance Evaluation of an SIMD Architecture with a Multi-bank Vector Memory Unit.
IEEE Work. on Signal Processing Systems Design and Implementation (SIPS), pp. 71-76.

Page responsible: Zebo Peng
Last updated: 2025-09-16

Department of Computer and Information Science
Linköping University
581 83 LINKÖPING
Tel: +46 13 28 10 00

Contact IDA | Maps

At LiU

At IDA

Top of page