Andrew Kerr, Gregory Diamos, Sudhakar Yalamanchili. Dynamic Compilation of Data-Parallel Kernels for Vector Processors. To Appear in Code Generation and Optimization (CGO 2012). Paper (comming soon) (bibtex)
Gregory Diamos, Benjamin Ashbaugh, Subramaniam Maiyuran, Andrew Kerr, Haicheng Wu, Sudhakar Yalamanchili. SIMD Re-convergence at Thread Frontiers. 44th International Symposium on Microarchitecture (MICRO 44). Paper (bibtex) (presentation)
Andrew Kerr, Gregory Diamos, Sudhakar Yalamanchili. GPU Application Development, Debugging, and Performance Tuning with GPU Ocelot. GPU Computing GEMS , vol. 1, 2011. Paper (bibtex)
Gregory Diamos. Harmony: An Execution Model For Heterogeneous Systems. PhD Thesis. December 2011. Paper (bibtex)
Haicheng Wu, Gregory Diamos, Si Li, and Sudhakar Yalamanchili. Characterization and Transformation of Unstructured Control Flow in GPU Applications . The First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems. June 2011. Paper (bibtex)
Naila Farooqui, Andrew Kerr, Gregory Diamos, Sudhakar Yalamanchili, and Karsten Schwan. A Framework for Dynamically Instrumenting GPU Compute Applications within GPU Ocelot . Fourth Workshop on General-Purpose Computation on Graphics Procesing Units. March 2011. Paper (bibtex)
Gregory Diamos. An Execution Model and Runtime for Heterogeneous Many-Core Systems. PhD Dissertation Proposal. January 2010. Proposal
Gregory Diamos, Andrew Kerr, Sudhakar Yalamanchili, and Nathan Clark. Ocelot: A Dynamic Compiler for Bulk-Synchronous Applications in Heterogeneous Systems. The Nineteenth International Conference on Parallel Architectures and Compilation Techniques. September 2010. Paper (bibtex)
Andrew Kerr, Gregory Diamos, and Sudhakar Yalamanchili. Modeling GPU-CPU Workloads and Systems. Third Workshop on General-Purpose Computation on Graphics Procesing Units. March 2010. Paper (bibtex)
Sudnya Padalikar and Gregory Diamos. Exploring The Latency and Bandwidth Tolerance of CUDA Applications. NFinTes Tech Report. December 2009. Paper
Gregory Diamos. The Design and Implementation of Ocelot's Dynamic Binary Translator from PTX to Multi-Core x86. CERCS Tech Report. December 2009. Paper
Gregory Diamos and Sudhakar Yalamanchili. Speculative Execution On Multi-GPU Systems. IEEE International Parallel & Distributed Processing Symposium (IPDPS 2010). April 2010. Paper (bibtex)
Gregory Diamos and Sudhakar Yalamanchili. Speculative Execution On Multi-GPU Systems. CERCS Tech Report. September 2009. Paper
Gregory Diamos and Sudhakar Yalamanchili. Harmony: An Execution Model and Runtime for Heterogeneous Many-Core Processors. High Performance Distributed Computing (HPDC08). Jun 2008. Paper (bibtex)
Andrew Kerr, Gregory Diamos, and Sudhakar Yalamanchili. A Characterization and Analysis of PTX Kernels. IEEE International Symposium on Workload Characterization (IISWC). October 2009. Paper
Gregory Diamos, Andrew Kerr, Mukil Kesavan. Translating GPU Binaries to Tiered Many-Core Architectures with Ocelot. Tech Report. January 2009. Paper
Gregory Diamos and Sudhakar Yalamanchili. Harmony: An Execution Model and Runtime for Heterogeneous Many-Core Processors. Tech Report. December 2007. Paper (bibtex)
Gregory Diamos. State Explosion: An Obvious Limitation to Strong Scaling. Short Paper. September 2009. Paper
Gregory Diamos and Sudhakar Yalamanchili. STARS: A System for Tuning and Automatically Reconfiguring SoC Links. Design Automation and Test in Europe (NoC Workshop) (DATE08). April 2008. Paper