Data structures and algorithms for SpMV that are eciently implemented on the CUDA platform for the ne-grained parallel architecture of the GPU and develop methods to exploit several common forms of matrix structure while oering alternatives which accommodate greater irregularity are developed.Expand

Proceedings of the Conference on High Performance…

14 November 2009

TLDR

This work explores SpMV methods that are well-suited to throughput-oriented architectures like the GPU and which exploit several common sparsity classes, including structured grid and unstructured mesh matrices.Expand

This chapter demonstrates how to leverage the Thrust parallel template library to implement high performance applications with minimal programming effort.Expand

Algebraic multigrid methods for large, sparse linear systems are a necessity in many computational simulations, yet parallel algorithms for such solvers are generally decomposed into coarse-grained...

This paper presents a simple and effective method for granular material simulation that generalizes this discrete model to rigid bodies by distributing particles over their surfaces and achieves two-way coupling between granular materials and rigid bodies.Expand

The implementation is fully general and the optimization strategy adaptively processes the SpGEMM workload row-wise to substantially improve performance by decreasing the work complexity and utilizing the memory hierarchy more effectively.Expand

This paper shows that a previous least-squares formulation for distortion minimization reduces to a Laplacian system on a general graph structure for which it derive an analytic expression, and describes an efficient multigrid algorithm for solving the relevant equations.Expand

The algorithms, features, and implementation of PyDEC, a Python library for computations related to the discretization of exterior calculus, are described, which map well to the facilities of numerical libraries such as NumPy and SciPy.Expand

This chapter demonstrates how to leverage the Thrust parallel template library to implement highperformance applications with minimal programming effort. Based on the C++ Standard Template Library… Expand

This paper shows that a previous least-squares formulation for distortion minimization reduces to a Laplacian system on a general graph structure for which it derive an analytic expression, and describes an efficient multigrid algorithm for solving the relevant equations.Expand