Efficient Parallel Implementations of Multipole Based N-Body Algorithms

William T. Rankin, III


Abstract

N-body problems pervade many different branches of numerical simulation. While the exact computation of the pairwise interactions between all N components of such a system is O(N^{2}) in complexity, approximate solutions often may be computed with O(N log N) or O(N) complexity.

This work presents an original design and implementation of a parallel, multipole-based N-body algorithm for particle simulation, with molecular dynamics simulation of biomolecules as the principal target application. The algorithm is designed for distributed memory processors and is portable to a wide variety of platforms. Performance data are presented for networks of workstations as well as for dedicated distributed multiprocessors, demonstrating the scalability of the implementation to large numbers of processors. Several extensions to the basic algorithm are presented that provide for the inclusion of periodic boundary conditions, infinite lattices, arbitrary potential functions and non-cubic simulation spaces.

Computation of an accurate virial tensor is a common step in molecular dynamics simulations involving constant pressure and periodic boundary conditions. The naive approach involves an all-pairs computation and is O(N^{2}) complexity. This dissertation presents a novel O(N) method for the computation of an approximate pressure tensor. The algorithm uses multipole expansions and hierarchical decomposition to produce results with a known error bound, while allowing integration with existing multipole-based N-body solvers. Results are presented that show the calculation of the full electrostatic virial tensor can be performed with a minimal amount of overhead.

In addition to obtaining high performance through efficient algorithms and scalability, implementations of distributed programs must also address issues associated with load imbalances. An analysis is performed to determine the impact of different data mapping schemes for this distributed N-body algorithm. Results are presented that show their impact upon both performance and communication overhead. A generalized load balancing functionality is implemented within the existing N-body application. This code is used to compare several different load balancing strategies that attempt to compensate for the inefficiencies caused by both irregular data distributions as well as external processor loads. Performance results are presented for both data and processor load imbalances.


rankin_disser.ps.gz

Back to the Scientific Computing Group Home Page


This page last modified 2/22/05 by wrankin