## Linear algebra

- Freely Available Software for Linear Algebra
- linear system solvers:
- sparse direct:
- iterative:
- AMG:
- CUSP
- AMGCL (OpenCL, CUDA, OpenMP backends, build-phase only on CPU)
- PyAMG
- AmgX
- ViennaCL
- Hypre (publications, GitHub, documentation)
- DUNE-ISTL
- OpenFoam (geometric agglomerated algebraic multigrid solver)
- RAPtor
- Paralution
- SParSH-AMG (paper)

- see also Libraries for solving sparse linear systems and GPU-accelerated libraries for solving sparse linear systems on StackExchange

## C++

For a more general list see cppreference.com and Awesome C++.

- metaprogramming
- command-line argument parsing
- clipp (single header, powerful and expressive syntax, usage & doc generation)
- args (header-only, resembles Python‘s
`argparse`

) - boost::program_options
- TCLAP (header-only)
- CLI11 (header-only, single-file, config files in INI format)
- argh (minimalist, only
`[]`

and`()`

operators, no exceptions for failures) - Clara (header-only, single-file, composable)

- logging
- XML parsing

### Data structures and linear algebra

- FLENS: C++11 header-only library reimplementing BLAS and LAPACK (seems inactive as of 2017)
- Armadillo: C++ linear algebra library
- https://bitbucket.org/blaze-lib/blaze
- tensors, n-dimensional arrays:
- general lists at Wikipedia and StackExchange
- http://cpptruths.blogspot.cz/2011/10/multi-dimensional-arrays-in-c11.html
- http://www.nongnu.org/tensors/ (last commit in 2012)
- https://bitbucket.org/wlandry/ftensor/src
- https://github.com/tensor-compiler/taco
- http://itensor.org/
- Eigen tensors - Many operations, expression templates, either pure-static or pure-dynamic sizes, only column-major format (row-major support is incomplete), little GPU support.
- cudarrays - Only up to 3D arrays, both static and dynamic, compile-time permutations using
`std::tuple`

. - RAJA - No memory management, views are initialized with a raw pointer, index permutations are initialized at runtime, only dynamic dimensions.
- Kokkos - Configurable layout and default selection based on the memory/execution space, but only AoS and SoA are considered, even for \(N>2\). For parallel work there is only one leading dimension - it does not map to 2D or 3D CUDA grids.
- Awkward Array - a library for nested, variable-sized data, including arbitrary-length lists, records, mixed types, and missing data, using NumPy-like idioms (Python library with C++ backend).

- Tpetra - parallel sparse linear algebra (Trillinos package)
- VexCL: vector expression template library for OpenCL/CUDA
- http://libelemental.org/about/
- k-d trees:
- http://kaldi-asr.org/doc/about.html - they have an integrated matrix library (wrapper for blas, lapack etc), including CUDA classes
- LRU cache

## PDE solution

### Finite elements

- List of finite element software packages on Wikipedia
- Deal.II (tutorial)
- DUNE
- FEniCS (components, demo)
- MFEM (has some GPU support)
- FreeFEM
- ElmerFEM
- libMesh – a framework for the numerical simulation of partial differential equations using arbitrary unstructured discretizations on serial and parallel platforms, supports adaptive refinement
- Hermes
- GetFEM++
- NGSolve
- Gridap.jl (user guide)
- Kratos Multi-Physics
- MoFEM (paper)
- Coolfluid 3 (last change in 2015) – mesh data structure, mesh operations
- GPUTUM (FEM solver) – only tetrahedral and triangular meshes, last change in 2016

### Finite volumes

- OpenFOAM – a FVM framework for CFD
- FiPy
- CFL3D
- Clawpack/Pyclaw
- SU2
- Nalu – control volume finite element (CVFEM) and edge-based vertex centered (EBVC)

### LBM

### Immersed boundary

### Particle-in-cell

### Domain decomposition

- hpddm
- LibGeoDecomp – an auto-parallelizing library for computer simulations (stencil codes, short-ranged n-body, meshfree methods, particle-in-cell codes)

## Meshes

Note: many libraries declare „GPU support“, but most of the times it means that just the resulting linear algebra or at most some finite elements operations are *accelerated* on GPU.
But the mesh itself is allocated on the host, arbitrary user kernels involving the mesh are not possible.

### Data formats

### General data structures

- OpenMesh – generic data structure for representing and manipulating polygonal meshes (only boundary representation, no volumetric meshes)
- ViennaMesh (mesh generator), ViennaGrid (C++ library, does not support GPUs)
- SCOREC (includes module PUMI: Parallel Unstructured Mesh Infrastructure (pdf))
- DUNE grid interface (very general interface specification and multiple implementations designed for different applications, there is no implementation with GPU support)
- MOAB (Mesh-Oriented datABase) – can store structured and unstructured mesh, implements the ITAPS iMesh interface, supports common parallel mesh operations like parallel import and export (to/from a single HDF5-based file), parallel ghost exchange, communication of field data, and general sending and receiving of mesh and metadata between processors
- Petsc object DMPLEX
- Structured Adaptive Mesh Refinement Application Infrastructure (SAMRAI)
- omega_h (simplex mesh adaptivity using MPI, OpenMP, CUDA) – does not build with CUDA 11.2
- bitpit mesh modules – PArallel Balanced Linear Octree, dynamic load-balancing with MPI, surface and volume meshes, based on C++ STL data structures
- GMlib – GPU computing on unstructured meshes (OpenCL)

### Numerical simulation frameworks

- Deal.II (tutorial)
- OpenFoam mesh description
- libMesh – a FEM framework for the numerical simulation of partial differential equations using arbitrary unstructured discretizations on serial and parallel platforms, supports adaptive refinement

### Generators

- CGAL: The Computational Geometry Algorithms Library
- CUBIT – geometry and mesh generation toolkit (manual)
- MMG – moving mesh generator

### Visualization / post-processing

- VTK
- VCGlib – C++ templated library for manipulation, processing and displaying with OpenGL of triangle and tetrahedral meshes (only boundary representation, no volumetric meshes)

### Stencils / structured grids

- https://op-dsl.github.io/
- http://www.libgeodecomp.org/
- https://github.com/naoyam/physis/
- https://code.google.com/archive/p/patus/

## CUDA

- https://github.com/harrism/hemi
- https://github.com/eyalroz/cuda-api-wrappers
- https://github.com/NVlabs/cub
- https://github.com/ComputationalRadiationPhysics/cuda_memtest

## Python

- pandas:
- https://github.com/woseseltops/pyscreen
- solvers:
- visualization:

## Data file formats

## MPI

- OpenMPI‘s hwloc
- MVAPICH benchmarks

## Job schedulers / workload managers

