Changeset [cd77853c485eaa41c369e17806924123e4308648] by Jed Brown
November 28th, 2008 @ 01:29 AM
First pass optimizing TensorMult_Hex.
- Speedup from <500 MFLOPS to 1150 MFLOPS. When unrolled over last dimension and possibly hand optimized for SSE, 3 GFLOPS or more should be possible up until the working set no longer fits in L1D.
Signed-off-by: Jed Brown jed@59A2.org http://github.com/jedbrown/dohp/...
Committed by Jed Brown
- M src/jacobi/impls/tensor/efstopo.c
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป
An implementation of the ``dual order hp'' version of the finite element method. This project targets parallel domain-decomposition methods for strongly coupled nonlinear problems with PDE constraints.