Changeset [cd77853c485eaa41c369e17806924123e4308648] by Jed Brown

November 28th, 2008 @ 01:29 AM

First pass optimizing TensorMult_Hex.

  • Speedup from <500 MFLOPS to 1150 MFLOPS. When unrolled over last dimension and possibly hand optimized for SSE, 3 GFLOPS or more should be possible up until the working set no longer fits in L1D.

Signed-off-by: Jed Brown jed@59A2.org http://github.com/jedbrown/dohp/...

Committed by Jed Brown

  • M src/jacobi/impls/tensor/efstopo.c
New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

An implementation of the ``dual order hp'' version of the finite element method. This project targets parallel domain-decomposition methods for strongly coupled nonlinear problems with PDE constraints.