Paul Hinker's Weblog
Thursday Jan 19, 2006
Parallel Dual-Core Amd Performance
Last time I presented some serial performance for a dual-core Amd box (285 cpus) . Those numbers aren't especially interesting so here are some parallel perform ance numbers.
| Matrix Size | 1Cpu | 2Cpu | 4Cpu | % of Peak | 2Cpu Scaling | 4Cpu Scaling |
|---|---|---|---|---|---|---|
| 1000 | 4605 | 9076.68 | 17614.13 | 88.56% | 98.55% | 95.63% |
| 1250 | 4661.55 | 9162.21 | 17764.15 | 89.65% | 98.27% | 95.27% |
| 1500 | 4632.85 | 9128.44 | 17840.91 | 89.09% | 98.52% | 96.27% |
| 1750 | 4647.37 | 9132.16 | 17990.68 | 89.37% | 98.25% | 96.78% |
| 2000 | 4624.1 | 9149.64 | 18010.68 | 88.93% | 98.93% | 97.37% |
| 2250 | 4642.03 | 9185.79 | 18015.37 | 89.27% | 98.94% | 97.02% |
| 2500 | 4630.41 | 9148.34 | 18015.57 | 89.05% | 98.79% | 97.27% |
| 2750 | 4641.52 | 9191.52 | 18006.34 | 89.26% | 99.01% | 96.99% |
| 3000 | 4607.91 | 9120.19 | 17946.34 | 88.61% | 98.96% | 97.37% |
| 3250 | 4646.25 | 9203.86 | 18112.48 | 89.35% | 99.05% | 97.46% |
| 3500 | 4621.16 | 9169.04 | 17988.65 | 88.87% | 99.21% | 97.32% |
| 3750 | 4628.75 | 9166.81 | 17995.61 | 89.01% | 99.02% | 97.19% |
| 4000 | 4673.16 | 9277.18 | 18291.52 | 89.87% | 99.26% | 97.85% |
| 4250 | 4628.35 | 9175.68 | 18077.82 | 89.01% | 99.12% | 97.65% |
| 4500 | 4611.75 | 9135.38 | 18009.88 | 88.69% | 99.04% | 97.63% |
| 4750 | 4634.89 | 9195.68 | 18090.37 | 89.13% | 99.20% | 97.58% |
| 5000 | 4600.08 | 9103.6 | 18054.83 | 88.46% | 98.95% | 98.12% |
Performance numbers are expressed in Mflops and scaling is calculated as (multi-core performance / (serial performance * #cpus used)
Nice performance numbers with 90% of peak for the serial run and as much as 98% scaling to 4 cpus. The above table concerns the double precision matrix multiply routine. As discussed previously in this blog, the DGEMM routine is probably one of the most heavily used routines in high performance computing. Especially when solving dense systems. The 3 other 'flavors' of matrix multiply (single,complex, double complex) demonstrate similar performance and scaling.
Posted at 11:09AM Jan 19, 2006 by hinkthink in General |

