Ultra-FAST Cryptography on the Sun UltraSPARC T2
Tuesday Oct 09, 2007
The UltraSPARC T2 processor has very low-overhead cryptography that basically allows one to add security at 'zero-cost'. A single Sun UltraSPARC T2 processor achieves up to 37,000 RSA 1024-bit signs/s and up to 38.9 Gbit/s of AES-128 throughput.
The comparisons below demonstrate the performance a single 1.4 GHz UltraSPARC T2 on RSA1024 (sign private key) and AES128-CBC operations
- The UltraSPARC T2 delivers over 4.1 times greater RSA1024 performance and 4.6 times greater AES128 performance than the 2-way quad-core 3 GHz Xeon.
- The UltraSPARC T2 delivers over 9.3 times greater RSA1024 performance and 10 times greater AES128 performance than the 2-way dual-core 2.6 GHz Opteron.
- The UltraSPARC T2 also delivers over 3 times greater RSA1024 performance and 15.6 times greater AES128 performance than a system using the Cavium Nitrox PX crypto acclerator card.
- The UltraSPARC T2 delivers over 30.8 times greater RSA1024 performance than the 2-way IBM p510 1.5 GHz Power5 .
To achieve these great results, the UltraSPARC T2 processor, has an on-chip cryptographic accelerator (SPU) that consists of a Cipher/hash unit and an enhanced modular arithmetic (MAU). This is an evolution of the previous generation UltraSPARC T1 that only contained modular arithmetic units.
Sun's UltraSPARC T2 processor introduces support for common bulk ciphers, secure hash operations and both prime and binary field Elliptic Cryptography. The UltraSPARC T2 processor supports RC4, DES, 3DES, AES-128, AES-192, AES-256, MD5, SHA-1, SHA-256.
Competitive Landscape
RSA/AES Cryptography Benchmark Performance as of 8/07/07 as measured by Sun on the following platforms.
| System | Processor GHz | Chips total- cores |
Operating System |
1024bit RSA (K signs/s) |
AES128 (Gbit/s) |
notes |
|---|---|---|---|---|---|---|
| Sun SPARC Enterprise T5220 | UltraSPARC T2 1.4 GHz | 1 chip 8 core | Solaris 10 | 37.0 K | 38.9 Gb/s | actual |
| Accelerator card | Sun SCA6000 | 13.0 K | 1.0 Gb/s | actual | ||
| Sun Fire T2000 | UltraSPARC T1 1.2 GHz | 1 chip 8 core | Solaris 10 | 12.9 K | actual | |
| Accelerator card | Cavium Nitrox PX | 12.0 K | 2.5 Gb/s | data- sheet |
||
| Sun FireT1000 | UltraSPARC T1 1 GHz | 1 chip 8 core | Solaris 10 | 10.8 K | actual | |
| quad-core Xeon 3 GHz | 2 chip 8 core | 9.0 K | 8.4 Gb/s | actual | ||
| Sun Fire V490* | US IV+ 1.5 GHz | 4 chip 8 core | Solaris 10 | 8.0 K | actual | |
| IBM p690 | Power4 1.3 GHz | 16 chip 32 core | AIX 5.1 | 6.1 K | actual | |
| Fujitsu PP850 | SPARC64 V 1.9 GHz | 16 chip 16 core | Solaris 10 | 6.0 K | actual | |
| Opteron 2.6 GHz | 2 chip 4 core | 4.0 K | 3.9 Gb/s | actual | ||
| Sun Fire V40z | Opteron sc 2.6 GHz | 4 chip 4 core | Solaris 10 | 3.3 K | actual | |
| Dell PE 1850 | Xeon 3.6 GHz | 2 chip 2 core | Linux RHEL4 U1 | 1.9 K | actual | |
| Dell PE 2850 | Xeon 3.6 GHz | 2 chip 2 core | Linux SLES 9 | 1.9 K | actual | |
| IBM p510 | Power5 1.5 GHz | 1 chip 2 core | AIX 5.3 | 1.2 K | actual |
* Used a Sun Crypto Accelerator (SCA) 4000 in the Sun Fire V490 testing.
Benchmark Description
The RSA/AES-128 Cryptography benchmark was developed by Sun to measure maximum throughput of RSA private key (sign) operations and AES-128 operations that a system can perform. On multi-chip and/or multi-core systems, multiple processes are used to achieve the maximum throughput. Two microbenchmark programs are used, pk11rsaperf/pk11aesperf on Solaris and OpenSSL speed test on non-Solaris systems. Though each microbenchmark uses different crypto APIs, they both measure the raw throughput of the same crypto operations.
pk11rsaperf & pk11aesperf is part of a set of cryptographic microbenchmark programs internally developed by the Crypto Product Group of NSN. pk11aesperf measures the performance of AES-128-CBC processing, as performed by Solaris Cryptographic Framework via PKCS#11 API. Different key sizes, data sizes and varying numbers of concurrent threads can be tested. The metric is aggregate operations per second, for pk11rsaperf and Gb/s for pk11aesperf (for large object sizes).
OpenSSL speed test, the standard microbenchmark included in the open-source OpenSSL package, measures raw cryptographic algorithm performance as implemented in the OpenSSL library - libcrypto.so via its own proprietary crypto APIs. For RSA the metric is operations per second, while for AES-128-CBC, the metric is Gb/s.
Disclosure Statement:
RSA/DSA Cryptography Benchmark Performance as of 08/07/07 as measured by Sun on the following platforms: Sun SPARC Enterprise T5220 37K RSA1024 signs/s, 38.9 AES128 Gb/s; Sun SCA6000 (actual) 13K RSA1024 signs/s, 1 AES128 Gb/s; Cavium Nitrox PX (datasheet) 12K RSA1024 signs/s, 2.5 AES128 Gb/s; 2-chip quad-core Xeon 3GHz 9K RSA1024 signs/s, 8.4 AES128 Gb/s; 2-chip dual-core Opteron 2.6GHz 4K RSA1024 signs/s, 3.9 AES128 Gb/s; Sun Fire T2000 1.2 GHz (8 cores, 1 chip) Solaris 10, 12,850 RSA1024 signs/s; Sun Fire T1000 1GHz (8 cores, 1 chip) Solaris 10, 10,764 RSA1024 signs/s; IBM p690 1.3 GHz (32 cores, 16 chips) AIX 5.1, 6,131 RSA1024 signs/s; Fujitsu PRIMEPOWER850 1.9 GHz (16 cores, 16 chips) Solaris 10, 6,038 RSA1024 signs/s; Dell PowerEdge 1850 3.6 GHz (2 cores, 2 chips) RHEL4 U1, 1,926 RSA1024 signs/s; Dell PowerEdge 2850 3.6 GHz (2 cores, 2 chips) SLES 9, 1,900 RSA1024 signs/s; IBM p5 510 1.5 GHz (2 cores, 1 chip, SMT) AIX 5.3, 1,200 RSA1024 signs/s.
Results Summary
|
Results |
|
37.0 K RSA1024 signs/s |
|
|
|
|
|
38.9 Gb/s AES128 |
|
Reference Date: |
|
August 7, 2007 |
|
|
Systems: |
|
Sun SPARC Enterprise T5120/T5220 |
|
|
Total Number Processors: |
|
1 chip / 8 cores/chip (8 threads/core) |
|
|
Processor/GHz of Server: |
|
Sun UltraSPARC T2 1.4 GHz |
|
|
Operating System: |
|
Solaris 10 |
|
Tags: cmt cryptography networking t2 t5120 t5220











It probably doesn't hurt that the three month avai...