...

Exploiting IBM System z Cryptographic Hardware using JSSE IBM System z

by user

on
3

views

Report

Comments

Transcript

Exploiting IBM System z Cryptographic Hardware using JSSE IBM System z
IBM System z
January 2010
Exploiting IBM System z
Cryptographic Hardware using JSSE
Exploiting IBM System z cryptographic hardware using JSSE
Page 2
Table of Contents
About this publication ................................................................................................................................... 3
Introduction ................................................................................................................................................... 4
Objectives ..................................................................................................................................................... 5
Summary ....................................................................................................................................................... 6
Hardware and software configuration .......................................................................................................... 7
System setup ................................................................................................................................................ 9
Environment............................................................................................................................................... 9
Network setup ......................................................................................................................................... 10
JSSE setup .................................................................................................................................................. 10
Required RPMs to install for JSSE, OpenSSL and Cryptographic functions.......................................... 10
Cryptographic hardware configuration ................................................................................................... 10
IBM Crypto Express2 feature configuration ........................................................................................ 10
OpenCryptoki configuration ................................................................................................................ 13
Access to the Java executables and cryptographic libraries............................................................. 13
Set up java.security file for hardware encryption ................................................................................ 14
PKCS11 configuration file .................................................................................................................... 14
Start the daemons................................................................................................................................ 15
Setup the openCryptoki password ...................................................................................................... 16
Prepare libraries for the iKeyman utility............................................................................................... 17
Define a keystore for hardware encryption using the iKeyman utility ................................................. 17
Java Version 6.0 migration considerations.......................................................................................... 19
Examine file /proc/driver/z90crypt....................................................................................................... 19
Enabling the polling thread for the z90crypt device driver ................................................................. 21
Configuring the environment to use software encryption ....................................................................... 21
Set up java.security file for software encryption.................................................................................. 21
Prepare libraries for the iKeyman utility............................................................................................... 22
Define a keystore for software encryption using the iKeyman utility................................................... 22
Workload description .................................................................................................................................. 24
Variations................................................................................................................................................. 25
Workload output and performance data ................................................................................................. 27
Results......................................................................................................................................................... 28
Java 1.5 SR9 JSSE software and hardware comparison........................................................................ 28
Java 1.5 SR9 JSSE hardware encryption with and without the z90crypt device driver polling thread.. 35
Java 1.5 SR9 JSSE SSL logon method comparison ............................................................................... 38
Java release to release comparisons ..................................................................................................... 41
Appendix A. Results tables......................................................................................................................... 42
Bibliography ................................................................................................................................................ 45
Notices ........................................................................................................................................................ 46
Exploiting IBM System z cryptographic hardware using JSSE
Page 3
About this publication
Authors
Dr. Juergen Doelle
Paul V. Sutera
Acknowledgements
The benchmarks were performed at the IBM System z World Wide Benchmark Center in
Poughkeepsie, NY.
How to send your comments
Your feedback is important in helping to provide the most accurate and highest quality
information. If you have any comments about this publication, send your comments using
IBM Resource Link at http://www.ibm.com/servers/resourcelink. Click Feedback on the
navigation pane. Be sure to include the name of the publication, and the specific location of
the text you are commenting on (for example, a page number or table number).
Exploiting IBM System z cryptographic hardware using JSSE
Page 4
Introduction
This study measures performance and throughput for the Java™ Secure Socket Extension
(JSSE) on Linux® for IBM System z® with Java 2 Platform, Enterprise Edition and the IBM
JSSE2 provider.
The name 'JSSE study' will be used throughout this document, in place of the full name:
Exploiting IBM System z cryptographic hardware using Java Secure Socket Extension.
Data encryption is an important feature to ensure privacy and integrity of data sent using any
type of network. But data encryption is a very CPU intensive activity, causing additional CPU
load when done using software. The IBM System z architecture provides two hardware
features, the IBM Crypto Express2 feature (which is a PCI card) and the Central Processor
Assist for Cryptographic Function (CPACF), which is part of the IBM System z processor,
used to offload the heavy workload of data encryption to specialized hardware. Use of these
hardware features frees CPU cycles from the main processor, speeds up the processing, and
increases the throughput. These hardware features are intended to help create a secure
environment, with the lowest impact on the executed workload.
The IBM JSSE2 is a Java package, enabling secure network communications. The extension
implements the Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols,
providing functions for data encryption, message integrity, and server and client
authentication.
For these tests, a workload that heavily exercises SSL socket creation and handshakes was
chosen. The handshakes are the protocol used by the client and server to negotiate the
authentication and manage the cryptographic keys used for a session. For this workload, new
SSL socket connections are created continuously in each thread, with twenty server and client
threads, until a time limit is reached. During each handshake, new encryption keys are
generated, which are used to encrypt the data exchanged between a client and a server
thread.
Exploiting IBM System z cryptographic hardware using JSSE
Page 5
Data packets of varying sizes are encrypted and decrypted. Four different cipher suites are
compared. The suites use hashing algorithms to ensure data integrity over the network.
Hardware and software encryption are both tested. The number of packets transmitted for the
test interval is used to measure throughput. CPU utilization is also measured during the test.1
Objectives
The objective of these tests is to show the advantage of hardware cryptographic features
provided by IBM system z. The performance of different Java versions was also measured.
The environment consists of:
ƒ One IBM System z LPAR running Linux and the JSSE client test program
ƒ A second IBM System z LPAR running Linux and the JSSE server test program
ƒ Both LPARs are connected with a HiperSockets™ connection
The objectives include:
ƒ Describe how the setup was done for this test, as a guideline for similar scenarios.
ƒ Measure the throughput and CPU load with four different ciphers, when scaling the
packet size using software and hardware encryption. Throughput consists of the number of
packets multiplied by the number of bytes of data in each packet, and divided by the run
time in seconds.
ƒ Compare throughput and CPU utilization for JSSE in hardware encryption mode with and
without a polling thread.
ƒ Compare hardware and software encryption of SSL handshakes when varying the logon
methods (cached, client authentication).
ƒ Study the performance under Java 5.0, comparing the SR9 release to the SR7 release, and
Java 6.0 SR4.
This information will help IT architects understand the performance characteristics of
hardware and software encryption using the IBM Java Secure Sockets Extension (IBM
JSSE2), as provided by the IBM Java 5.0 SDK (Software Developer's toolkit).
1
This paper is intended to provide information regarding performance of Java classes using the IBMPKCS11Impl security provider to
use the cryptographic hardware on IBM System z. It discusses findings based on configurations that were created and tested under
laboratory conditions. These, findings may not be realized in all customer environments, and implementation in such environments may
require additional steps, configurations, and performance analysis. The information herein is provided 'AS IS' with no warranties,
express or implied. This information does not constitute a specification or form part of the warranty for any IBM products.
Exploiting IBM System z cryptographic hardware using JSSE
Page 6
Summary
This paper describes how to set up the cryptographic environment on IBM System z to obtain
the benefit of the additional power of special purpose features CPACF and CEX2A. The
workload is a client-server based Java application communicating using SSL with different
cipher suites using the IBMPKCS11Impl provider (opencryptoki). This was realized with self
developed Java classes based on Java 5 loading the security provider directly. This is not
possible for standard applications.
Note: At the time of publishing this paper, the full support for the System z cryptographic
hardware for Java is general available with
ƒ SUSE Linux Enterprise Server Version 10 SP 3
ƒ IBM WebSphere® Application Server Version 7.0.0.7
ƒ IBM Software Development Kit (SDK) 1.6 SR 6
For more information and support enhancements from Java and WebSphere, please read
more at http://www-01.ibm.com/support/docview.wss?rs=404&uid=swg27017055.
Java 5 has a number of known issues that are fixed in Java 6. If using PKCS11, we
recommend moving to Java 6.
The impact of the cryptographic hardware depends on the size of the data transferred. In all
cases, a significant reduction of the CPU utilization (up to 50%) was observed. The
throughput increases significantly with packets sized approximately 20 KB, up to a fourfold
throughput increase when compared to software encryption. It is recommended to run
without the polling thread, which checks if data are available on the CEX2A card (which is
the default). The polling thread mechanism is on the way to be replaced on a IBM System
z10™ Enterprise Class (z10 EC™) system or later, by the AP interrupts mechanism (see
http://download.boulder.ibm.com/ibmdl/pub/software/dw/linux390/docu/lk31dd03.pdf, section
'Using AP adapter interrupts'). This is already available with Linux RHEL 5.4
With small packet sizes of 2 KB and 20 KB, cryptographic hardware provides a 10% higher
throughput, but the CPU cost per unit of transferred data also increases. When throughput is
more important than CPU load, the polling thread is helpful.
During the normal logon process, the server authenticates to the client using certificates. To
increase the security level, client authentication can be added. This CPU-intensive process
also has a significant reduction in CPU load when using the cryptographic hardware on IBM
System z.
Exploiting IBM System z cryptographic hardware using JSSE
Page 7
Caching of SSL sessions means that for consecutive requests from one specific client, the SSL
handshake is issued only for the first request. Now because the client and server have already
been identified and the keys are already exchanged, the server decides not to perform a
handshake for the further requests. This is the default behavior of a SSL session, the server
accepts an established session for a specific time interval (often about 10 minutes - the value
can be configured in the server). For that particular period, additional handshakes are
avoided. This optimization speed up the tests by a factor of two, and saved approximately
30% of the CPU utilization.
Finally, results were compared from Java Version 5.0 SR9 with Java Version 5.0 SR7 and
Java Version 6.0 SR4. All results are very similar. The only difference was that Java Version
6.0 SR4 did not support the hardware encryption, so for this test only software encryption was
used.
Hardware and software configuration
To perform the JSSE study, a test environment on an IBM System z system was created.
IBM System z hardware and software - Logical Partition
This is a detailed list of the hardware and software used in the IBM System z side of the IBM
JSSE2 study.
Two LPARs on a 64-way IBM System z10 EC, 4.4 GHz, model 2097-E64 are used. They are
equipped as described in Table 1.
Exploiting IBM System z cryptographic hardware using JSSE
Page 8
Table 1. JSSE study: Server hardware
LPAR
Description
LPAR 1 - JSSE
Server
Four physical CPUs dedicated
4 GB central storage
One 1 Gb OSA feature for external connectivity
HiperSockets connection to LPAR 2
Two processors of an IBM Crypto Express2 card in accelerator mode
(CEX2A)
CPACF (an unpriced feature of the IBM System z processor)
LPAR 2 - JSSE
Client
Four physical CPUs dedicated
4 GB central storage
One 1 Gb OSA feature for external connectivity
HiperSockets connection to LPAR 1
Two processors of an IBM Crypto Express2 card in accelerator mode
(CEX2A)
CPACF (an unpriced feature of the IBM System z processor)
IBM System z software
The IBM System z software is described in Table 2.
Table 2. JSSE study: Server software
Product
Version and Level
IBM Java SDK 64-bit for IBM Linux
1.5 SR7, SR9, Java 1.6 SR4
Novell SUSE Linux Enterprise Server
SLES 10, SP2 64-bit
Workload driver program
N/A
Exploiting IBM System z cryptographic hardware using JSSE
Page 9
System setup
To perform the JSSE study, a customer-like system setup was created.
Environment
The test environment for the JSSE study was a Java-based client-server workload that
processes SSL handshakes, data encryption and decryption, and data transfers.
These components are used:
ƒ A JSSE client program that runs on the IBM System z client LPAR
ƒ A JSSE server program that runs on the IBM System z server LPAR
Figure 1 shows the configuration used for the testing.
Figure 1. JSSE study: System configuration for the JSSE workload
Exploiting IBM System z cryptographic hardware using JSSE
Page 10
Network setup
To perform the JSSE study, a customer-like network was created.
The network setup for the workload consists of:
ƒ The two Logical Partitions (LPARs) were connected with a HiperSockets switch using a 16
KB frame size for a fast, isolated network connection.
The network setup is shown in Figure 1.
JSSE setup
These steps are used to configure the cryptographic environment for the JSSE study.
ƒ Install the required RPMs.
ƒ Configure the cryptographic hardware.
ƒ Configure the environment to use software encryption.
Required RPMs to install for JSSE, OpenSSL and Cryptographic functions
These RPMs are installed to run JSSE API tests for both hardware and software cryptography.
These RPMs are available from the Novell SUSE distribution CDs.
libica-1.3.8-0.6.s390x.rpm
libica-32bit-1.3.8-0.6.s390x.rpm
openCryptoki-2.2.4-0.7.s390.rpm
openCryptoki-32bit-2.2.4-0.7.s390.rpm
openCryptoki-64bit-2.2.4-0.7.s390x.rpm
Cryptographic hardware configuration
To configure the cryptographic hardware for the JSSE study, you must configure the IBM
Crypto Express2 feature, and then set up various libraries, files, and passwords to ensure that
the IBM Crypto Express2 feature is used.
IBM Crypto Express2 feature configuration
These instructions are used to set up the IBM Crypto Express2 feature on the IBM System
z10 for the JSSE study.
The IBM Crypto Express2 feature provides two PCI-X processors, that are also referred to as
adapters. These two processors can be configured in one of three modes:
ƒ Coprocessors
ƒ Accelerators
ƒ One coprocessor and one accelerator
Exploiting IBM System z cryptographic hardware using JSSE
Page 11
With Linux, the accelerator mode provides the best performance (see
http://www.ibm.com/developerworks/linux/linux390/perf/tuning_res_security_crypto.html#h1).
Selecting the mode is done from the Hardware Management Console (HMC) using the Single
Object Operations mode, when configuring the cryptographic feature on the CEC level:
1.
2.
3.
4.
5.
From the HMC, click System Management -> Servers.
Select (highlight) the name of your server.
Click Single Object Operations.
Enter the Support Element (SE).
In the SE, click System Management -> CPC Configuration -> Cryptographic
Configuration -> Crypto Type Configuration.
A sample displayed in Figure 2 shows the configuration used for the testing.
Figure 2: Cryptographic hardware configuration: Definition of cryptographic configuration
mode
The system used for the testing has two cryptographic features. All four processors are set in
the accelerator mode and the two processors from one feature are assigned to one LPAR with
the HMC. (select Partition -> CPC Configuration -> Customize/Delete Activation Profiles
-> Crypto) by selecting processors and domains for that LPAR.
Figure 3 shows the IBM Crypto Express2 configuration for an LPAR.
Exploiting IBM System z cryptographic hardware using JSSE
Page 12
Figure 3. HMC: IBM Crypto Express2 configuration for an LPAR
There are 16 possible Control and Usage domains. The Hardware Management Console
(HMC) requires the selection of these domains to configure the Linux LPAR to use a
cryptographic processor. The processors are then selected from the Cryptographic Candidate
list on the HMC Cryptographic configuration panel, and the same two are selected on the
Cryptographic Online list. The configuration is repeated for the other IBM System z LPAR.
The two processors from each IBM Crypto Express2 feature were accessed using the same
control and usage domains (see Figure 3). The Linux devices drivers handle only one active
domain.
These steps reserve an entire IBM Crypto Express2 feature with its two processors for each
LPAR. The two cryptographic processors are managed from the opencryptoki function in one
slot, and the workload is balanced automatically from the driver.
Exploiting IBM System z cryptographic hardware using JSSE
Page 13
The entire IBM Crypto Express2 feature is reserved exclusively for one LPAR due to the
requirements of the performance test, to ensure that no other workloads influence the results.
This setup does not necessarily represent a best practice for a production environment.
OpenCryptoki configuration
OpenCryptoki is an implementation of the PKCS #11 API that allows interfacing to devices
that hold cryptographic information.
OpenCryptoki is a slot manager and an API for slot token dynamic link libraries (STDLLs).
The slot manager runs as a daemon that controls the number of token slots provided to
applications, and it interacts with applications using a shared memory region. Each device
that has a token associated with it places that token into a slot in the slot manager database.
The shared memory region allows for proper sharing of state information between
applications.
With Linux versions SLES 9, SLES 10, RHEL 4, RHEL 5, OpenCryptoki must first be
configured using the pkcs11_startup script.
When the pkcs11_startup script is run, it performs these tasks:
1. Creates a Linux group named pkcs11.
2. Scans for an installed device (/dev/z90crypt).
3. Creates the slot configuration file (/etc/pkcs11/pk_config_data).
The slot manager daemon can then be started using the pkcsslotd command or the
OpenCryptoki service control script. Any application that accesses the PKCS subsystem must
run as root or under a Linux user that is a member of the pkcs11 group.
Access to the Java executables and cryptographic libraries
It is required that the Java executables and cryptographic libraries are accessible to the
applications used in the JSSE study.
For the system to locate the libopencryptoki library, add the opencryptoki directory to the
library search path. Also the Java executables should be resolvable; a PATH environment
variable is one way to do that.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64/opencryptoki
export PATH=$PATH:/opt/ibm/java-s390x-50/jre/bin
Exploiting IBM System z cryptographic hardware using JSSE
Page 14
Set up java.security file for hardware encryption
The java.security file for Java Version 5.0 must be present and customized on both the JSSE
server and client LPARs for the JSSE study.
The java.security file contains a list of providers that implement a certain set of ciphers. When
searching a provider for a cipher, the first one on this list that supports the cipher is chosen.
The only provider of the full cryptographic hardware support is the IBMPKCS11Impl
provider. Therefore, the default java.security file must be modified so that this provider is the
first provider in that list (see sample below).
The modified java.security file must be temporarily changed when the iKeyman utility is run.
For the iKeyman steps, the IBMPKCS11Impl provider below is commented out, and the
currently commented-out provider IBMPKCS11 is uncommented. Different versions and
levels of Java might provide different java.security files for different security providers. The
java.security example below shows IBMPKCS11Impl in the first position in front of the
IBMJCE provider, so that hardware encryption is used by default instead of software
encryption.
+--------------------------------------------------------------------------------+
|cat /usr/lib64/jvm/java-1_5_0-ibm-1.5.0_sr7/jre/lib/security/java.security
|
|
|security.provider.1=com.ibm.crypto.pkcs11impl.provider.IBMPKCS11Impl <pkcs11 config|
|#security.provider.1=com.ibm.crypto.pkcs11.provider.IBMPKCS11
|security.provider.2=com.ibm.jsse2.IBMJSSEProvider2
|security.provider.3=com.ibm.crypto.provider.IBMJCE
|security.provider.4=com.ibm.security.jgss.IBMJGSSProvider
|security.provider.5=com.ibm.security.cert.IBMCertPath
|security.provider.6=com.ibm.security.sasl.IBMSASL
|
|
|Sample Java Security file for Java 1.5
+--------------------------------------------------------------------------------+
The PKCS configuration file specified here is described in PKCS11 configuration file. The
support for having the IBMPKCS11Impl provider as the first provider in the java.security file
is incomplete at the moment. At the time of the test, the Java application is required to load
the IBMPKCS11Impl provider explicitly.
PKCS11 configuration file
The PKCS11 configuration file is specified for the IBMPKCS11Impl security provider in the
java.security file used in the JSSE study.
Exploiting IBM System z cryptographic hardware using JSSE
Page 15
The contents of the PKCS11 configuration for Java Version 5 used for the JSSE study are
shown below.
There is one such file on both the JSSE client and server LPARs. The slotListIndex value is
normally zero when there is one Cryptographic feature per LPAR. Some available encryption
algorithms are not supported on the hardware device, so these are in the disabled list.
Different Java versions might require different PKCS11 configuration files.
+--------------------------------------------------------------------------------+
|name = JSSEProvider
|
|library=/usr/lib/pkcs11/PKCS11_API.so64
|
|description=zlinux64bit
|
|
|
|
|
|slotListIndex = 0
|
|
|
|disabledmechanisms = {
|
|CKM_SHA_1
|
|CKM_MD5
|
|}
|
|
|
|
|
|Sample PKCS11 configuration file
+--------------------------------------------------------------------------------+
Start the daemons
Two daemons are necessary for hardware encryption used in the JSSE study.
The daemon startup can be automated with the Linux chkconfig command. The
cryptographic z90crypt device driver for Linux for IBM System z and IBM System z10 is a
generic character device that routes work to a supported cryptographic coprocessor or
accelerator device installed on the system. If the associated daemon is started, cryptographic
work can be routed to the Cryptographic facility accelerator.
The daemon named pkcsslotd is a slot manager running as a daemon to control token slots
provided to applications. Start the daemons manually as user root by issuing these
commands:
rcz90crypt start
rcpkcsslotd start
Exploiting IBM System z cryptographic hardware using JSSE
Page 16
Setup the openCryptoki password
Perform these steps to set up the openCryptoki password for the JSSE study.
Issue this command:
pkcsconf
The output lists the command operands and their meaning.
pkcsconf [-itsmMIupP] [-c slotnumber -U userPIN -S SOPin -n newpin]
-Use pkcsconf64 for a 64-bit system instead of pkcsconf.
-i display PKCS11 info
-t display token info
-s display slot info
-m display mechanism list
-I initialize token
-u initialize user PIN
-p set the user PIN
-P set the SO PIN
cleanup:
To clean up after unsuccessful pkcsconf command operations, issue these commands:
cd /var/lib/opencryptoki
rm -R *
rcpkcsslotd stop
rcpkcsslotd start
These steps must be repeated for each cryptographic feature installed, matching the
corresponding slot number with the -c n argument (as reported from the output of the
pkcsconf -t command).
1.
Initialize System Operator PIN and initialize token:
pkcsconf64 -I -c 0 (uppercase I)
Enter the SO PIN: ********
Enter a unique token label: zlinux64bit
-> 87654321 default)
(The command pkcsconf64 -t # displays token information.)
2.
Set the User PIN:
pkcsconf64 -u -c 0
-> was64usr
Exploiting IBM System z cryptographic hardware using JSSE
Page 17
Enter the new user PIN: ********
Re-enter the new user PIN: ********
3.
-> 87654321
-> 87654321
Change the password for the System Operator. This is a required step.
pkcsconf64 -P -c0
4.
Change the password for the User. This is a required step.
pkcsconf64 -p -c 0
5.
-> was64usr
-> was64usr
The User password is used to access the token from the application. Record the unique
tokens and passwords for later use by the ikeyman utility.
Note: Passwords used here are samples. Choose the real password according to your
password policies.
Prepare libraries for the iKeyman utility
Use these commands to perform the library preparation for the iKeyman utility used in the
JSSE study.
Under Java Version 5, incorrect soft links must be fixed before running the iKeyman utility.
Issue these commands to fix the soft links in files /usr/lib/opencryptoki (for 31-bit), and
/usr/lib64/opencryptoki (for 64-bit):
mv libopencryptoki.so.0.0.0 libopencryptoki.so
unlink libopencryptoki.so.0
ln -s libopencryptoko.so libopencryptoki.so.0
Define a keystore for hardware encryption using the iKeyman utility
Perform these steps to define a keystore for hardware encryption using the iKeyman utility.
Before invoking the iKeyman utility, be sure to edit file java.security, comment out the
IBMPKCS11impl provider, and uncomment the PKCS11 security provider, because iKeyman
requires the IBMPKCS11 provider. This reverses the comments shown in the section Set up
java.security file for hardware encryption.
To initialize the PKCS11 provider:
1. Invoke the iKeyman facility by issuing this command. The iKeyman facility is a graphical
user interface, so an X11 or VNC graphical environment should already be active before
running iKeyman.
java com.ibm.gsk.ikeyman.Ikeyman
Exploiting IBM System z cryptographic hardware using JSSE
Page 18
2.
3.
4.
5.
6.
7.
Click the yellow folder to Open a Key Database File.
Select Key Database Type: Java Cryptographic Token.
Select File Name: PKCS11_API.so64 (or PKCS11_API.so for 31-bit JVM).
Select Location: /usr/lib/pkcs11.
Choose the slot, which will normally be zero (0) for ICA.
Enter the password that corresponds to the user PIN entered previously in Setup the
openCryptoki password.
Figure 4 shows the iKeyman cryptographic configuration with Java Version 5.
Figure 4. iKeyman cryptographic configuration with Java Version 5
After the iKeyman steps are done, the PKCS11 provider should again be commented-out in
the java.security file, and the IBMPKCS11Impl provider should be uncommented and located
in the first provider position. See Setup the openCryptoki password.
Exploiting IBM System z cryptographic hardware using JSSE
Page 19
Java Version 6.0 migration considerations
Consider these points when migrating to Java Version 6.0.
The Java Version 6.0 64-bit SDK was installed by removing the Java Version 5.0 RPMs and
installing the Java Version 6.0 RPMs. Removing Java Version 5.0 is optional, and it is
possible to have multiple Java versions installed at the same time.
There are changes to be made to the Java Version 6.0 java.security file different than those
that were made for Java Version 5.0:
ƒ The PKCS11 provider is no longer used. Instead, use the
com.ibm.crypto.pkcs11impl.provider.IBMPKCS11Impl provider.
ƒ The key database for IBMPKCS11Impl is now named PKCS11Direct or PKCS11Config.
Also, for software encryption using cipher AES-256, file local_policy.jar and in the United
States, file US_export_policy.jar, are needed. Without the correct versions of these files, a
NoSuchAlgorithmException error message is generated for this cipher.
For more information, consult the guide for iKeyman under Java Version 6:
http://download.boulder.ibm.com/ibmdl/pub/software/dw/jdk/security/60/iKeyman.8.User.G
uide.pdf.
It is also important to modify the PKCS11 configuration file after installing Java 6. More
encryption mechanisms were added to the standard that are not yet supported in hardware.
These need to be present in the disabled mechanisms list in the PKCS11 configuration file.
The current list for Java 6 (subject to revision in future releases) is:
disabledmechanisms {
CKM_SHA_1
CKM_MD5
CKM_MD5_HMAC
CKM_SHA_1_HMAC
CKM_SSL3_MASTER_KEY_DERIVE
CKM_SSL3_KEY_AND_MAC_DERIVE
CKM_SSL3_PRE_MASTER_KEY_GEN
}
Refer to http://www.ibm.com/developerworks/java/jdk/security/ for the latest information
about Java security and Java Secure Socket Extensions.
Examine file /proc/driver/z90crypt
Examine the /proc/driver/z90crypt file to ensure that all options are set correctly.
Exploiting IBM System z cryptographic hardware using JSSE
Page 20
When the rcz90crypt and rcpkcsslotd daemons are running, issue this command on the SSL
client and server machines to show the statistics of the CEX2A SSL handshake acceleration.
To see the contents of file /proc/driver/z90crypt, issue this command:
cat /proc/driver/z90crypt
The output is similar to these lines. Bold is used for emphasis and is not part of the command
output.
zcrypt version: 2.1.0
Cryptographic domain: 8
Total device count: 2
PCICA count: 0
PCICC count: 0
PCIXCC MCL2 count: 0
PCIXCC MCL3 count: 0
CEX2C count: 0
CEX2A count: 2
requestq count: 0
pendingq count: 0
Total open handles: 0
Online devices: 1=PCICA 2=PCICC 3=PCIXCC(MCL2) 4=PCIXCC(MCL3) 5=CEX2C
6=CEX2A
0000060600000000 0000000000000000 0000000000000000 0000000000000000
Waiting work element counts
0000000000000000 0000000000000000 0000000000000000 0000000000000000
Per-device successfully completed request counts
00000000 00000000 00000000 00000000 00000000 0001420B
00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
000150A7
00000000
00000000
00000000
00000000
00000000
00000000
00000000
Exploiting IBM System z cryptographic hardware using JSSE
Page 21
Explanation of file /proc/driver/z90crypt
This is the meaning of each of the significant fields of file /proc/driver/z90crypt.
Cryptographic domain: 8
The domain where the processors resides is number 8.
Total device count: 2 and CEX2A count: 2
There are two cryptographic processors, and they are both configured as CEX2A (accelerator)
devices.
Online device list
The CEX2A processors are denoted by the number 6 in this list, they are in positions 5 and 7,
reflecting the positions in their cryptographic candidate list.
Per-device successfully completed request counts list
The successfully completed device counts show the number (in hexadecimal) of packet
transmissions over the CEX2A adapter. They are also in positions 5 and 7, reflecting the
positions assigned during cryptographic card configuration.
The packet count shows the utilization of the processors for SSL handshakes, with 8 digits for
each queue.
Waiting work element counts list
When elements are reported here, the corresponding processor is overutilized. Each digit
represents one queue.
Enabling the polling thread for the z90crypt device driver
To turn on the polling thread for the z90crypt device driver, issue this command:
modprobe z90crypt poll_thread=1
Configuring the environment to use software encryption
For the JSSE study, the environment must be set up to make use of software encryption.
Set up java.security file for software encryption
This is a description of the java.security file for Java Version 5.0. It is present and must be
tailored on both the JSSE server and client LPARs.
Exploiting IBM System z cryptographic hardware using JSSE
Page 22
For software encryption, use the default settings in the java.security file. Different versions
and levels of Java might provide different java.security files for different security providers.
Issue this command:
+--------------------------------------------------------------------------------+
|cat /usr/lib64/jvm/java-1_5_0-ibm-1.5.0_sr7/jre/lib/security/java.security
|
|
|
|security.provider.1=com.ibm.jsse2.IBMJSSEProvider2
|
|security.provider.2=com.ibm.crypto.provider.IBMJCE
|
|security.provider.3=com.ibm.crypto.pkcs11.provider.IBMPKCS11
|
|security.provider.4=com.ibm.security.jgss.IBMJGSSProvider
|
|security.provider.5=com.ibm.security.cert.IBMCertPath
|
|security.provider.6=com.ibm.security.sasl.IBMSASL
|
|
|
|
|
|Sample Java Security file for software encryption
|
+--------------------------------------------------------------------------------+
Prepare libraries for the iKeyman utility
Use these commands to perform the library preparation for the iKeyman utility used in the
JSSE study.
Under Java Version 5, incorrect soft links must be fixed before running the iKeyman utility.
Issue these commands to fix the soft links in files /usr/lib/opencryptoki (for 31-bit), and
/usr/lib64/opencryptoki (for 64-bit):
mv libopencryptoki.so.0.0.0 libopencryptoki.so
unlink libopencryptoki.so.0
ln -s libopencryptoko.so libopencryptoki.so.0
Define a keystore for software encryption using the iKeyman utility
Perform these steps to define and open a keystore for software encryption using the iKeyman
utility.
Use this procedure to open the cryptographic key database.
1. Invoke the iKeyman facility by issuing this command. The iKeyman facility is a graphical
user interface, so an X11 or VNC graphical environment should already be active before
running iKeyman.
java com.ibm.gsk.ikeyman.Ikeyman
2. Create a new Java key database if one doe not already exist:
a. Click Key Database File -> New.
b. In the New window, complete these fields:
Exploiting IBM System z cryptographic hardware using JSSE
Page 23
• Key database type - Accept the default of jks.
• File name - Type the file name. An example is testkeys.jks.
• Location - Enter a directory into which the JKS keystore will be stored.
c. Click OK.
d. At the password prompt:
• Type a password.
• Choose a password expiration date.
• Record the password for future use when opening the keystore.
• Click OK.
3. If a Java key database already exists:
a. Open the testkeys.jks keystore database with Key Database Type: JKS.
b. Enter a password for the software keystore database that will later be used for
software encryption testing.
For JSSE testing, keystore file testkeys.jks was in directory /home/jsse.
4. Create a self-signed certificate (done on both JSSE client and server):
a. Select Personal certificate in the pull-down menu under the label Keystore
content.
b. Select New self signed.
c. Enter a key label. The name does not matter, but it should be unique.
d. Accept all defaults.
e. Click OK.
5.
To export, click Extract certificate, which will create a .cert file to import a certificate.
Do this on both JSSE client and server.
6.
Transfer the .cert file to the local file system of each other system used in the study.
7.
Select Signer certificates in the pull down menu under the label Keystore content.
8.
Click Add.
9.
Type the path of the .cert file.
10. Click Open.
11. Click OK on the next box.
12. Type a label. The name does not matter as long there is only one, so use the same
name as on the original keystore.
13. Click OK.
14. Add (import) the certificate after extracting to a cert.arm file on the other systems that
will be doing SSL handshakes with this system.
When reviewing the signer certificates, you should see a signed certificate from the
other system as well as the signed certificates generated locally on this system.
It is advisable to use meaningful names, such as: certClient.arm and certServer.arm instead of
the default name cert.arm. An alternative is to use host names in the file names, to indicate
which certificate came from which system.
Exploiting IBM System z cryptographic hardware using JSSE
Page 24
Workload description
This is a description of the workload and the steps needed to run the workload for the JSSE
study.
The workload consists of Java programs (classes) that are invoked on the JSSE client and
JSSE server IBM System z LPAR. Different programs are invoked, depending on whether
hardware or software encryption is requested by the test program. For hardware encryption,
the hardware provider com.ibm.crypto.pkcs11impl.provider.IBMPKCS11Impl
and its associated functions are invoked. The IBMJSSE2 provider is also used. For software
encryption, only the IBMJSSE2 provider is used.
After the providers are established, other parameters are processed to establish an SSL
session between the JSSE client and server. The keystore name and password are passed into
the program for both hardware and software encryption. For hardware encryption, different
keystores and certificates are used than for software encryption. For software encryption, the
previously created self signed JKS certificate from the keystore, named testkeys.jks, is used.
Define a keystore for software encryption using the iKeyman utility.
For hardware encryption, the PKCS11 configuration file is specified, which provides the
placeholder keystore name /usr/lib/pkcs11/PKCS11_API.so64. The provider understands
that this library is only a placeholder for the hardware-based keystore. The passwords
submitted are the same passwords established using the iKeyman utility for the hardware or
software keystores. See Define a keystore for hardware encryption using the iKeyman utility.
The test programs start 20 client threads on the client. Each thread establishes an SSL
session with the server. An SSL handshake is performed and the session or connection is
established. Data of varying byte lengths is written to input socket streams by the client and
then returned bytes are read back from the server. Except in the case of cached connections,
the session is invalidated and the process is repeated beginning with a new SSL handshake
and a new exchange of data.
Each of the twenty client threads runs for a designated period of time. Thus, the speed and
efficiency of running with hardware or software encryption can be measured by how many
connections and data exchanges are made within that time period. A high-speed network
using HiperSockets is used so that performance is not impacted by network latency.
Exploiting IBM System z cryptographic hardware using JSSE
Page 25
The performance of these two operations is analyzed in the JSSE study:
ƒ The SSL handshakes when running with hardware encryption are performed on the IBM
Crypto Express2 feature. While the data exchanged is encrypted with symmetric keys, the
communication with the negotiation of the symmetric keys is then encrypted with
asymmetric keys. The key generation and exchange is a high-CPU load operation that
takes place during the SSL handshake. This operation runs on the IBM Crypto Express2
feature when hardware encryption is requested.
ƒ The data encryption with symmetric keys takes place as the data is sent to and read from
the input and output buffers for the open socket. When hardware encryption is requested
and the cipher used is supported, the data is encrypted and decrypted by the Central
Processor Assist for Cryptographic Function (CPACF). Hashing of data is also performed
by the CPACF.
Thus, the performance characteristics of hardware encryption are affected by two different
devices or functions, the IBM Crypto Express2 feature and the CPACF.
Variations
Variations of setup characteristics and parameters were used in the JSSE study, to show the
effect of these variations on workload output and performance.
Ciphers
Different cipher suites are used for each test case. Each test case is therefore run four times,
once using each cipher. The full cipher name is listed, followed by a shortened name in
parenthesis. The shortened name is used throughout this document.
ƒ
ƒ
ƒ
ƒ
SSL_RSA_WITH_RC4_128_MD5 (RC4-128)
SSL_RSA_WITH_3DES_EDE_CBC_SHA (3DES)
SSL_RSA_WITH_AES_128_CBC_SHA (AES-128)
SSL_RSA_WITH_AES_256_CBC_SHA (AES-256)
These ciphers follow this naming schema:
ƒ All ciphers use the RSA cipher for the SSL handshake (asymmetric algorithm for publickey cryptography).
ƒ The symmetric ciphers for data encryption are RC4-128, 3DES, AES-128, and AES-256.
ƒ CBC stands for cipher-block chaining, where each block of plaintext is XOR'd with the
previous ciphertext block before being encrypted. This way, each ciphertext block is
dependent on all plaintext blocks processed up to that point. See
http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation. This increases the
security level.
ƒ MD5/SHA are hash algorithms used for digital signatures.
Exploiting IBM System z cryptographic hardware using JSSE
Page 26
Each cipher has different workload and performance characteristics. Some ciphers might not
be supported by either one or both of the cryptographic hardware devices (CPACF and
CEX2A).
Hardware versus software encryption
The performance characteristics of the hardware cryptographic operations and key storage
and retrieval are compared to the same operations running with software encryption on the
Java virtual machine. Different ciphers and packet sizes are used to understand performance
attributes under a variable workload.
Packet size
Different packet sizes are used to create six separate tests using packet sizes from 1 byte to
262144 bytes. The smaller packet sizes would have a higher relative percentage of the
resources dedicated to the SSL handshakes, which in hardware encryption mode are run on
the IBM Crypto Express2 feature.
The larger packet sizes would cause more resources to be devoted to the encryption,
decryption, and hashing of the data. For hardware encryption, the data encryption and
hashing would run on the Central Processor Assist for Cryptographic Function (CPACF).
Hardware encryption with and without polling thread (poll_thread=1)
The z90crypt device driver provides a polling thread. The polling thread queries the
cryptographic adapter for finished cryptography requests that were off-loaded to the
cryptographic adapter. The use of the polling thread provides a trade-off. The requests from
the cryptographic adapter are normally retrieved only once every 1/100th of a second,
because this is the rate of the Linux kernel timer. The limit then becomes 100 SSL
handshakes per second.
This limit can be relaxed when the adapter is lightly loaded, such as with workloads with
fewer than eight parallel connections. There is a CPU cost to run an additional thread that
performs repeated polling operations. The throughput and CPU utilization are investigated by
running the z90crypt device driver with and without the polling thread enabled for all the
different ciphers and packet sizes.
Cached or uncached session
It is possible to pass an invalidate flag to the test program and have the invalidate() function
run using the session object. This invalidates the session, meaning any handshakes need to be
repeated. With session caching, even after session close, the keys and identity of the client are
maintained. This means that when a subsequent new connection and session is requested, a
new handshake is not required.
Exploiting IBM System z cryptographic hardware using JSSE
Page 27
The JSSE routines that handle SSL handshakes therefore do not need to run to generate new
encryption keys using the resource-intensive asymmetric key generation. For hardware
encryption, this means that the use of the IBM Crypto Express2 feature will be minimal. With
this test and a very small packet size, it is possible to isolate the effect of handshakes on JSSE
throughput, because the effort for data encryption is at the minimum. Most of the tests in this
study are run with uncached sessions, in order to specifically exercise the SSL handshake
mechanism.
Client authentication
Client authentication is an authentication mechanism where the server requests a certificate
from the client to verify that the client is what it claims to be. The certificate must be an
X.509 certificate and signed by a Certificate Authority (CA) trusted by the server.
The client typically receives the server's certificates and checks to see if the server is on the
list of trusted Certificate Authorities (CA). If the server is on the list, or if the client decides to
trust the server, the certificate is accepted. Client authentication is an optional security step
that goes beyond this trust mechanism to increase the level of security.
Workload output and performance data
These steps were used to collect the workload output and performance data for the JSSE
study.
Throughput and transaction rate
The transaction rate is the throughput as reported by the JSSE client summary reports. This
is calculated as the sum of the number of round trip packets per client multiplied by the
number of bytes per packet and then divided by the run time in seconds.
Throughput values are normalized in a way that the result from the test case using cipher
RC4-128 with a 2 KB package size and software encryption becomes 1. For the 1 byte
packets, the throughput is very low due to the small packet size, so these results are given a
value of zero.
CPU utilization
The CPU utilization is obtained during the steady-state part of the JSSE workload. To
compare the results of different CPU numbers, the values are normalized so that 100% CPU
utilization means that four CPUs are fully utilized.
Exploiting IBM System z cryptographic hardware using JSSE
Page 28
Collecting CPU utilization data
Data from the sar command is collected during the steady-state interval by running the sadc
command in a shell script as a background job.
/usr/lib64/sa/sadc $sample_interval $totsamps sadc.out.$HOSTNAME &
The sample_interval variable provides the time between sar snapshots. The total number of
samples is typically calculated based on the steady state time in seconds divided by the
sample interval.
For example, let: totsamps=$steadystate_secs/$sample_interval
The sadc command output file is post-processed as follows:
sar -A -f sadc.out.$HOSTNAME >sar.out.$HOSTNAME
Results
The JSSE study produced performance data, which is analyzed to make observations and
draw conclusions.
Results are observed and analyzed according to these categories:
ƒ Java 1.5 SR9 JSSE software and hardware comparison
ƒ Java 1.5 SR9 JSSE hardware comparison with and without the z90crypt device driver
polling thread
ƒ Java 1.5 SR9 JSSE SSL logon method comparison
Java release to release comparisons
Java 1.5 SR9 JSSE software and hardware comparison
JSSE encryption with Java 1.5 SR9 was studied, using both software and hardware
encryption.
To compare the results of software and hardware JSSE encryption, test cases are run that
target both software and hardware cryptography. Four different ciphers were used, and for
each of the ciphers uncached SSL sessions were established. Data of six different packet sizes
is sent and received over an SSL socket to simulate real-life secure-socket layer (SSL)
handshakes, connections, and encrypted data transfer. Packets transmitted and CPU
utilization are measured on both the JSSE client and server LPARs.
Offloading work from the general purpose CPUs to specialized hardware (CEX2A, CPACF)
can improve two parameters:
Exploiting IBM System z cryptographic hardware using JSSE
Page 29
ƒ CPU resources
ƒ Data throughput
While saving CPU resources is important, high data throughput is also important. To
determine if cryptographic hardware is able to provide equivalent or better performance than
software encryption, SSL handshakes are run on the IBM Crypto Express2 feature, and the
data encryption and decryption is run on the CPACF.
The results are summarized in tabular format in Appendix A, Results tables, tables Table 3,
Table 4, andTable 5.
The results are displayed graphically in Figure 5, Figure 6, Figure 7, and Figure 8.
Figure 5. Java 1.5 SR9 JSSE software and hardware comparison: Throughput
Exploiting IBM System z cryptographic hardware using JSSE
Page 30
Java 5 SR9 - encryption performance - CPU load Server
100
90
80
70
60
50
40
30
20
10
0
0
50
100
150
200
250
packet size in KB
SW RC4-128
HW RC4-128
SW 3DES
HW 3DES
SW AES-128
HW AES-128
SW AES-256
HW AES-256
Figure 6. Java 1.5 SR9 JSSE software and hardware comparison: Server CPU utilization
30
Exploiting IBM System z cryptographic hardware using JSSE
Page 31
Figure 7. Java 1.5 SR9 JSSE software and hardware comparison: Client CPU utilization
Exploiting IBM System z cryptographic hardware using JSSE
Page 32
Figure 8. Java 1.5 SR9 JSSE software and hardware comparison: Normalized throughput and
CPU utilization, client and server with 2KB packets
Observations
The normalized throughput results show that JSSE encryption on hardware for packets larger
than 2K (2000) bytes has a significantly higher throughput than the equivalent cipher and
packet size running with software encryption. Hardware and software encryption throughput,
however, are equivalent for cipher RC4-128.
When using hardware encryption, the throughput scales linearly as packet size increases, with
a slight deviation from linearity for packets of size greater than 128 KB. For software
encryption, throughput only modestly improves as packet sizes are increased above 64 KB,
and for some ciphers throughput growth is nearly flat. On the cryptographic hardware, when
packet sizes are increased from 64 KB to 256 KB, throughput triples.
Overall, CPU utilization when using the cryptographic hardware on the client and server for
the JSSE workload is much lower than the equivalent workload running only with software
encryption.
Exploiting IBM System z cryptographic hardware using JSSE
Page 33
The CPU utilization difference between hardware and software encryption is greatest on the
JSSE server. With 2 KB packet sizes (see Figure 8), the throughput for hardware and software
encryption are nearly equivalent. Server CPU utilization is less than 10% for all ciphers on
cryptographic hardware. In software mode, by contrast, server CPU utilization is nearly 45%
for most ciphers. On the client using a 2 KB packet size, the CPU utilization is 1% higher
running in hardware than running in software. This difference is in the range of normal
variation from run to run, and the results could be considered as equivalent. For 20 KB
packet sizes, the server CPU utilization is 40% lower running in hardware, and client CPU
utilization is 7% lower, saving 47%, or nearly two fully-utilized CPUs. For cipher 3DES, the
20 KB packet sizes do even better using hardware encryption, with a 16% improvement in
throughput using 68% fewer CPU resources, saving more than three fully-utilized CPUs.
On the server, when using the cryptographic hardware for packet sizes of 128KB using
ciphers AES-128 and AES-256, the CPU utilization is less than half of the CPU utilization
found on JSSE operations running with software encryption. For the SSL client running on
the cryptographic hardware with these ciphers, the CPU utilization is approximately one-half
the CPU utilization of running with software encryption. For cipher 3DES with a 128KB
packet size, the CPU savings are about half of the client CPU resources and also half of the
server CPU resources.
For software encryption, the CPU utilization is higher on the server than on the client. This is
especially true for the smaller packet sizes. As packet sizes increase, CPU utilization on the
client begins to grow more quickly than on the server. Running the encryption in software,
there are larger differences in both the CPU utilization and throughput among the different
ciphers. Running in cryptographic hardware mode, the ciphers perform much more uniformly
with respect to both CPU utilization and throughput.
Cipher RC4-128 is an exception. On the server when running on the cryptographic hardware,
there is a 30% or more reduction in CPU utilization. On the client however, the CPU
utilization with cipher RC4-128 is equivalent whether running with software encryption or
hardware encryption.
Conclusions
Offloading the computing effort for the cryptographic algorithms from the CPU to both the
IBM Crypto Express2 feature and the CPACF saves CPU cycles. Relieving the main
processors from the task of performing cryptography functions is one advantage of using
cryptographic hardware.
Exploiting IBM System z cryptographic hardware using JSSE
Page 34
Despite lower CPU utilization, at packet sizes greater than 20KB, throughput is significantly
higher when running with cryptographic hardware. As packet sizes are scaled, using hardware
encryption provides an increasing throughput advantage when compared to the Java software
implementation.
Only for small packet sizes, including the 1 byte packets, software throughput is equivalent or
slightly superior to hardware throughput. This is consistent with the cost of having to perform
a switch from running on the CPUs to running on the cryptographic hardware and back to
the CPUs when using the cryptographic hardware for only a small amount of data. The path
length of these switches could be longer, and CPU savings, especially for very small packets,
come with the cost of reduced throughput.
Typical internet packets are normally between 1.4 KB and 9 KB in size, so the data at packet
sizes
2 KB and 20 KB in this study are of special interest. At the larger packet size of 20KB, for
example, the total CPU savings on the server are 41% versus approximately 34% for the 2 KB
packets. The CPU savings difference for 20 KB packets is even greater with cipher 3DES,
with a savings of 28% on the server using the 2 KB packets, and a 48% server CPU savings
when using the 20KB packets. This savings is another reason to recommend large frames or
jumbo frames instead of the typical default frame size of 1.4 KB, when setting up networks
that might exchange data with cryptographic hardware. Even when not every package
requests a new session, the improvement from CPACF is much higher when there are larger
packages provided with jumbo frames.
The cryptographic hardware also provides more uniform performance than software
encryption when comparing the different ciphers and the cipher is fully supported by the
cryptographic hardware.
The cryptographic hardware, which includes the CPACF and the IBM Crypto Express2
feature, does not support all ciphers. Cipher RC4-128 is not supported on the CPACF feature.
There is traffic on the IBM Crypto Express2 feature reported in tests using this cipher, caused
by the RSA cipher used for the SSL handshakes.
When the packet sizes are small, most of workload is in creating the asymmetric keys during
SSL handshakes. Therefore, small packet sizes have greatly reduced the use of CPU resources
in cryptographic hardware mode, because most of the CPU-intensive work is running on the
IBM Crypto Express2 feature. For cipher RC4-128, as packet size increases the CPU
utilization savings decreases when comparing hardware and software encryption, because the
CPACF is not used for bulk data encryption with this cipher.
Exploiting IBM System z cryptographic hardware using JSSE
Page 35
The fact that there is no throughput difference between hardware and software encryption
with cipher RC4-128 might mean that for the other three ciphers, the CPACF is providing
the major contribution to the higher throughput. The CPU-intensive SSL handshakes might
not impede throughput when running software encryption, as long as there are sufficient CPU
resources available, which is the case in this study.
The main advantage of the IBM Crypto Express2 feature running in CEX2A accelerator mode
might be the offloading of CPU cycles to the feature, and the resulting large reduction in CPU
utilization on both SSL client-side and especially SSL server-side operations. The main
advantage of the CPACF feature seems to be the improvement in throughput when
transferring data.
Java 1.5 SR9 JSSE hardware encryption with and without the z90crypt device driver polling
thread
JSSE hardware encryption, with Java 1.5 SR9, was studied both with and without the polling
thread used for the z90crypt device driver.
The z90crypt device driver at level 2.1.0 and higher provide a configurable polling thread.
Starting with Novell SUSE Linux Enterprise Server (SLES) 10 SP2 and RedHat Enterprise
Linux (RHEL) 5.2, the polling thread default setting is disabled. In earlier releases, the
polling thread was enabled by default.
The reasons for running with the polling thread enabled are detailed in Variations. For the
JSSE workload with 20 concurrent threads, the benefits and costs to packet throughput and
CPU utilization are measured with and without the z90crypt polling thread enabled on both
the JSSE client and server. Four different ciphers are used with six different packet sizes. The
repeated creation, connection, and destruction of SSL connection sockets across the threads
is expected to drive high workloads on the IBM Crypto Express2 feature as it creates new
asymmetric keys for each new connection.
Exploiting IBM System z cryptographic hardware using JSSE
Page 36
The results are summarized in tabular format in Appendix A, Results tables, tables Table 6,
Table 7, and Table 8.
Figure 9 shows the impact on the CPU cost.
CEX2A usage with and without polling thread
CPU cost on the server (with server authentication)
normalized throughput per 1%CPU load
200
180
160
140
120
100
80
60
40
20
0
2
20
64
128
256
packet size in KB
RC4-128
AES-128
RC4-128 + polling
AES-128 + polling
3DES
AES-256
3DES + polling
AES-256 + polling
Figure 9. Java 1.5 SR9 JSSE hardware encryption with and without polling thread: CEX2A
usage
Observations
Enabling the polling thread on both the JSSE client and server provides small gains in
throughput on many of the JSSE tests at the expense of additional CPU utilization,
particularly on the JSSE server. The comparison of throughput values is derived by a ratio of
the normalized throughput values. The CPU comparisons are made using the differences
between CPU utilization values.
Exploiting IBM System z cryptographic hardware using JSSE
Page 37
ƒ For 2KB packets on all ciphers, there is an improvement in packet throughput between
10% and 12% when using the polling thread. This comes at the expense of additional
CPU utilization, which means a relative increase between 33% and 53% on the JSSE
server, and approximately 20% on the client.
ƒ For 20KB packets using the polling thread, there is:
– A 10% increase in throughput for ciphers RC4-128 and AES-128
– A 26% increase in throughput for cipher AES-256
– A 1% decrease in throughput for cipher 3DES
ƒ For the larger package sizes, the impact of having the polling thread enabled diminishes.
An exception is cipher RC4-128, where the increase of throughput varies from 3% to 13%
for the different package sizes, but not consistently.
The increase in throughput is mostly related to additional CPU cost for the polling thread. As
Figure 9 shows for the server, in most cases there is more throughput from 1% CPU without
the polling thread. The client behaves in a similar manner.
Conclusions
It is possible to get approximately a 10% improvement in packet throughput rates using the
polling thread up to the 20KB packet size. This improvement comes at higher CPU cost,
which means higher CPU utilization per unit of throughput.
Generally, the use of the polling thread is recommended only where the number of parallel
SSL connection threads is less than eight. See
http://www.zjournal.com/printItem.cfm?section=article&aid=1149. In these cases, the
cryptography load of SSL handshakes on the IBM Crypto Express2 feature adapter in CEX2A
mode is rather light. When the load on the feature is light, there are often finished encryption
requests on the feature that are waiting for the Linux kernel timer to pop. In these cases,
throughput is improved by having an enabled polling thread query the IBM Crypto Express2
feature so that finished requests can be retrieved and processed without having to wait for the
timer pop.
From the results of the study, there is at least a slight improvement for most cases when using
the polling thread. The best gains in packet throughput come with packets between 2KB and
20KB in size, at increasing CPU cost.
If you are running Linux kernel version 2.6.27 or later in an LPAR or z/VM®, a high
resolution timer is used instead of the standard timer, enabling faster querying of the
Exploiting IBM System z cryptographic hardware using JSSE
Page 38
cryptographic device. AP adapter interrupts are also replacing polling on newer kernel
releases on the z10™ EC or later models.
Java 1.5 SR9 JSSE SSL logon method comparison
JSSE SSL logon methods were compared using Java 1.5 SR9.
It is interesting to compare different SSL logon methods, to examine the throughput and
resource requirements of SSL handshakes and client authentication. Three different logon
methods are explored:
ƒ Client authentication
ƒ Cached - where SSL sessions are not invalidated and therefore are cached
ƒ Uncached - where the SSL session is invalidated.
This is the method used in all the other studies of this paper, because it forces a new SSL
handshake when each new session is created.
Client authentication data using the hardware are not available in this study. For client
authentication, the SSL sessions are also uncached. For these studies, the packet size will be
1 byte, in order to minimize the effects of bulk data encryption.
The results are summarized in tabular format in Appendix A, Results tables, tables Table 9
and Table 10.
The results are displayed graphically in Figure 10 and Figure 11. Because the throughput
values for the different cipher suites are very close, only CPU utilization results are displayed.
Exploiting IBM System z cryptographic hardware using JSSE
Page 39
Logon variation
RC4-128 cached
RC4-128 uncached
3DES cached
3DES uncached
AES-128 cached
AES-128 uncached
AES-256 cached
AES-256 uncached
0
5
10
15
20
25
30
35
40
45
%CPU utilization - JSSE server
Softw are
Hardw are
Figure 10. Java 1.5 SR9 JSSE SSL logon method comparison: Cached and uncached, JSSE
server, cryptographic hardware and software
Exploiting IBM System z cryptographic hardware using JSSE
Page 40
Logon variation
RC4-128 cached
RC4-128 uncached
3DES cached
3DES uncached
AES-128 cached
AES-128 uncached
AES-256 cached
AES-256 uncached
0
5
10
15
20
25
30
35
40
45
%CPU utilization - JSSE client
Softw are
Hardw are
Figure 11. Java 1.5 SR9 JSSE SSL logon method comparison: Cached and uncached, JSSE
client, cryptographic hardware and software
Observations
Client authentication using software encryption requires 40% or more additional CPU
resource on the JSSE client. Client authentication also reduces throughput slightly, at least
when running in software.
Cached SSL sessions save approximately half of the CPU resources on the JSSE server, when
running with software encryption, with little or no change on the client side. Running on the
cryptographic hardware, the cached SSL sessions have a CPU load approximately 16% on
both the client and the server, except for cipher RC4-128, where it is approximately 6% on
both. Caching the sessions almost doubles the throughput, whether running with hardware or
software encryption. The throughput for cached sessions with hardware or software
encryption are equivalent.
Comparing uncached sessions with hardware and software encryption, there is a significant
CPU savings on the server when using cryptographic hardware instead of software. While
there is a 3% increase in CPU load on the client, there is approximately a 32% CPU savings
on the server (36% for cipher RC4-128). The throughput for uncached sessions with 1 byte
packet sizes is slightly lower on hardware than on software.
Exploiting IBM System z cryptographic hardware using JSSE
Page 41
Conclusions
Client authentication running with software encryption requires significantly more CPU
resources on the client because the additional keys in the certificate presented by the client
require CPU-intensive operations on the client.
When running with software encryption, the 30% to 35% savings in CPU resources using
cached SSL sessions probably represent the typical cost of running the SSL handshakes in
the Java software stack. Cached sessions have nearly double the throughput of uncached
sessions with either hardware or software encryption. This shows that caching the SSL
information is a very effective way to improve throughput and reduce CPU utilization.
For the uncached sessions, the throughput is slightly lower with cryptographic hardware
encryption compared to software encryption for the 1 byte packets. This is consistent again
with the cost of using the cryptographic hardware for packets of a trivial size. Using very small
packets does not demonstrate the strength of the cryptographic hardware, but the small
reduction in throughput with 1 byte packets still shows a minimum of 28% savings in overall
CPU utilization.
Java release to release comparisons
All the other tests of the JSSE study used Java Version 5.0 SR9. The tests were repeated with
two other Java versions, to assess the impact on the results.
To identify differences with the Java releases before and after the release used in this study
(Java Version 5.0 SR9), the tests were repeated with all the possible combinations of packet
sizes and ciphers using:
ƒ Java Version 5.0 SR7
ƒ Java Version 6.0 SR4
The only difference was that Java Version 6.0 SR4 did not support the hardware encryption,
so in this case only software encryption was tested.
There was no significant difference observed between the two Java versions studied, with
respect to either throughput or CPU utilization. The JSSE APIs in the Java software
performed equivalently in all tests.
Exploiting IBM System z cryptographic hardware using JSSE
Page 42
Appendix A. Results tables
These tables are the raw data obtained by the JSSE study.
Tables for Java 1.5 SR9 JSSE software and hardware comparison
Table 3. Java 5 SR9 JSSE hardware and software encryption: Normalized throughput
Hardware (H)
or Software (S)
JSSE
functions
Normalized throughput
Cipher
Packet size in kilobytes
1 byte
2
20
64
128
256
S
RC4-128
0.0
1.0
8.8
29.5
54.8
91.5
S
3DES
0.0
1.0
7.7
16.8
21.1
23.3
S
AES-128
0.0
0.9
9.0
24.5
37.8
47.0
S
AES-256
0.0
1.0
8.8
23.1
33.5
41.0
H
RC4-128
0.0
0.9
9.2
29.5
57.2
93.2
H
3DES
0.0
0.9
9.2
28.7
54.1
88.4
H
AES-128
0.0
0.9
9.2
28.8
54.4
94.2
H
AES-256
0.0
0.9
9.0
29.3
55.4
93.1
Table 4. Java 5 SR9 JSSE hardware and software encryption: Server CPU utilization
Hardware
(H) or
Software
(S) JSSE
functions
Server CPU utilization
Packet size in bytes
Cipher
1 byte
2048
20480
65536
131072
262144
S
RC4-128
40.29
42.65
41.12
55.83
65.97
81.98
S
3DES
40.32
44.21
62.84
87.50
94.01
96.92
S
AES-128
41.68
38.67
53.35
73.55
88.97
95.88
S
AES-256
41.52
43.03
57.11
77.32
90.38
94.99
H
RC4-128
3.99
4.62
9.44
20.06
35.91
56.17
H
3DES
9.08
9.62
14.14
26.23
42.35
62.62
H
AES-128
8.92
9.49
13.34
24.26
38.70
58.97
H
AES-256
9.16
9.63
13.40
25.31
39.35
61.21
Exploiting IBM System z cryptographic hardware using JSSE
Page 43
Table 5. Java 5 SR9 JSSE hardware and software encryption: Client CPU utilization
Hardware
(H) or
Software
(S) JSSE
functions
Cipher
1
2048
20480
65536
131072
262144
S
RC4-128
5.779
7.179
9.899
21.558
35.111
55.431
S
3DES
6.663
9.992
33.867
65.554
83.455
91.587
S
AES-128
6.48
8.586
20.876
45.959
66.664
79.375
S
AES-256
6.752
9.023
23.362
50.276
70.6
84.682
H
RC4-128
4.52
5.095
9.512
21.223
36.108
56.424
H
3DES
9.751
10.339
15.186
26.841
42.708
62.774
H
AES-128
9.536
10.287
14.506
25.278
39.212
62.457
H
AES-256
9.826
10.373
14.715
26.094
39.974
61.917
Client CPU utilization
Packet size in bytes
Tables for Java 1.5 SR9 JSSE hardware encryption with and without the z90crypt device
driver polling thread
Table 6. Java V5.0 SR9: Normalized throughput with and without polling thread
Hardware with
(W) or without
(W/O) polling
thread
Normalized throughput
Cipher
Packet size in bytes
2048
20480
65536
131072
262144
W/O
RC4-128
1.0
9.9
30.9
61.4
97.3
W/O
3DES
1.0
9.9
31.4
58.5
95.4
W/O
AES-128
1.0
9.9
30.7
59.8
101.2
W/O
AES-256
1.0
8.6
30.9
60.1
100.3
W
RC4-128
1.1
10.8
35.0
63.1
105.2
W
3DES
1.1
9.7
30.7
61.7
99.0
W
AES-128
1.1
10.8
30.8
59.2
101.9
W
AES-256
1.1
10.8
31.7
59.6
101.4
Exploiting IBM System z cryptographic hardware using JSSE
Page 44
Table 7. Java V5.0 SR9: Server CPU utilization with and without polling thread
Hardware
with (W) or
without
(W/O) polling
thread
Server CPU utilization
Packet size in kilobytes
Cipher
1 byte
2
20
64
128
256
W/O
RC4-128
3.99%
4.61%
9.70%
19.42%
35.53%
53.93%
W/O
3DES
9.18%
9.60%
14.72%
27.57%
41.95%
65.37%
W/O
AES-128
9.08%
9.55%
13.43%
24.84%
38.92%
60.99%
W/O
AES-256
9.07%
9.63%
11.90%
24.77%
39.57%
60.49%
W
RC4-128
7.17%
7.07%
15.06%
24.03%
37.91%
58.35%
W
3DES
13.50%
12.91%
15.96%
28.06%
46.45%
65.36%
W
AES-128
13.10%
14.19%
18.28%
26.69%
40.84%
61.72%
W
AES-256
11.27%
12.83%
17.94%
27.65%
41.79%
63.35%
Table 8. Java V5.0 SR9: Client CPU utilization with and without polling thread
Hardware
with (W) or
without
(W/O)
polling
thread
Client CPU utilization
Packet size in kilobytes
Cipher
1 byte
2
20
64
128
256
W/O
RC4-128
4.42%
5.10%
9.44%
19.90%
36.03%
54.75%
W/O
3DES
9.77%
10.26%
15.24%
28.34%
42.34%
63.13%
W/O
AES-128
9.73%
10.26%
14.54%
25.40%
39.50%
61.08%
W/O
AES-256
9.68%
10.38%
12.87%
25.06%
41.38%
61.42%
W
RC4-128
6.44%
6.27%
11.00%
23.20%
37.32%
58.11%
W
3DES
11.81%
12.45%
15.49%
27.72%
45.32%
66.96%
W
AES-128
10.79%
11.99%
16.98%
26.25%
40.95%
61.28%
W
AES-256
10.51%
12.91%
17.50%
26.40%
41.06%
62.60%
Tables for Java 1.5 SR9 JSSE SSL logon method comparison
Exploiting IBM System z cryptographic hardware using JSSE
Page 45
Table 9. Java 1.5 SR9 JSSE SSL Logon method comparison: Normalized packet throughput
Hardware (H) or
software (S)
JSSE functions
Normalized throughput (1 byte packet size)
Cipher
Client authentication
Cached SSL session
Uncached
S
RC4-128
1.0
2.0
1.1
S
3DES
1.1
2.0
1.1
S
AES-128
1.1
2.0
1.1
S
AES-256
1.0
2.0
1.1
H
RC4-128
0.0
2.0
1.0
H
3DES
0.0
1.9
1.0
H
AES-128
0.0
2.0
1.0
H
AES-256
0.0
2.0
1.0
Table 10. Java 1.5 SR9 JSSE SSL Logon method comparison: CPU utilization client and
server
Hardware
(H) or
software (S)
JSSE
functions
CPU utilization (1 byte packet size)
Cipher
Client
authentication
Cached SSL session
Uncached
Server
Client
Server
Client
Server
Client
S
RC4-128
43.05%
49.99%
5.40%
5.94%
40.29%
5.78%
S
3DES
42.41%
48.83%
8.12%
9.22%
40.32%
6.66%
S
AES-128
44.05%
49.37%
6.48%
6.63%
41.68%
6.48%
S
AES-256
42.71%
48.62%
6.51%
6.53%
41.52%
6.75%
H
RC4-128
N/A
N/A
5.63%
6.94%
3.99%
4.52%
H
3DES
N/A
N/A
16.38%
16.13%
9.08%
9.75%
H
AES-128
N/A
N/A
15.85%
16.11%
8.92%
9.54%
H
AES-256
N/A
N/A
16.07%
16.37%
9.16%
9.83%
Bibliography
ƒ http://www.ibm.com/developerworks/linux/linux390/perf/tuning_res_security_crypto.html
Exploiting IBM System z cryptographic hardware using JSSE
Page 46
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other
countries. Consult your local IBM representative for information on the products and services
currently available in your area. Any reference to an IBM product, program, or service is not
intended to state or imply that only that IBM product, program, or service may be used. Any
functionally equivalent product, program, or service that does not infringe any IBM
intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in
this document. The furnishing of this document does not grant you any license to these
patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A.
For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual
Property Department in your country or send inquiries, in writing, to:
IBM World Trade Asia Corporation
Licensing 2-31 Roppongi 3-chome, Minato-ku
Tokyo 106-0032, Japan
The following paragraph does not apply to the United Kingdom or any other country
where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS
MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT
WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not
allow disclaimer of express or implied warranties in certain transactions, therefore, this
statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are
periodically made to the information herein; these changes will be incorporated in new
editions of the publication. IBM may make improvements and/or changes in the product(s)
and/or the program(s) described in this publication at any time without notice.
Exploiting IBM System z cryptographic hardware using JSSE
Page 47
Any references in this information to non-IBM Web sites are provided for convenience only
and do not in any manner serve as an endorsement of those Web sites. The materials at those
Web sites are not part of the materials for this IBM product and use of those Web sites is at
your own risk.
IBM may use or distribute any of the information you supply in any way it believes
appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose of enabling:
(i) the exchange of information between independently created programs and other programs
(including this one) and (ii) the mutual use of the information which has been exchanged,
should contact:
IBM Corporation
Software Interoperability Coordinator, Department 49XA
3605 Highway 52 N
Rochester, MN 55901
U.S.A.
Such information may be available, subject to appropriate terms and conditions, including in
some cases, payment of a fee.
The licensed program described in this information and all licensed material available for it
are provided by IBM under terms of the IBM Customer Agreement, IBM International
Program License Agreement, or any equivalent agreement between us.
Any performance data contained herein was determined in a controlled environment.
Therefore, the results obtained in other operating environments may vary significantly. Some
measurements may have been made on development-level systems and there is no guarantee
that these measurements will be the same on generally available systems. Furthermore, some
measurements may have been estimated through extrapolation. Actual results may vary. Users
of this document should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products,
their published announcements or other publicly available sources. IBM has not tested those
products and cannot confirm the accuracy of performance, compatibility or any other claims
related to non-IBM products. Questions on the capabilities of non-IBM products should be
addressed to the suppliers of those products.
All statements regarding IBM's future direction or intent are subject to change or withdrawal
without notice, and represent goals and objectives only.
Exploiting IBM System z cryptographic hardware using JSSE
Page 48
All IBM prices shown are IBM's suggested retail prices, are current and are subject to change
without notice. Dealer prices may vary.
This information is for planning purposes only. The information herein is subject to change
before the products described become available.
This information contains examples of data and reports used in daily business operations. To
illustrate them as completely as possible, the examples include the names of individuals,
companies, brands, and products. All of these names are fictitious and any similarity to the
names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate
programming techniques on various operating platforms. You may copy, modify, and
distribute these sample programs in any form without payment to IBM, for the purposes of
developing, using, marketing or distributing application programs conforming to the
application programming interface for the operating platform for which the sample programs
are written. These examples have not been thoroughly tested under all conditions. IBM,
therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
Each copy or any portion of these sample programs or any derivative work, must include a
copyright notice as follows:
(C) (your company name) (year). Portions of this code are derived from IBM Corp. Sample
Programs. (C) Copyright IBM Corp. _enter the year or years_. All rights reserved.
If you are viewing this information in softcopy, the photographs and color illustrations may
not appear.
Edition notices
(C) Copyright International Business Machines Corporation 2009. All rights reserved.
U.S. Government Users Restricted Rights -- Use, duplication, or disclosure restricted by GSA
ADP Schedule Contract with IBM Corp.
Exploiting IBM System z cryptographic hardware using JSSE
Page 49
Copyright IBM Corporation 2010
IBM Systems and Technology Group
Route 100
Somers, New York 10589
U.S.A.
Produced in the United States of America,
02/2010
All Rights Reserved
IBM, the IBM logo, ibm.com, DB2, DB2 Universal Database, DS8000, ECKD, Express, HiperSockets,
Resource Link, System z, System z9, System z10, WebSphere, z/VM, and z9 are trademarks or registered
trademarks of the International Business Machines Corporation.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks
of Adobe Systems Incorporated in the United States, and/or other countries.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other
countries, or both and is used under license therefrom.
InfiniBand and InfiniBand Trade Association are registered trademarks of the InfiniBand Trade Association.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other
countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the
United States, other countries, or both.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel
SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its
subsidiaries in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
ITIL is a registered trademark, and a registered community trademark of the Office of Government
Commerce, and is registered in the U.S. Patent and Trademark Office.
IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency,
which is now part of the Office of Government Commerce.
All statements regarding IBM’s future direction and intent are subject to change or withdrawal without notice,
and represent goals and objectives only.
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using
standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience
will vary depending upon considerations such as the amount of multiprogramming in the user’s job stream,
the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can
be given that an individual user will achieve throughput improvements equivalent to the performance ratios
stated here.
ZSW03153-USEN-01
Fly UP