...

Scalability Guide Version 3 IBM Endpoint Manager for Software Use Analysis

by user

on
Category:

children

4

views

Report

Comments

Transcript

Scalability Guide Version 3 IBM Endpoint Manager for Software Use Analysis
IBM Endpoint Manager for Software Use Analysis
Version 9
Scalability Guide
Version 3
IBM Endpoint Manager for Software Use Analysis
Version 9
Scalability Guide
Version 3
Scalability Guide
This edition applies to versions 9.1 and 9.0 of IBM Endpoint Manager for Software Use Analysis (product number
5725-F57) and to all subsequent releases and modifications until otherwise indicated in new editions.
© Copyright IBM Corporation 2002, 2014.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
Scalability Guidelines . . . . . . . . . 1
Introduction . . . . . . . . . . . . . . 1
Scanning and uploading scan data . . . . . . 1
Extract, Transform, Load (ETL) . . . . . . . 2
Decision flow . . . . . . . . . . . . . . 4
Planning and installing Software Use Analysis . . 6
Hardware requirements. . . . . . . . . 6
Network connection and storage throughput . 7
Dividing the infrastructure into scan groups . . . 7
Good practices for running scans and imports . . 8
Plan the scanning schedule . . . . . . . 8
Avoid scanning when it is not needed . . . . 8
Limit the number of computer properties that
are to be gathered during scans . . . . . . 9
Ensure that scans and imports are scheduled to
run at night. . . . . . . . . . . . . 9
Run the initial import . . . . . . . . . 9
Review import logs . . . . . . . . . . 9
Maintain frequent imports . . . . . . . 10
Disable collection of usage data . . . . . . 10
Make room for end-of-scan-cycle activities . . . 11
Configuring the application and its database for
medium and large environments . . . . . . . 11
© Copyright IBM Corp. 2002, 2014
Configuring the transaction logs size . . . .
Configuring the transaction log location . . .
Increasing Java heap size . . . . . . . .
Preventive actions . . . . . . . . . . .
Limiting the number of scanned signature
extensions . . . . . . . . . . . . . .
Recovering from accumulated scans . . . . .
Cleaning up high volume of Software Use
Analysis scans uploaded since the last import .
Verifying the removal of scan data. . . . .
IBM PVU considerations . . . . . . . . .
REST API considerations . . . . . . . . .
Improving user interface performance . . . .
Using relays to increase the performance of IBM
Endpoint Manager . . . . . . . . . . .
Reducing the Endpoint Manager server load .
.
.
.
.
11
11
12
12
. 12
. 14
.
.
.
.
.
14
15
16
17
17
. 18
. 18
Appendix. Executive summary . . . . 21
Notices . . . . . . . . . . . . . . 23
Trademarks . . . . . .
Privacy policy considerations
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 24
. 25
iii
iv
Scalability Guidelines
This guide is intended to help system administrators plan the infrastructure of
IBM® Endpoint Manager for Software Use Analysis and to provide
recommendations for configuring the Software Use Analysis server to achieve
optimal performance. It explains how to divide computers into scan groups,
schedule software scans, and run data imports. It also provides information about
other actions that can be undertaken to avoid low performance.
Introduction
IBM Endpoint Manager clients report data to the Endpoint Manager server that
stores the data in its file system or database. The Software Use Analysis server
periodically connects to the Endpoint Manager server and its database, downloads
the stored data and processes it. The process of transferring data from the
Endpoint Manager server to the Software Use Analysis server is called Extract,
Transform, Load (ETL). By properly scheduling scans and distributing them over the
computers in your infrastructure, you can reduce the length of the ETL process and
improve its performance.
Scanning and uploading scan data
To evaluate whether particular software is installed on an endpoint, you must run
a scanner. It collects information about files with particular extensions, package
data, and software identification tags. It also gathers information about the running
processes to measure software usage. The software scan data must be transferred
to the Endpoint Manager server from which it can be later on imported to
Software Use Analysis.
To discover software that is installed on a particular endpoint and collect its usage,
you must first install a scanner by running the Install Scanner fixlet. After the
scanner is successfully installed, the Initiate Software Scan fixlet becomes relevant
on the target endpoint. The following types of scans are available:
Catalog-based scan
In this type of scan, the Software Use Analysis server creates scanner
catalogs that are sent to the endpoints. The scanner catalogs do not include
signatures that can be found based on the list of file extensions or entries
that are irrelevant for a particular operating system. Based on those
catalogs, the scanner discovers exact matches and sends its findings to the
Endpoint Manager server. This data is then transferred to the Software Use
Analysis server.
File system scan
In this type of scan, the scanner uses a list of file extensions to create a list
of all files with those extensions on an endpoint.
Package data scan
In this type of scan, the scanner searches the system registry (Windows) or
package management system (Linux, UNIX) to gather information about
packages that are installed on the endpoints. Then, it returns the findings
to the Endpoint Manager server where the discovered packages are
compared with the software catalog. If a particular package matches an
entry in the catalog, the software is discovered.
© Copyright IBM Corp. 2002, 2014
1
Application usage statistics
In this type of scan, the scanner gathers information about processes that
are running on the target endpoints.
Software identification tags scan
In this type of scan, the scanner searches for software identification tags
that are delivered with software products.
You should run the catalog-based, file system, package data, and software
identification tags scans on a regular basis as they are responsible for software
discovery. The application usage statistics gathers usage data and can be disabled
if you are not interested in this information.
When the status of the Initiate Software Scan fixlet shows complete (100%), it
indicates that the scan was successfully initiated. It does not mean that the relevant
data was already gathered. After the scan finishes, the Upload Software Scan
Results fixlet becomes relevant on the targeted endpoint. It means that the relevant
data was gathered from the endpoint. When you run this fixlet, the scan data is
uploaded to the Endpoint Manager server. It is then imported to Software Use
Analysis during the Extract, Transform, Load (ETL) process.
Extract, Transform, Load (ETL)
The Extract, Transform, Load (ETL) is a process in the database usage that combines
three database functions, which transfer data from one database to another. The
first stage, Extract, involves reading and extracting data from various source
systems. The second stage, Transform, converts the data from its original form into
the form that meets the requirements of the target database. The last stage, Load,
saves the new data into the target database, thus finishing the process of
transferring the data.
In Software Use Analysis, the Extract stage involves extracting data from the
Endpoint Manager server. The data includes information about the infrastructure,
installed agents, and detected software. ETL also checks whether a new software
catalog is available, gathers information about the software scan and files that are
present on the endpoints, and collects data from VM managers.
The extracted data is then transformed to a single format that can be loaded to the
Software Use Analysis database. This stage also involves matching raw data with
the software catalog, calculating processor value units (PVUs), processing the
capacity scan, and converting information that is contained in the XML files. After
the data is extracted and transformed, it is loaded into the database and can be
used by Software Use Analysis.
2
Extract, Transform, and Load
Client computer
- console
Client computer
- browser
Software
Use Analysis
server
Endpoint
Manager
server
1. Extract
• Infrastructure
information
• Installed agents
• Scan data files
• Software Use data files
• Capacity data files
• Package data files
• Files with VM manager
information
Endpoint Manager
file system
Raw
scan
files
Endpoint
Manager
database
Core
business
logic
Information
about files
Web user
interface
Core
business
logic
High-speed
network
connection
2. Transform
• Information from
the XML files is
processed.
• Data is transformed
to a single format.
• Raw data is matched
with the software catalog.
• PVU and RVU
values are calculated.
• The capacity scan
is processed.
Software Use
Analysis
database
Software catalog
3. Load
Data is loaded into
the Software Use Analysis
database tables.
Relay
Endpoint Manager Client
on Windows, Linux and UNIX
XML
Usage
Scan data
data
Catalog-based
File system
Capacity
Package
Software identification tags
Endpoint Manager client
on Linux on System z
XML
VM
manager
data
(Windows and
Linux x86/x64
only)
XML
Usage
Scan data
data
Catalog-based
File system
Capacity
Package
Software identification tags
Capacity
configuration
XML
The hardest load on the Software Use Analysis server occurs during ETL when the
following actions are performed:
v A large number of small files is retrieved from the Endpoint Manager server
(Extract).
v Many small and medium files that contain information about installed software
packages and process usage data are parsed (Transform).
v The database is populated with the parsed data (Load).
At the same time, Software Use Analysis prunes large volumes of old data that
exceeds its data rentention period.
The performance of the ETL process depends on the number of scan files, usage
analyses, and package analyses that are processed during a single import. The
main bottleneck is storage performance because many small files must be read,
processed, and written to the Software Use Analysis database in a short time. By
properly scheduling scans and distributing them over the computers in your
infrastructure, you can reduce the length of the ETL process and improve its
performance.
Scalability Guidelines
3
Decision flow
To avoid running into performance issues, you should divide the computers in
your infrastructure into scan groups and properly set the scan schedule. You
should start by creating a benchmark scan group on which you can try different
configurations to achieve optimal import time. After the import time is satisfactory
for the benchmark group, you can divide the rest of your infrastructure into
analogical scan groups.
Start by creating a single scan group that will be your benchmark - when you are
satisfied with the performance that you achieve for this group, you will create
other scan groups on its basis. The size of the scan group might vary depending
on the size of your infrastructure, however avoid creating a group larger than 20
000 endpoints.
Scan the computers in this scan group. When the scan finishes, upload its results to
the Endpoint Manager server and run an import. Check the import time and
decide whether it is satisfactory. For information about running imports, see
section “Good practices for running scans and imports” on page 8.
If you are not satisfied with the import time, check the import log and try
undertaking one of the following actions:
v If you see that the duration of the import of raw file system scan data or
package data takes longer than one third of the ETL time and the volume of the
data is large (a few millions of entries), create a smaller group. For additional
information, see section “Dividing the infrastructure into scan groups” on page
7.
v If you see that the duration of the import of raw file system scan data or
package data takes longer than one third of the ETL time but the volume of the
data is low, fine tune hardware. For information about processor and RAM
requirements as well as network latency and storage throughput, see section
“Planning and installing Software Use Analysis” on page 6.
v If you see that processing of usage data takes an excessive amount of time and
you are not interested in collecting usage data, disable gathering of usage data.
For more information, see section “Disable collection of usage data” on page 10.
After you adjust the first scan group, run the software scan again, upload its
results to the Endpoint Manager server and run an import.
When you achieve an import time that is satisfactory, decide whether you want to
have a shorter scan cycle. For example, if you have an environment that consists of
42 000 endpoints and you created a scan group of 6000 endpoints, your scan cycle
will last seven days (on assumption that you create seven equal groups). To
shorten the scan cycle, you can try increasing the number of computers in a scan
group, for example, to 7000. It will allow you for shortening the scan cycle to six
days. After you increase the scan group size, observe the import time to ensure
that its performance remains on an acceptable level.
When you are satisfied with the performance of the benchmark scan group, create
the remaining groups. Schedule scans so that they fit into your preferred scan
cycle. Then, schedule import of data form the Endpoint Manager. Observe the
import time. If it is not satisfactory, adjust the configuration as you did in the
benchmark scan group. When you achieve suitable performance, plan for
end-of-cycle activities.
4
Use the following diagram to get an overview of actions and decisions that you
will have to undertake to achieve optimal performance of Software Use Analysis.
Installation
Plan and install
Software Use
Analysis
Configuration
Create a scan group
(up to 20 000
computers)
Fine tune hardware
(if possible)
Initiate the scan and
upload scan results
Create a smaller
scan group
Run an import
and check its time
Disable gathering
of usage data
(if you do not need it)
Is the import
time satisfactory?
No
Yes
Increase the size of
the scan group
Yes
Do you want
to have a shorter
scan cycle?
No
Create the remaining
scan groups
Fine tune hardware
(if possible)
Schedule the scans
to fit into the scan cycle
Create a smaller
scan group
Schedule daily imports
Disable gathering
of usage data
(if you do not need it)
Is the import time
still satisfactory?
No
Yes
Plan for
end-of-cycle activities
Scalability Guidelines
5
Planning and installing Software Use Analysis
Your deployment architecture depends on the number of endpoints that you want
to have in your audit reports.
For information about the Endpoint Manager requirements, see Server
requirements available in Software Use Analysis documentation.
Hardware requirements
If you already have the Endpoint Manager server in your environment, plan the
infrastructure for the Software Use Analysis server. Software Use Analysis server
stores its data in a dedicated DB2® database.
The following tables are applicable for environments with the following
configuration parameters: a weekly software scan, daily imports, and 60
applications that are installed on an endpoint (on average).
Table 1. Processor and RAM requirements for Software Use Analysis
Environment size
Topology
Small
environment
1 server
IBM Endpoint Manager, Software Use Analysis,
and DB2
Processor
Memory
At least 2,5 GHz - 4
cores
8 GB
2-3 GHz - 4 cores
16 GB
At least 2 GHz - 4
cores
24 GB
Up to 5 000
endpoints
Medium
environment
2/3 servers IBM Endpoint Manager
Software Use Analysis and DB2
5 000 - 50 000
endpoints*
Large environment
More than 50 000
endpoints**
A distributed environment is advisable. If you
separate DB2 from Software Use Analysis, the DB2
server should have at least 16 GB RAM.
3 servers
IBM Endpoint Manager
2-3 GHz - 4-16 cores 16-32 GB
Software Use Analysis
At least 2 GHz - 8
cores
16 GB
DB2
At least 2 GHz - 16
cores
64 GB
* For environments with up to 35 000 endpoints, there is no requirement to create scan groups. If you have more than 35 000
endpoints in your infrastructure, you must create scan groups. For more information, see section “Dividing the infrastructure into
scan groups” on page 7.
** For larger environments, scan groups are required.
Medium-size environments
You can use virtual environments for this deployments size, but it is
advisable to have dedicated resources for processor, memory, and virtual
disk allocation. The virtual disk that is allocated for the virtual machine
should have dedicated RAID storage, with dedicated input-output
bandwidth for that virtual machine.
Large environments
For large deployments, use dedicated hardware. For optimum
performance, use a DB2 server that is dedicated to Software Use Analysis
and is not shared with Endpoint Manager or other applications.
Additionally, you might want to designate a separate disk that is attached
to the computer where DB2 is installed to store the database transaction
logs. You might need to do some fine-tuning based on the
recommendations described.
6
Plan and prepare
for installation
Small environment
Size:
Up to 5 000 endpoints
Install Software Use
Analysis and DB2
on one computer
Software Use
Analysis
and DB2
Medium environment
Small
Size:
What is
the size of your
environment?
Install or reuse
IBM Endpoint
Manager
5 000 - 50 000 endpoints
Medium
Install Software Use
Analysis and DB2
on one computer
Install or reuse
IBM Endpoint
Manager
Software Use
Analysis
and DB2
Large
Large environment
Size:
50 000 - 250 000 endpoints
Install Software Use
Analysis and DB2
on two computers
Install or reuse
IBM Endpoint
Manager
Software Use
Analysis
server
DB2 server
A separate disk
or storage
might be necessary.
Network connection and storage throughput
The Extract Transform and Load (ETL) process extracts a huge amount of scan data
from the Endpoint Manager server, processes it on the Software Use Analysis
server, and saves it in the DB2 database. The following two factors affect the time
of the import to the Software Use Analysis server:
Gigabit network connection
Because of the nature of the ETL imports, you are advised to have at least
a gigabit network connection between the Endpoint Manager, Software Use
Analysis, and DB2 servers.
Disk storage throughput
For large deployments, you are advised to have dedicated storage,
especially for the DB2 server. The expected disk speed for writing data is
approximately 400 MB/second.
Dividing the infrastructure into scan groups
It is critical for Software Use Analysis performance that you properly divide your
environment into scan groups and then schedule scans in those scan groups
accurately. If the configuration is not well-balanced, you might experience long
import times.
For environments larger than 35 000 endpoints, divide your endpoints into
separate scan groups. The system administrator can then set a different scanning
schedule for every scan group in your environment.
Example
If you have 60 000 endpoints, you can create six scan groups (every group
containing 10 000 endpoints). The first scan group has the scanning schedule set to
Monday, the second to Tuesday, and so on. Using this configuration, every
endpoint is scanned once a week. At the same time, the Endpoint Manager server
receives data only from 1/6 of your environment daily and for every daily import
Scalability Guidelines
7
the Software Use Analysis server needs to process data only from 10 000 endpoints
(instead of 60 000 endpoints). This environment configuration shortens the
Software Use Analysis import time.
The image below presents a scan schedule for an infrastructure that is divided into
six scan groups. You might achieve such a schedule after you implement
recommendations that are contained in this guide. The assumption is that both
software scans and imports of scan data to Software Use Analysis are scheduled to
take place at night, while uploads of scan data from the endpoints to the Endpoint
Manager server occur during the day.
If you have a powerful server computer and longer import time is not problematic,
you can create fewer scan groups with greater number of endpoints in the
Endpoint Manager console. Remember to monitor the import log to analyze the
amount of data that is processed and the time it takes to process it.
For information how to create scan groups, see the topic Computer groups that is
available in Endpoint Manager documentation.
Good practices for running scans and imports
After you enable the Software Use Analysis site in your Endpoint Manager
console, you should carefully plan the scanning activities and their schedule for
your deployment.
Plan the scanning schedule
After you find the optimal size of the scan group, set the scanning schedule. It is
the frequency of software scan on an endpoint. The most common scanning
schedule is weekly so that every endpoint is scanned once a week. If your
environment has more than 100 000 endpoints, consider performing scans less
frequently, for example monthly.
Avoid scanning when it is not needed
The frequency of scans depends both on how often software products change on
the endpoints in your environment and also on your reporting needs. If you have
systems in your environment that have dynamically-changing software, you can
group such systems into a scan group (or groups) and set more frequent scans, for
example once a week. The remaining scan groups that contain computers with a
more stable set of software can be scanned less frequently, for example once a
month.
8
Limit the number of computer properties that are to be gathered
during scans
By default, the Software Use Analysis server includes four primary computer
properties from the Endpoint Manager server that is configured as the data source:
Computer Name, DNS Name, IP address, and Operating System. Imports can be
substantially longer if you specify more properties to be extracted from the
Endpoint Manager database and copied into the Software Use Analysis database
during each data import. As a good practice, limit the number of computer
properties to 10 (or fewer).
Ensure that scans and imports are scheduled to run at night
Some actions in the Software Use Analysis user interface cannot be processed
when an import is running. Thus, try to schedule imports when the application
administrator and Software Asset Manager are not using Software Use Analysis or
after they finished their daily work.
Run the initial import
It is a good practice to run the first (initial) import before you schedule any
software scans and activate any analyses.
Examples of when imports can be run:
v The first import uploads the software catalog from the installation d irectory to
the application and extracts the basic data about the endpoints from the
Endpoint Manager server.
v The second import can be run after the scan data from the first scan group is
available in the Endpoint Manager server.
v The third import should be started after the scans from the third scan group are
finished, and so on.
Review import logs
Review the following INFO messages in the import log to check how much data
was transferred during an ETL.
Information
Number about
Items specified in the import log
Description
1.
Infrastructure
Computer items:
The total number of computers in your environment. A
computer is a system with an Endpoint Manager agent that
provides data to Software Use Analysis.
2.
Software and
hardware
SAM::ScanFile items
The number of files that have input data for the following
items:
v File system scan information (SAM::FileFact items)
v Catalog-based scan information (SAM::CitFact items)
v Software identification tag scan information
(SAM::IsotagFact items)
SAM::FileFact items
The total count of information pieces about files from all
computers in your environment (contained in the processed
scan files).
SAM::CitFact items
The total count of information pieces from catalog-based
scans (contained in the processed scan files).
SAM::IsotagFact items
The total count of information pieces from software
identification tag scans (contained in the processed scan
files).
Scalability Guidelines
9
Information
Number about
Items specified in the import log
Description
3.
SAM::PackageFact items
The total count of information pieces about Windows
packages that have been gathered by the package data
scan.
SAM::UnixPackageFact items
The total count of information pieces about UNIX packages
that have been gathered by the package data scan.
4.
Installed
packages
Software usage SAM::AppUsagePropertyValue items
The total number of processes that were captured during
scans on the systems in your infrastructure.
Example:
INFO:
INFO:
INFO:
INFO:
INFO:
INFO:
INFO:
INFO:
Computer items: 15000
SAM::AppUsagePropertyValue items: 4250
SAM::ScanFile items: 30000
SAM::FileFact items: 15735838
SAM::IsotagFact items: 0
SAM::CitFact items: 149496
SAM::PackageFact items: 406687
SAM::UnixPackageFact items: 1922564
Maintain frequent imports
After the installation, imports are scheduled to run once a day. Do not change this
configuration. However, you might want to change the hour when the import
starts. If your import is longer than 24 hours, you can:
v Improve the scan groups configuration.
v Preserve the current daily import configuration because Software Use Analysis
handles overlapping imports gracefully. If an import is running, no other import
is started.
Disable collection of usage data
Software usage data is gathered by the Application Usage Statistics analysis. If the
analysis is activated, usage data is gathered from all endpoints in your
infrastructure. However, the data is uploaded to the Endpoint Manager server only
for the endpoints on which you run software scans. For the remaining endpoints,
the data is stored on the endpoint until you run the software scan.
About this task
If you do not need usage data or the deployment phase is not finished, do not
activate the analysis. It can be activated later on, if needed. If the analysis is
already activated, but you decide that processing of usage data takes too much
time or you are not interested in usage statistics, disable the analysis.
Procedure
1. Log in to the Endpoint Manager console.
2. In the navigation tree, open the IBM Endpoint Manager for Software Use
Analysis v9 > Analyses.
3. In the upper-right pane, right-click Application Usage Statistics, and click
Deactivate.
10
Make room for end-of-scan-cycle activities
Plan to have an import from SmartCloud Control Desk through IBM Tivoli®
Integration Composer at the end of a 1- or 2-week cycle. Include in your
end-of-scan-cycle activities the catalog update and the time for extracting Software
Use Analysis compliance reports.
Configuring the application and its database for medium and large
environments
To avoid performance issues in medium and large environments, configure the
location of the transaction log and adjust the log size. Apart from that, you can
also adjust the Java™ heap size.
Configuring the transaction logs size
If your environment consists of many endpoints, increase the transaction logs size
to improve performance.
About this task
The transaction logs size can be configured through the LOGFILSIZ DB2 parameter
that defines the size of a single log file. To calculate the value that can be used for
this parameter, you must first calculate the total disk space that is required for
transaction logs in your specific environment and then divide it, thus obtaining the
size of one transaction log. The required amount of disk space depends on the
number of endpoints in your environment and the number of endpoints for which
new scan results are available and processed during the data import.
Procedure
1. Use the following formula to calculate the disk space for your transaction logs:
<The number of endpoints> x 1 MB + <the number of endpoints
for which new scan results are imported> x 1 MB + 1 GB
2. Divide the result by 0.00054 to obtain the size of a single transaction log file.
3. Run the following command to update the transaction log size in your
database. Substitute value with the size of a single transaction log.
UPDATE DATABASE CONFIGURATION FOR SUADB USING LOGFILSIZ value
Example
v Calculating the single transaction log size for 100 000 endpoints and 15 000 scan
results:
100 000 x 1 MB + 15 000 x 1 MB + 1 GB = 114 GB
114 / 0.00054 = 211111
UPDATE DATABASE CONFIGURATION FOR SUADB USING LOGFILSIZ 211111
Configuring the transaction log location
To increase database performance, move the DB2 transaction log to a file system
that is separate from the DB2 file system.
About this task
Medium environments:
Strongly advised
Large environments:
Required
Scalability Guidelines
11
Procedure
To move the DB2 transaction log to a file system that is separate from the DB2 file
system, update the DB2 NEWLOGPATH parameter for your Software Use Analysis
database:
UPDATE DATABASE CONFIGURATION
FOR SUADB USING NEWLOGPATH value
Where value is a directory on a separate disk (different from the disk where the
DB2 database is installed) where you want to keep the transaction logs. This
configuration is strongly advised.
Increasing Java heap size
The default settings for the Java heap size might not be sufficient for medium and
large environments. If your environment consists of more than 5000 endpoints,
increase the memory available to Java client processes by increasing the Java heap
size.
Procedure
1. Go to the <INSTALL_DIR>/wlp/usr/servers/server1/ directory and edit the
jvm.options file.
2. Set the maximum Java heap size (Xmx) to one of the following values,
depending on the size of your environment:
v For medium environments (5000 - 50 000 endpoints), set the heap size to
6144m.
v For large environments (over 50 000 endpoints), set the heap size to 8192m.
3. Restart the Software Use Analysis server.
Preventive actions
Turn off scans if the Software Use Analysis server is to be unavailable for a few
days due to routine maintenance or scheduled backups.
If imports of data from Endpoint Manager to Software Use Analysis are not
running, the unprocessed scan data is accumulated on the Endpoint Manager
server. After you turn on the Software Use Analysis server, a large amount of data
will be processed leading to a long import time. To avoid prolonged imports, turn
off scans for the period when the Software Use Analysis server is not running.
Limiting the number of scanned signature extensions
The scanner scans the entire infrastructure for files with particular extensions. For
some extensions, the discovered files are matched against the software catalog
before the scan results are uploaded to the Endpoint Manager server. It ensures
that only information about files that produce matches is uploaded.
For other extensions, the scan results are not matched against the software catalog
on the side of the endpoint. They are all uploaded to the Endpoint Manager server.
Thus, you avoid rescanning the entire infrastructure when you import a new
catalog or add a custom signature. The new catalog is matched against the
information that is available on the server. However, such a behavior might cause
that large amounts of information about files that do not produce matches is
uploaded to the server. It might in turn lead to performance issues during the
import.
12
To reduce the amount of information that is uploaded to the server, limit the list of
file extensions that are not matched against the software catalog on the side on the
endpoint.
Procedure
1. Stop the Software Use Analysis server by running the following command:
/etc/init.d/wlpserver stop
/etc/init.d/SUAserver stop
2. To limit the number of extensions that are not matched against the software
catalog on the side of the endpoint, edit the following files. They are in the
<SUA_install_dir>\wlp\usr\servers\server1\apps\tema.war\WEB-INF\domains\
sam\config directory.
v In the file_names_all.txt file, leave the following extension:
\.ear$
v In the file_names_unix.txt file, leave the following extensions:
\.sh$
\.bin$
\.pl$
\.ear$
\.SH$
\.BIN$
\.PL$
\.EAR$
v In the file_names_windows.txt file, leave the following extensions:
\.exe$
\.sys$
\.com$
\.ear$
\.ocx$
Note: Do not remove file extensions that you used to create custom signatures.
They are likely to produce matches with the software catalog, so they can be
uploaded to the Endpoint Manager server.
3. Start the Software Use Analysis server by running the following command:
/etc/init.d/wlpserver start
/etc/init.d/SUAserver start
4. Upload the software catalog. If a new version of the catalog is available, upload
the new version. If it is not available yet, reupload the catalog that is currently
imported to Software Use Analysis.
v If you are using Software Knowledge Base Toolkit for catalog management
and upload a new version of the catalog, see: Updating the software catalog
in Software Knowledge Base Toolkit.
v If you are using Software Knowledge Base Toolkit but reupload the same
software catalog, perform the following steps:
a. Log in to Software Use Analysis.
b. In the top navigation bar, click Management > Catalog Servers.
c. To automatically import the software catalog from Software Knowledge
Base Toolkit, click Update Catalog.
Scalability Guidelines
13
v If you are using the built-in catalog management functionality that is
available in Software Use Analysis, perform the following steps:
a. Optional: Download the software catalog by using the Software Catalog
Downlaod fixlet. Download the compressed file that contains only the
software catalog in the XML format.
b. In the navigation bar, click Management > Catalog Upload.
c. Click Browse and select the
IBMSoftwareCatalog_canonical_2.0_form_date.zip file.
d. To upload the file, click Upload.
What to do next
After you modify the list of extensions that are uploaded directly to the server,
wait for the scheduled import or run it manually. During the import, the changes
are propagated to the specified endpoints. After the next scheduled scan and
import, the changed list of file extensions is used. More file extensions are matched
against the software catalog on the side of the endpoint and the import time is
shorter.
Recovering from accumulated scans
To recover from a situation when you have a large amount of accumulated scan
data, you need to clean up the Software Use Analysis scans and then verify if the
data was removed correctly.
Cleaning up high volume of Software Use Analysis scans
uploaded since the last import
To clean up accumulated scan data, use an SQL query that removes scan file
entries from the BFEnterprise tables.
Procedure
1. Back up the BFEnterprise and, tem_analytics databases.
2. Back up and remove all files from the UploadManager/sha1 directory on the
Endpoint Manager server.
3. To delete Software Use Analysis scan files entries from BFEnterprise tables, run
the following SQL statement against the BFEnterprise database:
use BFEnterprise
delete from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
delete from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
4. Run a Software Use Analysis import.
What to do next
You can either:
v Restore Software Use Analysis scans in sha1 directory in small chunks and run
imports (that is, restore 10 000 scans, run an import, restore next 10 000 scans,
run another import, and so on. Software Use Analysis will incrementally import
10 000 scans). While restoring the scans in sha1 directory, Endpoint Manager
FillDB process monitors changes in UploadManager\sha1 directory and updates
dbo.uploads and dbo.uploads_availability tables.
v Perform the following steps:
14
1. Rescan computers.
2. Upload new scan files.
3. Run an import.
Repeat the procedure for every 10 000 computers.
Tip: Avoid importing more than 10 000 or 20 000 scans within one Software Use
Analysis import. Such an import takes 10 - 15 hours even on systems that meet
the Software Use Analysis hardware requirements.
Verifying the removal of scan data
After you remove the scan data, verify whether the removal was successful.
Procedure
1. Stop all fixlets that scan and upload data.
2. Back up the Endpoint Manager databases:
SQL server
BFEnterprise and BESReporting
In SQL Server Management Studio, right click the database and select
Tasks > Back Up....
DB2
BFENT and BESREPOR
If you get a SQL1015N The database is in an inconsistent state.
SQLSTATE = 55025 error when attempting to back up the DB2
database, check if the database is in an inconsistent state by running
the following command: db2 get db cfg for database_name.
Note: Verify whether the parameter All committed transactions
have been written to disk is set to NO.
If the database is in inconsistent state, run the following commands:
a. DB2 restart database DATABASE_NAME
b. DB2 force applications all
c. db2stop
d. db2start
3. Move all files in the Upload Manager/sha1 directory on the Endpoint Manager
server to a backup location.
v Linux: /install_dir/UploadManagerData/BufferDir/sha1 (that is
install_dir = /var/opt/BESServer)
v Windows: install_dir\UploadManagerData\BufferDir\sha1 (that is
install_dir = c:\Program Files (x86)\BigFix Enterprise\BESServer\
UploadManagerData\BufferDir\sha1)
4. Delete scan files entries from BFEnterprise tables:
SQL Server
use BFEnterprise
delete from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
delete from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
DB2
Scalability Guidelines
15
connect to BFENT
delete from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
delete from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
5. Run a Software Use Analysis import.
6. Move a subset of the sha1 folders from the backup location from step 3 back
to the original sha1 folder.
v The sha1 directory contains folder names that match the last 2 digits of the
endpoint (computer) ID. The folder names can be used to identify a specific
endpoint or endpoints from a specific scan group.
v The Presentation Debugger can be used to run session relevance queries to
retrieve computer information. The following sample queries show how to
retrieve the information for computers that are associated with sha1 folders:
–
(id of it, hostname of it, name of it, ip addresses of it, operating system of it) of bes computers whose
(id of it mod 100 = name_of_sha1_subdirectory_moved) returns the following when "name_of_sha1_subdirectory_moved"
is replaced with "14":
7423114, nc9048149178.tivlab.austin.ibm.com, nc9048149178, 9.48.149.178, Linux Red Hat Enterprise Server 6.4
(2.6.32-358.el6.x86_64)
–
(id of it, hostname of it, name of it, ip addresses of it, operating system of it) of bes computers
whose ((id of it mod 100 = name_of_sha1_subdirectory_moved) or (id of it mod 100 = name_of_sha1_subdirectory_moved))
returns the following when "name_of_sha1_subdirectory_moved" is replaced with "14" and "5":
7423114, nc9048149178.tivlab.austin.ibm.com, nc9048149178, 9.48.149.178, Linux Red Hat Enterprise Server 6.4
(2.6.32-358.el6.x86_64)
16389405, nc038067.tivlab.raleigh.ibm.com, NC038067, 9.42.38.67, Win2003 5.2.3790
7. Verify that new rows are created in the BFEnterprise tables after copying the
sha1 folders back to the original location.
SQL Server:
use BFEnterprise
select * from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
select * from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
DB2:
connect to BFENT
select * from dbo.uploads where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
select * from dbo.uploads_availability where (FileName LIKE ’%itsitsearch%’ OR FileName LIKE ’%citsearch%’)
8. Run a Software Use Analysis import to load endpoint data into Software Use
Analysis.
9. Repeat steps 5 and 6 until all of the original sha1 folders have been imported
into Software Use Analysis.
10. Restart all fixlets that were stopped in step 1.
IBM PVU considerations
If you need to generate PVU reports for an IBM compliance purposes, the best
practice is to generate the report at least monthly.
16
For organizations that span continents, from IBM compliance perspective, the
License Metric Tool Regions must be applied requiring separate Software Use
Analysis deployments. For more information, see Virtualization Capacity License
Counting Rules.
REST API considerations
You can use REST API for software licensing information to retrieve large amounts
of data that is related to computer systems, software instances, and license usage
in your environment. Such information can then be passed to other applications for
further processing and analysis.
Although using single API requests to retrieve data only from a selected subset of
computers does not greatly impact the performance of Software Use Analysis, this
is not true when retrieving data in bulk for all your computer systems at the same
time. Such an action requires the processing of large amounts of data and it always
influences the application performance.
In general, the API requests should not be used together with other performance
intensive tasks, like software scans or data imports. Each user that is logged in to
the application, as well as the number of actions that are performed in the web
user interface during the REST API calls also decrease the performance.
Important: Each time you want to retrieve data through REST API, ensure that the
use of Software Use Analysis at a moderate level, so that the extra workload
resulting from REST API does not overload the application and create performance
problems.
When you retrieve data in bulk, you can also make several API requests and use
the limit and offset parameters to paginate your results instead of retrieving all the
data at the same time:
v Use the limit parameter to specify the number of retrieved results:
https://hostname:port/api/sam/computer_systems?token=token&limit=100000
v If you limit the first request to 100 000 results, append the next request with the
offset=100000 parameter to omit the records that you already retrieved:
https://hostname:port/api/sam/computer_systems?token=token&limit=100000&offset=100000
Note: The limit and offset parameters can be omitted if you are retrieving data
from up to about 50 endpoints. For environments with approximately 200 000
endpoints, you are advised to retrieve data in pages of 100 000 rows for computer
systems, 200 000 rows for software instances, and 300 000 rows for license usage.
Improving user interface performance
If you experience problems with the performance of the user interface while
working with reports, increase the number of rows that are loaded into the user
interface. Adjust the number of rows to the size of your environment.
About this task
When you open a report, 50 rows of data are loaded to the user interface by
default. When you scroll past those 50 rows, next 50 rows must be loaded. To
improve the response time of the user interface, you can increase the number of
rows that are loaded to the user interface.
Scalability Guidelines
17
Procedure
1. Stop theSoftware Use Analysis server.
2. On the computer where the Software Use Analysis server is installed go to the
sua_installation_dir\TEMA\work\tema\webapp\javascripts\report_components
directory and open the grid.js file.
3. Find the following lines:
$.widget("bigfix.grid",
{options:
{pageSize: 50,
gridOptions: {}
}
4. Increase the value of the pageSize parameter according to the size of your
environment.
Table 2. Number of rows loaded into Software Use Analysis user interface
Environment size
Value of the pageSize parameter
10 000 - 15 000 computers
800
15 000 - 30 000 computers
1600
over 30 000 computers
2500
5. Start the Software Use Analysis server.
6. Clear the cache in the web browser.
Using relays to increase the performance of IBM Endpoint Manager
To take advantage of the speed and scalability that is offered by IBM Endpoint
Manager, it is often necessary to tune the settings of the Endpoint Manager
deployment.
A relay is a client that is enhanced with a relay service. It performs all client
actions to protect the host computer, and in addition, delivers content and software
downloads to child clients and relays. Instead of requiring every networked
computer to directly access the server, relays can be used to offload much of the
burden. Hundreds of clients can point to a relay for downloads, which in turn
makes only a single request to the server. Relays can connect to other relays as
well, further increasing efficiency.
Reducing the Endpoint Manager server load
For all but the smallest Endpoint Manager deployments (< 500 Endpoint Manager
clients), a primary Endpoint Manager relay should be set for each Endpoint
Manager client even if they are not in a remote location.
The reason for this is that the Endpoint Manager server performs many tasks
including:
v Gathering new Fixlet content from the Endpoint Manager server
v Distributing new Fixlet content to the clients
v Accepting and processing reports from the Endpoint Manager clients
v Providing data for the Endpoint Manager consoles
v Sending downloaded files (which can be large) to the Endpoint Manager client,
and much more.
By using Endpoint Manager relays, the burden of communicating directly with
every client is effectively moved to a different computer (the Endpoint Manager
18
relay computer), which frees the Endpoint Manager server to do other tasks. If the
relays are not used, you might observe that performance degrades significantly
when an action with a download is sent to the Endpoint Manager server (you
might even see errors).
Setting up Endpoint Manager relays in appropriate places and correctly
configuring clients to use them is the most important change that has highest
impact on performance. To configure a relay, you can:
v Allow the clients to auto-select their closest Endpoint Manager relay.
v Manually configure the Endpoint Manager clients to use a specific relay.
For more information, see Managing relays.
Scalability Guidelines
19
20
Appendix. Executive summary
Table 3. Summary of the scalability best practices
Step
Activities
1.
Environment planning
Review the summary information that matches your environment size:
Small
v Up to 5 000 endpoints
v Software Use Analysis and DB2 installed on the same server
v Scan groups (optional)
Medium
v 5 000 - 50 000 endpoints
v Scan groups (advisable)
v Software Use Analysis and DB2 installed on separate computers
v It is possible to use virtual environments for this deployments size, however it is advisable to have dedicated
resources for processor, memory, and virtual disk allocation.
Large
v 50 000 - 250 000 endpoints
v Scan groups (required)
v Software Use Analysis and DB2 installed on separate computers, dedicated storage for DB2
v Fine-tuning might be required
2
Good practices for creating scan groups
v Plan the scan group size
v Create a benchmark scan group
v Check the import time and decide whether it is satisfactory
v When you achieve an import time that is satisfactory, decide whether you want to have a shorter scan cycle.
v When you are satisfied with the performance of the benchmark scan group, create the remaining groups.
3
Good practices for running scans, imports and uploads
v Run the initial import before scanning the environment
v Plan the scanning schedule
v Avoid scanning when it is not needed
v Limit the number of computer properties to the ones that are relevant for software inventory management
v Ensure that scans and imports are scheduled to run at night
v Disable gathering of usage data in the initial rollout phase
v Carefully plan for gathering of usage data in large environments. Testing is required.
v Configure regular imports (Daily imports are advisable.)
v Review import logs
v Maintain frequent imports
v Ensure that scans and imports are run at night
v Run imports once a day
v Configure upload schedule (daily)
4
End of cycle activities
v Regularly import a new software catalog, for example monthly
v Periodically import data from SmartCloud Control Desk through IBM Tivoli Integration Composer, for example at the
end of the 1 or 2-week cycle
v Generate audit snapshot
© Copyright IBM Corp. 2002, 2014
21
22
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in
other countries. Consult your local IBM representative for information on the
products and services currently available in your area. Any reference to an IBM
product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product,
program, or service that does not infringe any IBM intellectual property right may
be used instead. However, it is the user's responsibility to evaluate and verify the
operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter
described in this document. The furnishing of this document does not grant you
any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785 U.S.A.
For license inquiries regarding double-byte character set (DBCS) information,
contact the IBM Intellectual Property Department in your country or send
inquiries, in writing, to:
Intellectual Property Licensing
Legal and Intellectual Property Law
IBM Japan, Ltd.
1623-14, Shimotsuruma, Yamato-shi
Kanagawa 242-8502 Japan
The following paragraph does not apply to the United Kingdom or any other
country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS
PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS
FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or
implied warranties in certain transactions, therefore, this statement may not apply
to you.
This information could include technical inaccuracies or typographical errors.
Changes are periodically made to the information herein; these changes will be
incorporated in new editions of the publication. IBM may make improvements
and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those Web
sites. The materials at those Web sites are not part of the materials for this IBM
product and use of those Web sites is at your own risk.
© Copyright IBM Corp. 2002, 2014
23
IBM may use or distribute any of the information you supply in any way it
believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information which has been exchanged, should contact:
IBM Corporation
2Z4A/101
11400 Burnet Road
Austin, TX 79758 U.S.A
Such information may be available, subject to appropriate terms and conditions,
including in some cases, payment of a fee.
The licensed program described in this information and all licensed material
available for it are provided by IBM under terms of the IBM Customer Agreement,
IBM International Program License Agreement, or any equivalent agreement
between us.
Information concerning non-IBM products was obtained from the suppliers of
those products, their published announcements or other publicly available sources.
IBM has not tested those products and cannot confirm the accuracy of
performance, compatibility or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products.
This information contains examples of data and reports used in daily business
operations. To illustrate them as completely as possible, the examples include the
names of individuals, companies, brands, and products. All of these names are
fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of
International Business Machines Corp., registered in many jurisdictions worldwide.
Other product and service names might be trademarks of IBM or other companies.
A current list of IBM trademarks is available on the Web at “Copyright and
trademark information” at www.ibm.com/legal/copytrade.shtml.
Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Oracle and/or its affiliates.
Linux is a registered trademark of Linus Torvalds in the United States, other
countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of
Microsoft Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
24
Privacy policy considerations
IBM Software products, including software as a service solutions, (“Software
Offerings”) may use cookies or other technologies to collect product usage
information, to help improve the end user experience, to tailor interactions with
the end user or for other purposes. In many cases no personally identifiable
information is collected by the Software Offerings. Some of our Software Offerings
can help enable you to collect personally identifiable information. If this Software
Offering uses cookies to collect personally identifiable information, specific
information about this offering’s use of cookies is set forth below.
This Software Offering does not use cookies or other technologies to collect
personally identifiable information.
If the configurations deployed for this Software Offering provide you as customer
the ability to collect personally identifiable information from end users via cookies
and other technologies, you should seek your own legal advice about any laws
applicable to such data collection, including any requirements for notice and
consent.
For more information about the use of various technologies, including cookies, for
these purposes, See IBM’s Privacy Policy at http://www.ibm.com/privacy and
IBM’s Online Privacy Statement at http://www.ibm.com/privacy/details the
section entitled “Cookies, Web Beacons and Other Technologies” and the “IBM
Software Products and Software-as-a-Service Privacy Statement” at
http://www.ibm.com/software/info/product-privacy.
Notices
25
26
Printed in USA
Fly UP