Communication Server V1
Communication Server V1
Bill White
Octavio Ferreira
Teresa Missawa
Teddy Sudewo
Redbooks
International Technical Support Organization
November 2016
SG24-8360-00
Note: Before using this information and the product it supports, read the information in “Notices” on
page xi.
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Contents v
5.6.1 Commands to diagnose networking connectivity problems . . . . . . . . . . . . . . . . 274
5.6.2 Diagnosing an OMPROUTE problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
5.7 Additional information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Contents vii
Chapter 10. IBM z/OS in an ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
10.1 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
10.2 zEnterprise Unified Resource Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
10.3 Connectivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
10.3.1 Intranode management network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
10.3.2 Intraensemble data network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
10.4 Enabling z/OS as a member of the ensemble. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
10.4.1 Enabling z/OS for IPv6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
10.4.2 Enabling VTAM for the ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
10.4.3 Validating the INMN interfaces in z/OS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
10.4.4 Displaying information about the OSM interfaces . . . . . . . . . . . . . . . . . . . . . . . 428
10.5 Adding z/OS Communications Server into the ensemble . . . . . . . . . . . . . . . . . . . . . 430
10.5.1 Configuring the OSA CHPID to OSX in HCD . . . . . . . . . . . . . . . . . . . . . . . . . . 430
10.5.2 Creating a VLAN definition on Unified Resource Manager in the HMC . . . . . . 431
10.5.3 Adding hosts to the virtual network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
10.5.4 Configuring OSX interfaces in the TCP/IP stack. . . . . . . . . . . . . . . . . . . . . . . . 434
10.5.5 Displaying information about the OSX interfaces . . . . . . . . . . . . . . . . . . . . . . . 436
10.5.6 HiperSockets connectivity to the intraensemble data network . . . . . . . . . . . . . 438
10.5.7 Enabling HiperSockets access to the intraensemble data network . . . . . . . . . 438
10.5.8 Verifying the HiperSockets IQDX implementation. . . . . . . . . . . . . . . . . . . . . . . 440
10.6 Additional information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Contents ix
x IBM z/OS V2R2 Communications Server TCP/IP Implementation Volume 1
Notices
This information was developed for products and services offered in the US. This material might be available
from IBM in other languages. However, you may be required to own a copy of the product or product version in
that language in order to access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user’s responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
Statements regarding IBM’s future direction or intent are subject to change or withdrawal without notice, and
represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to actual people or business enterprises is entirely
coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are
provided “AS IS”, without warranty of any kind. IBM shall not be liable for any damages arising out of your use
of the sample programs.
The following terms are trademarks or registered trademarks of International Business Machines Corporation,
and might also be trademarks or registered trademarks in other countries.
AIX® MVS™ VTAM®
CICS® NetView® WebSphere®
CICS Explorer® OMEGAMON® z Systems®
FICON® Parallel Sysplex® z/OS®
Global Business Services® PR/SM™ z/VM®
HiperSockets™ RACF® z/VSE®
IBM® Redbooks® z10™
IBM z Systems® Redpapers™ z13™
IBM z13® Redbooks (logo) ® z13s™
IBM z13s™ RMF™ z9®
IMS™ System z10® zEnterprise®
Language Environment® System z9®
Lotus® Tivoli®
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other
countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other company, product, or service names may be trademarks or service marks of others.
For more than 50 years, IBM® mainframes have supported an extraordinary portion of the
world’s computing work, providing centralized corporate databases and mission-critical
enterprise-wide applications. IBM z Systems™, the latest generation of the IBM distinguished
family of mainframe systems, has come a long way from its IBM System/360 heritage.
Likewise, its IBM z/OS® operating system is far superior to its predecessors in providing,
among many other capabilities, world-class and state-of-the-art support for the TCP/IP
internet protocol suite.
TCP/IP is a large and evolving collection of communication protocols that is managed by the
Internet Engineering Task Force (IETF), an open, volunteer organization. Because of its
openness, the TCP/IP protocol suite has become the foundation for the set of technologies
that form the basis of the internet. The convergence of IBM mainframe capabilities with
internet technology, connectivity, and standards (particularly TCP/IP) is dramatically changing
the face of information technology and driving requirements for even more secure, scalable,
and highly available mainframe TCP/IP implementations.
This IBM Redbooks® publication is for people who install and support z/OS Communications
Server. It introduces z/OS Communications Server TCP/IP, describes the system resolver,
and shows the implementation of global and local settings for single and multi-stack
environments. It presents implementation scenarios for TCP/IP base functions, connectivity,
routing, and subplexing.
For more specific information about z/OS Communications Server standard applications, high
availability, and security, see the other volumes in the series:
IBM z/OS V2R2 Communications Server TCP/IP Implementation Volume 2: Standard
Applications, SG24-8361
IBM z/OS V2R2 Communications Server TCP/IP Implementation Volume 3: High
Availability, Scalability, and Performance, SG24-8362
IBM z/OS V2R2 Communications Server TCP/IP Implementation Volume 4: Security and
Policy-Based Networking, SG24-8363
For comprehensive descriptions of the individual parameters for setting up and using the
functions that are described in this book, along with step-by-step checklists and supporting
examples, see the following publications:
z/OS Communications Server: IP Configuration Guide, SC27-3650
z/OS Communications Server: IP Configuration Reference, SC27-3651
z/OS Communications Server: IP System Administrator’s Commands, SC27-3661
z/OS Communications Server: IP User’s Guide and Commands, SC27-3662
This book does not duplicate the information in those publications. Instead, it complements
them with practical implementation scenarios that can be useful in your environment. To
determine at what level a specific function was introduced, see z/OS Communications Server:
New Function Summary, GC31-8771. For complete details, review the documents that are
listed in the additional information section at the end of each chapter.
Bill White is a Project Leader and Senior IBM z Systems® Networking and Connectivity
Specialist at IBM Redbooks, Poughkeepsie, NY.
Octavio Ferreira is a Consulting IT Specialist with IBM Brazil. He has 34 years of experience
in IBM software support. His areas of expertise include z/OS Communications Server, SNA
and TCP/IP, and Communications Server on all platforms. For the last 15 years, Octavio has
worked in the Area Program Support Group, providing guidance and support to clients and
designing networking solutions such as SNA-TCP/IP Integration, z/OS Connectivity,
Enterprise Extender design and implementation, and SNA-to-APPN migration. He has also
co-authored other IBM Redbooks publications.
Teddy Sudewo is an IT Specialist at IBM Indonesia, working with large bank customers. He
has over 3 years of experience with IBM z Systems and IBM Systems Storage hardware. He
holds a bachelor degree in Electrical Engineering from Institut Teknologi of Sepuluh
Nopember, Surabaya, Indonesia. His areas of expertise include z Systems hardware, z/OS,
TCP/IP, encryption, STP, and storage products that are related to the IBM mainframe
infrastructure. He has written extensively about basic TCP/IP configurations, FTP TLS, FTP
AT-TLS, and zOSMF.
Doris Bunn, Mike Fox, Michael Gierlach, Randall Kunkel, Sam Reynolds, Jerry Stevens
IBM z/OS Communications Server Development, IBM Raleigh
Finally, we want to thank the authors of the previous z/OS Communications Server TCP/IP
Implementation series for creating the groundwork for this series:
Rufus P. Credle, Mike Ebbers, Rama Ayyar, Octavio L Ferreira, Yohko Ojima, Mike Riches,
Maulide Xavier, Valirio Braga, WenHong Chen, Demerson Cilloti, Sandra Elisa Freitag, Gwen
Dente, Marco Giudici, Adi Horowitz, Michael Jensen, Gazi Karakus, Shizuka Katoh, Uma
Maheswari Kumaraguru, Sherwin Lake, Bob Louden, Garth Madella, Yukihiko Miyamoto,
Hajime Nagao, Shuo Ni, Carlos Bento Nonato, Gilson Cesar de Oliveira, Roland Peschke,
Joel Porterie, Marc Price, Frederick James Rathweg, Micky Reichenberg, Georg Senfleben,
Rutsakon Techo, Larry Templeton, Rudi van Niekerk, Thomas Wienert, and Andi Wijaya.
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
[email protected]
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
Preface xv
xvi IBM z/OS V2R2 Communications Server TCP/IP Implementation Volume 1
1
The z/OS Communications Server product includes ACF / IBM VTAM®, in addition to TCP/IP.
This chapter covers the topics that are shown in Table 1-1.
1.1, “Overview and basic concepts” on Basic concepts of Communications Server for z/OS IP
page 2
1.2, “Featured functions” on page 3 Key characteristics of Communications Server for z/OS
IP
1.3, “Communications Server for z/OS IP Functional overview of how Communications Server for
implementation” on page 4 z/OS IP is implemented
1.4, “Additional information” on page 19 Lists IBM publications that provide further details for
implementing Communications Server for z/OS IP
z/OS Communications Server provides the computer platform with the freedom that is wanted
by organizations to distribute workload to environments suited to their needs.
Communications Server for z/OS IP, therefore, adds the z/OS environment to the list of
environments in which an organization can share data and computer processing resources in
a TCP/IP network.
The TCP/IP address space is where the TCP/IP protocol suite is implemented for
Communications Server for z/OS IP. The TCP/IP address space is commonly referred to as a
stack.
Communications Server for z/OS IP has highly efficient direct communication between the
UNIX System Services address space (OMVS) and a TCP/IP stack that was integrated in
UNIX System Services. This communication path includes the UNIX System Services
Physical File System (PFS) component for AF_INET and AF_INET6 (Addressing
Family-Internet) sockets communication.
The TCP/IP protocol suite is implemented by an MVS started task within the TCP/IP address
space along with z/OS UNIX (UNIX System Services).
Communications Server for z/OS IP offers an environment that is accessible to the enterprise
IP network and the internet. It defines the z/OS environment as a viable platform by making
z/OS applications and systems available to the non-z/OS environment, which are typically
UNIX or Windows centric. So, it eliminates the issues and challenges of many large
corporations to migrate or integrate with a more accessible platform and newer technologies.
The following list includes many of the technologies that are implemented in the z/OS
environment to complement TCP/IP:
High-speed connectivity, such as the following items:
– OSA-Express up to 10-Gigabit Ethernet in QDIO mode.
– IBM HiperSockets in internal queued direct I/O (iQDIO) mode.
– SMC-R.
– SMC-D.
High availability for applications that use IBM Parallel Sysplex® technology with the
following items:
– Dynamic Virtual IP Address (VIPA), which provides TCP/IP application availability
across z/OS systems in a sysplex and allows participating TCP/IP stacks to provide
backup and recovery for each other, for planned and unplanned TCP/IP outages.
– Sysplex Distributor, which provides intelligent load balancing for TCP/IP application
servers in a sysplex, and along with Dynamic VIPA provides a single system image for
client applications connecting to those servers.
– The Load Balancing Advisor (LBA), which provides z/OS Sysplex server application
availability and performance data to outboard load balancers through the Server
Application State Protocol (SASP).
As shown in Figure 1-1, many DLC protocols are provided with the z/OS Communications
Server by the VTAM component.
LPD client, NDB, NICS, RPC, Kerberos, TN3270 server, FTP server, FTP client, Telnet server,
LPD server, MISC server, Portmapper, NPF, X-Windows client, SNMP Agent, OMPROUTE,
SMTP server, SNMP query, X-Windows client, DPI library and SNMP Command, Netstat, Ping, Tracerte,
Telnet client DPI library R-commands, RPC, REXEC, RSH, Sendmail, CSSMTP
With Communications Server for z/OS IP, two worlds converge, providing access to the z/OS
UNIX environment and the traditional MVS environment.
ASIDs that are used for the TCP/IP stack, the resolver, VTAM, and TN3270 are non-reusable
because they provide PC-entered services that must be accessible to other address spaces.
If these address spaces are terminated enough times, all available ASIDs can be exhausted,
preventing the creation of an address space on the system. That situation might require an
initial program load (IPL).
To avoid this situation, these ASIDs should be started as reusable. To enable the reuse ASID
function, you must specify the following information:
REUSASID(YES) in member DIAGxx of your PARMLIB
REUSASID=YES on the start command when starting the address space
The REUSASID parameter cannot be coded in the JCL of the started task because the Master
Scheduler needs to know this information before the JCL is read and the ASID is assigned.
Consideration: Do not specify REUSASID=YES when you are starting the VMCF and TNF
subsystems or any applications that use these subsystems.
The resolver started task always uses a reusable ASID when started during z/OS UNIX
initialization through the BPXRMMxx statement RESOLVER_PROC, but uses a non-reusable
ASID if stopped and started. You should restart resolver with the REUSASID=YES parameter that
is specified on the start command.
This book includes examples of REUSASID coding and its results in Appendix B, “Additional
parameters and functions” on page 471.
With the increasing demand for processing and memory capacity, the storage in 31-bit
addressing mode (below the bar) is of special concern. Over the past several releases, code
changes moved storage that used to be obtained below the bar to 64-bit addressing mode
(above the bar), and by doing so, helped reduce the overall costs of its delivered services.
These changes allow for improved networking scalability because TCP/IP’s usage of data
space, ECSA, and private virtual storage is not significantly affected by the scale of
networking activity.
Other types of TCP/IP network connectivity, for example XCF, MPCPTP, LCS, or CTC, are still
31-bit types and are 64-bit stack compatible. These drivers do not provide 64-bit exploitation.
When you use the 31-bit types of network connectivity, your network performance and CPU
cost might not be as efficient as it was in previous releases because extra data copies might
be required. One example of a data traffic where this situation might occur is sysplex
distributor forwarding.
Tip: Use VIPAROUTE over OSA-Express QDIO or HiperSockets for sysplex distributor
forwarding to avoid using 31-bit network connectivity. Also, consider migrating your
connectivity environment to use only those drivers that support 64-bit mode.
For more information about 64-bit exploitation and how it might affect your z/OS environment,
see z/OS Communications Server: New Function Summary, GC27-3664.
The VTAM component of z/OS Communications Server provides the I/O support for each of
these communication interfaces, and requires the creation (dynamically or through definition)
of Transport Resource List Entries (TRLEs) to represent each interface. TRLEs must be
defined for the following communication interfaces:
MPCOSA
MPCIPA
MPCPTP
The DLCs that are implemented by z/OS Communications Server are described here:
CTC provides connectivity through a channel-to-channel (CTC) connection that is
established over an IBM z Systems FICON® environment.
LCS provides connectivity through special devices like the OSA-Express feature
1000BASE-T Ethernet, in LAN emulation mode (defined as channel-path identifier
(CHPID) type OSE in the I/O configuration).
MPCPTP allows a Communications Server for z/OS IP environment to connect to a peer
IP stack in a point-to-point configuration. With MPCPTP, a Communications Server for
z/OS IP stack can be connected to the following items:
– Another Communications Server for z/OS IP stack.
– An IP router with corresponding support.
– A non-z/OS server.
MPCPTP Samehost, also referred as IUTSAMEH, is used to connect two or more
Communications Server for z/OS IP stacks running on the same z/OS LPAR. In addition, it
can be used to connect these Communications Server for z/OS IP stacks to z/OS VTAM
for the use of Enterprise Extender.
MPCIPA allows an Open Systems Adapter-Express (OSA-Express) port to act as an
extension of the z/OS Communications Server TCP/IP stack and not as a peer TCP/IP
stack, as with MPCPTP:
– OSA-Express provides a mechanism for communication called QDIO. Although it uses
the MPC protocol for its control signals, the QDIO interface is different from channel
protocols. It uses Direct Memory Access (DMA) to avoid the impact that is associated
with channel programs. A partnership between Communications Server for z/OS IP
and the OSA-Express adapter provides compute-intensive functions from the
z Systems server to the adapter.
– OSA-Express collaborates with z/OS Communications Server TCP/IP to support
10-Gigabit Ethernet, 1000BASE-T, Fast Ethernet, and High-Speed Token Ring
network. TCP/IP hosts support all models of OSA-Express features.
– HiperSockets (iQDIO) provides high-speed, low-latency IP message passing between
logical partitions (LPARs) within a single z Systems server. The communication is
through processor system memory through DMA. The virtual servers that are
connected through HiperSockets form a virtual LAN (VLAN). HiperSockets uses
internal QDIO at memory speeds to pass traffic between virtual servers.
The IBM 10 GbE RoCE Express feature enables the use of Remote Direct Memory
Access (RDMA) processing by using SMC-R protocols for TCP connections to remote
peers on external networks that also support this function.
SMC-D allows TCP/IP stacks on different LPARs within the same central processor
complex (CPC) to share the Internal Shared Memory (ISM) device.
For more information about devices and connectivity options, see Chapter 4, “Connectivity”
on page 139.
This book describes these items in more detail in the following sections.
Pascal API
You can use the Pascal API to develop TCP/IP applications in the Pascal language.
Supported environments are normal MVS address spaces. Unlike the other APIs, the Pascal
API does not interface directly with the LFS. It uses an internal interface to communicate with
the TCP/IP protocol stack. The Pascal API supports only AF_INET.
However, in a CINET PFS configuration, they function differently from z/OS UNIX APIs. In this
type of configuration, the z/OS Communications Server APIs always bind to a single PFS
transport provider, and the transport provider must be the TCP/IP stack that is provided by the
z/OS Communications Server.
For complete documentation of the z/OS UNIX C sockets APIs, see z/OS XL C/C++ Compiler
and Run-Time Migration Guide for the Application Programmer, GC09-4913. You can also
find further guidance in z/OS UNIX System Services Programming Tools, SA22-7805.
REXX sockets
The REXX sockets programming interface implements facilities for socket communication
directly from REXX programs by using an address rxsocket function. REXX socket programs
can run in TSO, online, or batch. The REXX sockets programming interface supports
AF_INET and AF_INET6.
For complete documentation of the TCP/IP Services APIs, see z/OS Communications Server:
IP Sockets Application Programming Interface Guide and Reference, SC31-8788.
These applications are described in more detail in IBM z/OS V2R2 Communications Server
TCP/IP Implementation Volume 2: Standard Applications, SG24-8361 and z/OS
Communications Server: IP Configuration Guide, SC27-3650.
Communications Server for z/OS IP offers two variants of the UNIX shell environment:
The z/OS shell, which is the default shell
The tcsh shell (Ishell), which is an enhanced version of the Berkeley UNIX C shell
The Communications Server for z/OS IP requires that UNIX System Services be customized
in full-function mode before the TCP/IP stack successfully initializes. For this reason, this
book presents an overview of UNIX System Services to provide an overview of the coding
and security considerations that are involved with UNIX System Services.
For a useful description of the UNIX System Services customization process and TCP/IP, see
z/OS UNIX System Services Planning, GA22-7800.
With the APIs, programs can run in any environment (including batch jobs, in jobs submitted
by Time Sharing Option Extensions (TSO/E) interactive users, and in most other started
tasks) or in any other MVS application task environment. The programs can request:
Only MVS services
Only z/OS UNIX services
Both MVS and z/OS UNIX services
In z/OS UNIX Systems Services, address spaces are provided by the fork() or spawn()
functions of the Open Edition callable services:
For a fork() function, the system copies one process, called the parent process, into a
new process, called the child process, and places the child process in a new address
space, the forked address space.
A spawn() functions also starts a new process in a new address space. Unlike a fork(), in
a spawn() call, the parent process specifies a name of a program to be run in the child
process.
A process can have one or more threads. A thread is a single flow of control within a process.
Application programmers create multiple threads to structure an application in independent
sections that can run in parallel for more efficient use of system resources.
To the z/OS system, the UNIX file hierarchy appears as a collection of z Systems File System
data sets. Each z/OS UNIX file system data set is a mountable file system. The root file
system is the first file system mounted. Subsequent file systems can be mounted logically on
a directory within the root file system or on a directory within any mounted file system.
Each mountable file system is in a z/OS UNIX file system data set on direct-access storage.
DFSMS/MVS manages the z/OS UNIX file system data sets and the physical files.
For more information about the z/OS UNIX file system, see z/OS CS: IP Migration,
GC31-8773, and z/OS UNIX System Services Planning, GA22-7800.
An important part of your z/OS UNIX file system is in the /etc directory. The /etc directory
contains some basic configuration files of UNIX System Services, and most applications keep
their configuration files in there. To avoid losing all of your configuration when you upgrade
your operating system, put the /etc directory in a separate z/OS UNIX file system data set
and mount it at the /etc mountpoint. For more information about the /etc directory, see z/OS
UNIX System Services Planning, GA22-7800.
If a unit of work in MVS uses z/OS UNIX functions, this unit of work must have, in addition to
a valid MVS identity, a z/OS UNIX identity. A z/OS UNIX identity is based on a UNIX user ID
(UID) and a UNIX group ID (GID). Both UID and GID are numeric values 0 - 2147483647
(231-1).
In a z/OS UNIX system, the UID is defined in the OMVS segment in the user’s RACF user
profile, and the GID is defined in an OMVS segment in the group’s RACF group profile. What
in an MVS environment is called the user ID is in a UNIX environment normally termed the
user name or the login name. It is the name that users use to present themselves to the
operating system. In both a z/OS UNIX system and other UNIX systems, this user name is
correlated to a numeric user identification, the UID, which is used to represent this user
wherever such information has to be stored in the z/OS UNIX environment. One example of
this is in the Hierarchical File System, where the UID of the owning user is stored in the file
security portion of each individual file.
Access to z/OS UNIX resources is granted only if the MVS user ID has a valid OMVS
segment with an OMVS UID, or if a default user is configured. Access to resources in the
Hierarchical File System is based on the UID, the GID, and file access permission bits that are
stored with each file. The permission bits are three groups of three bits each. The groups
describe the following information:
The owner of the file itself
The users with the same GID as the owner
The rest of the world
The superuser UID has a special meaning in all UNIX environments, including the z/OS UNIX
environment. This user has a UID of zero and can access every resource.
In lieu of or in addition to RACF definitions for individual users, you can define a default user.
The default user is used to allow users without an OMVS segment defined to access UNIX
System Services. The default user concept should be used with caution because it might
become a security exposure.
For more information about the RACF security aspects of implementing the Communications
Server for z/OS IP, see IBM z/OS V2R2 Communications Server TCP/IP Implementation
Volume 4: Security and Policy-Based Networking, SG24-8363.
There are two shells: the z/OS shell and the Ishell. The login shell is determined by the
PROGRAM parameter in the RACF OMVS segment for each user. The default is the z/OS shell.
For more information about the z/OS UNIX shells, see z/OS UNIX System Services User’s
Guide, SA22-7801.
Operating mode
When a user first logs on to the z/OS UNIX shell, the user is operating in line mode.
Depending on the method of accessing the shell, the user can then use utilities that require
raw mode (such as vi) or run an X Window System application.
When you obtain a socket by using the socket() system call, you pass a parameter that tells
the socket library to which addressing family the socket should belong. All socket addresses
within one addressing family use the same syntax to identify sockets.
Note: Throughout this book, information regarding AF_INET (IPv4) also applies to
AF_INET6 (IPv6).
The z/OS UNIX Systems Services implement support for a given addressing family through
different physical file systems. There is one physical file system for the AF_INET addressing
family, and there is another for the AF_UNIX addressing family. A PFS is the part of the z/OS
UNIX operating system that handles the storage of data and its manipulation on a storage
medium.
LFS
You can configure either AF_INET or both AF_INET and AF_INET6. You cannot define the
stack as IPv6 only. Although coding AF_INET6 alone is not prohibited, TCP/IP does not start
because the master socket is AF_INET and the call to open it fails.
For more information, see Chapter 3, “Base functions” on page 73 or z/OS UNIX System
Services Planning, GA22-7800.
The AF_INET physical file system relies on other products to provide the AF_INET transport
services to interact with UNIX System Services and its sockets programs.
UNIX Application
LFS
PFS = AF_INET=Type(INET)
CS for z/OS IP
The sockets/PFS effectively transforms the sockets calls from the z/OS UNIX interface to the
TCP/IP stack regardless of the version of MVS or TCP/IP. The sockets/PFS handles the
communication between the TCP/IP address space and the z/OS UNIX address space in
much the same manner as High Performance Native Socket (HPNS) handles the
communication between the TCP/IP address space and the TCP/IP client and server address
spaces.
A simple example of a situation where you have more TCP/IP stacks running in your z/OS
system is if you have two separate IP networks, one production and one test (or one secure
and one not). You do not want routing between them, but you do want to give hosts on both IP
networks access to your z/OS environment. In this situation, you can implement two TCP/IP
stacks, one connected to the production IP network and another connected to the test
network.
If a single AF_INET(6) transport provider is sufficient, then use INET. If you need more than
one AF_INET(6) transport provider (multiple TCP/IP stacks), then you must use CINET.
You can customize z/OS to use the Common INET physical file system with just a single
transport provider (AF_INET(6)), but it is not preferred because of a slight performance
decrease as compared to the INET. However, you might consider doing this if you expect to
run multiple stacks in the future.
The PFS is also known under the name INET, and this appears in UNIX System Services
definitions when a FILESYSTYPE and NETWORK TYPE must be defined in the BPXPRMxx
member of SYS1.PARMLIB.
OE Application
OE LFS
C-INET PFS
IP Network
The resolver function allows applications to use names instead of IP addresses to connect to
other partners. The mapping of IP addresses and names is managed by name servers or
local definitions. The resolver queries those name servers, or searches local definitions, to
convert the name to an IP address or the IP address to a name. Using the resolver relieves
users of having to remember the decimal or hexadecimal IP addresses.
The resolver is important for enabling TCP/IP stacks or TCP/IP applications to establish
connections to other hosts.
This chapter covers the topics that are shown in Table 2-1.
2.2, “The resolver address space” on Key characteristics of the resolver address space
page 24
2.3, “Implementing the resolver” on The configuration and verification tasks of resolver
page 50 implementation
In most systems, in order for an application to reach a remote partner, it uses two commands
to ask the resolver what the IP address is for a host name, or vice versa. The commands are
gethostbyname(nnnnn) and gethostbyaddress(aaa.aaa.aaa.aaa). The IPv6-enabled
equivalent calls are getaddrinfo(nnnn) and getnameinfo(IPaddress).
Figure 2-1 illustrates the information request and response flows. The resolver gets a request
and based on its own configuration file, either looks at a local hosts file or sends a request to
a DNS server. After the relationship between the host name and IP address is established,
the resolver returns the response to the application.
Give me IP address
for hostx.abc.com
As mentioned, the resolver function allows applications to use names instead of IP addresses
to connect to other partners. Although using an IP address might seem to be an easy way to
establish such a connection, for applications that need to connect to numerous partners, or
for applications that are accessed by thousands of clients, using names is a much easier and
more reliable form of establishing access.
Table 2-2 Compare the use of direct addressing with name resolution
Item Hardcoded IP Local hosts file Domain Name System
addresses
Technology None. Use the entered Use gethostbyname() and Use gethostbyname() and
IP address directly on the let the resolver find an let the resolver contact the
connect() or sendto() IP address in the locally configured name server
socket call. configured hosts file. for an IP address.
Benefits Fast (no name resolution). Fast (local name IP address changes can
Good in some debugging resolution). be done without any local
situations (you know changes. All host names
exactly which IP address (in the entire network) can
is being used). be resolved.
A hierarchical name
space.
Note: Use of the z/OS UNIX services requires the resolver to have an OMVS segment that
is associated with its user ID. If you do not have a user ID defined for the resolver that has
an associated OMVS segment, you must act before starting the resolver. Otherwise, the
resolver address space initialization fails and the initialization of all TCP/IP stacks is
delayed. Complete one of the following steps:
If you already have a resolver user ID but it does not have an OMVS segment, then you
must define an OMVS segment for the resolver user ID.
If you do not have a resolver user ID, then you must create one that includes an OMVS
segment.
For more information, see “Creating a user ID for the resolver and assigning an OMVS
segment” on page 50.
For more information about defining and assigning a user ID for started procedures, see
“Using Started Procedures” in z/OS Security Server RACF Security Administrator’s Guide:
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ichza7c0/5.9?SHELF=E
Z2ZO213&DT=20110620175910
For more information about defining an OMVS segment, see “RACF and z/OS UNIX” in
z/OS Security Server RACF Security Administrator’s Guide:
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ichza7c0/17.0?SHELF=
EZ2ZO213&DT=20110620175910#HDRRUSS
The resolver must be started before TCP/IP stacks or any TCP/IP applications issue the
resolver calls. It can be started in one of the following ways:
Default z/OS UNIX resolver
If no customized resolver address space is configured, the z/OS UNIX System Services
starts the default resolver. The default resolver is named RESOLVER. To use the default
RESOLVER address space, specify the RESOLVER_PROC(DEFAULT) statement or do not
specify any RESOLVER_PROC statements in BPXPRMxx.
Customized resolver address space
The customized resolver address space can specify additional options to control the use
of the resolver configuration file. To create the customized resolver address space, create
a resolver started procedure and a SETUP data set to specify the additional options. The
customized resolver address space can be started automatically with the
RESOLVER_PROC(procname) statement in BPXPRMxx.
Although the resolver address space can be started manually, as a preferred practice, start
the resolver address space automatically during initialization of the UNIX System Services by
defining the RESOLVER_PROC() statement within BPXPRMxx.
After the resolver address space is activated, the global TCPIP.DATA statements cannot be
overridden unless the MODIFY command is issued.
The configuration file can be an MVS data set or a z/OS UNIX Hierarchical File System (HFS)
file.
The TCP/IP applications run a set of commands in the Sockets API Library to initiate a
request to the resolver in z/OS. The Sockets API Library uses one of the following socket
environments:
Native MVS environment
z/OS UNIX environment
Table 2-3 lists some of the APIs, z/OS applications, and user commands that use the active
MVS environment and the z/OS UNIX environment.
Table 2-3 Socket APIs, applications, and commands in Native MVS or z/OS UNIX environment
Items Native MVS environment z/OS UNIX environment
Each socket environment uses a different search order of the resolver configuration file, as
shown in Figure 2-2.
1. GLOBALTCPIPDATA 1. GLOBALTCPIPDATA
2. //SYSTCPD DD statement 2. RESOLVER_CONFIG
3. userid/jobname.TCPIP.DATA environment variable
4. SYS1.TCPPARMS(TCPDATA) 3. /etc/resolv.conf
5. DEFAULTTCPIPDATA 4. //SYSTCPD DD statement
6. TCPIP.TCPIP.DATA 5. userid/jobname.TCPIP.DATA
6. SYS1.TCPPARMS(TCPDATA)
7. DEFAULTTCPIPDATA
8. TCPIP.TCPIP.DATA
Figure 2-2 The resolver configuration file search order for each socket environment
Note: UNIX System Services Callable sockets use the z/OS UNIX environment search
order, but the z/OS UNIX API does not have access to the IBM XL C/C++ environment
variables (for example, RESOLVER_CONFIG and RESOLVER_TRACE).
This provides the flexibility to control the resolver lookup differently, depending on which
socket API the application uses. However, because of the difference in search orders, it
sometimes causes an unexpected result in the address resolution.
For example, if you set up /etc/resolv.conf as your resolver configuration file, the FTP
server application that uses the z/OS UNIX search order can resolve the name-to-address or
address-to-name successfully. However, the TN3270 server, which uses the native MVS
search order, fails because /etc/resolv.conf is not included in its search list.
Using GLOBALTCPIPDATA
To deal with the complexity of the different search orders in the environments, the
GLOBALTCPIPDATA statement was introduced. Using the GLOBALTCPIPDATA statement, you can
use the same resolver configuration file throughout the z/OS system because it is the first
choice in all socket search orders. This consolidation allows for consistent name resolution
processing across all TCP/IP applications.
The TCPIP.DATA file that is specified by the GLOBALTCPIPDATA statement is often called the
global TCPIP.DATA file. If you define GLOBALTCPIPDATA, the following statements can be
included only in the global TCPIP.DATA file:
DomainOrigin/Domain or Search
NSInterAddr/NameServer
NSPortAddr
ResolveVia
ResolverTimeOut
ResolverUDPRetries
SortList
Other TCPIP.DATA statements can be optionally included in the global TCPIP.DATA file, and the
definition in the global TCPIP.DATA always has precedence. If TCPIPJobname is specified in
both the global TCPIP.DATA file and the local (non-global) TCPIP.DATA file, then the one in the
global TCPIP.DATA file is used.
If other TCPIP.DATA statements, such as HostName and TCPIPJobname, cannot be found in the
global TCPIP.DATA file, then the resolver continues its search according to the search order of
each socket environment. The search stops when the file is found.
If statements such as HostName and TCPIPJobname cannot be found in that file either, the
defaults are applied. It does not continue searching in the list. A maximum of two files can be
used (global TCPIP.DATA file and one TCPIP.DATA file in the search order list).
Using GLOBALTCPIPDATA, the administrators can specify which statements should be applied
throughout the z/OS image, and decide which statements can be customized by each socket
environment by omitting those statements in the global TCPIP.DATA file.
Note: In the CINET multi-stack environment, omit the TCPIPJobname statement from the
global TCPIP.DATA file so that each TCP/IP stack, or the applications that have affinity to a
stack, can specify a local TCP.DATA with its own TCPIPJobname statement.
When using GLOBALTCPIPDATA in the CINET environment, the name server that is specified
by NSInterAddr or NameServer in the global TCPIP.DATA file must be accessible from all
TCP/IP stacks that issue resolver calls.
Figure 2-3 on page 29 depicts the relationship between global TCPIP.DATA and local
TCPIP.DATA.
Global TCPIP.DATA
DOMAINORIGIN ITSO.IBM.COM Local TCPIP.DATA
NSINTERADDR 10.1.2.10 TCPIPJOBNAME TCPIPA
NSPORTADDR 53 HOSTNAME WTSC30A
RESOLVEVIA UDP DATASETPREFIX TCPIPA
RESOLVERTIMEOUT 5 MESSAGECASE MIXED
RESOLVERUDPRETRIES
LOOKUP LOCAL DNS
FTPDZ
TCPIPZ
server
Local TCPIP.DATA
TCPIPJOBNAME TCPIPZ
HOSTNAME WTSC30Z
DATASETPREFIX TCPIPZ
MESSAGECASE MIXED
Using DEFAULTTCPIPDATA
DEFAULTTCPIPDATA can be specified in the resolver SETUP data set to define the last choice of
the TCPIP.DATA in the search order. The file that is specified by DEFAULTTCPIPDATA is used
when the application does not specify the local (non-global) TCPIP.DATA.
Using COMMONSEARCH
When the local hosts file is searched, the search order for the native MVS environment and
the z/OS UNIX environment are different. The difference in the search orders adds complexity
to configuration tasks and can lead unexpected results of the name resolution.
The local hosts files looked up in this search order are typically called ETC.IPNODES files.
When COMMONSEARCH is specified in the resolver SETUP data set, it uses the same search
order for both IPv4 and IPv6 queries. You can list both IPv4 and IPv6 addresses in the
ETC.IPNODES file.
1. GLOBALIPNODES 1. GLOBALIPNODES
2. userid/jobname.ETC.IPNODES 2. RESOLVER_IPNODES
3. hlq.ETC.IPNODES environment variable
4. DEFAULTIPNODES 3. userid/jobname.ETC.IPNODES
5. /etc/ipnodes 4. hlq.ETC.IPNODES
5. DEFAULTIPNODES
6. /etc/ipnodes
Figure 2-4 Local hosts file search order with COMMONSEARCH specified
If COMMONSEARCH is not specified in the resolver SETUP data set, then the default is
NOCOMMONSEARCH and the default search order that is shown in Figure 2-5 on page 31 is used.
Using GLOBALIPNODES
The GLOBALIPNODES statement specifies the global local host file that is used in the entire z/OS
image, regardless of which environment (native MVS or z/OS UNIX) that the applications or
sockets API use. To put the GLOBALIPNODES statement into effect for the name resolution of
IPv4 addresses, also specify COMMONSEARCH in the resolver SETUP data set.
Using DEFAULTIPNODES
The DEFAULTIPNODES statement specifies the last candidate of the local host file search. To put
the DEFAULTIPNODES statement into effect for the name resolution of IPv4 addresses, also
specify COMMONSEARCH in the resolver SETUP data set.
Figure 2-5 Local hosts file search order with NOCOMMONSEARCH specified (default)
Two of the new resolver setup file statements are CACHE and NOCACHE. The CACHE statement,
which is the default, explicitly indicates that resolver caching is active across the entire
system. The NOCACHE statement explicitly indicates that resolver caching is not active across
the entire system. You must code NOCACHE if you want to maintain the current level of resolver
processing. The setting of CACHE or NOCACHE can be changed dynamically by running the
MODIFY RESOLVER,REFRESH,SETUP command. If you change from a setting of CACHE to a setting
of NOCACHE dynamically, any existing cache records are immediately deleted.
query for
host.pok.ibm.com z/OS LPAR
4,6
1,5 2
3
TCPIP.DATA
Name Server
NSINTERADDR 10.1.1.1 10.1.1.2
NSINTERADDR 10.1.1.2
Using CACHESIZE(size)
CACHESIZE indicates how much storage the cache function can use to hold resolver cache
information. The valid range for size is 1 - 999 MB. The default is 200 MB. For planning
purposes, assume a megabyte of data holds slightly more than 400 entries and consider
coding a CACHESIZE at least 50% greater than your expected needs.
Important: You can modify the CACHESIZE by using the MODIFY REFRESH,SETUP command,
but you can increment only the storage amount or keep it the same. To decrease the value
of CACHESIZE M, you must stop and restart the resolver.
Using MAXTTL(time)
MAXTTL indicates the longest amount of time that the resolver cache can use saved
information. The valid range for time is 1 - 2147483647 (seconds). The default is
2,147,483,647, which is the largest TTL a name server can return.
When a list of IP addresses is cached for a host name, the Getaddrinfo process reorders the
list. If you need more control over the preferable IP address in the list, the statement SORTLIST
in the Resolver setup data set member can be used to define which IP address or network is
the one the cache returns to the requester. However, this statement is used only the first time
a list is returned from the DNS, and only the chosen IP address is returned from that moment
onward.
You can configure the resolver to enable reordering of the cached list of IP addresses that
was returned in response to a host name resolution request. Using this function, the list of IP
addresses are reordered in a round-robin fashion after each time the list in the cache is
queried, allowing the connection requests to be distributed among the addresses that are
provided.
This function can be enabled for a single application, or for the entire system, depending
where the statement is coded.
Attention: Avoid using the statements CACHEREORDER and SORTLIST together. With both
applied, the results might not be as expected.
After changing the resolver setup data set member to include the statement, run the modify
resolver,refresh,setup=<file name> command to apply the change. The messages
resulting from the refresh command show that the parameter is in place, as shown in
Example 2-2.
After CACHEREORDER is defined in the resolver setup data set member, it applies to all z/OS
LPARs. However, it is possible to disable it for a single application by including the statement
NOCACHEREORDER in the application’s resolver setup data set member.
To verify that the cache reorder function is working, follow the steps that are described in
“Testing the resolver DNS cache with CACHEREORDER in place” on page 35.
1 Determines whether system-wide cache reordering is in effect when EZZ9304I CACHEREORDER is displayed.
2. Run the netstat RESCache command in the console or Time Sharing Option (TSO)
command line to display information regarding the resolver cache. In the UNIX System
Services (now called z/OS UNIX) environment, the same command is netstat -q. Two
main types of information can be displayed: statistical information and actual resource
information.
You can specify additional modifiers or filters to influence the amount of cache data that is
displayed. For statistical information, you can add the DNS modifier to have the overall
statistics broken into statistical information on a name server IP address basis. You have
even more options for detailed entry information reports. You can filter the information by
the IP address of the name server that provided the information. You can filter the
information so that only entries that are related to a specific host name or IP address value
are displayed. You can display only negative cache information from the cache, either all
entries or subsets of entries based on name server IP address, host name value, or IP
address value.
3. Create a cache entry by running a ping command to resolve the host name
zoscs.lab.itso.ibm.com. To verify it, run the netstat -q command, as shown in
Example 2-4 and Example 2-5.
The next ping command to the same host shows that the cache provided the next address
in the list, as shown in Example 2-7.
This is a partial example of a netstat report showing a detailed cache entry. The reports are
formatted so that DNS A and AAAA records are displayed as one group, and DNS PTR
records are displayed as a second group. Negative cache entries can appear in either group,
in any order, and are identified by using the following notation:
***NA***
For each record, the cache entry key, or the target resource that was searched for to acquire
this cache information, is the first line of the entry. After that, the two types of entries are
similar. The IP address of the DNS name server that supplied this particular information is
displayed, allowing you to see what values were provided by what name servers. In the case
of DNS A and AAAA record entries, the host name that is used to create the record might
really be an alias or nickname for the official name of the resource. For that reason, the
display includes the official, or canonical, name, regardless of whether the names match.
There is no canonical name concept for DNS PTR records.
Two time values are displayed: one is the time and the date of cache entry creation. The other
is the time and date when the entry expires, based on the name server TTL or MAXTTL setting.
The netstat RESCACHE report does not include any resources that are in the cache that
represent expired information. The number of times this entry is reused is displayed as the
“Hits” value. Finally, for DNS A and AAAA entries, up to 35 IP addresses that are provided by
the specified name server for the host name value are included. For DNS PTR entries, the
one host name that is associated with the input IP address (either IPv4 or IPv6) is included.
The resolver also provides statistics for each currently unresponsive name server regarding
the number of queries that are attempted and the number of queries that received no
response during a sliding 5-minute interval.
Communications Server for z/OS IP considers a DNS name server to be unresponsive when
the number of unsuccessful queries exceeds a percentage threshold of the total queries that
are sent during a 5-minute interval. By default, the percentage threshold is 25% of the total
queries. This percentage can be customized by using the UNRESPONSIVETHRESHOLD
configuration statement in the resolver setup file.
Note: The autonomic function uses shorter time intervals than the sliding 5-minute
intervals that are described here. For more information, see “Polling for unresponsiveness”
on page 41.
The percentage threshold value can also be changed while the resolver is active by changing
the UNRESPONSIVETHRESHOLD configuration statement in the resolver setup file and running the
MODIFY resolver,REFRESH,SETUP=setup_file_name command.
The unresponsive DNS notification function is enabled by default. It can be turned off by
specifying the UNRESPONSIVETHRESHOLD configuration statement with a value of 0.
If by the end of a subsequent monitor interval the resolver determines that the name server’s
failure rate dropped below the threshold value, the resolver considers this name server to be
responsive again, clears message EZZ9308E from the operator console, and issues a
message indicating the DNS is responsive again, as shown in Example 2-11.
Implementation
You must explicitly enable this function. By default, the resolver performs only the network
operator notification level of monitoring.
The resolver polls only name servers that are defined in the global TCPIP.DATA file. The
resolver generates a new poll every six seconds regardless of the timeout value being used.
After the resolver starts polling a name server, it continues polling until the name server
becomes healthy.
Note: The resolver uses a separate socket for each name server being polled. You might
need to increase your MAXSOCKETS value by the number of name servers in your global
TCPIP.DATA file to assure that the resolver can always get a socket for polling.
When the resolver finds an unresponsive name server, it does the following tasks:
1. Stops forwarding queries to the name server.
2. Alerts the network operator with an action message, EZZ9311E, that remains on the screen
until cleared by the operator, or until the resolver determines the name server is
responsive again.
3. Issues a second message, EZZ9313I, that provides the statistics that is used by the
resolver to determine that the name server was unresponsive.
F RESOLV33,DISPLAY
EZZ9298I DEFAULTTCPIPDATA - None
EZZ9298I GLOBALTCPIPDATA - SYS1.TCPPARMS(GLBLDATA)
EZZ9298I DEFAULTIPNODES - None
EZZ9298I GLOBALIPNODES - SYS1.TCPPARMS(IPNODES)
EZZ9304I COMMONSEARCH
EZZ9304I CACHE
EZZ9298I CACHESIZE - 10M
EZZ9298I MAXTTL - 600
EZZ9298I UNRESPONSIVETHRESHOLD - 5
EZZ9304I AUTOQUIESCE
EZD2035I NAME SERVER 9.12.6.7 184
STATUS: ACTIVE FAILURE RATE: *NA*
EZZ9293I DISPLAY COMMAND PROCESSED
...
...
Affinity server
An affinity server is an application that has affinity to a specific TCP/IP stack; it provides
service to the clients that are connected through the TCP/IP stack to the applications.
In this case, you must code a TCP/IPJobname statement that represents the application to
direct traffic to a specific stack. So, when designing the global definitions in the resolver
address space, do not code a TCPIPJobname statement in GLOBALTCPIPDATA. Instead, allow it
to be coded in the local TCPIP.DATA.
A native TCP/IP sockets program always uses one stack only, and by default, it is the stack
that is identified in the TCPIPJOBNAME option in the chosen resolver configuration file. However,
the stack can also be chosen through the program configuration and API calls to associate
the program with a chosen stack, as shown in Figure 2-8.
Native MVS
TCPIP Jobname Socket
TCPB Program
Inbound Outbound
BPX Callable Sockets
Pre-routing Table
C-INET
X Y Z V
socket( )
Application-specific bind(8001, Y)
Configuration Data listen( )
Inbound Outbound
BPX Callable Sockets
Pre-routing Table
C-INET
X Y Z V
Generic server
A generic server is a server without an affinity to a specific stack, and it provides service to
any clients that are connected to any TCP/IP stacks on the system.
When using the generic bind, it does not matter whether the chosen resolver configuration file
has a TCPIPJobname; it is not used when the server is a pure generic server.
socket( )
Application-specific bind(8001, Y)
Configuration Data listen( )
Inbound Outbound
BPX Callable Sockets
Pre-routing Table
C-INET
X Y Z V
Outbound connections or UDP datagrams are processed by the CINET pre-router, and the
stack with the best route to the destination is chosen.
When using a generic bind, the server port number must be reserved in all stacks. If one
stack has it reserved to another address space, the bind() call fails.
Consideration: The res_state structure (nsaddr_list) contains only the IPv4 addresses
coded on the NSINTERADDR or NAMESERVER statements. Applications that examine or update
the nsaddr_list cannot manipulate the IPv6 addresses.
The IPv6 search order is the same as the COMMONSEARCH search order, as shown in Figure 2-4
on page 30. If you do not want to use the COMMONSEARCH search order for existing IPv4 local
hosts files, you might need to maintain two separate local host files (for example, IPv4
addresses in HOSTS.LOCAL, and IPv6 and IPv4 addresses in ETC.IPNODES).
The default destination address selection algorithm takes a list of destination addresses and
sorts them to generate a new list. The algorithm sorts together both IPv6 and IPv4 addresses
by a set of rules.
Rule 4 Prefer matching address formats. If one address format matches its associated
source address format and the other destination does not meet this criteria,
then place the destination with the matching format before the other address.
Rule 5 Prefer higher precedence. If the precedence of one address is higher than the
precedence of the other address, then the address with the higher precedence
is placed before the other destination address.
Rule 6 Use the longest matching prefix. If one destination address has a longer
CommonPrefixLength with its associated source address than the other
destination address has with its source address, then the address with the
longer CommonPrefixLength is placed before the other address.
Rule 7 Leave the order unchanged. No rule selected a better address of these two;
they are equally good. Choose the first address as the better address of these
two and the order is not changed.
Extension Mechanism for DNS (EDNS0) was introduced in RFC 2671 to address the
performance improvement limitation that was imposed by the traditional DNS implementation.
The IBM implementation of the EDNS0 standard allows DNS communication of up to
3072 bytes by using UDP. This implementation improves DNS’s ability to communicate a large
amount of data, such as IP version 6 (IPv6).
In rare situations where the DNS server was recently upgraded to support EDNS0, a refresh
of the z/OS resolver is required so that it can relearn the DNS server EDNS0 capabilities. Run
MODIFY RESOLVER,REFRESH to the resolver address space to refresh.
2.2.9 Considerations
To implement the resolver address space, it is important to first determine whether your
environment requires a single TCP/IP stack or multiple TCP/IP stacks. In both cases, the
resolver is an independent address space and must be running before the TCP/IP stack is
started.
The statements that are defined in the global TCPIP.DATA file cannot be overridden by the
local TCPIP.DATA file of each TCP/IP stack. The local TCPIP.DATA file can specify only the
statement if it is not already defined in the global TCPIP.DATA file.
Important: In certain resolver environments, the use of the trace functions (such as
SockDebug or TraceResolver) might affect performance. Therefore, as a preferred
practice, use the method that is described in 2.4.3, “CTRACE: RESOLVER (SYSTCPRE)”
on page 66.
In the multiple-stack environment, as a preferred practice, create a global TCPIP.DATA file if all
the statements that are needed in the global TCPIP.DATA file (see “Using
GLOBALTCPIPDATA” on page 27) can be applied to all the stacks, as shown in Figure 2-3 on
page 29. If not, do not use the global TCPIP.DATA file and use only the local TCPIP.DATA file for
each stack.
SC30
TN3270 FTPDZ
TCPIPA server TCPIPZ server
Preferred practice: Although there are specialized cases where multiple stacks per LPAR
can provide value, as a preferred practice, implement only one TCP/IP stack per LPAR.
The reasons for this preferred practice are as follows:
A TCP/IP stack can use all available resources that are defined to the LPAR in which it
is running. Therefore, starting multiple stacks does not yield any increase in throughput.
When running multiple TCP/IP stacks, additional system resources, such as memory,
CPU cycles, and storage, are required.
Multiple TCP/IP stacks add a significant level of complexity to TCP/IP system
administration tasks.
It is not necessary to start multiple stacks to support multiple instances of an application
on a given port number, such as a test HTTP server on port 80 and a production HTTP
server also on port 80. This type of support can instead be implemented by using
BIND-specific support where the two HTTP server instances are each associated with
port 80 with their own IP address, by using the BIND option on the PORT reservation
statement.
One example where multiple stacks can have value is when an LPAR must be connected
to multiple isolated security zones in such a way that there is no network level connectivity
between the security zones. In this case, a TCP/IP stack per security zone can be used to
provide that level of isolation, without any network connectivity between the stacks.
Figure 2-12 depicts the environment that we use for this implementation.
SC30
TCPIP stack
RESOLV30
Global TCPIP.DATA Local TCPIP.DATA
DOMA INORIGIN ITSO.IBM.COM TCP IPJOBNAME TCPIPA
NSINTERADDR 10.1.2.10 HOSTNAME WTSC30A
NSPORTADDR 53 DATASETPREFIX TCPIPA
RESOLVEVIA UDP MESSAGECA SE MIXED
RESOLVERTIMEOUT 5
RESOLVERUDPRETRIES
LO OKUP LOCAL DNS
Global ETC.IPNODES
10.1.1.10 WTSC30A
10.1.1.20 WTSC31B
10.1.1.30 WTSC32C
10.1.2.240 router1
10.1.2.220 router2 ...
For information about defining and assigning a user ID for started procedures, go to the
following website:
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ichza7c0/5.9?SHELF=EZ2Z
O213&DT=20110620175910
To create the procedure, copy the sample procedure hlq.SEZAINST(EZBREPRC) and customize
it to the environment, as shown in Example 2-12. The procedure has only one DD card that
must be configured, the SETUP DD card 1, which describes where the SETUP data set is.
Important: When the resolver is started by UNIX System Services, you must pay attention
to the following information:
The resolver address space is started by SUB=MSTR. This means that JES services are
not available to the resolver address space. Therefore, no DD cards with SYSOUT can
be used.
The resolver start procedure must be in a data set that is specified by the MSTJCLxx
PARMLIB member’s IEFPDSI DD card specification. Otherwise, the procedure is not
found and the resolver does not start. SYS1.PROCLIB is usually one of the libraries
that are specified there.
Important: Be careful when creating these global parameters. The definitions in the
resolver SETUP data set are applied to all TCP/IP stacks or applications.
To implement our resolver address space, we halt the running resolver by using the STOP
command, as shown in Example 2-20.
Important: Stop and restart the resolver only if you install a new level of the resolver code.
Notes:
If you want to start the default z/OS UNIX resolver, run the following command instead:
START IEESYSAS.RESOLVER,PROG=EZBREINI,SUB=MSTR
The resolver uses non-reusable address spaces. To start the resolver by using a
reusable address space ID (REUSASID), see 1.3.3, “Reusable address space ID” on
page 6.
If you want to reload the SETUP data set content changes, run the MODIFY command to
refresh the resolver. To show how this is done, we create a SETUP data set named NEWSETUP,
with the same configuration as the RESOLV30 setup file, change the UNRESPONSIVETHRESHOLD
statement to 35%, and refresh the resolver to reflect the changes, as shown in Example 2-22.
A possibility is that the resolver encounters correct and incorrect parameter values for the
same setup statement. If that occurs, the resolver uses the last correct specification, ignoring
any subsequent or previous specification. If no correct specification is found, the default value
is used.
During initialization, even if errors are encountered in the file, the resolver continues to parse
the file and issue messages identifying specific errors. You can use one single setup file for all
systems regardless of release level.
The resolver stops parsing the setup file if errors are encountered during MODIFY
RESOLVER,REFRESH,SETUP= command processing (see Example 2-22 on page 58).
The main reason for continuing with this behavior is that the resolver assumes that the MODIFY
command is a full replacement of the resolver configuration, which means that if a setup
statement is not coded in the setup file, the resolver assumes that the default value for the
statement should be used. This is true even if a non-default setting was specified for the
statement previously. If the resolver were to ignore errors in the setup file during MODIFY
processing, the behavior of the resolver might possibly change drastically, and unintentionally,
after the MODIFY command.
Two new messages were introduced with the resolver resiliency function:
A message is issued after resolver unitization completes when one or more errors are
detected.
The message EZD2039I is issued when the MODIFY RESOLVER,DISPLAY is issued and lists
the errors that the resolver encountered during initialization.
The new message EZD2039I is included only if resolver setup file errors were detected and no
MODIFY RESOLVER,REFRESH command was successfully processed since resolver initialization
completed. Because MODIFY REFRESH processing succeeds only if there are no resolver setup
file errors, the assumption is that MODIFY REFRESH processing fixes any previous setup file
errors. Therefore, the message is no longer displayed when a successful MODIFY REFRESH is
performed.
Automation
The new resolver messages are designed to assist with automation that detects errors during
system startup and to highlight previous errors that might not have been detected.
The TSO PING command was also successful, as shown in Example 2-28.
Another possibility to verify where the resolver is looking is by using the TRACE RESOLVER
parameter in the stack’s or application’s TCPIP.DATA file. For an explanation of how this is
done and what the contents of this trace will be, see 2.4, “Problem determination” on page 61.
This section offers a brief explanation of when to debug, which trace must be used, and how
to use these trace facilities. For more information about resolver diagnosis, see z/OS
Communications Server: IP Diagnosis Guide, GC31-8782.
Succeeds, but another The problem is with the resolver Use the Trace Resolver
application fails when configuration for the application in statement on the local
resolving the same host the user’s environment. TCPIP.DATA file that is used by
name. the application that has the
problem.
Fails, but the host name is The resolution is successful but This problem is related to
converted to an IP address. the host is not reachable or active. connectivity, not a resolver
problem.
Fails to convert the name to The problem might be with the Use Trace Resolver to solve the
an IP address. resolver configuration, searching problem.
local host files, or using DNS.
Tip: If the problem seems to be related to the DNS, use the LOOKUP LOCAL DNS statement
to check the local files first.
Tip: When directing Trace Resolver output to a TSO terminal, define the screen size to be
only 80 columns wide. Otherwise, the trace output is difficult to read.
Example 2-31 Using the OPTIONS DEBUG to get a trace of the resolver
OPTIONS DEBUG 1
TCPIPJOBNAME TCPIPA
HOSTNAME WTSC30A
DOMAINORIGIN ITSO.IBM.COM
DATASETPREFIX TCPIPA
MESSAGECASE MIXED
NSINTERADDR 10.1.2.10
NSPORTADDR 53
In this example, specify OPTIONS DEBUG (1) or TRACE RESOLVER to enable Trace Resolver.
Example 2-32 Trace Resolver partial output: z/OS UNIX shell environment
Resolver Trace Initialization Complete -> 2010/09/27 15:04:49.709930
res_init Resolver values:
Global Tcp/Ip Dataset = TCPIPA.TCPPARMS(GLOBAL) 1
Default Tcp/Ip Dataset = TCPIPA.TCPPARMS(DEFAULT)
Local Tcp/Ip Dataset = /etc/resolv.conf 2
...
...
(G) LookUp = LOCAL DNS 3
(*) Cache
res_init Succeeded
res_init Started: 2010/09/27 15:04:49.741620
res_init Ended: 2010/09/27 15:04:49.741624
***************************************************************************
GetAddrInfo Started: 2010/09/27 15:04:49.741646
GetAddrinfo Invoked with following inputs:
Host Name: admin 4
...
...
GetAddrInfo Only IPv4 Interfaces Exist
GetAddrInfo Searching Local Tables for IPv4 Address
Global IpNodes Dataset = TCPIPA.TCPPARMS(IPNODES) 5
Default IpNodes Dataset = None
Search order = CommonSearch
...
...
- Lookup for admin.ITSO.IBM.COM
- Lookup for admin
res_search(admin, C_IN, T_A)
res_search Host Alias Search found no alias 6
res_querydomain(admin, ITSO.IBM.COM, C_IN, T_A)
res_querydomain resolving name: admin.ITSO.IBM.COM
res_query(admin.ITSO.IBM.COM, C_IN, T_A)
The CTRACE support allows for JOBNAME, ASID filtering, or both. The trace buffer is in the
resolver private storage. The trace buffer minimum size is 128 KB. The maximum size is
128 MB. The default size is 16 MB. Trace records can optionally be written to an external
writer.
The resolver CTRACE can be started any time needed by using the TRACE CT command, or it
can be activated during resolver procedure initialization.
Note: If you suspect an error exists in the operation of the resolver cache, you must collect
CTRACE records because there are no Trace Resolver trace entries for cache processing.
2. Using the sample resolver procedure that is included with the product, run the following
console command:
S RESOLV30,PARMS='CTRACE(CTIRESxx)'
The xx is the suffix of the CTIRESxx PARMLIB member to be used. To customize the
parameters that are used to initialize the trace, you can update CTIRES00 (the
SYS1.PARMLIB member), as shown in Example 2-34.
3. Run the TRACE CT command to define the options, as shown in Example 2-35.
To help in situations like these, z/OS Communications Server has the Resolver CTRACE
TRACERES option to dynamically enable or disable the Resolver trace process for one or more
applications without the need to recycle the application (stop and start). The TRACERES option
collects the trace resolver entries and saves them as Resolver CTRACE records.
To activate the Resolver trace by using the TRACERES option when the Resolver procedure is
started, complete the following steps:
1. Specify the CTRACE TRACERES option in the CTRACE PARMLIB member CTIRESxx, as
shown in Example 2-39.
2. Start the Resolver with the PARMS keyword. This command activates the Resolver
CTRACE component (SYSTCPRE), and also starts the external writer to receive the
collected data:
S RESOLV30,PARMS='CTRACE(CTIRES00)'
3. After the CTRACE is active, the TRACERES option can be activated or inactivated at any
point for any specific application by using the CTRACE command, without changing the
application status (see Example 2-40).
Example 2-40 Use the CTRACE command to activate TRACERES for a specific application
TRACE CT,ON,COMP=SYSTCPRE,SUB=(RESOLV30)
*007 ITT006A SPECIFY OPERAND(S) FOR TRACE CT COMMAND.
R 07,OPTIONS=(TRACERES),JOBNAME=(CS01),END
IEE600I REPLY TO 007 IS;OPTIONS=(TRACERES),JOBNAME=(CS01),END
ITT038I ALL OF THE TRANSACTIONS REQUESTED VIA THE TRACE CT COMMAND
WERE SUCCESSFULLY EXECUTED.
Tip: If the CTRACE command is run to activate a trace for a specific jobname, it might
override any previous active filter. Before you activate the TRACERES option for a specific
application, run D TRACE,COMP=SYSTCPRE,SUB=(resolver_proc) to verify whether any
previous definition is already active.
4. To verify whether the TRACERES option is active, display the current settings for the
Resolver CTRACE component (SYSTCPRE) by running the Display Trace command that
is shown in Example 2-41.
8. Use IPCS to format and view the formatted Trace Resolver output in the Resolver
CTRACE component (SYSTCPRE), and run the IPCS CTRACE,FULL command to format
the information in a similar manner, as shown in Example 2-43.
This chapter covers the topics that are shown in Table 3-1.
3.2, “Common design scenarios for base Key characteristics of base functions and why they might
functions” on page 74 be important in your environment
3.3, “z/OS UNIX System Services setup Selected implementation scenarios, tasks, configuration
for TCP/IP” on page 79 examples, and problem determination suggestions
3.4, “Configuring z/OS TCP/IP” on Configuration details for the z/OS TCP/IP environment
page 93
3.5, “Implementing the TCP/IP stack” on Implementation tasks for the TCP/IP stack
page 107
3.6, “Activating the TCP/IP stack” on Messages that are used to verify the accuracy of the
page 114 current environment customization data sets that are
used in z/OS UNIX and TCP/IP initialization
3.7, “Reconfiguring the system with z/OS z/OS commands that are used to reconfigure the system
commands” on page 132
3.8, “Job log versus syslog as a Information about using job log versus syslog when
diagnosis tool” on page 136 diagnosing issues
Most of these functions are implemented at the lower layers. Certain base functions are
implemented at the application layer (such as Telnet and FTP). The details of the standard
applications can be found in IBM z/OS V2R2 Communications Server TCP/IP Implementation
Volume 2: Standard Applications, SG24-8361. Here, this chapter describes the configuration
that provides the infrastructure of the TCP/IP protocol suite in the z/OS Communications
Server environment.
The z/OS TCP/IP stack (a TCP/IP instance) is a fully functional implementation of the
standard RFC protocols that are fully integrated and tightly coupled between z/OS and UNIX
System Services. It provides the environment that supports the base functions, and also the
many traditional TCP/IP applications. Here are the two environments that must be created
and customized to support the z/OS Communications Server for TCP/IP:
A native z/OS environment on which users can use the TCP/IP protocols in a standard
z/OS application environment, such as batch jobs (with JES interface), started tasks, Time
Sharing Option (TSO), CICS, and IMS applications.
A z/OS UNIX System Services environment that lets you develop and use applications
and services that conform to the POSIX or XPG4 standards (UNIX specifications). The
z/OS UNIX environment also provides some of the base functions to support the z/OS
environment and vice versa.
Because the z/OS Communications Server uses z/OS UNIX services even for traditional
z/OS environments and applications, a full-function mode z/OS UNIX environment, including
a Data Facility Storage Management Subsystem (DFSMS), a z/OS UNIX file system, and a
security product (such as Resource Access Control Facility (RACF)), are required before the
z/OS Communications Server can be started successfully and the TCP/IP environment
initialized.
Important: Although there are specialized cases where multiple stacks per LPAR can
provide value, it is a preferred practice to implement only one TCP/IP stack per LPAR.
Dependencies
To achieve a successful implementation of the z/OS Communications Server - TCP/IP
component, there are certain dependencies, as explained here:
Implement a full-function UNIX System Services system on z/OS. Detailed information
about this topic is available in z/OS UNIX System Services Planning, GA22-7800, and in
z/OS MVS Initialization and Tuning Reference, SA22-7592. Also, see z/OS Program
Directory, GI10-0670, which is available at the following address:
http://publibz.boulder.ibm.com/epubs/pdf/iea2p1c0.pdf
Define a RACF environment for the z/OS Communications Server - TCP/IP component.
This includes defining RACF groups to z/OS UNIX groups to manage resources, profiles,
user groups, and user IDs.
An OMVS UID must be defined with UID (0) and assigned to the started task name of the
Communications Server for z/OS IP system address space. Detailed information is
available in IBM z/OS V2R2 Communications Server TCP/IP Implementation Volume 4:
Security and Policy-Based Networking, SG24-8363, z/OS Security Server RACF Security
Administrator's Guide, SA22-7683, z/OS Security Server RACF System Programmer's
Guide, SA22-7681, and z/OS Security Server RACF Command Language Reference,
SA22-7687.
Customize SYS1.PARMLIB members with special reference to BPXPRMxx to use the
integrated sockets INET with the AF_INET and AF_INET6 physical file system. Detailed
information is available in z/OS MVS Initialization and Tuning Reference, SA22-7592,
z/OS UNIX System Services Planning, GA22-7800, and z/OS V1R7.0 Program Directory
GI10-0670.
Customize the TCP/IP configuration data sets:
– PROFILE.TCPIP
– TCPIP.DATA
– Other configuration data sets
Use fully functional VTAM, which is required to support the interfaces that are used by
TCP/IP.
Advantages
A single-stack environment has the following advantages:
Fewer CPU cycles are spent processing TCP/IP traffic because there is only one logical
instance of each physical interface in a single-stack environment versus a multiple-stack
environment.
Servers use fewer CPU cycles when certain periodic updates arrive (OMPROUTE
processing routing updates). Multiple stacks mean multiple copies of OMPROUTE.
Each stack requires a certain amount of storage, the most significant being virtual storage.
Multiple TCP/IP stacks add a level of complexity to TCP/IP system administration tasks.
Communications Server for z/OS IP uses the tightly coupled design of the z/OS
Communications Server, the integration of z/OS and UNIX System Services, and the
provision of RACF services. Coordination is the key to a successful implementation of the
TCP/IP stack.
Dependencies
The dependencies for the multiple-stack environment are the same as for the single-stack
environment, with the following additional dependencies:
Additional storage, especially virtual storage
Additional CPU cycles for processing subsequent interfaces and services performing
periodic functions, such as OMPROUTE routing updates
Advantages
There are advantages to running a multiple-stack environment because it provides you with
the flexibility to partition your networking environment. Here are advantages to consider:
You might want to establish separate stacks to separate workloads based on availability
and security. For example, you might have different requirements for a production stack, a
system test stack, and a secure stack.
This approach can, for example, be used to establish a test TCP/IP stack, where new
socket applications are tested before they are moved into the production system. You
might also want to apply maintenance to a non-production stack so it can be tested before
you apply it to the production stack.
Your strategy might be to separate your workload onto multiple stacks based on the
functional characteristics of applications, as with UNIX (OpenEdition) applications and non
UNIX (z/OS) applications.
You might be running z/OS servers and UNIX (OpenEdition) servers on the same
well-known port (TN3270 and telnet on port 23). An alternative to this is approach is the
BIND for INADDR_ANY function.
Whatever the reason, the ability to configure multiple stacks and have them fully functional,
independently and concurrently, can be used in many different ways.
TSO clients
TSO client functions can be directed against any number of TCP/IP stacks. The client must
be able to find the TCPIP.DATA data set appropriate for the stack of interest. You can modify
your TSO logon procedure with a SYSTCPD DD statement, or use a common TSO logon
procedure without the SYSTCPD DD statement and allocate the TCPIP.DATA data set to the
appropriate stack of interest.
Stack affinity
Any server or client must reference the appropriate stack if the needed stack is not the default
stack that is defined in the BPXPRMxx member of SYS1.PARMLIB. Servers can use the
BPXK_SETIBMOPT_TRANSPORT environment variable to override the choice of the default
stack. There might also be applications that have affinity to the wrong stack and do not have
the option of establishing stack affinity. In those instances, you can run BPXTCAFF before the
application execution step. For example:
//AFFINITY EXEC PGM=BPXTCAFF,PARM='TCPIPA'
Port management
When there is a single stack and the relationship of the server to stack is 1:1, port
management is relatively simple. Using the PORT statement, the port number can be reserved
for the server in the PROFILE.TCPIP for that given stack.
Port management becomes more complex in an environment where there are multiple stacks
and a potential for multiple combinations of the same server (for example, UNIX System
Services TELNET and TN3270 TELNET). With use of VIPA, it is possible to use the same
“well-known” port number, in this case 23, for both services. The distinction is made by
different names mapping to different IP addresses (VIPAs). Therefore, in a multiple-stack
environment, you must answer several questions based on the following concepts:
Generic server
A generic server is a server without affinity for a specific stack, and it provides service to
any client in the network. FTP is an example because the stack is merely a connection
linking client and server. The service File Transfer is not related to the internal functioning
of the stack, and the server can communicate concurrently over any number of stacks.
Servers with an affinity for a specific stack
There must be an explicit binding of the server application to the chosen stack when the
service (for example, z/OS UNIX DNS, SNMP, and NETSTAT) is related to the internal
functioning of the stack.
This bind is made by using the setibmopt() socket call (to specify the chosen stack) or by
using the C function _iptcpn(), which allows applications to search in the TCPIP.DATA file
to find the name of a specific stack.
CPU resources
Provisions must be made for additional CPU cycles and storage (especially virtual storage).
These increases in resources are for the existence of the additional stacks running
concurrently.
For more information about MTU sizes for OSA-Express and HiperSockets, see IBM z/OS
V2R2 Communications Server TCP/IP Implementation Volume 3: High Availability, Scalability,
and Performance, SG24-8362.
RACF implementation
Each unit of work in the z/OS system that requires UNIX System Services must be
associated with a valid UNIX System Services identity. A valid identity refers to the presence
of a valid UNIX user ID (UID) and a valid UNIX group ID (GID) for each such user. The UID
and the GID are defined through the OMVS segment in the user’s RACF user profile and in
the group’s RACF group profile.
Each functional RACF access group must be authorized to access a specific TCP/IP RACF
resource with a specific access attribute. The details of this process are described in IBM
z/OS V2R2 Communications Server TCP/IP Implementation Volume 4: Security and
Policy-Based Networking, SG24-8363.
RACF offers you two techniques to assign user IDs and group IDs to started tasks:
The started procedure name table (ICHRIN03)
The RACF STARTED resource profiles
By using the STARTED resources, you can add new started tasks to RACF, and
immediately make those new definitions active, for example:
IEF695I START T03DNS WITH JOBNAME T03DNS IS ASSIGNED TO USER TCPIP3, GROUP
OMVSGRP
The user ID and default group must be defined in RACF, which then treats the user ID as any
other RACF user ID for its resource access checking. RACF allows multiple started procedure
names to be assigned to the same RACF user ID. In this example, this method is used to
assign RACF user IDs to all TCP/IP started tasks.
TCP/IP tasks need RACF user IDs with the OMVS segment defined. The user ID that is
associated with the main TCP/IP address space must be defined as a superuser; the
requirements for the individual servers vary, but most need to be a superuser also.
NETSTAT command
Access to the TSO NETSTAT command, the UNIX shell command onetstat, and command
options can be controlled by RACF, by defining NETSTAT resources to the RACF generic class
SERVAUTH. This command might also need to be restricted because it can be used to alter
or drop connections or to stop the TN3270 server.
SEZALOAD is one library that must be made part of your LNKLST concatenation. Because of
the LNKAUTH=LNKLST specification, it is APF-authorized when it is accessed through the
LNKLST concatenation. The SEZALOAD library holds the TCP/IP system code that is used
by both servers and clients.
In addition to the LNKLST libraries, there are libraries that are not accessed through the
LNKLST concatenation, but have to be APF-authorized. The SEZATCP library holds the
TCP/IP system code that is used by servers. This library is normally placed in the STEPLIB or
JOBLIB concatenation, which is part of the server JCL.
Every APF-authorized online application might have to be reviewed to ensure that it matches
the security standards of the installation. A program is a “well-behaved program” if it meets
the following requirements:
Logged-on users cannot access or modify system resources for which they are not
authorized.
The program does not require any special credentials to be able to run.
Or, in the case of RACF, the program does not need the RACF authorization attribute
OPERATIONS for execution.
Note: User IDs with the RACF attribute OPERATIONS have ALTER access to all data
sets in the system. The access authority to single data sets can be lowered or
excluded.
SYS1.PARMLIB is the single most important data set in the z/OS environment. It contains
most of the parameters that define z/OS and also many other subsystems. The
SYS1.PARMLIB data set definition parameters are critical to the proper initialization and
functioning of UNIX System Services and to the TCP/IP implementation. Several members of
interest are as follows:
IEASYS00
BPXPRMxx
Integrated Sockets PFS definitions
IEASYS00
Because the z/OS Communications Server uses z/OS UNIX services even for traditional MVS
environments and applications, a full-function mode z/OS UNIX environment, including a
DFSMS and z/OS File Systems (including z/OS UNIX file system), is required before the
z/OS Communications Server can be started and the TCP/IP environment successfully
established.
BPXPRMxx
All the parameters that are defined in BPXPRMxx should be reviewed and tailored to
individual installation specification and resource utilization. The following resources explain
the details and significance of each parameter in the BPXPRMxx member:
z/OS UNIX System Services Planning, GA22-7800
z/OS MVS Initialization and Tuning Guide, SA22-7591
The following resources detail the structure, design, installation, and implementation of the
z/OS UNIX environment:
z/OS UNIX System Services Planning, GA22-7800
z/OS UNIX System Services User's Guide, SA22-7802
z/OS V1R7.0 Program Directory GI10-0670
z/OS V1R7.0 Program Directory GI10-0670 is available at the following address:
http://publibz.boulder.ibm.com/epubs/pdf/iea2p1c0.pdf
Concepts such as Logical and Physical File Systems (PFS) are design components of z/OS
UNIX and are not described here.
Specifying NETWORK definitions for both AF_NET and AF_INET6 provides dual support. If IPv6
support is not what you want, you may omit the NETWORK DOMAINNAME(AF_INET6) statement
and subsequent parameters.
Example 3-1 BPXPRMxx definitions for a single stack supporting dual mode
FILESYSTYPE TYPE(UDS)
ENTRYPOINT(BPXTUINT)
NETWORK DOMAINNAME(AF_UNIX)
DOMAINNUMBER(1)
MAXSOCKETS(10000)
TYPE(UDS)
/* IPv4 support
NETWORK DOMAINNAME(AF_INET) 1
DOMAINNUMBER(2)
MAXSOCKETS(25000)
TYPE(INET) 2
INADDRANYPORT(10000)
INADDRANYCOUNT(2000)
/* IPv6 support
NETWORK DOMAINNAME(AF_INET6) 4
DOMAINNUMBER(19)
TYPE(INET)
INET specifies a single stack with TCP/IP (by default) as the stack name. In Example 3-1 on
page 83, the numbers correspond to the following information:
1. AF_INET specifies the IPv4 support for the physical file type for the socket address that is
used by this stack (TCP/IP).
2. Specify TYPE(INET) for a single-stack environment. If you specify INET, you cannot start
multiple TCP/IP stacks.
3. EZBPFINI identifies a TCP/IP stack (this is the only valid value).
4. AF_INET6 specifies IPv6 support for the physical file type for the socket address that is
used by this stack (TCP/IP).
Example 3-2 BPXPRMxx definitions for a multiple stack supporting dual mode
FILESYSTYPE TYPE(UDS) ENTRYPOINT(BPXTUINT)
NETWORK DOMAINNAME(AF_UNIX)
DOMAINNUMBER(1)
MAXSOCKETS(10000)
TYPE(UDS)
FILESYSTYPE TYPE(CINET)
ENTRYPOINT(BPXTCINT)
NETWORK DOMAINNAME(AF_INET) 1
DOMAINNUMBER(2)
MAXSOCKETS(10000)
TYPE(CINET) 2
INADDRANYPORT(10000)
INADDRANYCOUNT(2000)
NETWORK DOMAINNAME(AF_INET6) 3
DOMAINNUMBER(19)
MAXSOCKETS(10000)
TYPE(CINET)
SUBFILESYSTYPE NAME(TCPIPA) 4
TYPE(CINET) 2
ENTRYPOINT(EZBPFINI) 5
DEFAULT
SUBFILESYSTYPE NAME(TCPIPB) 4
TYPE(CINET) 2
ENTRYPOINT(EZBPFINI) 5
.....
Note: The hlq.SEZALPA module must be cataloged into the MVS master catalog. The
hlq.SEZALOAD and hlq.SEZALNK2 link libraries can be cataloged into the MVS master
catalog. You can omit them from the MVS master catalog if you identify them to include
a volume specification:
TCPIP.SEZALOAD(WTLTCP),
TCPIP.SEZALNK2(WTLTCP)
If the three data sets that are mentioned are renamed during the installation process,
then use these names instead.
3. PROGnn or IEAAPFxx
Add the following TCP/IP libraries for APF authorization:
– hlq.SEZATCP
– hlq.SEZADSIL
– hlq.SEZALOAD
– hlq.SEZALNK2
– hlq.SEZALPA
– SYS1.MIGLIB
N03 NAME=SC30,SNA,NETAUTH
Before you make this update, make sure that the hlq.SEZALOAD definition is added to
LNKLSTxx and the library itself is APF-authorized. z/OS initializes the address spaces
of the TNF and VMCF subsystems during IPL as part of the master scheduler
initialization.
5. SCHEDxx
You must specify certain Communications Server for z/OS IP modules as privileged
modules in MVS. The following entries are present in the IBM-supplied program properties
table (PPT); however, if your installation has a customized version of the PPT, ensure that
these entries are present:
– For Communications Server for z/OS IP:
PPT PGMNAME(EZBTCPIP) KEY(6) NOCANCEL PRIV NOSWAP SYST LPREF SPREF
– If you use restartable VMCF and TNF:
PPT PGMNAME(MVPTNF) KEY(0) NOCANCEL NOSWAP PRIV SYST
PPT PGMNAME(MVPXVMCF) KEY(0) NOCANCEL NOSWAP PRIV SYST
– For NPF:
PPT PGMNAME(EZAPPFS) KEY(1) NOSWAP
PPT PGMNAME(EZAPPAAA) NOSWAP
– For SNALINK:
PPT PGMNAME(SNALINK) KEY(6) NOSWAP SYST
6. COMMNDxx
VMCF and TNF might be required for some of the Communications Server for z/OS IP
facilities and components you are using. If you use restartable VMCF and TNF, procedure
EZAZSSI must be run during your IPL sequence (EZAZSSI starts VMCF and TNF).
Either use your operation’s automation software to start EZAZSSI, or add a command to
your COMMNDxx member in SYSx.PARMLIB:
COM='S EZAZSSI,P=your_node_name'
Example 3-3 BPXPRMxx member with port range that is provided by a single-stack environment
/* IPv4 support
NETWORK DOMAINNAME(AF_INET)
DOMAINNUMBER(2)
MAXSOCKETS(25000)
TYPE(INET)
INADDRANYPORT(10000) 8
INADDRANYCOUNT(2000) 8
* IPv6 support
NETWORK DOMAINNAME(AF_INET6)
DOMAINNUMBER(19)
TYPE(INET)
Review the values that are specified in BPXPRMxx for MAXPROCSYS, MAXPROCUSER,
MAXUIDS, MAXFILEPROC, MAXPTYS, MAXTHREADTASKS, and MAXTHREADS.
12.IFAPRDxx or PROGxx
Use these to add product and feature information in a z/OS environment.
Update your TCP/IP startup JCL procedure. The sample for the Communications Server for
z/OS IP procedure is in hlq.SEZAINST(TCPIPROC).
Any TSO user can run any TCP/IP command and use a TCP/IP client function to access any
other TCP/IP server host through the attached TCP/IP network. If these TCP/IP servers have
not implemented adequate password protection, then any TSO client user can log on to these
servers and access all data.
Superuser mode
Certain commands and operations from OMVS or from the ISHELL are authorized only for
superusers. There are two alternatives for running as a superuser:
The user ID can have permanent superuser status.
The ID was created with a UID value of zero. TCP/IP started tasks and some of its servers
are also defined with a UID of zero.
The user ID can have temporary authority for the superuser tasks.
The defined UID is set up as a non-zero value in RACF, but the user is granted READ
access to the RACF facility class of BPX.SUPERUSER. Also, RACF provides superuser
granularity enhancements to assign functions to users that need them.
If you need only temporary authority to enter superuser mode, then granting simple READ
permission to the BPX.SUPERUSER facility class allows the user to switch back and forth
between superuser mode and standard mode. You can enter su from the OMVS shell, or you
can select SETUP OPTIONS from the ISHELL and specify Option #7 to obtain superuser
mode.
The user is then authorized to enter commands that are authorized for the superuser function
from the ISHELL, or switch to an OMVS shell the user has already signed onto. The basic
prompt level, indicated by the dollar sign ($) prompt, is changed when in superuser mode to a
pound sign (#). The exit command takes the user out of superuser mode and also the OMVS
(UNIX) shell. Running the whoami command shows the change of user IDs.
This error occurred because the home directory that is associated with the user is not defined
or authorized in to the OMVS segment. You can determine the home directory with the RACF
listuser command (if you have the RACF authorization to use the command). However, you
still have access to the z/OS file, even though the message was displayed.
A similar problem occurs when trying to access the ISHELL environment, as shown in
Example 3-5.
In both cases, the user had an OMVS segment defined in RACF. However, the home directory
that was associated with the user in the user’s OMVS segment was not defined or authorized.
(You can determine the home directory with the RACF listuser command.) Authorization is
provided with the permission bits.
The same symptom shows up for users without an OMVS segment that is defined if the
BPX.DEFAULTUSER facility was activated with an inaccessible home directory.
In this case, although the user has the UNIX permission bit settings of 755 on the /u/cs01/
directory, the permission bits are set at 600 for the /u/ directory. Thus, you must ensure that
all directories in the entire path are authorized with suitable permission bits. After the settings
are changed to 755 for the /u/ directory, access to the subdirectory is allowed.
You can display UNIX permission bits from the ISHELL environment or by running the ls
-alF command from the shell.
Otherwise, it should be expanded to the following entry, depending on whether you want the
current directory searched first or last:
.:/bin:/usr/sbin
The instructions for setting up this user profile are in z/OS UNIX System Services User’s
Guide, SA22-7801, and z/OS UNIX System Services Planning, GA22-7800.
Note: To view the search path that is established for you, run echo $PATH from the shell
environment.
A user might attempt to run a simple TCP/IP command, such as oping, and receive an error
that the command is not found, as shown in Example 3-7.
In this case, you must preface the command with the directory path to find it:
/usr/lpp/tcpip/bin/oping
If you experience such a problem, check that the symbolic links are correct. Part of the
installation is to run the UNIX MKDIR program to set up the symbolic links for the various
commands and programs from their real path to /bin or /usr/sbin, where they can be found
by using the default search path.
The purpose here is to give an introduction to the data set naming and allocation techniques
that z/OS Communications Server uses. If you choose, you can allocate some of the
configuration data sets either implicitly or explicitly. In addition, you must ensure that both the
MVS and the z/OS UNIX functions can find the data sets.
3.4.2 PROFILE.TCPIP
Before you start your TCP/IP stack, you must configure the operational and address space
characteristics. These definitions are defined in the configuration data set, which is often
called PROFILE.TCPIP. The PROFILE.TCPIP data set is read by the TCP/IP address space
during initialization.
The PROFILE data set contains the following major groups of TCP/IP configuration
parameters:
Operating characteristics
Port number definitions
Network interface definitions
Network routing definitions
You can find detailed information about TCP/IP connectivity and routing definitions in
Chapter 4, “Connectivity” on page 139, and Chapter 5, “Routing” on page 223.
PROFILE.TCPIP statements
This section shows several essential statements for configuring TCP/IP stack.
The syntax for the parameters in PROFILE can be found in z/OS Communications Server: IP
Configuration Reference, SC27-3651. Additional profile statements and descriptions are
available in “PROFILE.TCPIP statements” on page 95.
Most PROFILE parameters that are required in a basic configuration have default values that
allow the stack to be initialized and ready for operation. However, there are a few parameters
that must be modified or must be unique to the stack.
Appendix D, “Our implementation environment” on page 519 describes the environment that
was used to create this book.
Note: You can instead define IPv4 OSA-Express devices (IPQAENET), HiperSockets, and
Static VIPA with the INTERFACE statement. This is a preferred practice, and is described in
“INTERFACE” on page 96.
Each device type has a different set of parameters that you can define. For details about each
device type and its definition, see Chapter 4, “Connectivity” on page 139.
The following DEVICE and LINK statements are example for defining one VIPA:
DEVICE VIPA1 VIRTUAL 0
LINK VIPA1L VIRTUAL 0 VIPA1
INTERFACE
The INTERFACE statement defines all IPv6 interfaces and is enhanced to define IPv4
IPAQENET and HiperSockets devices, and Static VIPA. This statement combines the
definitions of DEVICE, LINK, and HOME into a single statement for IPv4 and IPv6. It allows
multiple VLAN support for HiperSockets and IPAQENET devices in both IPv4 and IPv6.
The INTERFACE statement is set to reference the PORTNAME that is defined in the QDIO TRLE
definition statement as per DEVICE and LINK definitions and assigns an IP address to it by
using the IPADDR operand, according to the HOME definition. Optional operands include
subnetmask settings that use the /subnetmask bit number value in the IPADDR statement and
MTU size with the BEGINROUTES or BSDROUTINGPARMS, and SOURCEVIPAINT statements, which
associate a specific VIPA with this INTERFACE only.
Note: If SOURCEVIPAINT is coded, you define the entire INTERFACE definition block in
PROFILE after the VIPA is defined.
You can define the VLANID and VMAC with the LINK statement, with the additional benefit
that you can use the INTERFACE statement to set multiple VLANs on the same OSA port.
However, you cannot define multiple VLANs on the same OSA port with the LINK statement.
The devices that are defined through the INTERFACE statement return displays that differ from
devices that are defined through the DEVICE or LINK statements. See examples in B.3.8,
“INTERFACE statement” on page 493.
Example 3-8 INTERFACE statement in profile TCP/IP for IPv4 IPAQENET devices
INTERFACE OSA20A0I
DEFINE IPAQENET
PORTNAME OSA20A0
IPADDR 10.1.2.12/24
MTU 1492
VLANID 20
VMAC
SOURCEVIPAINT VIPA2L
You can delete a previously defined interface from the stack, after you stop it, with the
INTERFACE DELETE command (Example 3-9) by running the VARY TCPIP,,OBEYFILE command.
To use the multiple VLAN option in HiperSockets, configure an INTERFACE statement for each
VLAN connecting to the HiperSockets CHPID. The DEVICE and LINK definitions and IPv6
interface definitions share only a single DATAPATH for the same CHPID. However, each
HiperSockets INTERFACE statement requires a separate DATAPATH device from the Transport
Resource List Entry (TRLE) to the CHPID. VTAM automatically creates the TRLE for
HiperSockets.
Note: DATAPATH requires a certain amount of fixed storage, which can be defined by
using the IQDIOSTG VTAM start option and READSTORAGE parameter on the INTERFACE
statement.
Example 3-10 INTERFACE statement in profile TCP/IP for IPv4 IPAQIDIO devices
INTERFACE IUTIQDF4L
DEFINE IPAQIDIO
CHPID F4
IPADDR 10.1.4.31/24
SOURCEVIPAINTerface VIPA1L
READSTORAGE GLOBAL
SECCLASS 255
NOMONSYSPLEX
Notes:
OMPROUTE checks for any mismatch in the MTU or subnet mask parameter that is
defined in the INTERFACE statement in the stack profile and OMPROUTE configuration.
If it detects a mismatch, it issues messages and uses the value that is configured in the
OMPROUTE.
When you convert a HiperSockets definition from DEVICE, LINK, or HOME statements to
INTERFACE statements, you must restart VTAM to delete the existing TRLE node of the
HiperSockets interface, which was created dynamically when the HiperSockets were
first configured.
The INTERFACE statement for static VIPA is also similar to the IPv6 statement that has the
IPADDR parameter, which specifies a single home IP address (Example 3-11).
Example 3-11 INTERFACE statement in profile TCP/IP for IPv4 static VIPA
INTERFACE VIPA1L DEFINE VIRTUAL IPADDR 10.1.1.10
INTERFACE VIPA2L DEFINE VIRTUAL IPADDR 10.1.2.10
TIP: You can use the CONVERT parameter on the TCPIPCS PROFILE subcommand to help you
migrate all IPv4 DEVICE and LINK definitions to INTERFACE statements. From a dump, this
function displays all the DEVICE, LINK, and HOME definitions in the form of INTERFACE
statements for all OSA, HiperSockets, and static VIPA. Review the output of this command
before making any profile changes.
More examples and displays are available in Appendix B, “Additional parameters and
functions” on page 471.
HOME
The HOME statement is used for assigning an IP address for each interface you defined with
DEVICE and LINK statements. The following HOME statement is an example:
HOME
10.1.1.10 VIPA1L
10.1.2.12 OSA20A0I
Note: The HOME statement (with DEVICE and LINK) is mutually exclusive from the INTERFACE
statement. You must use one or the other. Use INTERFACE, as described in “INTERFACE”
on page 96.
The TCP/IP stack uses an IP address of 127.0.0.1 for IPv4 and ::1 for IPv6 as the loopback
interfaces. If there is a requirement to represent the loopback IP address of 14.0.0.0 for
compatibility with earlier TCP/IP versions, you must code an entry in the HOME statement. The
link label that is specified is LOOPBACK and you can define multiple IP addresses with the
LOOPBACK interface, as in the following example:
HOME
14.0.0.0 LOOPBACK
The PRIMARYINTERFACE statement can be used to specify which link is to be designated as the
default local host address for the GETHOSTID() function.
BEGINROUTES
Use this statement to define static routes for the TCP/IP routing table. This statement is
optional when you use the OMPROUTE dynamic routing daemon. However, if you do not
configure the OMPROUTE dynamic routing daemon, BEGINROUTES is necessary for a TCP/IP
stack to communicate with other hosts. For details about static and dynamic routing, see
Chapter 5, “Routing” on page 223.
VIPADYNAMIC
Use this statement to define dynamic VIPA or the functions that are related to dynamic VIPA,
such as sysplex distributor and dynamic VIPA takeover. For details about high availability and
load balancing functions that use dynamic VIPA, See IBM z/OS V2R2 Communications
Server TCP/IP Implementation Volume 3: High Availability, Scalability, and Performance,
SG24-8362.
AUTOLOG
The procedures that are specified in the AUTOLOG statement are initialized at TCP/IP startup,
so you do not have to start the TCP/IP applications manually after the TCP/IP startup.
AUTOLOG also monitors procedures that are started under its auspices, and restarts a
procedure that terminates for any reason unless NOAUTOLOG is specified on the PORT
statement.
AUTOLOG 1
FTPDA JOBNAME FTPDA1 ; FTP Server
ENDAUTOLOG
START
Specify a device name or an interface name on a START statement to initialize the interface at
the TCP/IP stack start. The following example is of a START statement for an OSA and a
HiperSockets device. VIPA does not need to be started because it is virtual and always active.
START OSA20A0
START IUTIQDF4L
If you do not specify a device name or an interface name on a START statement, you can
initialize the device with the TCPIP,procname,START,devicename command after the TCP/IP
stack start.
IPCONFIG
IPv4 features are defined within IPCONFIG. There is a separate configuration section for IPv6
parameters. For commonly used IPCONFIG statements, see B.3, “PROFILE.TCPIP
statements” on page 478.
TCPCONFIG
TCP features are defined within TCPCONFIG. TCP/IP on z/OS is enhanced to allow the
configuration of many TCP parameters externally in the TCP/IP profile. The default values for
several TCP parameters are changed, as listed in Table 3-2.
Table 3-2 TCP parameters for which the default values are changed.
TCP parameter Old default values New default values
SOMAXCONN 10 1024
Note: Because default values for these parameters are changed, if your environment
requires the old default values, code them explicitly in the TCP/IP configuration.
New TCP parameters that can be configured in the TCPCONFIG statement are as follows:
TIMEWAITINTERVAL
FINWAIT2TIME
MAXIMUMRETRANSMITTIME
Note: The lowest configurable value for the FINWAIT2TIME parameter is 1 second.
UDPCONFIG
UDP features are defined within UDPCONFIG. For commonly used UDPCONFIG statements, see
B.3, “PROFILE.TCPIP statements” on page 478.
GLOBALCONFIG
The GLOBALCONFIG statement defines the parameters that affect the entire TCP/IP stack. For
commonly used GLOBALCONFIG statements, see B.3, “PROFILE.TCPIP statements” on
page 478.
IPCONFIG6
All IPv6 features are defined within IPCONFIG6.
Locating PROFILE.TCPIP
The following search order is used to locate the PROFILE.TCPIP configuration file:
1. //PROFILE DD
//PROFILE DD DSN=TCPIPA.TCPPARMS(PROFA30)
2. jobname.nodename.TCPIP
3. TCPIP.nodename.TCPIP
4. jobname.PROFILE.TCPIP
5. TCPIP.PROFILE.TCPIP
PROFILE must exist, or the TCP/IP address space terminates abnormally with the following
message:
EZZ0332I DD:PROFILE NOT FOUND. CONTINUING PROFILE SEARCH
EZZ0325I INITIAL PROFILE COULD NOT BE FOUND
Use the //PROFILE DD statement in the TCP/IP address space JCL procedure to explicitly
allocate the PROFILE data set.
Example 3-14 shows the command syntax for checking the TCPIPA profile statements in the
HOMEOBY member of the TCPIPA.TCPIPPARMS dataset on LPAR SC31.
The command processes all INCLUDE files specified (even nested ones), in the same way as
the VARY TCPIP,,OBEYFILE command.
Note: If system symbols are being used, they are resolved based on the system symbols
configuration of the system where the command is run. If your profile statements use MVS
system symbols, run the command on the MVS system where you plan to use the profile
for consistent resolution of the MVS system symbols.
In Example 3-15, the VARY TCPIP,,SYNTAXCHECK command found an error on line 50. The
value for TCPSENDBFRSIZE is incorrect.
Changing the incorrect value XXXX, for example, to 128K, results in no errors being found by
the VARY TCPIP,,SYNTAXCHECK command. For a successful output of this command, see
Example 3-16.
OSA-Express QDIO connections are configured through a TRLE definition. All TRLEs are
defined as VTAM major nodes. For more information about MPC-related devices/interfaces,
see Chapter 4, “Connectivity” on page 139.
A TRLE definition that is used for the example OSA-Express in QDIO mode is shown in
Example 3-17.
Example 3-17 TRLE VTAM major node definition for device OSA2080
OSA2080 VBUILD TYPE=TRL
OSA2080T TRLE LNCTL=MPC, *
READ=2080, *
WRITE=2081, *
DATAPATH=(2082-2087), *
PORTNAME=OSA2080, 1 *
MPCLEVEL=QDIO
Because VTAM provides the DLC layer for TCP/IP, VTAM must be started before TCP/IP. The
major node (in this example, OSA2080) should be activated when VTAM is initializing. This
ensures that the TRLE is active when the TCP/IP stack is started. This is accomplished by
placing an entry for OSA2080 in the VTAM startup list ATCCONxx. The port name 1 (in
Example 3-17) must also be the same as the device name that is defined in the
PROFILE.TCPIP data set on the DEVICE and LINK statements.
With multi-port OSA-Express features, you can use both ports on the same TRL statement,
as shown in Example 3-18.
Example 3-18 TRL VTAM majnode definition for two ports for device OSA2080
OSA2080 VBUILD TYPE=TRL
OSA200T TRLE LNCTL=MPC, *
READ=2080, *
WRITE=2081, *
DATAPATH=(2082-2087), *
PORTNAME=OSA2080, *
PORTNUM=0, *
MPCLEVEL=QDIO
OSA201T TRLE LNCTL=MPC, *
READ=2088, *
WRITE=2089, *
DATAPATH=(208A-208D), *
PORTNAME=OSA2081, *
PORTNUM=1, *
MPCLEVEL=QDIO
The TCPIP.DATA configuration data set is read during initialization of all TCP/IP server and
client functions. TCPIP.DATA contains the configuration for the resolver address space. You
define the way name-to-address or address-to-name resolution is performed by the resolver.
TCPIP.DATA is also used by the TCP/IP applications to specify the TCP/IP stack that it
establishes an affinity with. The associated TCP/IP stack name is specified with the
TCPIPJOBNAME statement. Other stack-specific statements are HOSTNAME, which is the host
name of the TCP/IP stack, and DATASETPREFIX, which is the data set prefix (hlq) to be used for
searching a configuration data set.
The syntax for the parameters in the TCPIP.DATA file is in z/OS Communications Server: IP
Configuration Guide, SC27-3650. A sample TCPIP.DATA configuration file is provided in
hlq.SEZAINST(TCPDATA). You can define the TCPIP.DATA parameters in an MVS data set or
z/OS UNIX file system file.
For more information about the TCPIP.DATA file and the resolver address space, see
Chapter 2, “The resolver” on page 21.
If you must resolve host names outside your local area, you can configure the resolver to use
a domain name server (see the NSINTERADDR or NAMESERVER statement in the TCPIP.DATA
configuration file). A domain name server can be used with the local hosts file. If you
configured your resolver to use a name server, it always tries to do so, unless your
applications are written with a RESOLVE_VIA_LOOKUP symbol in the source code.
For further explanation and details, see Chapter 2, “The resolver” on page 21.
XCF
10.1.7.x1
VLAN 10 VLAN 11
10.1.2.240 10.1.3.240
SWITCH
To implement the TCP/IP stack to support base functions, complete the following steps:
Creating a TCPIP.DATA file
Creating the PROFILE.TCPIP file
Checking BPXPRMxx
Creating a TCP/IP cataloged procedure
Adding RACF definitions
Creating a VTAM TRL major node for MPCIPA OSA
Allocate the TCPPARMS library to be used for explicitly allocated configuration data sets
for the stack, or create a member in your existing TCPPARMS library. For this example,
we allocated TCPIPA.TCPPARMS(DATAA30).
Example 3-20 shows a local TCPIP.DATA file for the TCPIPA stack.
INTERFACE statement
We configure two OSA-Express features, each having four ports. We configure only two ports
on each card with the INTERFACE statement. For redundancy, we define two VLANs, with each
pair using one port per feature and each pair attached to the same VLAN. This facilitates ARP
Takeover.
HOME statement
We assign an IP address for each interface that was configured with a DEVICE and LINK
statement pair.
BEGINROUTES statement
We define static routes with the BEGINROUTES statement to route traffic to other hosts on a
network by using the OSA-Express or HiperSockets interfaces.
PORT statement
We reserve TCP ports for some applications with the PORT statement.
PORTRANGE statement
We reserve TCP ports for some wild card job name applications with the PORTRANGE
statement.
START statement
We define a START statement to initialize the interfaces at the TCP/IP stack startup.
DYNAMICXCF statement
We use a DYNAMICXCF statement to dynamically define the device to join the sysplex.
;
;STATIC VIPA DEFINITIONS
DEVICE VIPA1 VIRTUAL 0
LINK VIPA1L VIRTUAL 0 VIPA1
DEVICE VIPA2 VIRTUAL 0
LINK VIPA2L VIRTUAL 0 VIPA2
;
HOME
10.1.1.10 VIPA1L
10.1.2.10 VIPA2L
10.1.4.11 IUTIQDF4L
10.1.5.11 IUTIQDF5L
10.1.6.11 IUTIQDF6L
Example 3-24 TRLE VTAM major node definition for device OSA2080
OSA2080 VBUILD TYPE=TRL
OSA2080T TRLE LNCTL=MPC, *
READ=2080, *
WRITE=2081, *
DATAPATH=(2082-2087), *
PORTNAME=OSA2080, *
PORTNUM=0, *
MPCLEVEL=QDIO
OSA2081T TRLE LNCTL=MPC, *
READ=2088, *
WRITE=2089, *
DATAPATH=(208A-208D), *
PORTNAME=OSA2081, *
PORTNUM=1, *
MPCLEVEL=QDIO
Example 3-25 displays the TRLE VTAM major node definitions for devices OSA20A0 and
OSA20A1.
Example 3-25 TRLE VTAM major node definition for device OSA20A0
OSA20A0 VBUILD TYPE=TRL
OSA20A0T TRLE LNCTL=MPC, *
READ=20A0, *
WRITE=20A1, *
DATAPATH=(20A2-20A7), *
PORTNAME=OSA20A0, *
PORTNUM=0, *
MPCLEVEL=QDIO
*
OSA20A1 VBUILD TYPE=TRL
OSA20A1T TRLE LNCTL=MPC, *
READ=20A8, *
WRITE=20A9, *
DATAPATH=(20AA-20AE), *
PORTNAME=OSA20A1, *
PORTNUM=1, *
MPCLEVEL=QDIO
Messages that are issued by z/OS UNIX begin with the prefix BPX.
The next set of messages shows the initialization of SMS. SMS is a critical component
because the zFSs are SMS-managed. Subsequently, the file systems are mounted starting
with the root.
2. This message indicates that the Language Environment is available to be exploited by IBM
TCP/IP Lotus®, WebSphere®, and parts of the z/OS base, and also by languages such as
C/C++, COBOL, and others.
Initialization of devices must be completed before they achieve READY status (displayed by
using NETSTAT DEVLNKS) and connected to the network.
5. The EZB6473I and EZAIN11I messages are the final initialization messages to complete
the successful initialization of the TCP/IP stack.
Example 3-28 is the output from the OMVS display that shows the address space identifiers
for all z/OS UNIX processes on this LPAR.
What is also significant here is that OMVS=DEFAULT is not displayed in the output. In the
previous review of the z/OS UNIX environment, we mentioned that the z/OS UNIX System
Services must be customized in full-function mode. The display tells you that, at the least,
your system is not running in default mode (minimal mode).
Notice the various TPC/IP stacks and that tasks that are associated with them. Both TCPIPA
and TCPIP (the default stacks) are running EZBTCPIP. There are also multiple tasks that are
associated with the same RACF user ID, TCPIP. This offers the advantage of easier
maintenance and system definitions. However, this also presents the disadvantage of having
no distinguishing features among messages for individual tasks. Many users of TCP/IP and
UNIX System Services assign individual RACF user IDs to each OMVS user for easier
problem determination.
For a thorough description about the use and implementation of RACF, see IBM z/OS V2R2
Communications Server TCP/IP Implementation Volume 4: Security and Policy-Based
Networking, SG24-8363.
Example 3-30 shows several files that are defined in the active BPXPRM3A member for
comparative purposes only. You can compare the names that are defined in the active
BPXPRM3A member with the names that are actually active by using the D OMVS,F command.
MOUNT FILESYSTEM('WTSCPLX5.&SYSNAME..SYSTEM.ROOT')
MOUNTPOINT('/&SYSNAME.')
UNMOUNT
TYPE(HFS) MODE(RDWR)
MOUNT FILESYSTEM('OMVS.&SYSLEVEL..&SYSR1..ROOT')
MOUNTPOINT('/$VERSION')
AUTOMOVE
TYPE(HFS) MODE(READ)
MOUNT FILESYSTEM('OMVS.&SYSNAME..ETC')
MOUNTPOINT('/&SYSNAME./etc')
UNMOUNT
TYPE(HFS) MODE(RDWR)
MOUNT FILESYSTEM('OMVS.&SYSNAME..VAR')
MOUNTPOINT('/&SYSNAME./var')
UNMOUNT
TYPE(HFS) MODE(RDWR)
MOUNT FILESYSTEM('/&SYSNAME./TMP')
TYPE(TFS) MODE(RDWR)
MOUNTPOINT('/&SYSNAME./tmp')
PARM('-s 500')
UNMOUNT
MOUNT FILESYSTEM('/DEV')
MOUNTPOINT('/dev')
TYPE(TFS)
PARM('-s 10')
UNMOUNT
MOUNT FILESYSTEM('OMVS.&SYSLEVEL..&SYSR1..JAVA31V5')
MOUNTPOINT('/usr/lpp/java/J5.0')
TYPE(HFS) MODE(READ)
MOUNT FILESYSTEM('OMVS.&SYSLEVEL..&SYSR1..JAVA64V5')
MOUNTPOINT('/usr/lpp/java/J5.0_64')
TYPE(HFS) MODE(READ)
MOUNT FILESYSTEM('OMVS.&SYSLEVEL..&SYSR1..JAVA31V6')
MOUNTPOINT('/usr/lpp/java/J6.0')
TYPE(HFS) MODE(READ)
MOUNT FILESYSTEM('OMVS.&SYSLEVEL..&SYSR1..JAVA31M1')
The OMVS processes can also be displayed within the z/OS UNIX environment, and similar
comparisons can be made. Use the shell environment to look at UNIX processes and to run
the UNIX command ps -ef. This displays all processes and their environments in forest or
family tree format.
For detailed information about UNIX commands in the z/OS UNIX environment, see z/OS
UNIX System Services Planning, GA22-7800 and z/OS UNIX System Services User's Guide,
SA22-7802.
Example 3-31 shows that the UNIX System Services, after this initialization, is running with
user ID SYSPROG. The reason is because RACF cannot map a UNIX System Services UID
to an MVS user ID correctly if there are multiple MVS user IDs defined with the same UID. So,
RACF uses the last referenced MVS user ID.
Example 3-31 UNIX System Services processes display from the shell
CS03 @ SC30:/u/cs03>ps -ef
UID PID PPID C STIME TTY TIME CMD
SYSPROG 1 0 - Jul 11 ? 0:01 BPXPINPR
SYSPROG 65538 1 - Jul 11 ? 0:00 EZBREINI
SYSPROG 65539 1 - Jul 11 ? 0:00 EZBREUPS
SYSPROG 65540 1 - Jul 11 ? 0:09 HZSTKSCH
SYSPROG 65542 1 - Jul 11 ? 0:00
CEA 16842759 1 - Jul 11 ? 0:00 CEAPSRVR
SYSPROG 33619977 1 - Jul 11 ? 0:00 /usr/sbin/syslogd -c -
i -u -f /etc/syslog.conf
NET 65547 1 - Jul 11 ? 0:41 ISTMGCEH
SYSPROG 65550 1 - Jul 11 ? 0:15 EZBTNINI
SYSPROG 65551 1 - Jul 11 ? 14:53 ERB3GMFC
SYSPROG 65552 1 - Jul 11 ? 0:15 EZBTTSSL
SYSPROG 65554 1 - Jul 11 ? 0:15 EZBTXLUR
SYSPROG 65555 1 - Jul 11 ? 0:15 EZBTXLUS
SYSPROG 65556 1 - Jul 11 ? 0:15 EZBTMCTL
SYSPROG 65557 1 - Jul 11 ? 0:15 EZBTTMST
SYSPROG 65558 1 - Jul 11 ? 0:15 EZBTTMST
SYSPROG 65559 1 - Jul 11 ? 0:15 EZBTTMST
SYSPROG 65560 1 - Jul 11 ? 0:33 EZBTCPIP
SYSPROG 16842778 1 - Jul 11 ? 0:33 EZACFALG
SYSPROG 16842779 1 - Jul 11 ? 0:01 IAZNJTCP
SYSPROG 65564 1 - Jul 11 ? 0:33 EZASASUB
SYSPROG 16842781 1 - Jul 11 ? 0:00 /usr/sbin/inetd /etc/i
netd.conf
SYSPROG 50397214 1 - Jul 11 ? 0:01 GFSCINIT
SYSPROG 65570 1 - Jul 11 ? 0:00 RSHD
Note: Both obrowse and oedit are TSO commands. If you used telnet or rlogin to get to
the UNIX System Services shell, you have to use the cat command and the vi editor.
ISHELL provides an ISPF look and feel. The OMVS shell provides a more UNIX or DOS look
and feel, and of course for UNIX users there is the vi editor.
The final act of the verification is starting z/OS Communications Server TCP/IP. Example 3-32
shows the start of the TCP/IP stack.
Important: Because TCP/IP shares its DLCs with VTAM, you must restart TCP/IP if you
restart VTAM.
Note: If you want to run TCPIP in a reusable address space ID, see 1.3.3, “Reusable
address space ID” on page 6.
If the message is not displayed, the messages that are issued by the TCP/IP address space
describe why TCP/IP did not start.
Example 3-33 shows the output from the NETSTAT, CONFIG display.
Check your storage utilization to ensure that you made the correct allocations. Storage usage
can also be controlled by using the GLOBALCONFIG ECSALIMIT and GLOBALCONFIG POOLLIMIT
parameters. ECSALIMIT allows you to specify the maximum amount of extended common
service area (ECSA) that TCP/IP can use. POOLLIMIT allows you to specify the maximum
amount of authorized private storage that TCP/IP can use within the TCP/IP address space.
You can also use the MVS command D TCPIP,tcpproc,STOR to display TCP/IP storage
usage, as illustrated in Example 3-36.
Tip: When directing trace resolver output to a TSO terminal, define the screen size to be
only 80 columns wide. Otherwise, the trace output is difficult to read.
Note: For TCP connections, the start date and time indicate the occurrence of the following
socket functions for the TCP socket:
Bind
Listen
Connection establishment
For UDP endpoints, the start date and time indicate the occurrence of the bind socket
function for the UDP socket.
Socket establishment time is useful for performance and problem analysis. For more
information about the syntax and output of the commands, see z/OS Communications Server:
IP System Administrator’s Commands, SC27-3661.
The VARY command is an z/OS Console command. It allows you to add, delete, or redefine all
devices dynamically, and also change TN3270 parameters, routing, and almost any TCP/IP
parameter in the profile. These changes are in effect until the TCP/IP started task is started
again, or another VARY OBEYFILE command overrides them.
For further details about the VARY OBEYFILE command, see z/OS CS: IP System
Administrator’s Commands, SC31-8781. For more information about RACF definitions, see
IBM z/OS V2R2 Communications Server TCP/IP Implementation Volume 2: Standard
Applications, SG24-8361.
The sequence for deleting and adding back a resource that was defined by using the
INTERFACE statement is as follows:
1. Stop the device.
2. Delete the interface.
3. Add the new or changed interface.
4. Start the device.
To delete and add back a resource that was defined by using the DEVICE, LINK, or HOME
statements, complete the following steps:
1. Stop the device.
2. Remove the HOME address by excluding it from the full stack’s HOME list.
3. Delete the link.
4. Delete the device.
Note: You can delete and redefine OSA-Express resources that are defined with either the
INTERFACE statement or the DEVICE, LINK, or HOME statements by following the same
procedure but by creating different OBEYFILE commands. Because the INTERFACE
statement is now the preferred way of defining OSA devices, we use that procedure first in
the following examples.
Example 3-38 and Example 3-39 show the interface OSA2080I, or link OSA2080L, that is
active with associated IP address of 10.1.2.11.
Example 3-38 Displays a netstat device before deletion (for INTERFACE defined)
D TCPIP,TCPIPA,N,DE
.................................................................... Lines deleted
INTFNAME: OSA2080I INTFTYPE: IPAQENET INTFSTATUS: READY
PORTNAME: OSA2080 DATAPATH: 2082 DATAPATHSTATUS: READY
CHPIDTYPE: OSD
SPEED: 0000001000
IPBROADCASTCAPABILITY: NO
VMACADDR: 020002776873 VMACORIGIN: OSA VMACROUTER: ALL
ARPOFFLOAD: YES ARPOFFLOADINFO: YES
CFGMTU: NONE ACTMTU: 8992
IPADDR: 10.1.2.11/24
VLANID: 10 VLANPRIORITY: DISABLED
DYNVLANREGCFG: NO DYNVLANREGCAP: YES
READSTORAGE: GLOBAL (4096K)
INBPERF: BALANCED
CHECKSUMOFFLOAD: YES SEGMENTATIONOFFLOAD: YES
SECCLASS: 255 MONSYSPLEX: NO
ISOLATE: NO OPTLATENCYMODE: NO
MULTICAST SPECIFIC:
MULTICAST CAPABILITY: YES
GROUP REFCNT SRCFLTMD
----- ------ --------
224.0.0.1 0000000001 EXCLUDE
SRCADDR: NONE ....................................................................
Lines deleted
Example 3-39 Display netstat home before deletion (for DEVICE/LINK/HOME defined)
D TCPIP,TCPIPA,N,HO
.................................................................... Lines deleted
INTFNAME: OSA2080I
ADDRESS: 10.1.2.11
FLAGS:
.................................................................... Lines deleted
Because the STOP command is run as the last statement within an OBEYFILE regardless of its
position within the file, you cannot run STOP and DELETE in one step. Trying to do so results in
error messages. You should stop the interface or device with the console command, as
shown in Example 3-40.
Enter either the NETSTAT DEV or NETSTAT HOME command to check that the device you wanted
to delete is missing from the list.
Example 3-42 and Example 3-43 show the statements that are necessary to delete the
device.
Example 3-42 OBEYFILE member to delete the device OSA2080I (INTERFACE defined)
INTERFACE OSA2080I
DELETE
Example 3-43 OBEYFILE member to delete the device OSA2080 (DEVICE/LINK/HOME defined)
HOME
10.1.1.10 VIPA1L
10.1.2.10 VIPA2L
;;;10.1.2.11 OSA2080I
10.1.3.11 OSA20C0I
10.1.3.12 OSA20E0I
10.1.2.12 OSA20A0I
10.1.4.11 IUTIQDF4L
10.1.5.11 IUTIQDF5L
10.1.6.11 IUTIQDF6L
;
DELETE LINK OSA2080I
DELETE DEVICE OSA2080
Note: With DEVICE/LINK/HOME defined devices, you must provide the complete HOME
definition that excludes the device that you want to delete because the new HOME statement
replaces the existing one. This step is not necessary with devices that are defined by using
the INTERFACE statement.
Run the command that is shown in Example 3-46 to add the device and link that are
associated with its own IP address.
Then, follow with a display to verify the addition to the stack, as shown in Example 3-47.
Messages with prefix BPX z/OS MVS System Messages, Vol 3 (ASB-BPX), SA22-7633
Messages with prefix EZA For Communications Server for z/OS IP, see z/OS Communications Server: IP
Messages Volume 1 (EZA), SC31-8783
Messages with prefix EZB For Communications Server for z/OS IP, see z/OS Communications Server: IP
Messages Volume 2 (EZB, EZD), SC31-8784
Messages with prefix EZY For Communications Server for z/OS IP, see z/OS Communications Server: IP
Messages Volume 3 (EZY), SC31-8785
Messages with prefix EZZ and SNM For Communications Server for z/OS IP, see z/OS Communications Server: IP
Messages Volume 4 (EZZ, SNM), SC31-8786
Messages with prefix FOMC, z/OS UNIX System Services Messages and Codes, SA22-7807
FOMM, FOMO, FSUC, and FSUM
Eight-digit SNA sense codes and z/OS Communications Server: IP and SNA Codes, SC31-8791
DLC codes
UNIX System Services return z/OS UNIX System Services Messages and Codes, SA22-7807
codes and reason codes
Chapter 4. Connectivity
In today’s networked world, the usability of a computer system is defined by its connectivity.
Although there are many ways for TCP/IP traffic to reach IBM mainframes, this chapter
describes the most commonly used and the most dynamic types of mainframe connectivity.
Detailed topics about these interfaces are provided, including implementation information,
design scenarios, and setup examples.
This chapter covers the topics that are shown in Table 4-1.
Figure 4-1 shows the physical interfaces (and device types) that are provided by z Systems
servers. The physical network interface is enabled through z/OS Communications Server
(TCP/IP) definitions.
Sysplex
MPCPTP (XCF)
CF Environment
IBM System p
or OEM MPCPTP
Servers MPCPTP
(samehost)
CTC
(FICON/Escon) z Systems
Servers
Token Ring
MPCIPA LCS/MPCIPA
(HiperSockets)
MPCOSA (OSA2)
LCS/MPCIPA
(1000BASE-T)
ATM (LANE)
LCS/MPCIPA
ATM Ethernet
MPCIPA (GbE)
Network
MPCIPA (10GbE )
z Systems network connectivity is handled by the physical and logical interfaces to enable the
transport of IP datagrams. Using the OSI model as an example, it spans Layer 1 (physical
layer) and Layer 2 (data link control (DLC) layer). The z/OS Communications Server supports
several types of interfaces connecting to separate networking environments. These
environments vary from point-to-point connections (such as MPCPTP, CTC, and CLAW), to
LAN connections (such as LCS and MPCIPA).
For more information about these protocols, see z/OS Communications Server: IP
Configuration Reference, SC27-3651.
Both interfaces use the System z I/O architecture that is called queued direct input/output
(QDIO).
QDIO is a highly efficient data transfer mechanism that satisfies the increasing volume of
applications and bandwidth demands. It dramatically reduces system processing impact, and
improves throughput by using system memory queues and a signaling protocol to directly
exchange data between the OSA-Express microprocessor and network software by using
data queues in main memory and by using Direct Memory Access (DMA).
The components that comprise QDIO are DMA, Priority Queuing, dynamic OSA Address
Table (OAT) building, LPAR-to-LPAR communication, and Internet Protocol (IP) Assist
functions.
With QDIO, I/O interrupts and I/O path-lengths are minimized, resulting in significantly
improved performance versus non-QDIO mode, reduction of system assist processor
(SAP) utilization, improved response time, and server cycle reduction.
z/OS Communications Server can transport only IP traffic over OSA-Express in QDIO mode
and HiperSockets. However, SNA can be transported over IP connections by using
encapsulation technologies, such as Enterprise Extender (EE) and TN3270.
For more information about EE, see Enterprise Extender Implementation Guide, SG24-7359.
For TN3270 details, see IBM z/OS V2R2 Communications Server TCP/IP Implementation
Volume 2: Standard Applications, SG24-8361.
Table 4-3 lists the OSA-Express Ethernet features that are available on the z Systems
platforms. The mode of operation in which they can run and the necessary TCP/IP and VTAM
definition types are included.
Note: The 1000Base-T feature can also support native SNA data flows to VTAM when
configured in Non-QDIO mode. The VTAM device type protocol is called Link Station
Architecture (LSA).
z/OS Communications Server registers IPv4 addresses in the OSA OAT for two distinct
purposes:
Inbound routing
ARP offload
Several factors contribute to the types of IPv4 addresses in a TCP/IP stack that are registered
in the OAT. These factors are summarized in the following questions:
Does the adapter interface definition include a virtual MAC (VMAC) keyword?
Is VMAC ROUTEALL coded or defaulted for the adapter interface?
Is VMAC ROUTLCL coded for the adapter interface?
Depending on these factors, separate addresses are registered in the OSA as described here
for the purposes of inbound routing and ARP offload:
Inbound routing:
– For an INTERFACE statement with VMAC ROUTEALL or for DEVICE/LINK, do not register any
IP addresses for inbound routing. Register only an IP address for supporting ARP
offload.
– For INTERFACE without VMAC ROUTEALL or for DEVICE/LINK, register the entire home list
for inbound routing.
Displaying registered addresses: OSA/SF has a Get OAT function that retrieves the
registered IP addresses in the OAT. However, the displayed table is incomplete, containing
only a limited number of the addresses that the stack registered with the OSA device.
When performing problem determination for the OSA, do not assume that OSA/SF is
showing you everything that you need to know. You might have to solicit the help of Level 2
defect support to see everything that is registered in the OSA.
Note: The INTERFACE statement is required if one stack is attaching to multiple VLANs
though a single OSA port.
When a VLAN ID is configured for an OSA-Express interface in the TCP/IP stack, the
following operations occur:
The TCP/IP stack becomes VLAN-enabled, and the OSA-Express port is considered to be
part of a VLAN.
During activation, the TCP/IP stack registers the VLAN ID value to the OSA-Express port.
A VLAN tag is added to all outbound packets.
The OSA-Express port filters all inbound packets based on the configured VLAN ID.
If the TCP/IP stack is also configured with PRIRouter or SECRouter for an OSA-Express port
that has a VLAN ID defined, then the stack serves as an IP router for the configured VLAN ID.
If OSA-Express ports are shared across multiple TCP/IP routing stacks, consider using virtual
MAC support for your environment instead of the PRIRouter and SECRouter options. For
details, see Chapter 6, “Virtual LAN and virtual MAC support” on page 285.
Figure 4-2 shows how the PRIRouter function works in a shared OSA environment.
PRIRouter SECRouter
VIPA 10.1.1.10 VIPA 10.1.1.20 VIPA 10.1.1.30 VIPA 10.1.1.40
OSA 10.1.2.11 OSA 10.1.2.21 OSA 10.1.2.31 OSA 10.1.3.41
HS4 10.1.4.11 HS4 10.1.4.21 HS4 10.1.4.31 HS4 10.1.4.41
OSA 1 OSA 2
In Figure 4-2, the terminal user connects to 10.1.4.41. Each stack sharing OSA1 registered
the IP addresses for VIPAs, OSAs, and the HiperSockets in the OAT. However, the address
10.1.4.41 is not represented in OSA1’s OAT. Therefore, the packet from the terminal that
arrives at OSA1 is sent to the primary routing TCP/IP stack in LPAR A. The TCP/IP stack in
LPAR A uses its routing table to forward the packet to LPAR D, where IP address 10.1.4.41
is.
If LPAR A becomes unavailable, the TCP/IP stack in LPAR B or C takes over the routing
responsibility for OSA1.
Therefore, if an OSA interface is configured with a specific VLAN ID and also configured as a
primary or secondary router, that stack serves as a router for just that specific VLAN. This
allows each OSA-Express (CHPID) to have a primary router per VLAN. Configuring primary
routers (one per VLAN) has many advantages and preserves traffic isolation for each VLAN.
If OSA-Express ports are shared across multiple TCP/IP routing stacks, consider using virtual
MAC support for your environment instead of the PRIRouter and SECRouter options. For
more information, see Chapter 6, “Virtual LAN and virtual MAC support” on page 285.
The left side of Figure 4-3 on page 147 depicts a high-latency network where the TCP
window size is too small. The round-trip time (RTT) is relatively long and the window size is
relatively small. Therefore, the sender fills the window before it receives an ACK for the data
at the start of the window. This forces the sender to delay sending additional data until it
receives an ACK or a window update. Over a long-distance connection, this can cause
transmission stalls and suboptimal performance.
The right side demonstrates a situation where the window size is large enough for the
high-latency network. The sender has not yet sent the last bit of the window size before it
receives an ACK for the first bit of the current window. The z/OS TCP maximum windows size
is 512K (defined in TCPMAXRCVBUFRSIZE in the TCPCONFIG section). However, a window size
of 512K might not always be enough to achieve this behavior.
Data Data
Round Round
Data
trip Window trip
time size time K
AC Window
(RTT) (RTT)
K size
AC
Data
Data
K
AC
C K
A
Time Time
The goal of the DRS function is to keep the pipe full for inbound streaming TCP connections
over networks with large capacity and high latency and prevent the sender from being
constrained by the receiver’s advertised window size.
If a TCP connection uses a receive buffer size larger than 64 KB, the stack detects a high
latency inbound streaming TCP connection and dynamically increases the receive buffer size
for the connection (in an attempt to not constrain the sender). This in turn adjusts the
advertised receive window and allows window size to grow as high as 2 MB. The TCP receive
buffer size can grow as high as 2 MB for certain TCP connections regardless of the
TCPMAXRCVBUFRSIZE value. The stack disables the function for a connection if the application is
not keeping up with the pace.
DRS does not take effect for applications that set a value less than 64 KB on the SO_RCVBUF
socket option on SETSOCKOPT().
If TCPRCVBUFRSIZE is less than 64 KB, then DRS does not take effect for applications that do
not use the SO_RCVBUF socket option.
Implementation
To configure an OSA-Express4S or OSA-Express3 feature to operate in optimized latency
mode, use the INTERFACE statement with the OLM parameter. Because optimized latency
mode affects both inbound and outbound interrupts, it supersedes other inbound
performance settings set by the INBPERF parameter.
Guidelines
Because of the operating characteristics of optimized latency mode, other configuration
changes might be required:
For outbound traffic to gain the benefit of optimized latency mode, traffic must be directed
to priority queues 1, 2, or 3 by using the WLMPRIORITYQ parameter in the GLOBALCONFIG
statement or by using Policy Agent and configuring a policy with the
SetSubnetPrioTosMask statement.
Although an OSA-Express feature supports multiple outbound write priority queues,
outbound optimized latency mode is performed only for traffic on priority queue 1 (priority
level 1). The TCP/IP stack combines all the traffic that is directed to priority queues 1, 2,
and 3 into priority queue 1 for any OSA-Express4S or OSA-Express3 feature operating in
optimized latency mode.
Configure the WLMPRIORITYQ parameter with no subparameters, which assigns a default
mapping of service class importance levels to OSA-Express outbound priority queues.
This default mapping directs traffic that is assigned to the higher priority service class
importance levels 1 - 4 to queues that operate in optimized latency mode, and enables the
appropriate types of traffic to benefit from optimized latency mode.
Ensure that there are no more than four concurrent users of an OSA-Express4S or
OSA-Express3 feature that are configured with optimized latency mode.
When enabling multipath routing by using the PERPACKET option, do not configure a
multipath group that contains an OSA-Express feature that is configured with optimized
latency mode and any other type of device.
For more information about OSA-Express features and capabilities, see OSA-Express
Implementation Guide, SG24-5948.
The left side of Figure 4-4 on page 149 depicts OSA single inbound queue support. All
inbound QDIO traffic is received on a single read queue regardless of the data type. This
includes both batch and interactive traffic and both traffic that is destined for this TCP/IP stack
and traffic to be forwarded by this TCP/IP stack. The maximum amount of storage available
for inbound traffic is limited to the read buffer size (64 KB read SBALs) times the maximum
number of read buffers (126).
z/OS z/OS
Application Application
VTAM
TCP/IP TCP/IP
z z
CP CP CP CP CP CP CP CP CP CP CP
OSA OSA
SD
EE
bulk
1 2 3 4
other
1 2 3 4
Multiple processes run for inbound traffic only when data is accumulating on the read queue
(typically during burst periods when z/OS Communications Server is not keeping up with the
OSA). This can cause bulk data packets for a single TCP connection to arrive at the TCP
layer out of order. Each time the TCP layer on the receiving side sees out of order data, it
transmits a duplicate ACK. A single process is used to package the data, queue it, and
schedule the TCP/IP stack to process it. This same process also performs acceleration
functions, such as Sysplex Distributor connection routing accelerator. The TCP/IP stack
separates the traffic types to be forwarded to the appropriate stack component that processes
them.
The supported traffic types are streaming bulk data, sysplex distributor, and Enterprise
Extender. Examples of bulk data traffic are FTP, TSM, NFS, and IBM TDMF. Both IP versions
are supported for all types of traffic.
The dynamic LAN idle timer is updated independently for each read queue. This ensures the
most efficient processing of inbound traffic based on the traffic type.
Implementation
The QDIO inbound workload queuing function is enabled with the INBPERF DYNAMIC
WORKLOADQ setting on the IPAQENET and IPAQENET6 INTERFACE statements. WORKLOADQ is not
supported for INBPERF DYNAMIC on IPAQENET LINK statements. The VMAC parameter can be
specified with or without macaddr.
For more information, see the information about the IPAQENET INTERFACE and IPAQENET6
INTERFACE statements in z/OS Communications Server: IP Configuration Reference,
SC27-3651.
Verification
See a WorkloadQueuing field in the netstat DEVLINKS/-D report to determine whether the
QDIO inbound workload queuing function is enabled. This information can also be returned
by the GetIfs callable NMI.
Moreover, you can use other commands to obtain more information about the QDIO inbound
workload queuing function for the QDIO interface. The output for the Display ID=trlename
and Display TRL,TRLE=trlename commands shows whether this function is in use for the
QDIO interface as follows:
For each input queue, it includes the queue ID and queue type in addition to the read
storage. The queue type is PRIMARY for the primary input queue, BULKDATA for the bulk
data AIQ, and SYSDIST for the sysplex distributor connection routing AIQ.
The queue type value N/A indicates that the queue is initialized but is not in use by the
TCP/IP stack.
In addition, the queue ID and queue type can be used to correlate with VTAM tuning statistics,
packet trace, and OSA-Express Network Traffic Analyzer (OSAENTA) trace output for the
QDIO interface. The netstat ALL/-A report includes the interface name for bulk data TCP
connections that are using this function. This information can also be returned by the
GetConnectionDetail callable NMI. The netstat STATS/-S report includes the total number of
segments that is received for all connections from the bulk data AIQ of this function. This
information can also be returned by the GetGlobalStats callable NMI.
Considerations:
Access to the INMN is restricted to authorized management applications, and is only
available through Port 0 of any OSA-Express CHPID that is configured with type OSM.
Port 1 is not available for these communications.
Connectivity to the INMN is restricted to stacks that are enabled for IPv6.
Connectivity to the INMN and to the IEDN is allowed only when the central processor
complex (CPC) is a member of an ensemble.
Customer
Managed
Management
Network
zEnterprise Node
z196
zBX zBX
OSX Intraensemble
Data Network
(IEDN)
OSD
Customer
Managed BladeCenter
Data Chassis
Networks
Consolidated servers that have to access corporate data on the z Systems server can do so
at memory speeds, bypassing all the network processing impact and delays.
Because there is no server-to-server traffic outside the z Systems server, a much higher level
of network availability, security, simplicity, performance, and cost effectiveness is achieved as
compared with servers communicating across a LAN, such as:
HiperSockets has no external components. It provides a secure connection. For security
purposes, servers can be connected to separate HiperSockets or VLANs within the same
HiperSockets. All security features, such as IPSec or IP filtering, are available for
HiperSockets interfaces as they are with other TCP/IP network interfaces.
HiperSockets looks like any other TCP/IP interface; it is transparent to applications and
supported operating systems.
HiperSockets can also improve TCP/IP communications within a sysplex environment
when the DYNAMICXCF is used (for example, in cases where Sysplex Distributor uses
HiperSockets within the same z Systems server to transfer IP packets to the target
systems).
The HiperSockets device is represented by the IQD channel ID (CHPID) and its associated
subchannel devices. All LPARs that are configured in HCD/IOCP to use the same IQD CHPID
have internal connectivity and can communicate by using HiperSockets.
VTAM builds a single HiperSockets MPC group by using the subchannel devices that are
associated with a single IQD CHPID. VTAM uses two subchannel devices for the read and
write control devices, and 1 - 8 devices for data devices. Each TCP/IP stack is assigned a
single data device.
Therefore, to build the MPC group, there must be a minimum of three subchannel devices
defined (within HCD) and associated with the same IQD CHPID. The maximum number of
subchannel devices that VTAM uses is 10 (supporting eight data devices or eight TCP/IP
stacks) per LPAR or MVS image.
IQD CHPID can be viewed as a logical LAN within the server. z Systems servers allow up to
16 separate IQD CHPIDs, creating the capability of having up to 16 separate logical LANs
within the same server.
Each IQD CHPID can be assigned to a set of LPARs (configured in HCD) so isolating these
LPARs in separate logical LANs becomes possible, as Figure 4-6 shows.
z Systems Server
Production Test
Restriction: HiperSockets multiple write is effective only on IBM System z10 or later and
when z/OS is not running as a guest in an IBM z/VM® environment.
To enable the HiperSockets multiple write facility on all HiperSockets interfaces, including
interfaces that are created for dynamic XCF, add the IQDMULTIWRITE parameter to the
GLOBALCONFIG statement.
For more information, see Appendix B, “Additional parameters and functions” on page 471.
For a review of the scenarios that we used to test HiperSockets multiple write, see
Appendix A, “HiperSockets Multiple Write,” in IBM z/OS V2R2 Communications Server
TCP/IP Implementation Volume 3: High Availability, Scalability, and Performance,
SG24-8362.
To enable HiperSockets traffic that is using the multiple write facility to be processed on
available zIIPs, specify the ZIIP IQDIOMULTIWRITE parameter on the GLOBALCONFIG statement.
Note: The VLAN ID that is assigned to a HiperSockets device applies to both IPv4 and
IPv6 connections over that CHPID.
HiperSockets Accelerator
z/OS Communications Server IP takes advantage of the technological advances and
high-performing nature of the I/O processing that are offered by HiperSockets with the
z Systems servers and OSA-Express by using the QDIO architecture. This is achieved by
optimizing IP packet forwarding processing that occurs across these two types of
technologies. This function is referred to as HiperSockets Accelerator. It is a configurable
option, and is activated by defining the IQDIORouting option on the IPCONFIG statement.
When the TCP/IP stack is configured with HiperSockets Accelerator, it allows IP packets that
are received from HiperSockets to be forwarded to an OSA-Express port (or vice versa)
without the need for those IP packets to be processed by the TCP/IP stack.
When using this function, one or more LPARs contain the routing stack, which manages
connectivity through OSA-Express ports to the LAN; the other LPARs connect to the routing
stack by using the HiperSockets, as shown in Figure 4-7 on page 155.
LPAR E
HiperSockets HiperSockets
(CHPID FE) (CHPID FD)
OSA OSA
Gigabit Ethernet Network
Note: This example is intended purely to demonstrate IP traffic flow. Do not implement
HiperSockets Accelerator by using a single LPAR.
Dynamic XCF is required to support Sysplex Distributor and nondisruptive dynamic VIPA
movement, as described in IBM z/OS V2R2 Communications Server TCP/IP Implementation
Volume 3: High Availability, Scalability, and Performance, SG24-8362.
Dynamic XCF creates definitions for DEVICE, LINK, HOME, and BSDROUTINGPARMS statements
and the START statement dynamically. When activated, the dynamic XCF devices and links
appear to the stack as though they are defined in the TCP/IP profile. They can be displayed
by using standard commands, and they can be stopped and started.
During TCP/IP initialization, the stack joins the XCF group, ISTXCF, through VTAM. When
other stacks in the group discover the new stack, the definitions are created automatically, the
links are activated, and the remote IP address for each link is added to the routing table. After
the remote IP address is added, IP traffic can flow across one of the following interfaces:
IUTSAMEH (within the same LPAR)
HiperSockets (within the same server)
XCF signaling (different server, either by using the coupling facility link or a CTC
connection)
LPAR 1 LPAR 2
z Systems Server 1
)
Channel-to-Channel
ng
(CTC)
gn
CF
Si
CF
(X
TCP/IP
Stack D
LPAR 3
z Systems Server 2
When an IPv4 DYNAMICXCF HiperSockets device and link are created and successfully
activated, a subnetwork route is created across the HiperSockets link. The subnetwork is
created by using the DYNAMICXCF IP address and mask. This allows any LPAR within the same
server to be reached, even ones that are not within the sysplex. To do that, the LPAR that is
outside of the sysplex environment must define at least one IP address for the HiperSockets
endpoint that is within the subnetwork that is defined by the DYNAMICXCF IP address and mask.
When multiple stacks are within the same LPAR that supports HiperSockets, both IUTSAMEH
and HiperSockets links or interfaces coexist. In this case, it is possible to transfer data across
either link. Because IUTSAMEH links have better performance, it is always better to use them
for intra-stack communication. A host route is created by DYNAMICXCF processing across the
IUTSAMEH link, but not across the HiperSockets link.
For more information about dynamic XCF, Sysplex Distributor, and nondisruptive dynamic
VIPA movement, see IBM z/OS V2R2 Communications Server TCP/IP Implementation
Volume 3: High Availability, Scalability, and Performance, SG24-8362.
To design connectivity in a z/OS environment, you must account for the following
considerations:
As a server environment, network connectivity to the external corporate network must be
carefully designed to provide a high-availability environment, avoiding single points of
failures.
If a z/OS LPAR is seen as a stand-alone server environment on the corporate network, it
should be designed as an endpoint.
If a z/OS LPAR is used as a front-end concentrator (for example, using HiperSockets
Accelerator), it should be designed as an intermediate network or node.
One example where multiple stacks can have value is when an LPAR must be connected
to multiple isolated security zones in such a way that there is no network level connectivity
between the security zones. In this case, a TCP/IP stack per security zone can be used to
provide that level of isolation without any network connectivity between the stacks.
Based on these considerations, the following sections present preferred practice scenarios for
building a z/OS Communications Server TCP/IP configuration by using OSA-Express (QDIO),
HiperSockets (iQDIO), and dynamic XCF.
We built our connectivity scenarios with two OSA-Express3 1000BASE-T features (four ports
each) that are connected to the LAN environment (one Layer3 switch). We also implemented
a HiperSockets internal LAN to interconnect all LPARs within the same System z10. Finally,
we used dynamic XCF connectivity for the Sysplex environment.
Note: Although in our environment we connected all the OSA ports to one switch, in a
production implementation, the preferred approach is to connect your OSAs to at least two
switches.
CHPID PATH=(CSS(1),0A),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),CHPARM=02,PCHID=531,
TYPE=OSM 1
CHPID PATH=(CSS(1),0B),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),CHPARM=02,PCHID=101,
TYPE=OSM
CNTLUNIT CUNUMBR=2340,PATH=((CSS(1),0A)),UNIT=OSM
IODEVICE ADDRESS=(2340,015),MODEL=M,UNITADD=00,CUNUMBR=(2340),
UNIT=OSA,MODEL=M,DYNAMIC=YES,LOCANY=YES
CNTLUNIT CUNUMBR=2360,PATH=((CSS(1),0B)),UNIT=OSM
IODEVICE ADDRESS=(2360,015),MODEL=M,UNITADD=00,CUNUMBR=(2360),
UNIT=OSA,MODEL=M,DYNAMIC=YES,LOCANY=YES
CHPID PATH=(CSS(1),18),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),PCHID=590,TYPE=OSX 1
CHPID PATH=(CSS(1),19),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),CHPARM=02,PCHID=510,
TYPE=OSX
CNTLUNIT CUNUMBR=2300,PATH=((CSS(1),18)),UNIT=OSX
IODEVICE ADDRESS=(2300,015),MODEL=X,CUNUMBR=(2300),UNIT=OSA,
MODEL=X,DYNAMIC=YES,LOCANY=YES
CNTLUNIT CUNUMBR=2320,PATH=((CSS(1),19)),UNIT=OSX
IODEVICE ADDRESS=(2320,015),MODEL=X,UNITADD=00,CUNUMBR=(2320),
UNIT=OSA,MODEL=X,DYNAMIC=YES,LOCANY=YES
CHPID PATH=(CSS(1),02),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),PCHID=530,TYPE=OSD
CHPID PATH=(CSS(1),03),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),PCHID=100,TYPE=OSD
CHPID PATH=(CSS(1),04),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),PCHID=181,TYPE=OSD
CHPID PATH=(CSS(1),05),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),PCHID=291,TYPE=OSD
CNTLUNIT CUNUMBR=2080,PATH=((CSS(1),02)),UNIT=OSA
IODEVICE ADDRESS=(2080,015),UNITADD=00,CUNUMBR=(2080),UNIT=OSA
IODEVICE ADDRESS=208F,UNITADD=FE,CUNUMBR=(2080),UNIT=OSAD
CNTLUNIT CUNUMBR=20A0,PATH=((CSS(1),03)),UNIT=OSA
IODEVICE ADDRESS=(20A0,015),UNITADD=00,CUNUMBR=(20A0),UNIT=OSA
IODEVICE ADDRESS=20AF,UNITADD=FE,CUNUMBR=(20A0),UNIT=OSAD
CNTLUNIT CUNUMBR=20C0,PATH=((CSS(1),04)),UNIT=OSA
IODEVICE ADDRESS=(20C0,015),UNITADD=00,CUNUMBR=(20C0),UNIT=OSA
IODEVICE ADDRESS=20CF,UNITADD=FE,CUNUMBR=(20C0),UNIT=OSAD
CHPID PATH=(CSS(1),F4),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),TYPE=IQD
CHPID PATH=(CSS(1),F5),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),TYPE=IQD
CHPID PATH=(CSS(1),F6),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),TYPE=IQD
CHPID PATH=(CSS(1),F7),SHARED,
PARTITION=((A11,A13,A16,A18),(=)),TYPE=IQD
CNTLUNIT CUNUMBR=E800,PATH=((CSS(1),F4)),UNIT=IQD
IODEVICE ADDRESS=(E800,032),CUNUMBR=(E800),UNIT=IQD
CNTLUNIT CUNUMBR=E900,PATH=((CSS(1),F5)),UNIT=IQD
IODEVICE ADDRESS=(E900,032),CUNUMBR=(E900),UNIT=IQD
CNTLUNIT CUNUMBR=EA00,PATH=((CSS(1),F6)),UNIT=IQD
IODEVICE ADDRESS=(EA00,032),CUNUMBR=(EA00),UNIT=IQD
CNTLUNIT CUNUMBR=EB00,PATH=((CSS(1),F7)),UNIT=IQD
IODEVICE ADDRESS=(EB00,032),CUNUMBR=(EB00),UNIT=IQD
Important: The CHPIDs, type OSM and OSX (1), are used only if z/OS is part of an
ensemble.
In addition to Example 4-1 on page 159, there are other ways to build the IOCDS for an
OSA-Express adapter. This applies particularly to a OSA-Express4S and OSA-Express3
GbE, which can contain more than a single port on the same CHPID. However, in our labs, we
used the method that is shown in Example 4-1 on page 159. To see other alternatives to
define the IOCDS and to review our suggestions, see 4.4.1, “Dependencies: CHPID, IOCDS,
port numbers, port names, and port sharing” on page 162.
z/OS Communications Server provides a set of High Performance Data Transfer (HPDT)
services that includes multipath channel (MPC), which is a high-speed channel interface that
is designed for network protocol use (for example, APPN or TCP/IP).
Multiple protocols can either share or have exclusive use of a set of channel paths to an
attached platform. With MPC, you can have multiple device paths that are defined as a single
logical connection.
The term MPC group is used to define a single MPC connection that can contain multiple
read and write paths. The number of read and write paths does not have to be equal, but
there must be at least one read and write path that are defined within each MPC group.
MPC groups are defined by using the Transport Resource List (TRL), where each defined
MPC group becomes a TRL entry (TRLE) in the TRL table. The configuration and control of
the MPC interfaces are provided by VTAM. They are enabled in VTAM as TRLE minor nodes.
You must define the channel paths that are a part of the group in the TRLE. Each TRLE is
identified by a resource name. For OSA-Express, the TRLE also has a port name to identify
the association between VTAM and TCP/IP, allowing connectivity to the OSA-Express port.
For details about defining a TRLE, see z/OS Communications Server: SNA Resource
Definition, SC31-8778.
Because you are dealing with multiple LPARs in the example server, for redundancy purposes
share the OSA-Express ports (CHPID type OSD) across all LPARs.
In this scenario, there are two OSA-Express3 1000BASE-T features, each with four ports, two
ports per channel. One port of each channel was used unless the second port was needed
for the testing of new functions. This allows you to have four CHPIDs (02, 03, 04, and 05),
shared by four LPARs (SC30, SC31, SC32, and SC33), as shown in Figure 4-9.
VLAN 10 VLAN 11
10.1.2.240 10.1.3.240
SWITCH
To make better use of the OSA-Express ports and to control data traffic patterns, define one
port on each OSA-Express feature with a separate VLAN ID, creating two subnetworks to be
used by all LPARs. In a high availability configuration, these OSA-Express ports are the path
to all of the IP addresses for the LAN environment.
Port 0
CHPID x Port 0 CHPID x
Port 1
Port 1
CHPID y Port 0
CHPID y
Port 0
Figure 4-10 Comparison: OSA-E2 2-port adapter and OSA-E3 4-port adapter
Each port of the OSA-Express2 adapter that is shown in Figure 4-10 is on a separate CHPID:
CHPID x and CHPID y. Each port on each CHPID is defined with a separate port name and is
at port number 0.
The OSA-Express3 is engineered with two ports on each CHPID: CHPID x and CHPID y. The
two ports on each CHPID are numbered port 0 and port 1. Note how the top half of the
OSA-E3 is the mirror image of the bottom half with regard to the port number assignments;
reading from top to bottom, you see Port 0, Port 1, Port 1, Port 0. As with any OSA port, the
port names on the multi-port OSA-E3 must be unique to a CHPID. An explanation of this port
name assignment is in “Considerations for assigning the OSA port name” on page 168.
Example 4-2 Sample CNTLUNIT and IODEVICE for an OSA on CHPID Type OSD (QDIO)
CNTLUNIT CUNUMBR=2080,PATH=((CSS(2),02)),UNIT=OSA
IODEVICE ADDRESS=(2080,015),CUNUMBR=(2080),UNIT=OSA 1
Example 4-2 corresponds to what you must code in a VTAM TRLE definition to support a
QDIO connection of a TCP/IP stack. Look at Example 4-3, where you see that the VTAM
TRLE that defines port number 0 (which is at A, the only port number on an OSA-Express2)
uses only the first nine addresses (2080 - 2088) of the allocated 15 addresses (2080 - 208E)
on this CNTLUNIT.
To add the OSA-Express3 port that is at port number 1 of the same CHPID, use the same
IOCDS as before, but add a TRLE definition for PORTNUM=1 (B). See the TRLE in Example 4-4.
In the example, we have simply started the addresses for PORTNUM=1 at 2089 of the IOCDS C.
Port 0
CHPID x Port 0 CHPID x IODEVICE 2080 - 2088
IODEVICE 2080 - 2088 Port 1
IODEVICE 2089 - 208D
Port 1
CHPID y Port 0
CHPID y
Port 0
As Example 4-2 on page 163 shows, the IOCP definitions have no awareness of the OSA
adapter’s two ports and simply assign device addresses; the VTAM definition for z/OS does
care about the port numbers and maps the number to the addresses (Example 4-3 on
page 163 and Example 4-4 on page 163). This address allocation scheme worked well for us
because we did not have to reconfigure the IOCP for our test. Other schemes might work
better for you, particularly if you are consolidating OSA ports from separate CHPIDs onto the
same CHPID of a new OSA-Express3.
Note: Our examples show how to point to the two separate ports with the PORTNUM
parameter in a z/OS example. Other z Systems operating systems, such as z/VM, Linux on
z, IBM z/VSE®, or TPF, have similar coding parameters to allocate addresses to port
number 0 versus port number 1. See the appropriate operating system documentation for
those definitions.
A migration to OSA-Express 3 can affect more than just the IOCDS. You also have other
types of definitions in the operating system and potentially in access methods (like VTAM) to
migrate. The more you can keep the definitions the same across migrations, the easier and
more efficient the migration to a new platform or release becomes. This is where the next two
alternatives can make a difference for you.
Example 4-6 VTAM definitions for OSA-E3 port numbers 0 and 1 (two device ranges)
OSA1000 VBUILD TYPE=TRL
OSA1000P TRLE LNCTL=MPC, *
READ=1000, *
WRITE=1001, *
DATAPATH=(1002), *
PORTNAME=OSA1000, *
PORTNUM=0, 1 *
MPCLEVEL=QDIO
Figure 4-12 shows how the device addresses are allocated for this example.
Port 0
IODEVICE 1000 - 101F
CHPID x Port 0 CHPID x
IODEVICE 1000 - 101F Port 1
IODEVICE 2000 - 201F
Port 1
CHPID y Port 0
CHPID y
IODEVICE 2000 - 201F
Port 0
Figure 4-12 Consolidate two OSA ports from OSA-E2 onto a single CHPID of OSA-E3
Example 4-7 shows the device range for port number 0 under CUADD=0 (A) and the device
range for port number 1 under CUADD=1 (B).
Example 4-7 Separate logical control unit for each OSA-E3 port
CNTLUNIT CUNUMBR=3000,CUADD=0 A,PATH=((CSS(0),02),(CSS(1),02)),UNIT=OSA
IODEVICE ADDRESS=(3000,032),UNITADD=00,CUNUMBR=(3000),UNIT=OSA
IODEVICE ADDRESS=3020,UNITADD=FE,CUNUMBR=(3000),UNIT=OSAD
CNTLUNIT CUNUMBR=3500,CUADD=1 B,PATH=((CSS(0),02),(CSS(1),02)),UNIT=OSA
IODEVICE ADDRESS=(3500,032),UNITADD=00,CUNUMBR=(3500),UNIT=OSA
The VTAM definitions look similar to what you have seen before. Examine the coding in
Example 4-8.
Example 4-8 VTAM TRLEs for two logical control units and port numbers of an OSA-E3
OSA3000 VBUILD TYPE=TRL
OSA3000P TRLE LNCTL=MPC, *
READ=3000, *
WRITE=3001, *
DATAPATH=(3002), *
PORTNAME=OSA3000, *
PORTNUM=0, 1 *
MPCLEVEL=QDIO
Port 0
IODEVICE 3000 - 301F
CHPID x Port 0 CHPID x CUADD=0 *
IODEVICE 3000 - 301F
Port 1
CUADD=0 (default) IODEVICE 3500 - 351F
CUADD=1
Port 1
CHPID y Port 0
CHPID y
IODEVICE 3500 - 351F
CUADD=0 (default) Port 0
Figure 4-13 Distinguish OSA-E3 port numbers in the IOCDS with a CUADD parameter
As with the second alternative, you might find it easier to merge what were OSA connections
on two separate CHPIDs into a single CHPID and distinguish them with separate address
ranges and separate logical control unit numbers.
Notes:
In all the IOCDS definitions that are illustrated so far, we coded the Open Systems
Adapter/Support Facility (OSA/SF) device on CUADD=0, either by default or through
explicit coding. The OSA/SF device must be on CUADD=0.
OSA supports outbound priority queuing (multiple outbound queues) when no more
than 480 valid subchannels are defined for all LPARs sharing a CHPID. Each LPAR
sharing a CHPID gets a subchannel for every device that is defined on that CHPID.
Therefore, if you define a CHPID that is shared by 15 LPARs and define 32 devices
(either on one port or across two ports), you have used 480 valid subchannels (15 * 32
= 480). If your definition requires more than 480 valid subchannels (with a maximum of
1920), then the user must explicitly turn off Outbound Priority Queuing on the CHPID
definition by specifying CHPARM=02 in the IOCP or by specifying it in HCD.
HCD prevents a device definition that causes the 480 subchannel limit to be broken.
IOCP issues an error message and does not create an IOCDS if the limit is broken.
If you must define more than 254 devices for an unshared OSD channel path, multiple
control units must be defined. Specify a unique logical address for each control unit by
using the CUADD keyword.
This rule seems obvious, but you might find yourself confused when you contemplate a
migration from certain configurations of the OSA-Express2 to an implementation of a new
OSA-Express3. Consider Figure 4-14.
Port 0
Port name GIG0x
CHPID x Port 0 CHPID x
Port name GIGx Port 1
Port name GIG1x
[Port name GIG0]
Port 1
CHPID y Port 0
CHPID y [Port name GIG0]
Port name GIGy Port 0
[Port name GIG0] [Port name GIG0]
On an OSA CHPID, the port name value must On an OSA CHPID, the port name value must
be unique to the CHPID. be unique to the CHPID.
This example depicts a single port per CHPID, This example depicts multiple ports per CHPID,
as in the design of an OSA-E2. as in the design of an OSA-E3.
The port names are unique to the CHPID and The port names at the top of the graphic are
also different from each other. not only unique to the CHPID but also different
However, certain configurations permit the port from each other: "GIG0x" and "GIG1x."
names to be the same as in "GIG0." For No configuration can allow two OSA ports on
example, if different VTAMs control the OSA the same CHPID to be assigned the same port
TRLE definitions, the port names can be the name. For example, the port names in the
same (for example, GIG0) across the two bottom half of the graphic must bear unique
CHPIDs. port names or one port fails to activate.
Figure 4-14 shows that if you attempt to move both ports that are named GIG0 to CHPIDy of
the OSA-E3, one port does not activate because the names are no longer unique to the
CHPID. The presence of duplicate names on the same CHPID generates an SNA sense code
of 8010311B.
When planning connectivity for a LAN environment, there might not be a requirement to
isolate data traffic or services for certain servers or clients as we show in this scenario. In
such cases, VLAN IDs can be omitted.
If there is a requirement for VLANs, add the VLAN IDs to your IP addressing scheme to aid in
the mapping of IP addresses to VLANs based on data traffic patterns or access to resources.
Also, to simplify administration and management of VLANs, consider using Generic Attribute
VLAN Registration Protocol (GVRP) where possible. For details, see “VLAN support of
Generic Attribute Registration Protocol” on page 145.
Table 4-4 OSA-Express and switch port assignment with VLAN IDs
OSA-Express port Connects to switch Switch port VLAN ID (mode)
For all OSA-Express ports in our scenarios, we used the following port names:
OSA2080
OSA20A0
OSA20C0
OSA20E0
The device definition of an OSA-Express port must be set as an MPCIPA device type 1. The
link definition describes the type of transport used (in our case, QDIO Ethernet, which is
defined as IPAQENET 2). VLAN ID 3 defines the VLAN number the packets are tagged with
as they are being sent out to the switch.
Note: You can define only a single VLAN for each OSA port with DEVICE and LINK
statements. If you want to define multiple VLANs on a single OSA port, you must define it
with the INTERFACE statement.
In Example 4-12, the alternative interface statement of OSA-Express ports combines the
definitions that are otherwise coded in the DEVICE, LINK, HOME, BEGINROUTES, and
BSDROUTINGPARMS statements, and as such requires a label 1, the type of transport used
(QDIO Ethernet, as defined as IPAQENET 2, which is the only type allowed for IPv4 devices),
a port name 3 matching the TRLE port name, an IP address and optional subnetmask 4,
optional MTU size 5, VLANID 6, VMAC 7 (required when setting multiple VLANs on the same
physical OSA port), and SOURCEVIPAINT 8, which associates a specific VIPA with this
interface.
Note: This step is not required when you define OSA ports through the INTERFACE
statement.
Note: This step is not required when defining OSA ports through the INTERFACE
statement.
If not supplied, defaults are used from static routing definitions in BEGINROUTES or the
OMPROUTE configuration (dynamic routing definitions), if implemented.
If the link characteristics, BEGINROUTES statements, or the OMPROUTE configuration are not
defined, then the stack’s interface layer (based on hardware capabilities) and the
characteristics of devices and links are used. However, this might not provide the
performance or function that you want.
Note: Static and dynamic routing definitions override or replace the link characteristics that
are defined through the BSDROUTINGPARMS statements. For more information about static
and dynamic routing, see Chapter 5, “Routing” on page 223.
Because the device driver resources are provided by VTAM, you can display the resources by
using VTAM display commands. To display a list of all TRLEs active in VTAM, use the D
NET,TRL command, as shown in Example 4-16.
You can also display information about TRLEs that are grouped by control type, such as MPC
or XCF devices, as shown in Example 4-17.
You can also get specific information about a single TRLE by using the TRLE name, as shown
in Example 4-18, for an OSA-Express device.
Note: You might want to revisit the description of IPv4 address registration in
“OSA-Express QDIO IPv4 address registration” on page 143.
For performance reasons, the OSA-Express bypasses the LAN and routes packets directly
between the stacks when possible. Figure 4-15 shows two TCP/IP stacks, TCPIPA and
TCPIPB, which share the OSA port that is connected to subnet 10.1.2.0/24.
SC30 SC31
TCPIPA TCPIPB
PROFA30X PROFB31X
QDIO OSA
10.1.2.0/24
10.1.2.11 10.1.2.21
A
224.000.000.001 224.000.000.001
224.000.000.005 224.000.000.005
B
C C
Switch
Router
IP Network
For performance reasons, the OSA-Express bypasses the LAN and routes packets directly
between the stacks when possible. For unicast packets, OSA internally routes the packet
when the next-hop IP address is registered on the same LAN or VLAN by another stack
sharing the OSA port. Figure 4-15 illustrates examples of this action.
Thus, you see that stacks sharing an OSA-Express port can communicate over the OSA.
Some customers might express concerns about this efficient communication path and want to
disable it because traffic flowing internally through the OSA adapter bypasses any security
features that are implemented on the external LAN. For example, the customer might have
used the virtualization features of the z Systems and of 10-Gigabit OSA adapters to build a
perimeter network (DMZ) on several LPARs of a z Systems and also several production
LPARs on the same z Systems footprint. Although they can implement firewall and intrusion
detection technologies within the LPARs to isolate the two zones (DMZ and production) from
each other, they might have already invested in external security mechanisms on the LAN. If
traffic through a shared OSA bypasses the security on the LAN, they must find a way to
prevent the internal routing across the shared OSA path.
Several network designs are available to provide isolation and force the traffic to bypass the
shared OSA path or to be prevented from using it:
Implement IP filtering on the stacks in the adjacent zones by using z/OS Policy Agent with
IP filtering and Intrusion Detection Services (IDS).
Implement routing filters that block the advertisement of certain routing zones to parts of
the network from which they should remain concealed. Examples of such features are
OSPF range checking, RIP, or EIGRP routing filters.
Implement policy-based routing (PBR) to eliminate the internal OSA path where it is not
wanted.
Define static routes so that paths to a stack sharing the OSA are forced to hop through a
router on the LAN.
Configure the TCP/IP stacks in separate zones (IP subnets) with separate VLANs that
extend into the stacks themselves.
Implement OSA connection isolation.
Coding ISOLATE on your INTERFACE statement enables the function. It tells the OSA-Express
not to allow communications to this stack other than over the LAN. OSA-Express requires that
both stacks sharing the port are non-isolated for direct routing to occur.
Because this function is specific to security, an OSA-Express interface that does not support
the connection isolation function cannot be activated. Examine the messages at 1 and 2 in
Example 4-19 which show an unsuccessful activation attempt for a QDIO interface whose
OSA does not support the ISOLATE function that was coded on it.
Example 4-19 Failure to activate an OSA interface that does not support the ISOLATE feature
V TCPIP,TCPIPF,START,OSA2080X
EZZ0060I PROCESSING COMMAND: VARY TCPIP,,START,OSA2080X
EZZ0053I COMMAND VARY START COMPLETED SUCCESSFULLY
EZD0022I INTERFACE OSA2080X DOES NOT SUPPORT THE ISOLATE FUNCTION 1
EZZ4341I DEACTIVATION COMPLETE FOR INTERFACE OSA2080X 2
To eliminate the ISOLATE specification on the device so that you can successfully activate it,
you must first STOP the interface before using the V TCPIP,,OBEYFILE command to modify
the ISOLATE parameter.
If you implement static routing where connection isolation is in effect, it is simple to code the
appropriate routing statements to bypass the direct path through the OSA. If you are running
a dynamic routing protocol, you might see routing errors when the routing protocol attempts to
send packets over the ISOLATED OSA port. Such errors are “working as designed” when
ISOLATION is introduced into the configuration.
If the visibility of such errors is unwanted, you can take other measures to avoid the failure
messages. If you are simply attempting to bypass the direct route in favor of another, indirect
route, you can accomplish this also with some thoughtful design.
For example, you might purposely bypass the direct path by using PBR or by coding static
routes that supersede the routes that are learned by the dynamic routing protocol. You might
adjust the weights of connections to favor alternate interfaces over the interfaces that are with
ISOLATE.
SC30 SC31
TCPIPA TCPIPB
PROFA30X PROFB31X
ISOLATE [ISOLATE]
QDIO OSA
10.1.2.0/24
10.1.2.11 10.1.2.21
1
224.000.000.001
224.000.000.005 X 224.000.000.001
224.000.000.005
2
Switch
Router
Both stacks are running a dynamic routing protocol that informs them that there is a direct
path (1) through the OSA port between each other. The routing protocol knows nothing of the
ISOLATE function that was introduced to prevent packets from using the direct route.
(ISOLATE must be coded on only one of the two TCP/IP stacks, although you can code it on
both in this diagram.)
If TCPIPA and TCPIPB do not communicate with each other at all, then there is no need to
alter the appearance of the existing routing table. A route failure in this instance might be
wanted. To produce a message that explains that the two endpoints are ineligible for routing
to each other at all, you can introduce an IP filter.
Note: The routing failure itself has no failure message that indicates that ISOLATE is at
fault.
However, if TCPIPA and TCPIPB do need to exchange information, you must deploy an
effective route that bypasses the direct route between them. Therefore, at TCPIPA, you might
add a non-replaceable static route to an IP address in TCPIPB; the static route in the
BEGINROUTES block points to the next-hop router on the path indicated with 2 in Figure 4-16
on page 180.
Cost = 90 2 Cost = 90
SC30 SC31
TCPIPA TCPIPB
PROFA30X PROFB31X
Figure 4-17 shows a lower-cost route at 2. The dynamic routing protocol continues to run, but
now the favored route is the one over HiperSockets, XCF, CTC, or over an alternative LAN
connection. Although the dynamic routing protocol continues its awareness of the direct OSA
path, it prefers the path at 2.
VLAN 10 VLAN 11
10.1.2.240 10.1.3.240
SWITCH
Figure 4-18 Stacks that are started with test profiles PROFA30X, B31X, C32X, and D33X
All of the z Systems TCP/IP stacks are members of an OSPF Totally Stubby Network. The
TCP/IP profiles at each stack are named PROFA30X, PROFB31X, PROFC32X, and
PROFD33X. Each stack shares each of the four OSA ports that are depicted. In VLAN 10 and
on subnet 10.1.2.0/24, you see two OSA ports on each stack: OSA2080 and OSA20A0. In
VLAN 11 and on subnet 10.1.3.0/24, you see two OSA ports on each stack: OSA20C0 and
OSA20E0. Each stack also has a static VIPA in subnet 10.1.1.0/24. The OSA and VIPA
interfaces are all advertised with OSPF protocols. However, the connections that are
implemented with the DYNAMICXCF keyword use only static routing.
CHPID 03
10.1.2.12 10.1.2.22 OSA20A0 10.1.2.32 10.1.2.42
10.1.2.x2
Registered: 20A0-20AF communication path
Next-hop VMACs/VLANID 10 through OSA port
CHPID 04
10.1.3.11 10.1.3.21 10.1.3.31 10.1.3.41
OSA20C0
10.1.3.x1
Registered: 20C0-20CF communication path
Next-hop VMACs/VLANID 11 through OSA port
Figure 4-19 OAT entries for the stacks sharing the four OSA ports
The revised diagram shows you how stacks communicate with each other over the shared
OSA ports when the next-hop router IP address is registered in the OSA. For performance
reasons, the OSA-Express bypasses the LAN and routes packets directly between the stacks
when possible.
CHPID 02
OSA2080
X 10.1.2.x1
X 2080-208F
X
TRUNK MODE
VLAN 10
10.1.2.240
SWITCH
In our testing, we do not permit TCPIPA or TCPIPB to be reached directly over the shared
OSA port. Using the ISOLATE function, we prevent direct communication between TCPIPA
and TCPIPB by way of this port; we also prevent direct communication between either
TCPIPA or TCPIPB and either of the two remaining stacks in our configuration: TCPIPC and
TCPIPD.
We continue to permit TCPIPC and TCPIPD to share the OSA path between each other.
Note: You might choose to design your OSA ISOLATE function so that the non-sharing
TCP/IP stack might use the direct path through the OSA. However, if you have abundant
bandwidth on the OSA port, you might choose to implement ISOLATE on only selected
sharing TCP/IP stacks, as we have done in our test.
Example 4-20 ISOLATE coding on CHPID2 (OSA2080X) for PROFA30X and PROFB31X
INTERFACE OSA2080X
DEFINE IPAQENET
PORTNAME OSA2080
IPADDR 10.1.2.11/24
MTU 1492
VLANID 10
VMAC ROUTEALL
ISOLATE 1
INTERFACE OSA2080X
DEFINE IPAQENET
PORTNAME OSA2080
IPADDR 10.1.2.21/24
MTU 1492
VLANID 10
VMAC ROUTEALL
ISOLATE 2
The definitions for the interface in stacks TCPIPC and TCPIPD contain NOISOLATE, which is
also the default. See 3 and 4 in Example 4-21.
Example 4-21 NOISOLATE coding on CHPID2 (OSA2080X) for PROFC32X and PROFD33X
INTERFACE OSA2080X
DEFINE IPAQENET
PORTNAME OSA2080
IPADDR 10.1.2.31/24
MTU 1492
VLANID 10
VMAC ROUTEALL
NOISOLATE 3
INTERFACE OSA2080X
DEFINE IPAQENET
PORTNAME OSA2080
IPADDR 10.1.2.41/24
MTU 1492
VLANID 10
VMAC ROUTEALL
NOISOLATE 4
VMAC IP address
HOME 020005749925 010.001.002.011
...
************************************************************************
Image 2.4 (A24 ) CULA 0
80(2080)* MPC N/A OSA2080 (QDIO control) SIU ALL
82(2082) MPC 00 No4 No6 OSA2080 (QDIO data) Isolated X SIU ALL
VLAN 10 (IPv4)
VMAC IP address
HOME 020004749925 010.001.002.021
...
************************************************************************
First, we examine the routing table at TCPIPA to determine whether we have routes that take
us to those destinations, as shown in Example 4-24.
We run traceroute against the three target addresses in network 10.1.2.0/24, as shown in
Example 4-25.
Example 4-25 Test traceroute from TCPIPA to native OSA Home address of TCPIPB
===> tracerte 10.1.2.21 (tcp tcpipa V srcip 10.1.2.11 Intf OSA2080X
The results are the same when trying to reach TCPIPC and TCPIPD from either TCPIPA or
TCPIPB: Because the route table indicates a direct path through the OSA, the stack attempts
to send the packet over the direct route and experiences a failure. This is what we expect
because we coded ISOLATE on OSA2080X in TCPIPA (and TCPIPB).
Can we reach the VIPAs over the OSA port that is indicated as a route in Example 4-24 on
page 188? We run a traceroute to the VIPAs and discover that the available routes cannot be
reached, as shown in Example 4-26.
The results are the same when trying to reach the VIPAs at TCPIPC and TCPIPD from either
TCPIPA or TCPIPB: Because the route table indicates a direct path through the OSA, the
stack attempts to send the packet over the direct route and experiences a failure. This is what
we expect because we coded ISOLATE on OSA2080X in TCPIPA (and TCPIPB).
VMAC IP address
HOME 02004F776872 010.001.002.010
HOME 02004F776872 010.001.002.011 Y
Note: The ARP takeover function still works as expected if you start a second device on
the same subnet in the same stack. ISOLATE does not alter this function.
X3
X1 X2
10.1.2.11 10.1.2.21 10.1.2.31 10.1.2.41
4
224.000.000.001 224.000.000.001 224.000.000.001 224.000.000.001
224.000.000.005 224.000.000.005 224.000.000.005 224.000.000.005
5 Switch 5
10.1.100.221 Router 10.1.100.224
10.1.100.0/24
1, 2, 3 = Direct routes from TCPIPA or TCPIPB to any other stack are unsuccessful.
4 = Direct route between TCPIPC and TCPIPD is successful.
5 = Routes from any stack to terminals reached through the router are successful.
Figure 4-21 Available paths when ISOLATE is defined and dynamic routing is enabled
Those tests show that the existing basic routing table at each of the stacks allows you to
communicate with TCP/IP networks that are reached through the external router (5).
The routing tables also permit TCPIPC and TCPIPD to communicate with each other (4).
NOISOLATE is either coded or defaulted on the INTERFACE in the two stacks. However, TCPIPC
and TCPIPD cannot communicate with either TCPIPA or TCPIPB, and TCPIPA and TCPIPB
cannot communicate with each other over the internal OSA path (1, 2, 3). Example 4-28
shows the typical responses when a target cannot be reached.
Unfortunately, the only path that TCPIPC and TCPIPD have for reaching TCPIPA and
TCPIPB is the direct route through the OSA port, but this port prevents internal routing
because the parameter ISOLATE is coded at TCPIPA and TCPIPB. The routing table in
Example 4-24 on page 188 shows that the table points to a network route for 10.1.2.0/24,
which is reached by way of a directly attached next-hop router (0.0.0.0):
10.1.2.0/24 0.0.0.0 UO 0000000000 OSA2080X
There is no route for any of the stacks to reach each other over the external router.
Again, the issue is that the dynamic routing table knows nothing about the ISOLATE feature
because ISOLATE is not a Layer 3 function. The dynamic routing protocol is working according
to the protocol standards. So, how do we rectify this situation if we really want the stacks to
communicate with each other, but just not directly over the OSAs? It is a matter of adjusting
the routing table by adding some non-replaceable static routes.
We tested only one of these options: coding static routes to supersede the dynamically
learned routes.
X3
X1 X2
10.1.2.11 10.1.2.21 10.1.2.31 10.1.2.41
4
224.000.000.001 224.000.000.001 224.000.000.001 224.000.000.001
224.000.000.005 224.000.000.005 224.000.000.005 224.000.000.005
5 Switch 5
10.1.100.221 Router 10.1.100.224
10.1.100.0/24
1, 2, 3 = Direct routes from TCPIPA or TCPIPB to any other stack are unsuccessful.
4 = Direct route between TCPIPC and TCPIPD is successful.
5 = Routes from any stack to terminals reached through the router are successful.
6 = Indirect routes among the stacks through external router are successful if present in routing table.
The routes to the remote TCP/IP nodes (5) through the OSA ports continue to be successful
in our scenario; no changes are necessary here. The routing table between TCPIPC and
TCPIPD continues to function as expected to permit direct routing between the two stacks (4);
changes to the routing table are also unnecessary here.
The routing paths that are indicated with 1, 2, and 3 in Figure 4-22 continue to be
unsuccessful in this test because we want to enforce ISOLATE. However, we can make the
two-hop paths through the external router (6) available if we code non-replaceable static
routes. These routes supersede the dynamically learned routes in the stack’s routing table.
Example 4-29 Static non-replaceable routes at TCPIPA to override the direct route through the OSA
port
;TCPIPA.TCPPARMS(ROUTA30X)
BEGINRoutes
; Direct Routes - Routes that are directly connected to my interfaces
; Destination Subnet Mask First Hop Link Name Packet Size
;;;;;;;;;;;;;;;;;;;;;;;;;BELOW IS FOR TESTING ISOLATION;;;;;;;;;;;;;;;;
ROUTE 10.1.2.0/24 10.1.2.240 OSA2080X mtu 1492 1
ROUTE 10.1.1.0/24 10.1.2.240 OSA2080X mtu 1492 2
ROUTE 10.1.1.20/32 10.1.2.240 OSA2080X mtu 1492 3
ROUTE 10.1.1.30/32 10.1.2.240 OSA2080X mtu 1492 3
ROUTE 10.1.1.40/32 10.1.2.240 OSA2080X mtu 1492 3
;;;;;;;;;;;;;;;;;;;;;;;;ABOVE IS FOR TESTING ISOLATION;;;;;;;;;;;;;;;;;
ENDRoutes
The example shows, at 1 and 2, the indirect route to both the native OSA port IP subnet and
the VIPA IP subnet. In our scenario, these two statements do not suffice because our OSPF
configuration indicates that we are advertising HOST routes for the VIPAs. As a result, we
also need the statements you see at 3, that is, the statements that point to a route over the
external router to reach the specific host VIPA addresses. If we do not code these statements,
OSPF advertises HOST routes and our stack always tries unsuccessfully to reach the target
VIPAs over the OSA port.
We add the static routing statements that are shown in Example 4-30 to TCPIPB. The only
difference to the statements at TCPIPA is the absence of TCPIPB’s VIPA and the presence of
TCPIPA’s VIPA address.
Example 4-30 Static non-replaceable routes at TCPIPB to override the direct route through the OSA
port
;TCPIPB.TCPPARMS(ROUTB31X)
BEGINRoutes
; Direct Routes - Routes that are directly connected to my interfaces
; Destination Subnet Mask First Hop Link Name Packet Size
;
;;;;;;;;;;;;;;;;;;;;;;;;;BELOW IS FOR TESTING ISOLATION;;;;;;;;;;;;;;;;
ROUTE 10.1.2.0/24 10.1.2.240 OSA2080X mtu 1492
ROUTE 10.1.1.0/24 10.1.2.240 OSA2080X mtu 1492
ROUTE 10.1.1.10/32 10.1.2.240 OSA2080X mtu 1492
ROUTE 10.1.1.30/32 10.1.2.240 OSA2080X mtu 1492
ROUTE 10.1.1.40/32 10.1.2.240 OSA2080X mtu 1492
;;;;;;;;;;;;;;;;;;;;;;;;ABOVE IS FOR TESTING ISOLATION;;;;;;;;;;;;;;;;;
ENDRoutes
We test only a subset of all addresses that are available at the four stacks, that is, the
connectivity with the VIPAs and the native OSA port addresses. Therefore, we limit our
BEGINROUTES coding only to these two address types.
Note: If you also need connectivity to other addresses, such as CTC or HiperSockets, you
might have to add more routes to your list of non-replaceable routes.
In Example 4-31, A shows that OSPF reaches the VIPAs in subnet 10.1.1.0/24 over the OSA
port; B shows that OSPF informed the stack that the network 10.1.2.0/24 is directly attached.
Example 4-33 TCPIPC: non-replaceable static routes to other TCP/IP nodes on z Systems
;TCPIPC.TCPPARMS(ROUTC32X)
BEGINRoutes
; Direct Routes - Routes that are directly connected to my interfaces
; Destination Subnet Mask First Hop Link Name Packet Size
;;;;;;;;;;;;;;;;;;;;;;;;;BELOW IS FOR TESTING ISOLATION;;;;;;;;;;;;;;;;
ROUTE 10.1.2.11/32 10.1.2.240 OSA2080X mtu 1492 1
ROUTE 10.1.2.21/32 10.1.2.240 OSA2080X mtu 1492 1
ROUTE 10.1.1.10/32 10.1.2.240 OSA2080X mtu 1492 2
ROUTE 10.1.1.20/32 10.1.2.240 OSA2080X mtu 1492 2
;;;;;;;;;;;;;;;;;;;;;;;;ABOVE IS FOR TESTING ISOLATION;;;;;;;;;;;;;;;;;
ENDRoutes
At TCPIPC and TCPIPD, we need to override the routes that are learned from OSPF that
point to the addresses at TCPIPA and TCPIPB. In Example 4-33 at 1, we define host routes to
the native OSA port IP addresses at TCPIPA and TCPIPB that point to the external router. We
did not explicitly code any static routes for the TCPIPD stack. At 2, we add routes to the host
VIPAs that are in TCPIPA and TCPIPB, but not in TCPIPD.
We must make the same types of routing changes at TCPIPD. See the statements that we
add to this stack in Example 4-34.
Example 4-34 TCPIPD: non-replaceable static routes to other TCP/IP nodes on z Systems
;TCPIPD.TCPPARMS(ROUTD33X)
BEGINRoutes
; Direct Routes - Routes that are directly connected to my interfaces
; Destination Subnet Mask First Hop Link Name Packet Size
;;;;;;;;;;;;;;;;;;;;;;;;;BELOW IS FOR TESTING ISOLATION;;;;;;;;;;;;;;;;
ROUTE 10.1.2.11/32 10.1.2.240 OSA2080X mtu 1492 1
ROUTE 10.1.2.21/32 10.1.2.240 OSA2080X mtu 1492 1
ROUTE 10.1.1.10/32 10.1.2.240 OSA2080X mtu 1492 2
ROUTE 10.1.1.20/32 10.1.2.240 OSA2080X mtu 1492 2
;;;;;;;;;;;;;;;;;;;;;;;;ABOVE IS FOR TESTING ISOLATION;;;;;;;;;;;;;;;;;
ENDRoutes
At 1, we define host routes to the native OSA port IP addresses that point to the external
router. We do not explicitly code any static routes for the TCPIPC stack. At 2, we add routes to
the host VIPAs that are in TCPIPA and TCPIPB, but not in TCPIPC.
Example 4-35 Routing table at TCPIPC with entries that are provided by OSPF and by static routes
D TCPIP,TCPIPC,N,ROUTE
IPV4 DESTINATIONS
DESTINATION GATEWAY FLAGS REFCNT INTERFACE
DEFAULT 10.1.2.240 UGO 0000000000 OSA2080X
10.1.1.10/32 10.1.2.240 UGHS 0000000000 OSA2080X A
10.1.1.20/32 10.1.2.240 UGHS 0000000000 OSA2080X A
10.1.1.30/32 0.0.0.0 UH 0000000000 VIPA1L
10.1.1.40/32 10.1.2.41 UGHO 0000000000 OSA2080X B
10.1.2.0/24 0.0.0.0 UO 0000000000 OSA2080X C
10.1.2.11/32 10.1.2.240 UGHS 0000000000 OSA2080X D
10.1.2.21/32 10.1.2.240 UGHS 0000000000 OSA2080X D
10.1.2.31/32 0.0.0.0 UH 0000000000 OSA2080X
10.1.2.32/32 0.0.0.0 H 0000000000 OSA20A0X
10.1.3.0/24 10.1.2.41 UGO 0000000000 OSA2080X
10.1.3.0/24 10.1.2.240 UGO 0000000000 OSA2080X
10.1.3.31/32 0.0.0.0 H 0000000000 OSA20C0X
10.1.3.32/32 0.0.0.0 H 0000000000 OSA20E0X
10.1.7.0/24 0.0.0.0 US 0000000000 IQDIOLNK0A01071F
10.1.7.11/32 0.0.0.0 UHS 0000000000 IQDIOLNK0A01071F
10.1.7.31/32 0.0.0.0 H 0000000000 EZASAMEMVS
10.1.7.31/32 0.0.0.0 UH 0000000000 IQDIOLNK0A01071F
10.1.7.41/32 0.0.0.0 UHS 0000000000 IQDIOLNK0A01071F
10.1.100.0/24 10.1.2.240 UGO 0000000000 OSA2080X
127.0.0.1/32 0.0.0.0 UH 0000000002 LOOPBACK
192.168.1.0/24 10.1.2.240 UGO 0000000000 OSA2080X
192.168.2.0/24 10.1.2.240 UGO 0000000000 OSA2080X
192.168.3.0/24 10.1.2.240 UGO 0000000000 OSA2080X
IPV6 DESTINATIONS
DESTIP: ::1/128
GW: ::
INTF: LOOPBACK6 REFCNT: 0000000000
FLGS: UH MTU: 65535
25 OF 25 RECORDS DISPLAYED
END OF THE REPORT
Look more closely at Example 4-35. The entries that are marked with A are statically added to
override learned routes from OSPF. The entries at B and C remain as OSPF originally
advertised them. These are for addresses in TCPIPD or for other 10.1.2.0/24 addresses that
are not to be found in TCPIPA or TCPIPB. The entries that are marked with D are statically
added to override learned routes from OSPF.
As Example 4-36 shows, our command executions are successful and point to a two-hop
route across the router (A) between the two isolated TCPIP stacks (TCPIPA and TCPIPB).
Our tests to the external terminals from TCPIPA are also successful. (See Figure 4-22 on
page 193 for a diagram of where the terminals are.) Our test in Example 4-37 shows a
verbose ping to the terminal at address 10.1.100.221.
Example 4-37 Connectivity through the ISOLATED OSA to the remote network
===> ping 10.1.100.221 (tcp tcpipa V srcip 10.1.1.10
Pinging host 10.1.100.221
with 256 bytes of ICMP data
Ping #1 from 10.1.100.221: bytes=264 seq=1 ttl=127 time=1.28 ms
***
Ping #2 from 10.1.100.221: bytes=264 seq=2 ttl=127 time=0.37 ms
Ping #3 from 10.1.100.221: bytes=264 seq=3 ttl=127 time=0.91 ms
Ping statistics for 10.1.100.221
Packets: Sent=3, Received=3, Lost=0 (0% loss)
Approximate round-trip times in milliseconds:
Minimum=0.37 ms, Maximum=1.28 ms, Average=0.85 ms, StdDev=0.46 ms
***
We must test our connectivity from TCPIPA to TCPIPC and TCPIPD to see whether the
two-hop route is successful now that we updated the routing tables at all four stacks. See the
indications of a two-hop route (2) in Example 4-38.
Finally, we test the connectivity between TCPIPC and TCPIPD to ensure that we are still
taking the direct path through the OSA port despite the addition of our static routes.
Example 4-39 shows that we are indeed taking the one-hop route (A).
If you are using static routing protocols at z/OS and must isolate traffic over shared OSA
ports, then either deploy a VLAN implementation with separate VLAN IDs assigned to
separate IP subnets or use the ISOLATE feature and remember to disable ICMP redirects.
If you are using a dynamic routing protocol at z/OS and must isolate traffic over shared OSA
ports, use a VLAN implementation with separate VLAN IDs that are assigned to separate IP
subnets for each of the sharing TCP/IP stacks.
If you are using a dynamic routing protocol at z/OS and must isolate traffic over shared
OSA ports but are reluctant to deploy VLANs in the z Systems TCP/IP stacks, use the
OSA connection isolation feature. When doing so, plan a strategy to include some
non-replaceable static routes in the TCP/IP stack’s routing table that forces a hop over
an external router. Create a robust testing plan to ensure that you are permitting only the type
of routing that you want.
To create this scenario, we define the HiperSockets, which is represented by the IQD CHPID
and its associated devices. All LPARs that are configured to use the shared IQD CHPID have
internal connectivity, and therefore can communicate by using HiperSockets.
Our environment uses three IQD CHPIDs (F4, F5, and F6). Each creates a separate logical
LAN with its own subnetwork. Figure 4-23 depicts these interfaces of our scenario.
Note: In both cases, the TRLE is dynamically built by VTAM. The IQDCHPID VTAM start
option controls the VTAM selection of which IQD CHPID (and related devices) to
include in the HiperSockets MPC group (IUTIQDIO) when it is dynamically built for
DYNAMICXCF connectivity.
4.6.2 Considerations
For isolation of IP traffic between LPARs through HiperSockets, consider using VLANs so you
can logically subdivide the internal LAN for a HiperSockets CHPID into multiple VLANs.
Therefore, stacks that configure the same VLAN ID for the same CHPID can communicate
over HiperSockets; stacks that have no VLAN ID or a different VLAN ID configured cannot.
For HiperSockets, the VLAN ID applies to IPv4 and IPv6 connections. HiperSockets
VLAN IDs can be defined by using the VLANID parameter on a LINK or INTERFACE statement.
Valid VLAN IDs are 1 - 4094.
Define the DEVICE and LINK statements for each HiperSockets CHPID being implemented, as
shown in Example 4-40 on page 203. A HiperSockets CHPID must be defined as an MPCIPA
type of device 1. The link definition describes the type of transport being used. A
HiperSockets link is defined as IPAQIDIO 2.
Important: The hexadecimal value that is specified here represents the CHPID, and it
cannot be the same value as that used for the dynamic XCF HiperSockets interface.
If the link characteristics, BEGINROUTES statements, or the OMPROUTE configuration are not
defined, the stack’s interface layer (based on hardware capabilities) and the characteristics of
devices and links are used. However, this might not provide the performance or function you
want.
Example 4-43 Using the command D TCPIP,TCPIPA,N,DEV to verify the HiperSockets connection
DEVNAME: IUTIQDF4 DEVTYPE: MPCIPA
DEVSTATUS: READY
LNKNAME: IUTIQDF4L LNKTYPE: IPAQIDIO LNKSTATUS: READY
IPBROADCASTCAPABILITY: NO
CFGROUTER: NON ACTROUTER: NON
ARPOFFLOAD: YES ARPOFFLOADINFO: YES
ACTMTU: 8192
VLANID: NONE
READSTORAGE: GLOBAL (2048K)
SECCLASS: 255 MONSYSPLEX: NO
IQDMULTIWRITE: ENABLED (ZIIP)
ROUTING PARAMETERS:
MTU SIZE: 8192 METRIC: 80
DESTADDR: 0.0.0.0 SUBNETMASK: 255.255.255.0
MULTICAST SPECIFIC:
MULTICAST CAPABILITY: YES
GROUP REFCNT SRCFLTMD
----- ------ --------
224.0.0.5 0000000001 EXCLUDE
SRCADDR: NONE
224.0.0.1 0000000001 EXCLUDE
SRCADDR: NONE
LINK STATISTICS:
BYTESIN = 196650
INBOUND PACKETS = 1647
INBOUND PACKETS IN ERROR = 0
INBOUND PACKETS DISCARDED = 0
INBOUND PACKETS WITH NO PROTOCOL = 0
BYTESOUT = 82841
OUTBOUND PACKETS = 670
OUTBOUND PACKETS IN ERROR = 0
OUTBOUND PACKETS DISCARDED = 0
Because the device driver resources are provided by VTAM, you can display the resources by
using VTAM display commands.
For TRLEs that are generated dynamically, the device type and address can be decoded from
the generated TRLE name. The format of the TRLE name is IUTtaaaa:
IUT Fixed for all TRLEs that are generated dynamically.
t Shows the device type, which indicates the following information:
C CDLC device
H HYPERCHANNEL device
I QDIO device
L LCS device
S SAMEHOST device
W CLAW device
X CTC device
aaaa The read device number. For SAMEHOST connections, this is a sequence
number.
To display a list of all TRLEs active in VTAM, run the D NET,TRL command, as shown in
Example 4-44.
The D NET,TRL,TRLE command that is used to obtain information about a HiperSockets device
is shown in Example 4-45.
After DYNAMICXCF is defined, it provides connectivity between stacks under the same LPAR by
using the IUTSAMEH device (SAMEHOST) and between LPARs through HiperSockets that
use a IUTiQDIO device. To connect other z/OS images or other servers, an XCF coupling
facility link is created.
XCF
10.1.7.x1
4.7.1 Dependencies
The dependencies are as follows:
All z/OS hosts must belong to the same sysplex.
VTAM must have XCF communications that are enabled by specifying XCFINIT=YES or
XCFINIT=DEFINE as a startup parameter or by activating the VTAM XCF local SNA major
node, ISTLSXCF. For details about configuration, see z/OS Communications Server: SNA
Network Implementation, SC31-8777.
DYNAMICXCF must be specified in the TCP/IP profile of each stack.
The IQD CHPID that is used for the DYNAMICXCF device cannot be the user-defined
HiperSockets device (IQD CHPID). To avoid this, a VTAM start option, IQDCHPID, can be
used to identify which IQD CHPID is used by DYNAMICXCF.
4.7.2 Considerations
z/OS Communications Server improved and optimized Sysplex IP routing. In a sysplex
environment, you might prefer to use a connection other than a coupling facility link for
cross-server connectivity because XCF is heavily used by other workloads (in particular, for
distributed application data sharing).
This option can be configured with the VIPAROUTE statement in the VIPADYNAMIC statement. It
allows for the use of OSA-Express features, such as 1000BASE-T Ethernet, Gigabit Ethernet,
and 10-Gigabit Ethernet. For details, see IBM z/OS V2R2 Communications Server TCP/IP
Implementation Volume 3: High Availability, Scalability, and Performance, SG24-8362.
Figure 4-24 on page 207 shows the DynamicXCF implementation in our environment by
using HiperSockets CHPID F7.
When you use dynamic XCF for sysplex configuration, make sure that XCFINIT=YES or
XCFINIT=DEFINE is coded in the VTAM start options.
If XCFINIT=NO was specified, run the VARY ACTIVATE command for the ISTLSXCF major node.
This ensures that XCF connections between TCP stacks on separate VTAM nodes in the
sysplex can be established.
The VTAM ISTLSXCF major node must be active for DYNAMICXCF work, except for the following
scenarios:
Multiple TCP/IP stacks on the same LPAR. A dynamic SAMEHOST definition is generated
whether or not ISTLSXCF is active.
HiperSockets is configured and enabled across multiple z/OS LPARs that are in the same
sysplex and the same server. If this is the case, a dynamic IUTIQDIO link is created
whether or not ISTLSXCF is active.
Note: The link name for device IUTIQDIO is defined dynamically as IQDIOLNK0A01070B.
In the link name, 0A01070B is the hexadecimal value of the assigned IP address
(10.1.7.11).
Because the device driver resources are provided by VTAM, you can display the resources by
using VTAM display commands.
For TRLEs that are generated dynamically, the device type and address can be decoded from
the generated TRLE name. The format of the TRLE name is IUTtaaaa:
IUT Fixed for all TRLEs that are generated dynamically.
t Shows the device type, which indicates the following information:
C Indicates this is a CDLC device.
H Indicates this is a HYPERCHANNEL device.
I Indicates this a QDIO device.
L Indicates this is an LCS device.
S Indicates this is a SAMEHOST device.
W Indicates this is a CLAW device.
X Indicates this is a CTC device.
aaaa The read device number. For SAMEHOST connections, this is a sequence
number.
For XCF links, the format of the TRLE name is ISTTxxyy. ISTT is fixed, xx is the SYSCLONE
value of the originating VTAM, and yy is the SYSCLONE value of the destination VTAM.
To display a list of all TRLEs active in VTAM, run the D NET,TRL command, as shown in
Example 4-48.
You can also display XCF TRLE-specific information, as shown in Example 4-50.
The DYNAMICXCF configuration created a HiperSockets TRLE named IUTIQDIO. The related
TRLE status can also be displayed, as shown in Example 4-51.
The DYNAMICXCF configuration created a SAMEHOST TRLE named IUTSAMEH. The related
TRLE status can be displayed, as shown in Example 4-52 on page 213.
The DYNAMICXCF statement dynamically generates the DEVICE, LINK, and HOME statements. It
also starts the device when the TCP/IP stack is activated, as the messages in Example 4-53
show.
Using any of the starting methods results in a series of messages, as shown in Example 4-55.
When you stop a device, messages are displayed, as shown in Example 4-56.
Authorization to use this command is through the user’s RACF profile. The datasetname
variable cannot be a z/OS UNIX file system file. The data set contains the modified TCP/IP
configuration statements. See Example 4-57.
;BSDROUTINGPARMS TRUE
; Link name MTU Cost metric Subnet Mask Dest address
;OSA2080L 1492 0 255.255.255.0 0
;ENDBSDROUTINGPARMS
BSDROUTINGPARMS TRUE
; Link name MTU Cost metric Subnet Mask Dest address
OSA2080L 1024 0 255.255.255.0 0
ENDBSDROUTINGPARMS
Important: Dynamic XCF cannot be changed by using the OBEYFILE command. If you want
to change the IPCONFIG DYNAMICXCF parameters, stop TCP/IP, code a new IPCONFIG
DYNAMICXCF statement in the initial profile, and restart TCP/IP.
When you run a ping command, you might receive any of the responses that are listed in
Table 4-5. For more details about running the ping command, see 9.4.1, “The ping command
(TSO or z/OS UNIX)” on page 355.
ping 10.1.2.11 Pinging host 10.1.2.11 sendMessage(): The interface being tested has a problem. Run the
(intf osa2080l) EDC8130I Host cannot be reached. netstat command to verify the interface status.
ping 10.1.2.11 Pinging host 10.1.2.11 The ICMP packet was sent to the network, but the
(intf osa2080l) Ping #1 timed out. destination address is either invalid or it cannot
answer. Correct the destination address or verify
the destination host status. This problem should
be verified in the network.
ping 10.1.2.11 Pinging host 10.1.2.11 This is the expected response. The interface is
(intf osa2080l) Ping #1 response took 0.000 seconds. working.
netstat DEVLINKS/-d
Displays the status of each interface, physical and logical, that is defined in the TCP/IP
stack, as illustrated in Example 4-59 (only one interface is shown as a sample).
These commands can help you discover connectivity problems. If they do not, the next step in
debugging a direct-attached network problem is to gather documentation that shows more
detailed information about traffic problems that are related to the interface and network.
To get this detailed information, the z/OS Communications Server typically uses the
component trace to capture event data and save it to an internal buffer, or writes the internal
buffer to an external writer, if requested. You can later format these trace records by using the
Interactive Problem Control System (IPCS) subcommand CTRACE.
To debug a network connectivity problem, you can use the Component trace with either of the
two specific components, as follows:
SYSTCPIP component trace with the following options:
– VTAM, which shows all of the non-data-path signaling occurring between the devices
and VTAM
– VTAMDATA, which shows data-path signaling between the devices and VTAM,
including a snapshot of media headers and some data
Chapter 5. Routing
One of the major functions of a network protocol such as TCP/IP is to efficiently interconnect
several disparate networks. These networks can include LANs and WANs, fast and slow,
reliable and unreliable, and inexpensive and expensive connections.
To interconnect these networks, some level of intelligence is needed at the boundaries to look
at the data packets as they pass, and make rational decisions as to where and how they
should be forwarded. This is known as IP routing. This chapter looks at the various types of
IP routing that is supported in a z/OS Communications Server environment.
This chapter covers the topics that are shown in Table 5-1.
5.4, “Implementing static routing in z/OS” The implementation tasks and configuration examples
on page 245 for static routing
5.5, “Implementing OSPF routing in z/OS The implementation tasks and configuration examples
with OMPROUTE” on page 252 for Open Shortest Path First (OSPF) dynamic routing
Bridging is often compared with routing, which might seem to accomplish precisely the same
goal. However, consider the primary differences between these functions:
Bridging occurs at Layer 2 (the data link control (DLC) layer) of the OSI reference model.
Routing occurs at Layer 3 (the network layer).
This distinction provides bridging and routing with different information to use while moving
information from source to destination, so the two functions accomplish their tasks in different
ways.
5.1.1 Terminology
To help you understand concepts, Table 5-2 lists several common IP routing terms. Most
functions or protocols that are listed are supported by z/OS Communications Server.
Routing The process that is used in an IP network to deliver a datagram to the correct
destination.
Routing daemon A server process that manages the IP routing table. OMPROUTE is the z/OS
Communications Server component that acts as the routing daemon.
Dynamic routing Routing that is dynamically managed by a routing daemon and automatically
changes in response to network topology changes.
Static routing Routing that is manually configured and does not change automatically in response
to network topology changes.
Autonomous system (AS) A group of routers exchanging routing information through a common routing
protocol. A single AS can represent many IP networks.
Router A device or host that interprets protocols at the Internet Protocol (IP) layer and
forwards datagrams on a path toward their correct destination.
Gateway A router that is placed between networks or subnetworks. The term is used to
represent routers between ASs.
Interior gateway protocols (IGP) Dynamic route update protocols that are used between dynamic routers running on
TCP/IP hosts within a single AS.
Exterior gateway protocols (EGP) Dynamic route update protocols that are used between routers that are placed
between two or more ASs.
To route packets in the network, each network interface must have a unique IP address
assigned. Whenever a packet is sent, the destination and source IP addresses are included in
the packet’s header information. The network layer (Layer 3) of the TCP/IP stack examines
the destination IP address to determine how the packet should be forwarded. The packet is
either sent to its destination on the same network (direct routing) or, based on a routing table
entry, to another network by using a router (indirect routing).
Host E
192.168.2.0 / 24 192.168.2.105
192.168.1.0 / 24 172.16.1.0 / 24
10.1.1.1 10.1.1.2
10.1.1.0 / 24
10.1.1.101 10.1.1.102
Host A Host B
This example has hosts and routers in multiple networks, and to achieve connectivity between
these hosts, the routers are connected to multiple networks, creating a path between them.
127.0.0.1 Loopback
The routing table contains routes to various routers in this network. When host A has an IP
datagram to forward, it determines which IP address to forward it to by using the IP routing
algorithm and the routing table.
Note: The suffix of /24 represents the length of subnet mask (a 24-bit mask, in this case).
Because Host A is directly attached to network 10.1.1.0/24, it maintains a direct route to this
network. To reach other networks such as 192.168.1.0/24 and 172.16.1.0/24, it must have
an indirect route through router A and router B respectively because these networks are not
directly attached to it. Another option is to define a default route. If the indirect route to the
network is not defined explicitly, the default route is used.
In this example, Host A reaches Host B by using the direct route. To reach Host C
(192.168.1.103), it uses the indirect route to 192.168.1.0/24 and forwards the packet to
Router A (10.1.1.1).
Likewise, to reach Host D, it uses the indirect route to 172.16.1.0/24 and forwards the packet
to Router B (10.1.1.2). The indirect route to Host E (192.168.2.105) is not explicitly defined
in the Host A. So, the default route is used and the Host A forwards the packet to Router A
(10.1.1.1).
To reach any given IP network address, each host or router in the network needs to know only
the next hop’s IP address and not the full network topology.
If two or more indirect routes are defined for the same destination, the route selection
depends on the implementation of the routers or hosts. Some implementation always uses
the top entry in the list, and some implementation uses all routes to distribute the packets. In
some cases, it is configurable with the provided parameters.
If two or more indirect routes are defined for the same destination but with different subnet
mask length, the route with longest mask length is selected. This method is called the longest
match.
Static routing
Static routing requires you to manually configure the routing tables. This task is part of the
configuration steps you follow when customizing TCP/IP. It implies that you know the address
of every network you want to communicate with and how to get there. You must know the
address of the first router on the way.
The task of statically defining all necessary routes can be simple for a small network. It offers
the advantage of avoiding the network traffic processing impact of a dynamic route update
protocol. It also allows you to enforce rigid control of the allocation of addresses and resource
access. However, it requires manual reconfiguration if you move or add a resource.
Another disadvantage of static routing is that, even if the network failure occurs in the
intermediate path to the destination, the routing table remains unchanged and keeps sending
the packet according to the statically defined next hop routers. Sometimes it might cause the
network to be unreachable. Also, if you fail to define the correct next hop router in the route
entry, the routers continue forwarding the packet by using that entry. Even if there is a better
route, the router does not change its next hop router until the changes are made to the static
route entry.
If your network environment is small and manageable, with few to no network changes
anticipated, then using static routes is an option (keeping in mind that your z/OS system is
basically an application server environment). A preferred practice is to define only the default
gateways to the exterior networks, and let the routers do the exterior routing. You can
implement the static routing between the z/OS system and external router, and still let the
external routers use the dynamic routing protocol to exchange route information.
If your routing tables are complex because of network growth, or if the system must act as a
gateway, it is far easier to let the system do the work for you by using dynamic routing.
The drawback of dynamic routing is the burden of route information exchange. There are
some configuration techniques that you can use to reduce this burden, as explained in 5.2,
“Routing in the z/OS environment” on page 230.
The administrator must assess the importance of each of these requirements when
determining the appropriate routing protocol for an environment.
Static routing can be combined with dynamic routing by using the OMPROUTE routing
daemon. If the ROUTE statement in the BEGINROUTES statement block is coded with
NOREPLACEABLE, then the static route is always preferred over the dynamically learned route for
the same destination with the same subnet mask length.
If two or more routes to the same destination with same subnet mask length are defined in the
z/OS Communications Server routing table, then the TCP/IP stack always uses the first active
entry, by default. If you specify an IPCONFIG MULTIPATH statement in the TCP/IP profile, all
routes for the same destination are used per connection or per packet, depending on which
option you specify for MULTIPATH.
For IPv4, OMPROUTE implements the OSPF protocol that is described in RFC 1583 (OSPF
version 2), the OSPF subagent protocol that is described in RFC 1850 (OSPF version 2
Management Information Base), and the RIP protocols that are described in RFC 1058
(Routing Information Protocol) and in RFC 1723 (RIP V2 - Carrying Additional Information).
For IPv6, OMPROUTE implements the IPv6 RIP protocol that is described in RFC 2080
(RIPng for IPv6) and the IPv6 OSPF protocol that is described in RFC 2740 (OSPF for IPv6).
OMPROUTE does not use the BSDROUTINGPARMS statement. Instead, its parameters are
defined in the OMPROUTE configuration file. The OMPROUTE configuration file is used to
define both OSPF and RIP environments.
Note: If the INTERFACE statement is used in the TCP/IP stack to define an interface, the
subnet mask and MTU that is coded in OMPROUTE must agree, or OMPROUTE issues
an error message and use the values that you configure to OMPROUTE.
For IPv4, the OSPF and RIP protocols are communicated over interfaces that are defined
with the OSPF_INTERFACE and RIP_INTERFACE configuration statements. Interfaces that are not
involved in the communication of the RIP or OSPF protocol are configured with the INTERFACE
configuration statement (unless it is a non-point-to-point interface and all default values that
are specified on the INTERFACE statement are acceptable).
If both OSPF and RIP protocols are used in an OMPROUTE environment, then OSPF takes
precedence over RIP. OSPF routes are preferred over RIP routes to the same destination.
OMPROUTE does not replace a NOREPLACEABLE static route, even if it detected a dynamic
route to the same destination, and the TCP/IP stack uses a NOREPLACEABLE static route to
forward the packet. OMPROUTE replaces a REPLACEABLE static route if it detects a dynamic
route to the same destination. The REPLACEABLE option enables the last resort to the
destination if OMPROUTE has not detected a dynamic route to the destination.
Also, take care to ensure that the z/OS Communications Server host is not overly burdened
with routing work. Unlike routers or other network boxes whose sole purpose is routing, an
application host z/OS Communications Server is doing many things other than routing, and it
is not preferable for a large percentage of machine resources (memory and CPU) to be used
for routing tasks, as can happen in complex or unstable networks.
The most common and preferred way to use dynamic routing in the z/OS environment is to
define the stack as an OSPF Stub Area or, even better, as a Totally Stubby Area. Stub and
Totally Stubby Areas minimize the amount of routing work that z/OS must perform.
Given the need for a responsive OMPROUTE node, a storage shortage in the node can lead
to lost connectivity in the network. For example, OMPROUTE might exit if the stack cannot
allocate storage for OMPROUTE dispatchable unit control blocks or for sending routing
updates to neighbor routers. Messages that advise you about storage shortages are shown in
Example 5-1.
Proper design of the dynamic routing environment can eliminate or reduce the likelihood of
storage shortages that affect OMPROUTE. For example, the most common and preferred
way to use dynamic routing in the z/OS environment is to define the stack as an OSPF Stub
Area or, even better, as a Totally Stubby Area.
Stub Areas minimize storage and CPU processing at the nodes that are part of the Stub Area
because they maintain less knowledge about the topology of the AS than do other types of
non-backbone routers. They maintain knowledge only of intra-area destinations and
summaries of inter-area destinations and default routes within the AS to reach external
destinations.
A Totally Stubby Area receives even less routing information than a Stub Area. It knows of
only intra-area destinations and default routes within the Stub Area to reach external
destinations. Thus, its storage and CPU processing requirements are even less than what is
required for a Stub Area.
The TCP/IP stack ensures that there are always control blocks available for dispatchable units
doing work for OMPROUTE. In addition, the stack satisfies requests for stack storage that is
made on behalf of OMPROUTE while storage remains available. Requests made on behalf of
other applications are not satisfied during a storage shortage.
These actions temporarily keep OMPROUTE from deleting routes during a storage shortage
when OMPROUTE fails to receive the usual periodic routing updates from neighboring
routers. In addition, they decrease the likelihood that OMPROUTE exits, times out routes, or
fails to send routing updates to neighbor routers during a storage shortage. This temporary
reprieve lasts for 5 minutes, at which time OMPROUTE automatically resumes the
requirement for periodic routing table updates.
With RIP routes, you might discover that OMPROUTE is responding to the shortage event
when several route displays reveal that the age of RIP routes ceases to increase. See an
example of such a display at 2 in Example 5-4. Several iterations of the OMPROUTE command
showed that the age of the route never increased beyond 10.
A trace of OMPROUTE activity by using a trace level of -t2 and a debug level of -d1 also
provides information about OMPROUTE’s automatic tolerance of a storage shortage
condition. Messages that are shown in Example 5-5 advise you that OMPROUTE is reacting
as designed to a storage shortage. In the example, the value of the type field can be begin or
end and the ip_version field can be IPv4 or IPv6.
Example 5-5 OMPROUTE trace messages for toleration of storage shortage
EZZ8166I Received type storage shortage notification for ip_version
EZZ8167I OSPF dead router checking is resumed for ip_version
EZZ8168I OSPF dead router checking is suspended for ip_version
EZZ8169I RIP route aging is resumed for ip_version
EZZ8170I RIP route aging is suspended for ip_version
Policy-based routing (PBR) determines the destination based on the defined policy. Traffic
descriptors such as TCP/UDP port numbers, application name, and source IP addresses can
be used to define the policy to enable the optimized route selection.
PBR can use both static routes and dynamic routes, which are obtained with the OMPROUTE
routing daemon.
For more information about PBR, see IBM z/OS V2R2 Communications Server TCP/IP
Implementation Volume 4: Security and Policy-Based Networking, SG24-8363.
The OSPF protocol is based on link-state or shortest path first technology. OSPF routing
tables contain details of the connections between routers, their status (active or inactive),
their cost (desirability for routing), and so on.
Updates are broadcast when a link changes status, and consist merely of a description of the
changed status. OSPF can divide its network into topology subsections, which are known as
areas, within which broadcasts are confined. OSPF is designed for the TCP/IP internet
environment. In Communications Server for z/OS IP, OSPF is configured by using the UNIX
daemon OMPROUTE.
OSPF terminology
Several of the common IP routing-related terms and concepts that are used in OSPF are as
follows:
Router ID
This is a 32-bit number that is allocated to each router in the OSPF network protocol. This
number is unique in the AS. It represents the IP address of an interface that is defined on
the OSPF node.
For the z/OS implementation of the Router ID in OSPF, use a static VIPA address. Do not
use a Dynamic VIPA as the Router ID because the movement of the Router ID causes
confusion in the OSPF routing protocol exchanges.
Areas
OSPF networks can be divided into areas. An area consists of networks and routers that
are logically grouped. All routers within an area maintain the same topology database.
All OSPF networks consist of at least one area, typically the backbone area. If you define
more than one area, one of the areas must be the backbone area and the other area or
areas are defined as non-backbone areas.
Backbone area
All OSPF networks should have a backbone area. The area identifier of the backbone area
is always 0.0.0.0. The backbone area is special in that it distributes routing information to
all areas connected to it.
Area border routers
These routers connect two or more areas. The area border router maintains a topology
database of each area to which it is attached. All area border routers must have at least
one interface in the backbone area. A virtual link can be used to satisfy this requirement.
AS boundary routers
These routers connect the OSPF internetwork and exchange reachability information with
other routers in other ASs. They can use the EGPs. The AS boundary routers are used to
import static routes and RIP routes into the OSPF network (and vice versa).
Virtual link
This logical link connects an area that does not have a physical link to a backbone area.
The link is treated as a point-to-point link.
For example, you might integrate a mainframe network running OSPF with a router
network running Enhanced Interior Gateway Routing Protocol (EIGRP) to take
advantage of the filtering capabilities of EIGRP, thus reducing the amount of protocol
traffic between the OSPF network and the EIGRP network.
Designated router
A designated router (DR) is a router on a shared multi-access medium such as a LAN or
ATM network. A DR performs most of the OSPF protocol activities for that network, such
as synchronizing database information and informing members of the broadcast network
of changes to the network. The DR must be adjacent to all other routers on the broadcast
medium. Every network or subnetwork on a broadcast network must have a DR and
preferably a backup designated router (BDR).
Note: Define non-z/OS routers that are attached to z/OS OSPF LAN broadcast
networks as the DRs. z/OS CPU utilization is reduced if a non-z/OS router performs the
work of the DR.
There is one exception to this rule when dealing with a HiperSockets network. A
HiperSockets network is also a broadcast network; however, only z/OS, z/VM, or Linux
on z Systems nodes participate in a HiperSockets network. Therefore, at least some
nodes inside the mainframe must be a DR on a HiperSockets LAN.
Complications can occur if the z/OS node is the DR on a LAN network when parallel
interfaces into the LAN over a shared OSA exist. Shared OSAs can route over the
shared OSA port without entering the network.
If the packet arrives over the backup interface instead of the primary parallel interface,
the recipient discards the packet. The databases at the nodes become corrupted
because of missing information, and lost adjacencies can result.
Therefore, do not allow z/OS nodes with parallel interfaces and shared LANs to be the
DR. If a z/OS node must be the DR, it should be connected to the broadcast medium
through a non-shared OSA port.
Each area has its own topology and has a gateway that connects it to the rest of the network.
It dynamically detects and establishes contacts with its neighboring routers by periodically
sending Hello packets.
Link-state database
The link-state database is a collection of OSPF LSAs. OSPF, being a dynamic IP routing
protocol, does not need to have routes that are defined to it. It dynamically discovers all the
routes and the attached routers through its OSPF Hello part of the protocol. The OSPF Hello
part of the protocol transmits Hello packets to all its router neighbors to establish connection.
After the neighbors are discovered, the connection is made.
After the Hello protocol concludes that all the connections are established, the link state
databases are synchronized. This exchange is performed starting with the most recently
updated LSAs. The link state databases are synchronized until all router LSAs in the network
(within an area) have the same information. The link state protocol maintains a loop-free
routing because of the synchronization of the link state databases.
RIP uses a hop count (distance vector) to determine the best possible route to a network or
host. The hop count is also known as the routing metric, or the cost of the route. A router is
defined as being zero hops away from its directly connected networks, one hop away from
networks that can be reached through one gateway, and so on. The fewer hops, the better.
The route that has the fewest hops is the preferred path to a destination. A hop count of 16
means infinity, or that the destination cannot be reached. Thus, large networks with more than
15 hops between potential partners cannot use RIP.
RIP V1
RIP is a protocol that manages IP routing table entries dynamically. The gateways that use
RIP exchange their routing information to allow the neighbors to learn of topology changes.
The RIP server updates the local routing tables dynamically, resulting in current and accurate
routing tables. The protocol is based on the exchange of protocol data units (PDUs) between
RIP servers (such as OMPROUTE). Although various types of PDUs exist, the following two
are most important:
REQUEST PDU
This PDU is sent from a RIP server as a request to other RIP servers to transmit their
routing tables immediately.
RESPONSE PDU
This PDU is sent from a RIP server to other RIP servers either as a response to a
REQUEST PDU or as a result of expiration of the broadcast timer (every 30 seconds).
RIP V1 limitations
Because RIP is designed for a specific network environment, it has several limitations, as
described here. Consider the following limitations before implementing RIP in your network:
RIP V1 declares a route invalid if it passes through 16 or more gateways. Therefore,
RIP V1 places a limitation of 15 hops on the size of a large network.
RIP V1 uses fixed metrics to compare alternative routes versus actual parameters, such
as measured delay, reliability, and load. This means that the number of hops is the only
parameter that differentiates a preferred route from non-preferred routes.
The routing tables can take a relatively long time to converge or stabilize.
RIP V1 does not support variable subnet masks or variable subnetting because it does not
pass the subnet mask in its routing advertisements. Variable subnet masking refers to the
capability of assigning different subnet masks to interfaces that belong to the same Class
A, B, or C network.
RIP V2
Rather than being another protocol, RIP V2 is an extension to the functions that are provided
by RIP V1. To use these new functions, RIP V2 routers exchange the same RIP V1
messages. The version field in the message specifies version number 2 for RIP messages
that use authentication or carry information in any of the newly defined fields.
Depending on the configuration in the adjacent routers, the following types of routes can be
learned from the received router advertisements:
Default route, for which the originator of the router advertisement is the next hop
Direct routes (no next hop) to prefixes that are on the link that is shared by the z/OS
Communications Server and the originator of the router advertisement
The z/OS host running with OMPROUTE becomes an active OSPF or RIP router in a TCP/IP
network. Either or both of these routing protocols can be used to dynamically maintain the
host IPv6 routing table. For example, OMPROUTE can detect when a route is created, is
temporarily unavailable, or if a more efficient route exists. If both IPv6 OSPF and IPv6 RIP
protocols are used simultaneously, then IPv6 OSPF routes are preferred over IPv6 RIP routes
to the same destination.
RIPng or RIP V2
RIP Next Generation (RIPng) is a distance vector routing protocol for IPv6 that is defined in
RFC 2080. RIPng for IPv6 is an adaptation of the RIP V2 protocol to advertise IPv6 network
prefixes. RIPng for IPv6 uses UDP port 521 to periodically advertise its routes, respond to
requests for routes, and advertise route changes.
RIPng for IPv6, like other distance vector protocols, has a maximum distance of 15, in which
15 is the accumulated cost (hop count). Locations that are a distance of 16 or further are
considered unreachable. RIPng for IPv6 is a simple routing protocol with a periodic
route-advertising mechanism that is designed for use in small to medium-sized IPv6
networks. RIPng for IPv6 does not scale well to a large or very large IPv6 network.
IPv6 OSPF is classified as an IGP. This means that it distributes routing information between
routers belonging to a single AS, which is a group of routers all using a common routing
protocol. The IPv6 OSPF protocol is based on link-state or shortest path first (SPF)
technology.
At a glance, the OSPF implementation is basically the same as it is for IPv4, except for some
primary differences.
New LSA types are added (to carry addressing and link-local information). Because IP
addressing is removed from certain basic LSA types, new LSA types are provided to
communicate IP addresses, which routers then correlate to topology information in other LSA
types.
The Concept of Flooding Scope is added (scopes are link, area, and AS). It indicates how far
an advertisement can be flooded. For example, link scope means that an LSA can be flooded
only on the originating link.
Support for Unknown LSA types is added (this makes the protocol more extensible).
Unknown LSA types can be ignored, or they can be stored and forwarded by the router,
depending on the settings of bits in the LSA type field. This vastly improves interoperability
between routers running separate versions of the protocol. For example, a DR can
conceivably have a lower level of support than another router on the same link; because the
DR floods on behalf of the other routers on the link, it can store and forward unknown LSA
types that are received from its peers.
Multiple OSPF instances are supported on a link. An instance ID field is added to OSPF
headers, and OSPF processes only process packets whose instance ID matches their own.
This opens the possibility of one link belonging to completely different ASs.
SC30 SC31
TCPIPA TCPIPB
PROFAS30 (Static routes) PROFBS31 (Static routes)
10.1.2.240 10.1.3.240
SWITCH 1
5.4.1 Dependencies
All subnetworks that are defined in the TCP/IP stack that are used by the application servers,
including static and dynamic VIPAs, must also have static routing definitions in the routers. In
our case, the layer 3 switches (routers) do not need static route definitions for direct routes.
We define indirect routes for TCPIPA and TCPIPB VIPAs in the routers.
The routing table’s management is manual, thus increasing the possibility of outages caused
by definition errors. If a destination (sub)network becomes unreachable, then the static routes
for that (sub)network remain in the routing table, and packets are still forwarded to the
destination. The only way to remove static routes from the routing table is for the network
administrator to update the routing table.
Define as few static routing definitions as possible when implementing a static routing
environment, keeping in mind that the z/OS system is basically an application server
environment. It is a preferred practice to define only the default gateways to the exterior
networks, and let the routers do the exterior routing. You can implement the static routing
between the z/OS system and external router and still let the external router use the dynamic
routing protocol.
In the router, define only the route definitions to the VIPA subnetworks. The interior
subnetworks, such as XCF and HiperSockets, do not usually need to be reached by the
corporate network, so they do not need to be defined.
Important: If you choose to implement the OSA connection isolation feature together with
dynamic routing and yet still must communicate between two or more nodes sharing the
OSA adapter port, you must override the dynamically generated subnet or host route
between the two TCP/IP stacks with a non-replaceable static route that indicates a
next-hop address of an external router. See the information about OSA connection
isolation in “Considerations for assigning the OSA port name” on page 168.
All static routes are then listed in the TCP/IP routing table.
Important: When using the OBEYFILE command, include all static routes that you want to
define. The OBEYFILE command replaces the entire BEGINROUTES block.
In a CINET environment where multiple TCP/IP stacks are configured, use the TCP option for
the TSO PING command and the -p option for the z/OS UNIX ping command to specify the
TCP/IP stack name from which you want to issue the ping command.
You do not need to specify these options if your environment is an INET environment where
only one TCP/IP stack is configured.
In a CINET environment where multiple TCP/IP stacks are configured, use the TCP option for
the TSO TRACERTE command and the -a option for the z/OS UNIX traceroute command to
specify the TCP/IP stack name you want to issue the TRACEROUTE command from.
You do not need to specify these options if the user issuing this command is already
associated to the TCP/IP stack (with SYSTCPD DD, for example).
You do not need to specify these options if your environment is an INET environment where
only one TCP/IP stack is configured.
Figure 5-3 depicts the environment that we use for the OSPF scenario. The TCPIPA stack is
running on SC30. We create the OMPROUTE procedure OMPA to establish affinity to
TCPIPA. We also create OMPB for TCPIPB on SC31.
SC30 SC31
TCPIPA TCPIPB
PROFA30 (Dynamic routes) PROFB31 (Dynamic routes)
10.1.2.240 10.1.3.240
VLAN10 VLAN11
SWITCH 1
Area 0.0.0.0
We define a z/OS TCP/IP to be a member of OSPF Totally Stubby Area. The external routers
(Layer 3 switches) represent the ABRs between the Totally Stubby Area and the backbone
area. We made the external routers DR or BDR to reduce the routing workloads that are
required in z/OS.
Because the configuration examples for TCPIPB and OMPB on SC31 are similar to those
examples for TCPIPA and OMPA on SC30, we show configuration examples only on SC30.
5.5.2 Considerations
A z/OS Communications Server host is usually used as an application server and the routing
daemon is running primarily to provide access to network resources and vice versa. Ensure
that the z/OS Communications Server host is not overly burdened with routing work.
The z/OS Communications Server must not be configured as a backbone router, either
intentionally or inadvertently. Careful network design can minimize the routing burdens on the
z/OS Communications Server (application host) without compromising accessibility.
5.5.3 Suggestions
Define the z/OS Communications Server environment as an OSPF Stub Area to reduce the
CPU process that is needed for managing the routing table. A Stub Area can be configured so
that route summaries from other areas are not flooded into the Stub Area by the area border
routers. When this is done, only routes to destinations within the Stub Area are shared among
the hosts. Default routes are used to represent all destinations outside the Stub Area. The
Stub Area’s resources are still advertised to the network at large by the area-border routers.
You can use this optimization, sometimes referred to as a Totally Stubby Area.
Also, make the external routers DR or BDR, and do not allow z/OS systems to be DR or BDR,
to reduce the routing burden for z/OS systems. DR or BDR is selected in each LAN segment
or VLAN. However, on HiperSockets links, z/OS systems are the only participants. One of the
z/OS systems on the HiperSockets network must take the role of DR (optionally, another one
can take the role of BDR).
Note: Recall the earlier warning in this chapter about the use of OSA connection isolation:
It is generally incompatible with a dynamic routing protocol such as OSPF. If implemented,
you might need to introduce non-replaceable static routes pointing to external next-hop
routers. For more information, see “Considerations for assigning the OSA port name” on
page 168.
Tip: OMPROUTE can be started as a z/OS procedure, or from the z/OS shell, or from
AUTOLOG.
Important: When you define the STDENV (_CEE_ENVFILE) file with a z/OS data set, the data
set must be allocated with RECFM=V. Using RECFM=F or FB is not preferable because the fixed
setting enables padding with blanks for the environment variables.
Although you can include a UNIX time zone variable (TZ=...) in either the JCL or the
environment variable file, the preferred procedure is to insert the appropriate time zone for all
applications into the z/OS SYS1.PARMLIB(CEEPRMxx) member, as shown in
Example 5-16. You should define the TZ environment variable for all three LE option sets
(CEEDOPT, CEECOPT, and CELQDOPT).
Example 5-16 Set the time zone variable for all applications
CEECOPT(ALL31(ON), ENVAR('TZ=EST5EDT') )
CEEDOPT(ALL31(ON), ENVAR('TZ=EST5EDT') )
CELQDOPT(ALL31(ON), ENVAR('TZ=EST5EDT') )
In the CINET environment, the Global Resolver configuration file contains keywords that are
shared with all TCP/IP stacks on the z/OS image, and should omit the stack-specific
keywords such as TCPIPJobname and Hostname. Those parameters should be specified in the
local TCPIP.DATA file. If a specific parameter is not found in the global TCPIP.DATA, the local
TCPIP.DATA file is searched according to the search order. You can read more about the
resolver in Chapter 2, “The resolver” on page 21.
Example 5-17 shows the global TCPIP.DATA file that is used in our example.
In an INET environment, usually only a global TCPIP.DATA file is used. It should contain the
keywords (TCPIPJobname and DATASETPREFIX) that are used by OMPROUTE. The
TCPIPJobname parameter specifies the name of TCP/IP stack with which OMPROUTE
establishes an affinity.
Important: If you fail to take one of these actions, OMPROUTE is periodically canceled
and restarted by TCP/IP.
The next example of an OMPROUTE configuration file can be shared across multiple stacks
by using MVS system symbols, and you can use the statement INCLUDE. The statement can
group OMPROUTE configuration statements that are common to several OMPROUTE
instances into a single file. You do not need to repeat the configuration information in multiple
places; you need only use INCLUDE.
The use of MVS system symbols and the statement INCLUDE in the OMPROUTE configuration
file are introduced in z/OS Communications Server, as shown in Example 5-23.
Example 5-23 Shareable OMPROUTE configuration file by using MVS system symbols and INCLUDE
OSPF
RouterID=10.1.&SYSCLONE..10 1
Comparison=Type2
Demand_Circuit=YES;
Global_Options
Ignore_Undefined_Interfaces=YES
Routesa_Config Enabled=No;
; Static vipa
OSPF_Interface IP_address=10.1.&SYSCLONE..10 2
Subnet_mask=255.255.255.0
Name=VIPA3L
Attaches_To_Area=0.0.0.2
Advertise_VIPA_Routes=HOST_ONLY
Cost0=10
MTU=65535;
INCLUDE //'TCPIPA.TCPPARMS(OMPA30IN)' 3
This OMPROUTE configuration file is now shareable. We fully used wildcards, the MVS
system symbolics, and the INCLUDE statement.
To configure router 1, we used the configuration statements that are shown in Example 5-25.
Starting OMPROUTE
OMPROUTE can be started from an z/OS procedure, from the z/OS shell, or by AUTOLOG.
You can use the AUTOLOG statement to start OMPROUTE automatically during TCP/IP
initialization. Insert the name of the OMPROUTE start procedure in the AUTOLOG statement of
the PROFILE.TCPIP data set (see Example 5-27).
Several of the most useful DISPLAY commands and outputs are described in this section. For
other display command options and to find more detailed information about specific
commands, see z/OS Communications Server: IP System Administrator’s Commands,
SC31-8781.
D TCPIP,TCPIPA,OMPR,OSPF,NBRS
EZZ7851I NEIGHBOR SUMMARY 392
NEIGHBOR ADDR NEIGHBOR ID STATE LSRXL DBSUM LSREQ HSUP IFC
10.1.3.240 10.1.3.240 128 0 0 0 OFF OSA20E0I 1
10.1.3.41 10.1.3.10 8 0 0 0 OFF OSA20E0I 2
10.1.2.22 10.1.1.20 8 0 0 0 OFF OSA20A0I 2
10.1.2.240 10.1.3.240 128 0 0 0 OFF OSA20A0I 1
10.1.2.22 10.1.31.10 8 0 0 0 OFF OSA20A0I 2
10.1.4.21 10.1.31.10 128 0 0 0 OFF IUTIQDF4L 3
10.1.5.21 10.1.31.10 128 0 0 0 OFF IUTIQDF5L 3
10.1.6.21 10.1.31.10 128 0 0 0 OFF IUTIQDF6L 3
* -- LINK NAME TRUNCATED
IPV4 DESTINATIONS
DESTINATION GATEWAY FLAGS REFCNT INTERFACE
DEFAULT 10.1.3.240 UGO 0000000000 OSA20C0I 1
DEFAULT 10.1.3.240 UGO 0000000002 OSA20E0I
10.1.1.10/32 0.0.0.0 UH 0000000000 VIPA1L
10.1.1.20/32 10.1.5.21 UGHO 0000000001 IUTIQDF5L 2
10.1.1.20/32 10.1.4.21 UGHO 0000000000 IUTIQDF4L
10.1.1.30/32 10.1.5.31 UGHO 0000000001 IUTIQDF5L
10.1.1.30/32 10.1.4.31 UGHO 0000000000 IUTIQDF4L
10.1.1.40/32 10.1.4.41 UGHO 0000000000 IUTIQDF4L
10.1.1.40/32 10.1.5.41 UGHO 0000000000 IUTIQDF5L
10.1.2.0/24 0.0.0.0 UO 0000000000 OSA20A0I
10.1.2.0/24 0.0.0.0 UO 0000000000 OSA2080I
10.1.2.10/32 0.0.0.0 UH 0000000000 VIPA2L
10.1.2.11/32 0.0.0.0 UH 0000000000 OSA2080I
10.1.2.12/32 0.0.0.0 UH 0000000000 OSA20A0I
10.1.2.14/32 0.0.0.0 H 0000000000 OSA2081I
10.1.2.30/32 10.1.5.31 UGHO 0000000000 IUTIQDF5L
10.1.2.30/32 10.1.4.31 UGHO 0000000000 IUTIQDF4L
10.1.2.40/32 10.1.4.41 UGHO 0000000000 IUTIQDF4L
10.1.2.40/32 10.1.5.41 UGHO 0000000000 IUTIQDF5L
10.1.3.0/24 0.0.0.0 UO 0000000000 OSA20E0I
10.1.3.0/24 0.0.0.0 UO 0000000000 OSA20C0I
10.1.3.11/32 0.0.0.0 UH 0000000000 OSA20C0I
10.1.3.12/32 0.0.0.0 UH 0000000000 OSA20E0I
10.1.4.0/24 0.0.0.0 UO 0000000000 IUTIQDF4L
10.1.4.11/32 0.0.0.0 UH 0000000000 IUTIQDF4L
10.1.5.0/24 0.0.0.0 UO 0000000000 IUTIQDF5L
10.1.5.11/32 0.0.0.0 UH 0000000000 IUTIQDF5L
10.1.6.11/32 0.0.0.0 UH 0000000000 IUTIQDF6L
10.1.7.0/24 0.0.0.0 US 0000000000 IQDIOLNK0A01070
B
10.1.7.11/32 0.0.0.0 H 0000000000 EZASAMEMVS
10.1.7.11/32 0.0.0.0 UH 0000000000 IQDIOLNK0A01070
B
10.1.7.21/32 0.0.0.0 UHS 0000000000 IQDIOLNK0A01070
B
10.1.7.31/32 0.0.0.0 UHS 0000000000 IQDIOLNK0A01070
B
10.1.7.41/32 0.0.0.0 UHS 0000000000 IQDIOLNK0A01070
B
10.1.8.10/32 0.0.0.0 UH 0000000000 VIPL0A01080A
10.1.8.40/32 10.1.4.41 UGHO 0000000000 IUTIQDF4L
10.1.8.40/32 10.1.5.41 UGHO 0000000000 IUTIQDF5L
10.1.8.41/32 10.1.4.41 UGHO 0000000000 IUTIQDF4L
10.1.8.41/32 10.1.5.41 UGHO 0000000000 IUTIQDF5L
10.1.8.42/32 10.1.4.41 UGHO 0000000000 IUTIQDF4L
10.1.8.42/32 10.1.5.41 UGHO 0000000000 IUTIQDF5L
In a CINET environment where multiple TCP/IP stacks are configured, use the TCP option for
the TSO PING command and the -p option for the z/OS UNIX ping command to specify the
TCP/IP stack name from which you want to run the ping command.
You do not need to specify those options if the user running this command is already
associated to the TCP/IP stack (with SYSTCPD DD, for example). There is no need to specify
these options if your environment is an INET environment where only one TCP/IP stack is
configured.
In a CINET environment where multiple TCP/IP stacks are configured, use the TCP option for
the TSO TRACERTE command and the -a option for the z/OS UNIX traceroute command to
specify the TCP/IP stack name from which you want to issue the TRACEROUTE command.
You do not need to specify those options if the user running this command is already
associated to the TCP/IP stack (with SYSTCPD DD, for example). There is no need to specify
those options if your environment is an INET environment where only one TCP/IP stack is
configured.
From the z/OS UNIX shell, run the ps -ef command, as shown in Example 5-39 (1).
Using the PID number, stop OMPROUTE with the kill pidnumber command, as shown in
Example 5-39 (2).
In OSPF environments in which there might be a problem with some remote hardware (for
example, a router, switch, or network cable) that is beyond detection by the z/OS hardware or
software, OMPROUTE can get into an infinite neighbor state loop over one of its interfaces
with a neighbor. This loop might contribute to increased workload. In LAN configurations in
which there are parallel OSPF interfaces that can reach the same neighbor for adjacency
formation, unless you are using OMPROUTE futile neighbor state loop detection or unless
you manually fix the problem, the backup interfaces are not used until after an outage occurs
for the OSPF interface that was initially involved in an adjacency formation attempt with a DR.
You can use the MODIFY (F) command to suspend and, after fixing the problem, activate an
OSPF interface by using the F procname,OSPF,INTERFACES,NAME=interfname,SUSPEND or
ACTIVATE command, which suspends or activates the OMPROUTE interface.
D TCPIP,TCPIPA,OMP,OSPF,INTERFACES
EZZ7849I INTERFACES 803
IFC ADDRESS PHYS ASSOC. AREA TYPE STATE #NBRS
#ADJS
10.1.6.11 IUTIQDF6L 0.0.0.2 BRDCST 64 1 1
10.1.5.11 IUTIQDF5L 0.0.0.2 BRDCST 32 2 2
10.1.4.11 IUTIQDF4L 0.0.0.2 BRDCST 32 2 2
10.1.3.12 OSA20E0I 0.0.0.2 BRDCST 32 4 1
10.1.3.11 OSA20C0I 0.0.0.2 BRDCST 2 0 0
10.1.2.12 OSA20A0I 0.0.0.2 BRDCST 1* 0 0 2
10.1.2.14 OSA2081I 0.0.0.2 BRDCST 1 0 0
F OMPA,OSPF,INTERFACES,NAME=OSA20A0I,ACTIVATE 3
EZZ7866I OMPA MODIFY COMMAND ACCEPTED
EZZ8160I OMPA MODIFY ACTIVATE COMMAND FOR OSPF IPV4 INTERFACE
OSA20A0I IS SUCCESSFUL
Note: Run the MODIFY SUSPEND command to stop OSPF traffic on an OSPF interface,
rather than running the VARY TCPIP command to deactivate the corresponding physical
interface in TCPIP. This allows existing sessions that use static routes on the affected
interface to not be disrupted.
Verify IP routing to a
destination host
Verify and
correct the
dynamic
route
definition
Yes
Yes
Yes
The Verify and
Ping first No No correct the
device is
hop OK? device
4 ready? 5 5b
problem
Yes Yes
Verify and
External correct the
Route
network interface
verified OK
problem 4a config 5a
problem
The descriptions for the tags, which are shown in Figure 5-4, are as follows:
1. Use the ping command to determine whether there is connectivity to the destination IP
address. More information about the ping command can be found in “PING command
(TSO or z/OS UNIX)” on page 274.
2. If the ping command fails immediately, there might not be a route to the destination host or
subnet. Run the netstat ROUTE/ -r command to display routes to the network, as shown
in Example 5-10 on page 250. Verify that TCP/IP has a route to the destination address.
The ping command can be run with the TSO PING command or the z/OS UNIX ping
command. Example 5-41 on page 275 shows the display of TSO PING command. You see that
the ping is successful.
In a CINET environment where multiple TCP/IP stacks are configured, use the TCP option for
the TSO PING command and the -p option for the /OS UNIX ping command to specify the
TCP/IP stack name from which you want to issue the ping command. You do not need to
specify those options if you are issuing this command in the associated TCP/IP stack (with
SYSTCPD DD, for example). There is no need to specify this option if your environment is an
INET environment where only one TCP/IP stack is configured.
Example 5-42 shows the display of the z/OS UNIX ping command.
TRACEROUTE command
TRACEROUTE can be invoked by either the TSO TRACERTE command or the z/OS UNIX shell
traceroute or tracert command.
TRACEROUTE displays the route that a packet takes to reach the requested target. TRACEROUTE
starts at the first router and uses a series of UDP probe packets with increasing IP time-to-live
(TTL) or hop count values to determine the sequence of routers that must be traversed to
reach the target host. The output that is generated by this command is shown in
Example 5-43.
In a CINET environment where multiple TCP/IP stacks are configured, use the TCP option for
the TSO TRACERTE command and the -a option for the z/OS UNIX traceroute command to
specify the TCP/IP stack name from which you want to issue the TRACEROUTE command.
You do not need to specify those options if the user running this command is already
associated to the TCP/IP stack (with SYSTCPD DD, for example). There is no need to specify
those options if your environment is an INET environment where only one TCP/IP stack is
configured.
Tip: Using a name instead of IP address needs the resolver or DNS to do the translation.
This adds more variables to the problem determination, and should be avoided when you
are diagnosing network problems. Use the host IP address instead.
Useful commands
In addition to the commands in 5.6.1, “Commands to diagnose networking connectivity
problems” on page 274, you can use other commands to diagnose OMPROUTE problems, as
described here.
D TCPIP,TCPIPA,OMP,OSPF,NBRS command
This command displays all the OSPF neighbors. Make sure that you established the neighbor
with other routers. Example 5-30 on page 265 shows a display.
D TCPIP,TCPIPA,OMP,RTTABLE command
This command displays the OMPROUTE routing table. Make sure that you have the expected
route that is listed in the table. If you have multiple routes for the destination, with different
costs, only the best route (least cost route) is added to the OMPROUTE and TCP/IP routing
tables. Example 5-32 on page 267 shows a display.
D TCPIP,TCPIPA,OMP,RTTABLE,DELETED command
This command displays all of the route destinations that were deleted from the OMPROUTE
routing table since the initialization of OMPROUTE at this node. The routes that have
changed the next hop are not considered deleted, and are therefore not displayed with this
command. Example 5-45 on page 277 shows the results of this display after OMPROUTE is
terminated at SC31 (OMPB), another member of the SYSPLEX.
If there is no apparent error message that can help you to solve the problem, then prepare
OMPROUTE to generate more detailed information by using the debug tools that are
available in OMPROUTE. This can be activated by coding the Debug and Trace options in the
start procedure, or by using the MODIFY command to implement these options.
An OMPROUTE trace from startup can be enabled by coding the trace options after the
forward slash (/) in the PARM field of the OMPROUTE cataloged procedure, as shown in
Example 5-46.
Example 5-46 Trace options that are defined in the OMPROUTE startup procedure
//OMP30A PROC STDENV=STDENV&SYSCLONE
//OMP30A EXEC PGM=OMPROUTE,REGION=4096K,TIME=NOLIMIT,
// PARM=('POSIX(ON) ALL31(ON)',
// 'ENVAR("_BPXK_SETIBMOPT_TRANSPORT=TCPIPA"',
// '"_CEE_ENVFILE=DD:STDENV")/-t2 -d1')
//*
//STDENV DD DISP=SHR,DSN=TCPIPA.OMPROUTE.&STDENV
If a trace cannot be enabled from startup, the following commands can dynamically enable
and disable tracing:
Enable tracing:
– MODIFY omproute,TRACE=2 (TRACE6=2 for IPv6)
– MODIFY omproute,DEBUG=1 (DEBUG6=1 for IPv6)
Disable tracing:
– MODIFY omproute,TRACE=0 (TRACE6=0 for IPv6)
– MODIFY omproute,DEBUG=0 (DEBUG6=0 for IPv6)
Important: Using the OMPROUTE TRACE and DEBUG options and directing the output to
z/OS UNIX file system files generates additional processing impact that might cause OSPF
adjacency failures or other routing problems. To prevent that, change the output destination
to the CTRACE Facility.
You can start the OMPROUTE CTRACE anytime by using the command TRACE CT, or it can
be activated during OMPROUTE initialization. If not defined, the OMPROUTE component
trace is started with a buffer size of 1 MB and the MINIMUM tracing option.
A parmlib member can be used to customize the parameters and to initialize the trace.
The default OMPROUTE Component Trace parmlib member is the SYS1.PARMLIB
member CTIORA00. The parmlib member name can be changed by using the
OMPROUTE_CTRACE_MEMBER environment variable.
In addition to specifying the trace options, you can also change the OMPROUTE trace buffer
size. (The buffer size can be changed only at OMPROUTE initialization.) The maximum
OMPROUTE trace buffer size is 100 MB. The OMPROUTE REGION size in the OMPROUTE
catalog procedure must be large enough to accommodate a large buffer size.
When OMPROUTE is initialized by using the DEBUGTRC option, use a larger internal CTRACE
buffer or an external writer. When using the internal CTRACE buffer, you must get a DUMP of
OMPROUTE to see the trace output.
The following steps illustrate how to start the CTRACE for OMPROUTE and direct the trace
output to an external writer:
1. Create a CTWTR procedure in your SYS1.PROCLIB, as shown in Example 5-47.
2. Prepare the SYS1.PARMLIB member CTIORA00 to get the output data. Example 5-48
shows a sample of CTIORA00 contents.
You can also use the TRACE CT command to define the options that you want after
OMPROUTE is initialized, and send the trace to an external writer, by following these steps:
1. Start the CTRACE external writer, as shown in Example 5-51.
Example 5-51 Start the CTRACE external writer, CTWTR, partial console output
TRACE CT,WTRSTART=CTWTR
ITT038I ALL OF THE TRANSACTIONS REQUESTED VIA THE TRACE CT COMMAND
WERE SUCCESSFULLY EXECUTED.
...
IRR812I PROFILE ** (G) IN THE STARTED CLASS WAS USED
TO START CTWTR WITH JOBNAME CTWTR.
...
IEF196I DSNAME=SYS1.SC30.OMPA.CTRACE,VOL=SER=COMST2,UNIT=3390,
IEF196I SPACE=(CYL,10),DISP=(NEW,
IEF196I CATLG),DSORG=PS
...
ITT110I INITIALIZATION OF CTRACE WRITER CTWTR COMPLETE.
3. Modify the trace or debug trace levels as needed, running one or both of the following
modify commands, as shown in Example 5-53:
– modify omp_proc,trace=x
– modify omp_proc,debug=x
Example 5-53 Modify the omproute to use the trace and debug levels
F OMPA,TRACE=1
EZZ7866I OMPA MODIFY COMMAND ACCEPTED
F OMPA,DEBUG=2
EZZ7866I OMPA MODIFY COMMAND ACCEPTE
6. Save the trace contents in the trace file that is created by the CTWTR procedure by
running the command that is shown in Example 5-55.
8. Change the OMPROUTE debug and trace level, as shown in Example 5-57, to avoid
performance problems. Run the MODIFY command.
After these steps, the trace file must be formatted by using the following IPCS command in
the IPCS Subcommand screen (option 6), as shown in Example 5-58.
The next display shows the OMPROUTE debug entries, as shown in Example 5-59.
For more information about OMPROUTE diagnosis, see z/OS Communications Server: IP
Diagnosis Guide, GC31-8782.
You need a switch to communicate across VLANs, but typically separate VLANs are in
separate IP subnets; therefore, you often need a router to communicate across VLANs.
Virtual Medium Access Control (VMAC) support for z/OS Communications Server is a
function that affects the operation of an OSA interface at the OSI layer 2 level. This is the data
link control (DLC) layer with its sublayer Medium Access Control (MAC) layer.
This chapter covers the topics that are shown in Table 6-1.
6.1, “Virtual MAC overview” on page 286 The VMAC concept, and the environment on
which it can be used
6.4, “VLAN implementation on z/OS” on page 295 Single VLAN and multiple VLAN
implementation scenarios on z/OS
VMAC support enables an OSA interface to have a physical MAC address and many distinct
virtual MAC addresses for each device or interface in a stack. Each stack can define up to
eight VMACs per protocol (IPv4 or IPv6) for each OSA interface.
Using VMACs, forwarding decisions in the OSA can be made without having to involve the
OSI Layer 3 level (network layer / IP layer). From a LAN perspective, the OSA interface with a
VMAC appears as a dedicated device or interface to a TCP/IP stack. Packets that are
destined for a TCP/IP stack are identified by an assigned VMAC address and packets that are
sent to the LAN from the stack use the VMAC address as the source MAC address. This
means that all IP addresses that are associated with a TCP/IP stack are accessible by using
their own VMAC address instead of sharing a single physical MAC address of an OSA
interface.
“OSA-Express router support” on page 145 explains that the PRIRouter and SECRouter
functions enable routing through a TCP/IP stack to IP addresses that are not registered in the
OSA. The stack that has the OSA interface that is defined with PRIRouter receives packets
that are destined for IP addresses that are not in the given stack. The stack then forwards the
packets to the next hop.
Only one PRIRouter can be defined per OSA interface, although multiple SECRouters can be
defined to an OSA interface for other TCP/IP routing stacks. However, only one SECRouter
function can take over services if the PRIRouter is not available. If the first SECRouter
function is not available, then the next defined SECRouter forwards IP packets to the
associated stack. This means that the OSA interface cannot serve multiple TCP/IP routing
stacks concurrently even with the use of the PRIRouter and SECRouter functions.
Another challenge with shared OSA interfaces is one that requires load balancing of traffic
across multiple TCP/IP stacks and IP addresses. For example, certain load balancing
technologies use a concept of distributing packets to the appropriate adjacent systems based
on knowledge of the MAC address.
In our example, we use load balancing (LB) with Sysplex Distributor to illustrate this
challenge. If there is a shared OSA environment, the MAC address is attached to the Sysplex
Distributor and to the selected target system. However, the target IP address can be on a
system other than the Sysplex Distributor.
As a result, the LB forwarding agent sends the packets to be distributed to the OSA’s physical
MAC address, but the OSA knows to send only the information to the system that registered
the target address; it does not know to forward the information to the actual target stack.
Mechanisms that are in place to overcome this challenge are Generic Resource
Encapsulation (GRE) and network address translation (NAT).
For more information about load balancing modes (directed and dispatch), see z/OS
Communications Server: IP Configuration Guide, SC27-3650.
10.1.7.0/24
LPAR A
XCF
Sysplex Distributor LPAR B LPAR C LPAR D
Service Manager Target Stack Target Stack Target Stack
Connect to 10.1.2.31
Connect to 10.1.2.41
This simplifies a shared OSA configuration significantly. Defining VMACs has little
administrative impact. It is also an alternative to GRE or NAT when load balancing
technologies are used. In Figure 6-1, the Dynamic VIPA targets are found without the use
of GRE and without routing through the Sysplex Distributor. One of the options for defining
VMACs permits the OSA to bypass IP address lookup. As a result, when the packet arrives
at the correct VMAC, it is routed to the stack even though the DDVIPA is not registered in
the OAT.
For IPV6, TCP/IP uses the VMAC address for all neighbor discovery address resolution flows
for that stack’s IP addresses, and likewise uses the VMAC as the source MAC address for all
IPv6 packets sent from that stack. Again, from a LAN perspective, the OSA interface with a
VMAC appears as a dedicated device to that stack.
Note: VMAC definitions on a device in a TCP/IP stack override any NONRouter, PRIRouter,
or SECRouter parameters on devices in a TCP/IP stack. If necessary, selected stacks on a
shared OSA can define the device with VMAC and others can define the device with
PRIRouter and SECRouter capability.
Note: Allow the OSA to generate the VMACs instead of assigning an address in the
TCP/IP profile. If VMACs are defined in the LINK statement, they must be defined as locally
administered MAC addresses, and should be unique to the LAN on which they are located.
The same VMAC can be defined for both IPv4 and IPv6 usage, or a stack can use one VMAC
for IPv4 and one for IPv6. Also, a VLAN ID can be associated with an OSA-Express device or
interface that is defined with a VMAC.
Note: To enable virtual MAC support, you must be running at least an IBM System z9
Enterprise Class (z9 EC) or z9 Business Class (z9 BC), and an OSA-Express feature with
OSA Layer 3 Virtual MAC support.
OSA20C0
Router 1
We omitted the DEVICE, LINK, and HOME statements for OSA20C0 on TCPIPB and TCPIPD,
and modified the IP routing definitions on all stacks.
Figure 6-2 is used only for demonstration purposes. We do not recommend implementing any
configuration with single-points-of-failure.
Example 6-1 Device and link statements: VMAC definition for TCPIPA
DEVICE OSA20C0 MPCIPA
LINK OSA20C0L IPAQENET OSA20C0 VLANID 11 VMAC 020012345678 1
DEVICE IUTIQDF4 MPCIPA
LINK IUTIQDF4L IPAQIDIO IUTIQDF4
DEVICE VIPA1 VIRTUAL 0
LINK VIPA1L VIRTUAL 0 VIPA1
If VMAC is defined without a MAC address 2, then OSA generates a VMAC by using a part of
the “burned-in” MAC address of the OSA. You can also specify the MAC address for VMAC 1.
If you decide to specify a MAC address, it must be a locally administered address, which
means bit 6 of the first byte is 1 and bit 7 of the first byte is 0.
There is no need to define PRIRouter or SECRouter on the DEVICE statement. When VMAC is
specified on the LINK statement, PRIRouter or SECRouter is ignored.
Note: z/OS Communications Server is enhanced and IPV4 interfaces VLANs can be
defined by running the INTERFACE statement. More details are available in “INTERFACE
statement” on page 109.
6.2.2 Verification
We verify that VMAC is correctly defined in TCPIPA (see Example 6-3). We specify a MAC
address 1 for the OSA in TCPIPA, so VMACORIGIN is CFG 2.
We also see the VMAC in the OSA Address Table (OAT) is queried by OSA/SF
(Example 6-5). OSA registers all IP addresses (including VIPA) in the TCP/IP stack, and
maps them to the VMAC address.
VMAC IP address
HOME 02000E776C05 010.001.003.011 7
VMAC IP address
HOME 02000F776C05 010.001.003.023 7
VMAC IP address
HOME 020007776C05 7 010.001.002.030
HOME 020007776C05 010.001.002.033
The last 3 bytes of the OSA-generated VMAC 7 are identical to that of the universal MAC
address (“burned-in” address) of the OSA 5. The first byte of the OSA-generated VMAC is
always 02 to make the VMAC a locally administered address. To make the VMAC unique
among all TCP/IP stacks, the second and third bytes are used as a counter that is
incremented each time OSA generates a MAC address.
Example 6-6 shows the ARP cache of the router. IP address 10.1.3.11 in TCPIPA is mapped
to the VMAC that is defined in TCPIPA 8, and IP address 10.1.3.31 in TCPIPC is mapped to
the VMAC that is defined in TCPIPC 9.
Each IP address is mapped to a different MAC address even if these stacks share an OSA
interface. OSA responds to ARP requests for all registered IP addresses by using a VMAC
instead of a “burned-in” MAC address.
According to the routing table, the router chooses 10.1.3.11 as the next hop for destination
address 10.1.1.20, and chooses 10.1.3.31 as the next hop for destination address
10.1.1.40. The router forwards the packet with the destination IP address 10.1.1.20 to the
destination MAC address 0200.1234.5678. When the packet reaches the OSA interface, OSA
forwards the packet to TCPIPA because OSA knows the VMAC 200.1234.5678 is mapped to
TCPIPA. The same can be said for the TCPIPC VMAC.
Example 6-7 shows that the two stacks (TCPIPA and TCPIPC) sharing one OSA interface are
able to route packets correctly.
Router1#traceroute 10.1.1.40
1 10.1.3.31 4 msec 0 msec 0 msec
2 10.1.1.40 0 msec 0 msec 0 msec
Ports that are used to attach VLAN-unaware equipment are called access ports; ports that are
used to connect to other switches or VLAN-aware servers are known as trunk ports. Network
frames that are generated by VLAN-aware equipment are marked with a tag, which identifies
the frame to the VLAN.
z/OS
VLAN A VLAN B
Physical LAN
The z/OS stack registers the VLAN ID to OSA, which means that the OSA does the following
tasks:
Appends a Layer 2 VLAN tag with this VLAN ID on all outbound packets. (For IPv6 unicast
packets, the stack, not the OSA, appends the VLAN tags.)
Filters out any inbound packets that have a VLAN tag containing a different VLAN ID.
VLANs on a single footprint, as shown in Figure 6-3, typically map to separate IP subnets.
This one-to-one mapping is not a requirement because the same IP subnet (a Layer 3
construct) can be subdivided into separate VLANs. Likewise, separate IP subnets on the
same footprint can be mapped to the same VLAN. Nevertheless, it is more common to assign
a separate IP subnet to separate VLAN IDs, as shown in Figure 6-3. The latter type of
network design simplifies network topology and the planning of a Layer 3 routing
infrastructure.
App A App B
Multiple VLAN IDs per OSA
Multiple VLAN IDs per OSA
TCP/IP
VLAN A VLAN B
Physical LAN
Note: By using the ROUTEALL attribute, you allow the interface to forward IP packets. You
can use the ROUTELOCAL attribute if you do not want the interface to forward IP packets.
Configure a unique subnet for each IPv4 interface for this OSA-Express feature by using
the subnet mask specification on the IPADDR parameter on the INTERFACE statement.
To use multiple VLANs for an OSA port, you must configure a separate interface to the
OSA port for each VLAN. Each of these interfaces requires a separate DATAPATH device
in the TRLE definition. Furthermore, each DATAPATH device requires a certain amount of
fixed storage. For more information, see “VTAM considerations” on page 299.
VLAN IDs must be unique on a single OSA port within a single stack. If you code multiple
INTERFACE statements from one stack to the same OSA and do not configure a VLAN ID
for one INTERFACE, the INTERFACE definition is rejected.
If one INTERFACE within a stack that is connecting to an OSA port is implemented with
VLAN/VMAC, then all INTERFACE statements connecting to the same OSA port within that
stack must specify VLAN/VMAC.
If more than one INTERFACE is defined for a particular IP version for a single OSA port
within a stack, then the VLANID, VMAC, and IP subnet values must be unique on each of the
INTERFACE statements. If parallel interfaces are needed with the same IP subnet and same
VLANID, then the parallel INTERFACE statements must be coded on different OSA ports.
Note: Some switch vendors use VLAN ID 1 as the default value when a VLAN ID value is
not explicitly configured. You should avoid the value of 1 when configuring a VLAN ID
value. By convention, the “Native VLANID” is often coded as the number 1 (one).
Source VIPA
Use the following guidelines when selecting a source VIPA:
In earlier CS releases, for IPv4, when source VIPA is in effect, the stack selects a source
VIPA based on the order of the home list (from the ordering of IP addresses in the HOME
statement in the profile). So, for IPv4, the user controls source VIPA selection by using the
HOME statement.
For IPv6, there is no HOME statement. The user controls source VIPA selection by using the
SOURCEVIPAINTERFACE parameter on the INTERFACE statement.
The source VIPA selection for interfaces that are defined with the IPv4 INTERFACE
statement works the same way as IPv6 (by using the SOURCEVIPAINTERFACE parameter,
which must point to the link name of an IPv4 static VIPA).
For IPv4 interfaces that are defined by using DEVICE/LINK, source VIPA selection continues
to work based on the ordering of the home list.
You can specify SOURCEVIPAINTERFACE for every VLAN you define. The VIPA IP address
can be in the same or different subnet from the IP address of the OSA interface.
ARP processing
In QDIO mode, the OSA performs all Address Resolution Protocol (ARP) processing for IPv4.
The z/OS stack informs the OSA of the IP addresses for which it should perform ARP
processing. Because the z/OS stack also supports configurations where ARPs flow for VIPAs
(which one might see on some flat network configurations by using static routing), the stack
also informs the OSA of the VIPAs for which it should perform ARP processing. OSA sends
gratuitous ARPs for these IP addresses during interface takeover scenarios to provide fault
tolerance.
VTAM considerations
The QDIOSTG VTAM start parameter specifies how much storage VTAM keeps available for all
OSA QDIO devices. Each OSA express QDIO DATAPATH device consumes a large amount
of fixed storage. The QDIOSTG value can be overridden by using the READSTORAGE parameter on
the IPAQENET LINK or the INTERFACE statement in the TCPIP profile. As every VLAN adds
another OSA device (DATAPATH) and environment, as a preferred practice, use VTAM tuning
statistics in a multi-VLAN and evaluate the needs and storage.
6.4.4 Verification
In our example, we perform TCPIP device displays and retrieve the OAT to present how
multiple VLANs are recognized by the system. Example 6-9 shows the output of the TCPIP
device display. We define two VLANs and a source VIPA on the INTERFACE statement.
D TCPIP,TCPIPA,N,DEV,INTFN=OSA20C0I VLAN 11
INTFNAME: OSA20C0I INTFTYPE: IPAQENET INTFSTATUS: READY
PORTNAME: OSA20C0 DATAPATH: 20C2 DATAPATHSTATUS: READY
CHPIDTYPE: OSD
SPEED: 0000001000
IPBROADCASTCAPABILITY: NO
VMACADDR: 02000E776C05 VMACORIGIN: OSA VMACROUTER: ALL
ARPOFFLOAD: YES ARPOFFLOADINFO: YES
CFGMTU: 1492 ACTMTU: 1492
IPADDR: 10.1.3.11/24 1
VLANID: 11 VLANPRIORITY: DISABLED 2
DYNVLANREGCFG: NO DYNVLANREGCAP: YES
Example 6-10 shows the OAT of a channel-path identifier (CHPID) that is defined as multiple
VLANs and source VIPA.
Example 6-10 OAT of a CHPID that is defined as multiple VLANs and source VIPA
Image 1.1 (A11 ) CULA 0
00(20C0)* MPC N/A OSA20C0 (QDIO control) SIU ALL
02(20C2) MPC 00 No4 No6 OSA20C0 (QDIO data) SIU ALL
VLAN 11 (IPv4)
VMAC IP address
HOME 02000E776C05 010.001.003.011 3
VMAC IP address
HOME 020011776873 010.001.002.023 1
HOME 020011776873 010.001.002.025 2
Note: The same VMAC is assigned for the VLAN IP address and the source VIPA IP
address. Because VLAN 11 belongs to a different IP subnet mask from the source VIPA,
the source VIPA is not displayed on this VLAN.
This chapter describes the SMC capabilities that are implemented on the z Systems platform
and contains the topics that are shown in Table 7-1.
7.1, “What is Shared Memory Introduction of the SMC protocols on the z Systems
Communications” on page 304 platform.
7.2, “Enabling SMC support” on What is needed to support SMC on z System platforms
page 313 with dependencies and considerations.
7.3, “Setting up the SMC-R environment” Configuration examples and implementation steps for
on page 317 building an SMC-Remote Direct Memory Access
(SMC-R) environment.
7.4, “Setting up our SMC-D Configuration examples and implementation steps for
environment” on page 324 building an SMC-Direct Memory Access (SMC-D)
environment.
Both SMC protocols use shared memory architectural concepts, eliminating TCP/IP
processing in the data path, yet preserving TCP/IP quality of service (QoS) for connection
management purposes.
RDMA technology
One of the key InfiniBand transport mechanisms is RDMA, which allows the transfer of data to
or from memory on a remote system with low latency, high throughput, and low CPU
utilization. RDMA over RoCE is part of the InfiniBand Architecture Specification that provides
transport over Ethernet fabrics. It encapsulates InfiniBand transport headers into Ethernet
frames by using an IEEE-assigned Ethertype.
A RoCE transport performs best when the underlying Ethernet fabric provides a lossless
capability, where packets are not routinely dropped. This goal can be accomplished by using
Ethernet flow control where Global Pause frames are enabled for both transmission and
reception on each of the Ethernet switches in the path between the 10GbE RoCE Express
features. This capability is enabled by default in the 10GbE RoCE Express feature.
RoCE uses a Layer 2 Ethernet fabric (switches with Global Pause enabled) and requires
advanced Ethernet hardware (RDMA-capable NICs).
PF driver
The PF driver communicates with the PF in a PCIe adapter. The PF Driver has the following
functions:
Discover, configure, and manage resources.
Perform hardware error handling.
Perform code updates.
Run diagnostic tests.
VF driver
The VF driver is a function that shared a PCIe adapter across multiple LPARs.
SR-IOV in the z Systems platform provided isolation of VFs within the 10GbE RoCE Express
feature. For example, the 10GbE RoCE Express feature can be shared between 31 LPARs in
the z13 and z13s, and one LPAR cannot cause errors visible to other VFs or other LPARs.
Each operating system LPAR has its own VF driver and application queue in its memory
space.
SMC-R model
SMC-R is a hybrid solution, as shown in Figure 7-3. It uses an existing TCP connection to
establish the SMC-R connection. A TCP option (SMCR) controls switching from TCP to “out
of band” SMC-R. The SMC-R information is exchanged within the TCP data stream. Socket
application data is exchanged through RDMA (write operations). The TCP connection
remains established to control the SMC-R connection.
Middleware/Application Middleware/Application
Sockets Sockets
TCP TCP
SMC-R SMC-R
IP IP
Interface Interface
IP Network (Ethernet)
Dynamic (in-line) negotiation for SMC-R is initiated by presence of TCP Option (SMCR)
TCP connection transitions to SMC-R allowing application data to be exchanged using RDMA
SMC-D over ISM does not use QP technology like SMC-R. Therefore, links and Link Groups
based on QPs (or other hardware constructs) are not applicable to ISM. SMC-D protocol has
a design concept of a “logical point-to-point connection” called an SMC-D link.
Note: The SMC-D information in the netstat command displays is related to ISM link
information (not Link Groups).
Virtual memory is managed by each z/OS (similar to SMC-R logically shared memory)
following the existing z Systems PCIe I/O translation architecture.
Server Client
Sockets Sockets
SMC SMC
Figure 7-4 Connect two z/OS LPARs in the same z Systems platform by using SMC-D
SMC-D model
SMC-D is a protocol that allows TCP socket applications to transparently use ISM. ISM is a
virtual channel similar to IQD for HiperSockets. A virtual adapter is created in each z/OS
LPAR and by using the SMC protocol, the memory is logically shared. The virtual network is
provided by firmware.
SMC is based on a TCP/IP connection and preserves the entire network infrastructure.
SMC-D is also a “hybrid” solution, as shown in Figure 7-5. It uses a TCP connection to
establish the SMC-D connection. The TCP path can be either through an OSA-Express port
or HiperSockets connection. A TCP option (called SMCD) controls switching from TCP to “out
of band” SMC-D. The SMC-D information is exchanged within the TCP data stream. Socket
application data is exchanged through ISM (write operations). The TCP connection remains
established to control the SMC-D connection.
Sockets Sockets
TCP TCP
SMC-D SMC-D
IP IP
Interface Interface
TCP connection transitions to SMC-D allowing application data to be exchanged using Direct
Memory Access (LPAR-to-LPAR)
Figure 7-5 Dynamic transition from TCP to SMC-D by using two OSA-Express features
Both SMC protocols can coexist in the same z Systems platform. Figure 7-6 shows a
three-tier solution using both SMC-D and SMC-R across two z Systems platforms.
Function ID
The 10GbE RoCE features and the ISM adapters are identified by a hexadecimal Function
Identifier (FID) with a range of 00 - FF. A FID can be used only by one LPAR at a time, but is
reconfigurable. Only one FID can be defined for z Systems platforms before the z13 or z13s.
Up to 31 FIDs can be defined for shared mode (on a z13 and a z13s) for each physical card.
A PCHID on a FUNCTION statement must be unique and cannot match a PCHID value on the
CHPID statement.
VCHID specifies the virtual channel identification number (7E0 - 7FF) that is associated with
the VF. VCHID is required for FUNCTION TYPE=ISM. A VCHID on a FUNCTION statement must be
unique and cannot match a VCHID value on the CHPID statement.
Physical network ID
The physical network ID (PNETID) is used to logically group interfaces and adapters based
on connectivity. Operating systems (for example, z/OS) dynamically learn the PNETID (from
the I/O configuration) and then group OSA-Express ports and 10GbE RoCE Express ports
based on matching PNETIDs for SMC-R. For SMC-D, OSA-Express ports or HiperSockets
and ISM are grouped based on matching PNETIDs.
Important: If you do not configure a PNETID for the RoCE adapter, activation fails. If you
do not configure a PNETID for the OSA-Express port, activation is successful, but the
interface is not eligible for SMC-R use.
TYPE
Specifies the type of function adapters that are supported for SMC-R and SMC-D. The
following TYPE keyword values are allowed:
ISM specifies that the Function ID is associated with an SMC-D (internal virtual network
connection).
ROCE or ROC specifies that the Function ID is associated with a 10GbE RoCE Express
feature.
PART
The PART keyword specifies the availability of the FID to LPARs. All LPAR names that are
specified must match those that are specified in the RESOURCE statement.
RoCE RoCE
PCHID=100 PNETID=PNETA PCHID=12C PNETID=PNETA
Example 7-1 shows a sample FUNCTION configuration to define 10GbE RoCE Express ports
that are shared between LPARs.
Physical 10GbE RoCE Express features on PCHID 100 and PCHID 12C can be shared
between other LPARs in the z Systems platform by adding FUNCTION statements with different
FIDs and VFs.
Sockets Sockets
SMC ISM SMC
FID=17 FID=18
VF=1 VCHID=7E1 VF=2
PNETID=PNET1
Figure 7-8 ISM adapters that are shared between LPARs in a single z Systems platform
Note: In Figure 7-8, the ISM network “PNET1” is referenced by the PNETID statement. ISM
(like HiperSockets) does not use physical cards or card slots (PCHID), but instead uses a
logical instance that is defined as a VCHID.
Workloads can be logically isolated on separate ISM VCHIDs or RoCE PCHIDs. Alternatively,
workloads can be isolated by using VLANs. The VLAN definitions are inherited from the
associated IP network definitions of the OSA-Express ports or HiperSockets with the same
PNETID. The VLANs are registered or inherited up front when the RNIC is first activated. The
VLANs are already registered to the RoCE feature before the TCP connection is set up.
Configuration considerations
The IOCDS (HCD) definitions for ISM PCI VFs are not directly related to the software
(SMC-D) usage of ISM (the z/OS TCP/IP and SMC-D implementation and usage are not
directly related to the I/O definition).
The user defines a list of ISM FIDs (VFs) in IOCDS (HCD), and z/OS dynamically selects an
eligible FID based on the required PNet ID. FIDs or VFs are not defined in Communications
Server for z/OS TCP/IP. Instead, z/OS selects an available FID for a specific PNET. Access to
additional VLANs does not require configuration of additional VFs.
Note: For future use, consider over-provisioning the number of FIDs and VFs for each ISM
VCHID.
For native PCI devices, FIDs must be defined. Each FID in turn also defines a corresponding
VF. In terms of operating system administration tasks, the administrator typically references
FIDs. Usually VFs (and VF numbers) are transparent.
SMC-R needs the following items in each IBM z13, IBM z13s™, IBM zEC12, or IBM zBC12:
10 Gigabit Ethernet (10GbE) RoCE Express features. Up to sixteen 10GbE RoCE
Express features are supported per platform. The ports must be dedicated to an LPAR on
a zEC12 or zBC12. On a z13 or z13s, the ports can be shared across LPARs.
OSA-Express ports in Queued direct input/output (QDIO) mode (channel-path identifier
(CHPID) type OSD). The supported OSA-Express features include the 10 GbE, the 1
GbE, and the 1000BASE-T.
A standard 10 GbE switch is optional and does not have to be RDMA over RoCE-enabled.
Input/output configuration data set (IOCDS) with PCHID, FID, VF (for sharing), and
PNETID defined to the FUNCTION statement for the 10GbE RoCE Express ports, and a
matching PNETID that is defined to the CHPID statement for the OSA-Express ports.
SMC-D requires the following items in each IBM z13 or IBM z13s:
HiperSockets connections or OSA-Express ports in queued direct input/output (QDIO)
mode (CHPID type OSD). The supported OSA-Express features include the 10 GbE, the
1 GbE, and the 1000BASE-T.
Input/output configuration data set (IOCDS) with VCHID, FID, VF, and PNETID defined to
the FUNCTION statement for the ISM with a matching PNETID in the CHPID statement for
HiperSockets connections or the OSA-Express ports.
An SMC-R point-to-point connection is a viable option for test scenarios, but is not a preferred
practice for production deployment because the connection does not allow for connectivity
with other LPARs (multiple SMC-R peers).
If the 10GbE RoCE Express ports are connected to 10 GbE switches, the switch ports must
be set to the following settings:
Global Pause: IEEE 802.3x port-based Flow Control should be enabled.
Priority Flow Control (PFC): IEEE 802.1Qbb, priority-based Flow Control should be
disabled.
The maximum supported unrepeated point-to-point distance is 300 meters (984.25 feet)
between the 10GbE RoCE Express port and the 10 GbE switch port.
Table 7-2 shows the port characteristics of the 10GbE RoCE Express feature and supported
z Systems platforms.
Table 7-2 Characteristics of the 10GbE RoCE Express feature per z Systems platform
z Systems platform Supported ports Shared portsa Dedicated ports
The configuration and operations tasks follow the same process (HCD or IOCDS) as existing
PCI functions, such as RoCE Express and zEDC Express. ISM supports dynamic I/O and
provides adapter virtualization (VFs), such as:
Up to 32 ISM VCHIDs per z13 or z13s. A VCHID represents a unique ISM network, each
with a unique PNETID.
Each VCHID supports up to 255 VFs (the maximum is 8,000 VFs per z13 or z13s).
VCHIDs support VLANs.
A Global Identifier (GID) that is internally generated to correspond with each ISM FID.
Virtual MACs (VMACs), MTU, physical ports, and frame size are not applicable.
z/VM is supported in pass-through mode (PTF is required).
https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.2.0/com.ibm.zos.v2r2.halz0
01/globalconfigstatement.htm
For more information about SMC-R planning and security considerations, go to:
http://www.ibm.com/software/network/commserver/SMC-R/
For more information about SMC-D planning and security considerations, go to:
http://www.ibm.com/software/network/commserver/SMC-D/
z Systems (z13)
SC30 SC31 SC32 SC33
TCPIPA TCPIPB TCPIPC TCPIPD
PROFA30F (Flat network) PROFB31F (Flat network) PROFC32F (Flat network) PROFD33F (Flat network)
VIPA1I 10.1.10.10 VIPA1I 10.1.10.20 VIPA1I 10.1.10.30 VIPA1I 10.1.10.40
VIPA2I 10.1.20.10 VIPA2I 10.1.20.20 VIPA2I 10.1.20.30 VIPA2I 10.1.20.40
Ethernet
Switch
We use two RoCE and two OSA-Express interfaces that are shared across four z/OS LPARs.
The I/O configuration that is shown in this section is defined in HCD, with the resulting IOCDS
definitions shown in Example 7-3.
To use the RoCE adapter in shared mode, create one FUNCTION statement for each LPAR in
our scenario, defining a specific FID for each LPAR and a Virtual Function ID for each TCP/IP
stack.
The resulting IOCDS for our scenario is shown in Example 7-3 on page 317.
Both OSAs are shared by the same partitions as the RoCE adapters.
The profiles for each LPAR and TCP/IP stack that are going to be part of the same
environment must have at least one PFID that is associated to a specific RoCE adapter.
PORTNUM defaults to port 1 of the RoCE feature. If you want to use port 2 of the RoCE
feature, then PFID xxxx PORTNUM 2 must be defined on the GLOBALCONFIG statement.
In our environment, each TCP/IP stack uses both RoCE adapters, as shown in Example 7-5.
Defining the OSA interfaces that are not for SMC-R use
After the global statement SMCR is configured in the TCP/IP profile, all IPAQENET interfaces
with CHPID TYPE OSD use the SMC-R function by default.
In our test environment, we use two OSA interfaces. To compare the throughput with and
without SMC-R, we define one of the interfaces with the NOSMCR parameter, as shown in
Example 7-6.
During the TCP/IP startup process, RNIC interfaces are dynamically created and associated
with the OSA interfaces where SMCR is defined. This can be verified by using a display
command, as shown in Example 7-8.
Example 7-8 Verify SMC-R through an OSA interface display (partial results)
D TCPIP,TCPIPA,N,DEV,INTFN=OSA23A0I
INTFNAME: OSA23A0I INTFTYPE: IPAQENET INTFSTATUS: READY
PORTNAME: OSA23A0 DATAPATH: 23A2 DATAPATHSTATUS: READY
CHPIDTYPE: OSD SMCR: YES 1
PNETID: COMMSRVA 2 SMCD: NO
...
INTERFACE STATISTICS:
BYTESIN = 0
INBOUND PACKETS = 0
INBOUND PACKETS IN ERROR = 0
INBOUND PACKETS DISCARDED = 0
INBOUND PACKETS WITH NO PROTOCOL = 0
BYTESOUT = 84
OUTBOUND PACKETS = 1
OUTBOUND PACKETS IN ERROR = 0
In the results from the display, you can see the following information:
1. The OSA interface OSA23A0I has SMC-R enabled.
2. OSA23A0I is using PNETID COMMSRVRA.
3. OSA23A0I is associated with the RNIC interfaces that are created during TCP/IP stack
startup.
You also can verify the status of the RNIC interfaces by using a display command, as shown
in Example 7-9.
In the results from the display, you can verify the following relevant information:
1. The PFID that is associated with the LPAR this stack is running.
2. The dynamic TRLE that is created in VTAM to connect the RNIC physical interface.
3. The PNETID that represents the physical network where this interface is connected to. It
must be the same for all OSAs and TCP/IP stacks that are using this network.
Each batch job transfers data between different stacks and LPARs to verify that the RNIC
interface is being shared as expected. The results are similar in each stack, so we show the
results from one stack only.
The first test was made by transferring data through the OSA network without SMC-R,
subnetwork 10.1.10.xx. The results are shown in Example 7-10 on page 321.
During the data transfer process, each concurrent job being activated causes the overall
performance to drop. You can observe the CPU utilization of the FTP batch jobs and the
TCP/IP stacks, as shown in Example 7-11.
The OSA interfaces are getting most of the workload and are the bottleneck in this test.
Next, move to the SMC-R test. Before you initiate the data transfer by using SMC-R, check
the RNIC interfaces to verify their status, as shown in Example 7-12.
The display command shows that both RNIC interfaces are ready and no connections are
established through them.
Then, start the same concurrent jobs, now using subnetwork 10.1.20.x, which is defined to
use SMC-R, with the results that are shown in Example 7-13.
Example 7-13 Job log (partial) for the FTP data transfer by using SMC-R
EZA1736I FTP
EZY2640I Using dd:SYSFTPD=TCPIPA.TCPPARMS(FTPDA30) for local site configuration parameters.
...
EZA1466I FTP: using TCPIPA
EZA1456I Connect to ?
EZA1736I 10.1.20.30
EZA1554I Connecting to: 10.1.20.30 port: 21.
...
EZA1460I Command:
EZA1736I PUT 'cs03.seq1' seq12
EZA1701I >>> SITE VARrecfm LRECL=27998 RECFM=U BLKSIZE=27998
200 SITE command was accepted
EZA1701I >>> PORT 10,1,20,10,4,6
200 Port request OK.
EZA1701I >>> STOR seq12
125 Storing data set CS03.SEQ12
EZA1485I 937781382 bytes transferred - 10 second interval rate 93778.06 KB/sec - Overall
transfer rate 93778.06 KB/sec
250 Transfer completed successfully.
EZA1617I 1135494539 bytes transferred in 12.080 seconds. Transfer rate 93997.88
Kbytes/sec.
With SMC-R in use, you can see that the bottleneck is moved to the application, which is
wanted because the data is delivered faster, which causes the application to use more CPU
during less time. The TCP/IP stack uses less CPU while improving the overall performance.
Looking at the RNIO interface display, you can observe that a connection between the LPARs
is created to transfer the data, as shown in Example 7-15.
The display command shows that SMC links are dynamically created between the stacks
connecting them and allow data to be transferred through them.
This data is formatted as packet trace data, and we enable the trace process the same way it
is done for TCP/IP connections (for example, protocol, port, and IP address).
The application flow is sent as TCP data as usual. You also can use Connection Layer Control
(CLC) and Link Layer Control (LLC) flows, with full support for TCP/IP component trace
(CTRACE), data trace, and VTAM Internal Trace (VIT).
No additional configuration tasks are necessary, and we format this data through the IPCS
component.
For more information about diagnosing SMC-R, see z/OS Communications Server: IP
Diagnosis Guide, GC31-8782.
z Systems (z13)
We use two ISM interfaces, which are shared by four LPARs and two HiperSockets CHPIDs.
The I/O configuration that is shown in this section is defined in HCD. We show only the
resulting IOCDS definitions in Example 7-16 on page 325.
The following checklist provides a task summary for enabling SMC-D support:
Configuring the ISM interfaces
Configuring the HiperSockets connections1
Altering the TCP/IP profile to include SMC-D support
1
An OSA interface can be used as described in “Configuring the OSA interfaces” on page 318 instead of a
HiperSockets connection.
To use the ISM adapter in shared mode, we create one FUNCTION statement for each LPAR in
our scenario, defining a specific Function ID (FID) for each LPAR and a Virtual Function ID for
each TCP/IP stack.
The resulting IOCDS configuration for ISM in our scenario is shown in Example 7-16.
Both HiperSockets are shared by the same partitions as the ISM adapters.
2
The OSA interfaces must have NOSMCD added to the INTERFACE statement in the TCP/IP profile if they are not use for
SCM-D.
Important: Up to eight TCP/IP stacks can share an ISM VCHID (ISM feature) in a specific
LPAR (each TCP/IP stack must define a unique FID value).
In our test environment, we use two IPAQIDIO interfaces to compare the throughput with and
without SMC-D. We define the first interface, HIPERFEI, with the NOSMCD parameter, as shown
in Example 7-19.
The other HiperSockets interface HIPERFFI is defined to use SMC-D and it is connected to
PNETID COMMSRVD, as shown in Example 7-17 on page 325.
Example 7-21 Verify the SMC-D HiperSockets connection through an interface display (partial results)
D TCPIP,TCPIPA,N,DEV,INTFN=HIPERFFI
INTFNAME: HIPERFFI INTFTYPE: IPAQIDIO INTFSTATUS: READY
TRLE: IUTIQ4FF DATAPATH: 7F02 DATAPATHSTATUS: READY
CHPID: FF
PNETID: COMMSRVD 2 SMCD: YES 1
...
INTERFACE STATISTICS:
BYTESIN = 0
INBOUND PACKETS = 0
INBOUND PACKETS IN ERROR = 0
INBOUND PACKETS DISCARDED = 0
INBOUND PACKETS WITH NO PROTOCOL = 0
BYTESOUT = 0
OUTBOUND PACKETS = 0
OUTBOUND PACKETS IN ERROR = 0
OUTBOUND PACKETS DISCARDED = 0
ASSOCIATED ISM INTERFACE: EZAISM01 3
1 OF 1 RECORDS DISPLAYED
END OF THE REPORT
You also can verify the status of the ISM interface by using the display command, as shown in
Example 7-22.
Each batch job transfers data between different TCP/IP stacks and LPARs to verify that the
ISM interface is being shared as expected. The results are similar in each stack, so we show
only the results from one TCP/IP stack.
The first test was made transferring data through the HiperSockets network without SMC-D,
subnetwork 10.1.30.xx, and the results are shown in Example 7-23.
Example 7-23 Job log (partial) for the FTP data transfer by using HiperSockets without SMC-D
EZA1554I Connecting to: 10.1.30.30 port: 21.
...
230 CS03 is logged on. Working directory is "CS03.".
EZA1460I Command:
EZA1736I ebcdic
EZA1701I >>> TYPE E
200 Representation type is Ebcdic NonPrint
EZA1460I Command:
EZA1736I mode b
EZA1701I >>> MODE B
200 Data transfer mode is Block
EZA1460I Command:
EZA1736I site recfm=u blksize=27998 cylinders volume=COMDA2 unit=3390
EZA1701I >>> SITE recfm=u blksize=27998 cylinders volume=COMDA2 unit=3390
EZA1701I >>> PORT 10,1,30,10,4,34
200 Port request OK.
EZA1701I >>> STOR seq13
125 Storing data set CS03.SEQ13
EZA1485I 779980521 bytes transferred - 10 second interval rate 77998.00 KB/sec -
Overall transfer rate 77998.00 KB/sec
250 Transfer completed successfully.
EZA1617I 1135494539 bytes transferred in 14.620 seconds. Transfer rate 77667.19
Kbytes/sec.
By using a HiperSockets interface to transfer data, you have better throughput and higher
CPU utilization compared to the tests made by using the OSA interface.
Before you initiate the data transfer by using SMC-D, check the ISM interface to verify its
status, as shown in Example 7-25.
The display command shows that the ISM interface is ready, but no traffic is using this path.
Next, start the same concurrent jobs by using subnetwork 10.1.40.x, which is defined to use
SMC-D. The results are shown in Example 7-26.
Example 7-26 Job log (partial) for the FTP data transfer by using HiperSockets with SMC-D
EZA1736I FTP
EZY2640I Using dd:SYSFTPD=TCPIPA.TCPPARMS(FTPDA30) for local site configuration
parameters.
...
EZA1459I NAME (10.1.40.30:CS01):
EZA1701I >>> USER cs03
...
EZA1701I >>> STOR seq14
125 Storing data set CS03.SEQ14
Using SMC-D, we saw better throughput compared to the test using HiperSockets interface
without SMC-D, as shown in Example 7-23 on page 328.
To confirm, we use the SMC-D interface to transfer our data and run the command Netstat
DevLink,SMC again, as shown in Example 7-27.
Next, we look at the CPU utilization during the data transfer, and we see that the overall
utilization is reduced significantly, as shown in Example 7-28.
This data is formatted as packet trace data, and we enable the trace process the same way
as is done for TCP/IP connections (protocol, port, and IP address).
No additional configuration processing is necessary, so we format this data through the IPCS
component.
For more information about diagnosing SMC-R, see z/OS Communications Server: IP
Diagnosis Guide, GC31-8782.
As mentioned, the subplexing support is also for VTAM nodes. However, this chapter
describes subplexing only for TCP/IP stacks. For information about VTAM subplexing, see
SNA Network Implementation Guide, SC31-8777.
This chapter covers the topics that are shown in Table 8-1.
8.1, “Introduction” on page 334 The subplexing concept, and the environment on which
it can be used.
8.3, “Load Balancing Advisor and The Load Balancing Advisor (LBA) allows any external
subplexing” on page 337 load balancing solution to become sysplex aware.
HiperSockets XCF
Dedicated LPARs with
single TCP/IP stacks Communications with all TCP/IP stacks
HiperSockets XCF
IUTSAMEH IUTSAMEH
TCPIPA TCPIPB TCPIPA TCPIPB
Concept of subplexing
A subplex is a subset of a sysplex that consists of selected members. Those members are
connected and they communicate through the dynamic cross-system coupling facility (XCF)
groups to each other, using the following methods:
XCF links (for cross-system IP and VTAM connections)
IUTSAMEH (for IP connections within an LPAR)
HiperSockets (IP connections cross-LPAR in the same server)
Subplexes do not communicate with members outside the subset of the sysplex. For
example, in Figure 8-2, TCP/IP stacks with connectivity to the internal network can be
isolated from TCP/IP stacks that are connected to the external network by using subplexing.
HiperSockets XCF
Dedicated LPARs with
single TCP/IP stacks No communications to
dissimilar Subplexes
Subplex 1 Subplex 2
z/OS LPAR z/OS LPAR z/OS LPAR z/OS LPAR
HiperSockets XCF
No communications to
Multi-purpose LPARs Communications dissimilar Subplexes
with dual TCP/IP stacks within same Subplex
IUTSAMEH IUTSAMEH
TCPIPA TCPIPB TCPIPA TCPIPB
TCP/IP stacks are defined as members of a subplex group with a defined group ID. For
example, in Figure 8-2, TCP/IP stacks within subplex 1 can communicate only with stacks
within the same subplex group. They cannot communicate with stacks in subplex 2.
Note: Although there are specialized cases where multiple stacks per LPAR can provide
value, as a preferred practice, implement only one TCP/IP stack per LPAR when possible.
Figure 8-3 illustrates our TCP/IP subplexing environment with the following attributes:
The first subplex is a VTAM subplex, which is not within the scope of this book. However,
when defining only a TCP/IP subplex, a default VTAM subplex is defined automatically.
Note: A TCP/IP subplex uses VTAM XCF support for DYNAMICXCF connectivity.
Therefore, a TCP/IP stack cannot span different VTAM subplexes.
The second subplex is configured with TCP/IP C stacks running in LPARs A11and A13,
representing the internal subplex.
The third subplex is configured with TCP/IP D stacks running in LPARs A13 and A16,
representing the external subplex.
VTAM Subplex
TCP/IP C TCP/IP C
IP Subplex 11
TCP/IP D TCP/IP D
IP Subplex 22
With subplex support for LBA, more than one advisor can be active in the sysplex at any given
time. In fact, there should be one advisor active for each subplex in the sysplex that
participates in load balancing through the LBA. Each advisor reads configuration data from a
file, which can exist as a z/OS UNIX file, a PDS or PDSE member, or a sequential data set.
In the configuration file for each advisor, the sysplex_group_name statement specifies the
TCP/IP sysplex group name in the form of EZBTvvtt, where vv is the VTAM subplex group ID
that is specified on the VTAM XCFGRPID start option and tt is the TCP/IP subplex group ID that
is specified by the XCFGRPID parameter on the GLOBALCONFIG statement in the TCP/IP profile. If
no VTAM subplex ID is specified when VTAM is started, then vv is CP. If no TCP/IP subplex ID
is specified in the TCP/IP profile, then tt is CS. If you have a default subplex in your sysplex
(that is, a subplex in which both the VTAM and TCP/IP subplex IDs are not specified),
configure the LBA for that subplex with a sysplex group name of EZBTCPCS.
Figure 8-4 shows that a LBA application is configured to allow an external LBA to connect to
the internet subplex and the intranet production subplex.
LB2 is balancing connections to applications running TCP/IP stacks in the intranet production
IP subplex on LPAR3, LPAR4, and LPAR5. The TCP/IP sysplex group name is EZBT2102
(VTAM XCFGRPID 21 and TCP/IP XCFGRPID 02). The TCP/IP subplex ID is 2102. LB2
connects to an LBA in this subplex. The Advisor, LBAD2102, is configured to use stacks that
are members of the TCP/IP subplex ID of 2102. A single instance of this Advisor can run in
LPAR3, LPAR4, or LPAR5. It is running in LPAR3. Three agents are configured to use stacks
that are members of TCP/IP subplex ID of 2102. The three agent job names are as follows:
LBAG02102 on LPAR3
LBAG2102 on LPAR4
LBAG2102 on LPAR5
Note: Although there are two TCP/IP stacks in LPAR5 in subplex 2102, there is only
one Load Balancing Agent for that subplex on that LPAR. The one Agent reports on all
servers in that LPAR in that subplex.
There is no load balancing for applications that are running in the intranet development IP
subplex. Therefore, no advisor and no agents need to run in this subplex. If you want to load
balance in the intranet development IP subplex, configure an Advisor instance to run on either
LPAR4 or LPAR5. Also, configure an Agent instance to run on both LPAR4 and LPAR5, and
configure the Advisor and Agent applications to use stacks that are members of TCP/IP
subplex ID 2104 (TCPIP6 and TCPIP7).
There are two subplexes in the three LPARs on the right side of the figure. The production IP
subplex has TCP/IP subplex ID 2102 because the VTAM XCF group ID is 21 and the TCP/IP
XCF group ID is 02. Subplex 2102 spans LPAR3, LPAR4, and LPAR5. The TCP/IP sysplex
group name is EZBT2102. This subplex includes the following stacks:
Stack TCPIP3 on LPAR3
Stack TCPIP4 on LPAR4
Stacks TCPIP5 and TCPIP8 on LPAR5
The Development IP subplex spans only LPAR4 and LPAR5. This subplex has a TCP/IP
subplex ID of 2104, which is VTAM XCF group ID 21 and TCP/IP XCF group ID 04. The
TCP/IP sysplex group name EZBT2104. This subplex includes the following stacks:
Stack TCPIP6 on LPAR4
Stack TCPIP7 on LPAR5
Note: A TCP/IP subplex cannot span multiple VTAM subplexes because all TCP/IP stacks
on an LPAR use the same VTAM for their dynamic XCF communication.
If the IP traffic for a defined subplex uses HiperSockets, which is the preferred method for
cross-LPAR connectivity within the same server, then an additional parameter (IQDVLANID) in
the GLOBALCONFIG is needed for the HiperSockets VLAN ID of the HiperSockets connection
that is built with the DYNAMICXCF definition. Values 2 - 31 are valid for XCFGRPID, and IQDVLANID
allows values 1 - 4094. If you define HiperSockets with DEVICE and LINK statements, the
parameter VLANID on the LINK statement is required for assigning the VLAN for the subplex.
Figure 8-5 depicts our subplexing environment: three LPARs with a VTAM subplex, and two IP
subplexes (11 and 22). Because we did not define the VTAM subplex, the XCFGRPID value for
the VTAM subplex automatically defaults to CP.
TCP/IP C TCP/IP C
VIPA:10.30.1. 230 VIPA:10.30.1. 241
IP Subplex 11
(internal subplex) XCFGRPID:11 IQDVLANID:11
.100 HiperSockets .101
XCF
10.30.20.0/24
TCP/IP D TCP/IP D
VIPA:10.40.1. 241 VIPA:10.40.1. 221
IP Subplex 22
(External subplex) XCFGRPID:22 IQDVLANID:22
.101 HiperSockets .102
XCF 10.20.40.0/24
OSPF_Area=0.0.0.2
VTAM IQDCHPID: F7
Stub_Area=YES
For TCP/IP, both the VTAM group ID suffix and the TCP group ID suffix are used to build the
TCP/IP group name. This group name is also used to join the sysplex. Remember, when
starting TCP/IP under Sysplex Autonomics control in previous z/OS releases, the stack joined
the sysplex group with the name EZBTCPCS. You can verify this by using the D XCF,GROUP
command.
EZBTCPCS is the default TCP/IP group name. The format of this group name is EZBTvvtt,
where vv is a 2-digit VTAM group ID suffix that is specified on the VTAM XCFGRPID start option
(the default is CP if not specified) and tt is a 2-digit TCP group ID suffix that is specified on the
XCFGRPID parameter of the GLOBALCONFIG statement (the default is CS if not specified).
In our scenario (see 3 in Example 8-3 on page 342), we define XCFGRPID 11 for TCP/IP; we
do not define XCFGRPID for VTAM. The result is an XCF group name of EZBTCP11 (6 in
Example 8-4 on page 343).
You might recognize that both XCFGRPIDs are important in creating the subplex group name.
Changing the VTAM XCFGRPID changes the XCF group name for the TCP/IP stack. Thus, the
stack is no longer a member of the previous TCP/IP subplex group.
Although nothing was changed in the TCP/IP profile definitions in this example, the TCP/IP
stack with the new subplex group name is no longer a member of the previous subplex
(EZBTCP11). Thus, the TCP/IP stack loses the connectivity to the subplex.
Important: If VTAM is brought down and restarted with a different XCFGRPID, the TCP/IP
stacks must be stopped and restarted to pick up the new VTAM subplex group ID.
Otherwise, the TCP/IP stacks continue to act as though there were in the original sysplex
group, resulting in unpredictable connectivity.
If TCP and VTAM Coupling Facility structures are used, names must also be unique for each
subplex to preserve separation between the subplexes. This means that the TCP structures
EZBDVIPA and EZBEPORT must be appended with the VTAM and TCP XCF group ID
suffixes to the end of the structure names (for example, EZBDVIPAvvtt and EZBEPORTvvtt,
where vv is the 2-digit VTAM group ID suffix that is specified on the XCFGRPID start option and
tt is the 2-digit TCP group ID specified in the TCP/IP profile).
The TCP structure names, including the suffixes, must be defined in the sysplex CFRM policy
(see Example 8-1).
Note: Example 8-1 is only a sample. The size depends on the number of source DVIPAs
and concurrently established TCP outbound connections from all TCPSTACKSOURCEVIPA of
the participating stacks within the sysplex. The ephemeral port number for each
connection is stored to avoid duplicate source port numbers.
For more information about TCP and VTAM structures, see z/OS MVS Setting Up a Sysplex,
SA22-7625.
The following sections describe the implementation for each subplex in detail.
IP Subplex 11
TCP/IP C TCP/IP C
VIPA:10.30.1. 230 VIPA:10.30.1. 241
XCFGRPID:11 IQDVLANID:11
.100 HiperSockets .101
10.30.20.0/24
XCF
The VTAM start option is needed by VTAM to automatically create the Transmission Resource
List Element (TRLE) for the HiperSockets interface of the stack. The TRLE points to its
IUTQDIO name, which is defined to the TCP/IP profile DEVICE name. The PORTNAME that
is created by VTAM is IUTQDxx, where xx is the used Channel Path ID (CHPID).
The DYNAMICXCF function also requires the VTAM start option XCFINIT=YES (see 2 in
Example 8-2), which creates the XCF major node dynamically.
Tip: You can check your VTAM start options by using the D NET,VTAMOPTS command.
Example 8-2 ATCSTRxx definitions that are needed for DYNAMICXCF and the HiperSockets interface
SYS1.VTAMLST(ATCSTR31)
IQDCHPID=F7, 1
XCFINIT=YES 2
Example 8-3 shows the TCP/IP profile definitions that are needed for assigning stack C in
LPAR A13 to subplex 11. Based on the parameters XCFGRPID 3 and IQDVLANID 4, stack C
belongs to subplex 11. The group interface is defined by using the IPCONFIG parameter
DYNAMICXCF with its IP address 10.30.20.101 5.
Example 8-3 TCP/IP profile: subplex definitions for stack C in LPAR A13
GLOBALCONFIG
XCFGRPID 11 3
IQDVLANID 11 4
;
IPCONFIG
DYNAMICXCF 10.30.20.101 255.255.255.0 8 5
The definitions for LPAR A11 are not shown because the XCFGRPID is the same. Only the
DNAMICXCF IP address 5 is different (10.30.20.100).
In our scenarios, we did not define the VTAM start option XCFGRPID. A display from LPAR A13
TCP/IP stack C (see Example 8-4) shows that the stack is a member of the VTAM subplex
group ID CP and TCP/IP subplex group 11, with the name EZBTCP11 6.
In the same LPAR, there is another stack member of subplex group 22 with the name
EZBTCP22 7 (see 8.4.4, “Subplex 22: External subplex” on page 345).
The number in parentheses is related to the number of stacks that are active in the XCF
group.
Example 8-5 displays that the stack in LPAR A13 is in subplex 11 with its name EZBTCP11 9.
The definitions for the subplex 22 (EZBTCP22) are described in 8.4.4, “Subplex 22: External
subplex” on page 345.
D TCPIP,TCPIPD,SYSPLEX,GROUP
EZZ8270I SYSPLEX GROUP FOR TCPIPD AT SC31 IS EZBTCP22
Example 8-6 NETSTAT CONFIG with XCFGRPID and IQDVLANID for stack C
D TCPIP,TCPIPC,NETSTAT,CONFIG
GLOBAL CONFIGURATION INFORMATION:
TCPIPSTATS: NO ECSALIMIT: 0000000K POOLLIMIT: 0000000K
MLSCHKTERM: NO XCFGRPID: 11 10 IQDVLANID: 11 11
SEGOFFLOAD: NO SYSPLEXWLMPOLL: 060 MAXRECS: 100
EXPLICITBINDPORTRANGE: 00000-00000 IQDMULTIWRITE: NO
WLMPRIORITYQ: NO
SYSPLEX MONITOR:
TIMERSECS: 0060 RECOVERY: NO DELAYJOIN: NO AUTOREJOIN: NO
MONINTF: NO DYNROUTE: NO JOIN: YES
The command NETSTAT DEV also shows the HiperSockets connection with VLANID 12, which
is the same value as IQDVLANID, as shown in Example 8-7.
IP Subplex 22
z/OS LPAR: A13 z/OS LPAR: A16
TCP/IP D TCP/IP D
VIPA:10.40.1. 241 VIPA:10.40.1. 221
XCFGRPID:22 IQDVLANID:22
.101 HiperSockets .102
XCF
10.20.40.0/24
Example 8-8 TCP/IP profile: subplex definitions for stack D in LPAR A13
GLOBALCONFIG
XCFGRPID 22 1
IQDVLANID 22 2
;
IPCONFIG
DYNAMICXCF 10.20.40.101 255.255.255.0 8 3
Example 8-9 TCP/IP profile: subplex definitions for stack D in LPAR A16
GLOBALCONFIG
XCFGRPID 22 4
IQDVLANID 22 5
;
IPCONFIG
DYNAMICXCF 10.20.40.102 255.255.255.0 8 6
Chapter 9. Diagnosis
A key topic in any TCP/IP network infrastructure is documenting and analyzing problems.
This chapter describes tools that are available in z/OS Communications Server and
techniques to gather and diagnose problems that are related to the TCP/IP environment.
This chapter covers the topics that are shown in Table 9-1.
9.1, “Debugging a problem in a z/OS Problem determination techniques and the tools that are
TCP/IP environment” on page 350 available to debug a problem in z/OS Communications
Server - TCP/IP component.
9.2, “Logs to diagnose Communications Why logs are important in problem analysis.
Server for z/OS IP problems” on
page 353
9.3, “Sysplex Autonomics function” on Using System Autonomics monitoring functions to detect
page 353 and act on TCP/IP Sysplex operations.
9.4, “Useful commands to diagnose Commands that are used to debug network problems.
Communications Server for z/OS IP
problems” on page 355
9.5, “Gathering traces in Using z/OS Component Trace Service to capture trace
Communications Server for z/OS IP” on data for the main z/OS Communications Server - TCP/IP
page 365 component.
9.7, “Additional tools for diagnosing Other tools that can be used to diagnose network
Communications Server for z/OS IP problems.
problems” on page 405
9.8, “MVS console support for selected Using EZACMD to run z/OS CS UNIX commands from
TCP/IP commands” on page 410 the MVS console, in NetView and in TSO.
9.9, “Additional information” on page 418 More information about the usage of logs, standard
commands, tools, and utilities.
When problems arise in a TCP/IP environment, they can sometimes be challenging to isolate.
Without the proper tools, techniques, and knowledge of the environment, debugging any
problem can be difficult. The culprit might be any one of the many components between the
affected endpoints.
Most problems can easily be placed into one of these categories, and the information that is
needed to debug them can be retrieved from logs, commands, or utilities.
Logs are the first and most important tool to help you understand the nature of the problem. In
logs, you find messages that might explain what happened or even lead you to the actions
that are needed to solve the problem.
However, sometimes problems such as connectivity or routing do not provide messages that
clearly show what went wrong. Therefore, you need further information, which can be
obtained by using commands such as netstat, ping, or traceroute. If the commands do not
provide enough information to solve or isolate the problem, then you can start the z/OS
Communications Server trace utilities that gather data as it passes through the devices and
the stack.
Many problems that are related to the TCP/IP stack are because of configuration errors. Here,
you can use logs to find useful messages that indicate where the error is.
When a problem occurs, the first step is to verify that the operating environment is behaving
as expected. After this is confirmed, you can then focus on other areas. To help isolate the
problem, a useful approach is to answer various basic questions:
Is the TCP/IP stack running correctly?
This generic question can help determine whether the problem is stack-related. It can be
answered by verifying the behavior of the entire Communications Server for z/OS IP
environment.
Usually, the tools that are used to answer this question are the logs where messages that
are related to the problem can be found (see 9.2, “Logs to diagnose Communications
Server for z/OS IP problems” on page 353) and tools that receive information by using the
Network Management Interface (NMI) (see 9.7, “Additional tools for diagnosing
Communications Server for z/OS IP problems” on page 405).
If the problem is an abend, save the generated dump for analysis. The configuration
should also be checked for inconsistencies. If you conclude it is not a stack-related
problem, then the next step is based on your findings to determine whether it is a network-
or application-related problem.
Has this ever worked before? If so, what changed?
These two basic questions might seem obvious, but they are in fact the most common
reasons for problems that are encountered in a Communications Server for z/OS IP
environment.
If the problem is with a production and stable environment, you must first check whether
any changes were made. In some cases, changes do not take effect until a system or
stack recycle is done. The only useful approach in this case is to track any changes and
always use change management processes.
If you are dealing with a new implementation, was a step-by-step approach being used? If
so, you probably know in which step the problem occurred and can adapt your problem
determination procedure based on the step being implemented.
Are the physical connections and interfaces active and working properly?
This question is related to a connectivity problem, and it leads to checking interface
definitions and status. You also need to look at the log files, and use commands to
determine whether the interfaces are operational. The netstat command can be used to
verify this, as described in 9.4, “Useful commands to diagnose Communications Server for
z/OS IP problems” on page 355.
If it is an intermittent problem, or if you cannot find the cause of the problem,
Communications Server for z/OS IP provides a set of trace tools that you can use to
gather more information. See 9.5, “Gathering traces in Communications Server for z/OS
IP” on page 365.
In many situations, the information that is obtained during the problem determination process
comes from separate logs (system, application, and stack logs). To build a clear picture of the
problem or outage, all significant information must be correlated.
As a preferred practice, implement syslogd to control where all messages are sent. Doing it
this way, you have a single place to refer to when debugging a problem. The syslogd process
is a UNIX process that logs UNIX application messages to one or more files.
TCP/IP services that run as UNIX processes log application messages by using syslogd can
consolidate logging information from several systems to one system through UDP
communications.
For more information about setting up syslogd, see IBM z/OS V2R2 Communications Server
TCP/IP Implementation Volume 2: Standard Applications, SG24-8361.
To control how the actions that are performed by Sysplex Autonomics take place, configure
the SYSPLEXMONITOR parameter of the GLOBALCONFIG statement in PROFILE.TCPIP by using the
following values:
TIMERSECS defines the interval at which the sysplex monitor checks the monitored
functions in the stack.
RECOVERY / NORECOVERY defines whether the sysplex monitor acts when a problem is
detected, or just issues messages regarding the problem but take no further actions.
To detect a problem and act upon it, a TCP/IP stack cannot be the only member of the TCP/IP
sysplex group and it must be advertising DVIPAs. The RECOVERY value must be defined to
allow the autonomics to act when a problem is detected, or only a message is displayed
regarding the detected problem.
For more information about Sysplex Autonomics, see z/OS Communications Server: IP
Configuration Guide, SC27-3650.
This section briefly describes the main commands that you can use to diagnose problems in a
Communications Server for z/OS IP environment. For additional help and detailed information
about the commands that are described and other commands that can be used for problem
determination, see z/OS Communications Server: IP System Administrator’s Commands,
SC27-3661 and z/OS Communications Server: IP Diagnosis Guide, GC31-8782.
Tip: Using names instead of IP address needs the resolver or DNS to do the translation,
thus adding more variables to the problem determination task. This should be avoided
when diagnosing network problems. Use the host IP address instead.
In most cases, the default options of ping are used. However, in a z/OS Communications
Server environment, using the default options might lead to a false conclusion, given the
number of interfaces that can be used to transport the ICMP request.
Table 9-2 shows the available options that can be used with the ping command in TSO and
z/OS UNIX environments.
Figure 9-1 illustrates the use of the ping command for problem determination.
SC32
TCPIPC
10.1.1.31
OSA20A0 OSA20C0
10.1.2.31 10.1.3.31
10.1.2.240 10.1.3.220
Router 1 Router 2
Figure 9-1 Using the ping command with the interface option
To avoid such confusion, indicate which path to verify by using the interface (intf) option, as
shown in Example 9-2.
After using the correct command, you can see that there is a problem using interface
OSA20C0L, which is the direct connection to the 10.1.3.0 subnetwork.
Example 9-3 shows a ping with a very large packet size, with no pmtu option specified. We
use the noname option to avoid a reverse DNS lookup on the IP address.
Example 9-4 shows that by adding the pmtu yes option 1 to the ping command, you can
determine at which hop 2 the fragmentation is necessary, and the MTU size 3.
Example 9-6 The ping command with the pmtu ignore option
ping 10.1.1.10 (noname tcp tcpipb l 25000 c 1 t 1 pmtu ignore
Pinging host 10.1.1.10
Ping #1 needs fragmentation at: 10.1.7.21
Next-hop MTU size is 8192
***
On most platforms, the traceroute command sends UDP datagrams to the destination host.
These datagrams reference a port number outside the standard range. The source knows
when it reaches the destination host when it receives an ICMP “Port Unreachable” message.
The traceroute command displays the route that a packet takes to reach the requested
target. The output that is generated by this command is shown in Example 9-7.
Figure 9-2 shows various netstat options, and these can be further qualified by filter criteria,
depending on the option you choose. The output can be displayed to the terminal (default), to
a data set (report), or to the REXX data stack. The Output Format (short or long) supports
IPv6 addresses.
Up
Figure 9-2 The netstat command options: Target output (filter select)
The remainder of this section shows examples of netstat commands that are used for
diagnostic purposes, and their outputs.
You can optionally display additional application connection data by using the APPLDATA
parameter on the NETSTAT CONN and NETSTAT ALLCONN commands. Example 9-11 contrasts
the output of two NETSTAT CONN commands: one without the APPLDATA parameter 1, the other
with the APPLDATA parameter 2. The TN3270 server populates the APPLDATA field with
connection data, as documented in z/OS Communications Server: IP Configuration
Reference, SC27-3651. The TN3270 APPLDATA fields that are shown for the connection are
the component ID, LU name, the SNA application name, connection mode, client type,
security method, security level, and security cipher 3.
Example 9-11 NETSTAT CONN without and with the APPLDATA option
D TCPIP,TCPIPB,N,conn 1
D TCPIP,TCPIPB,N,conn,appldata 2
You can optionally display the report that is provided by netstat ALL/-A, which is now
available when you run the DISPLAY TCPIP,,NETSTAT command, in addition to being available
by using the TSO or z/OS UNIX shell environment. You can filter this command to display only
the client IPADRR that you want, and receive complete details of this session, such as the
maximum segment size in use, as shown in Example 9-12.
You can filter the output of the NETSTAT CONN,APPLDATA command by adding the APPLD filter
option and specifying the filter criteria. The APPLDATA field is a total of 40 bytes. By using an
asterisk (*) in the filter criteria, you can filter on any part of the 40 bytes. Example 9-13 shows
several filter criteria strings being used.
D TCPIP,TCPIPB,N,CONN,APPLDATA,APPLD=*SC31*
USER ID CONN STATE
TN3270B 00000111 ESTBLSH
LOCAL SOCKET: ::FFFF:10.1.1.20..23
FOREIGN SOCKET: ::FFFF:10.1.100.222..1028
APPLICATION DATA: EZBTNSRV SC31BB05 SC31TS03 3T B
1 OF 1 RECORDS DISPLAYED
END OF THE REPORT
Note: The MAXRECS parameter is available on the GLOBALCONFIG TCP/IP profile statement
for configuring a default value for the DISPLAY TCPIP,,NETSTAT command’s MAX parameter.
The default value is 100.
The VARY TCPIP,,DROP command allows all TCP connections that are associated with a
server matching the specified filter to be reset. If more than one server application is found to
match the input filter values, the command fails. Existing TCP connections are reset by this
command, but new connection requests are not quiesced. If necessary, you might quiesce
new connection requests to the server application before issuing this command.
Example 9-14 shows the output for two VARY TCPIP,,DROP commands: One with the PORT
parameter and the optional JOBNAME parameter 1, the other with the JOBNAME parameter 2. You
can optionally specify the address space ID (ASID). You can see EZD2013I message 3,
which includes the number of connections that were reset. The following messages depend
on the server application on which the command was issued.
V TCPIP,TCPIPB,DROP,JOBNAME=TN3270B 2
EZZ0060I PROCESSING COMMAND: VARY TCPIP,TCPIPB,DROP,JOBNAME=TN3270B
IKT100I USERID CANCELED DUE TO UNCONDITIONAL LOGOFF
IKT100I USERID CANCELED DUE TO UNCONDITIONAL LOGOFF
EZD2013I 3 CONNECTIONS WERE SUCCESSFULLY DROPPED 3
IKT122I IPADDR..PORT 10.1.100.221..4217
IKT100I USERID CANCELED DUE TO UNCONDITIONAL LOGOFF
IKT122I IPADDR..PORT 10.1.100.221..4218
IKT122I IPADDR..PORT 10.1.100.221..4216
EZZ6034I TN3270B CONN 000000BE LU SC31BB29 CONN DROP ERR 1010 158
IP..PORT: ::FFFF:10.1.100.221..4216 EZBTTRCV
EZZ6034I TN3270B CONN 000000C0 LU MULTIPLE CONN DROP ERR 1010 159
IP..PORT: ::FFFF:10.1.100.221..4217 EZBTTRCV
Note: The VARY TCPIP,,DROP command drops all connections for a server. The Netstat
DROP/-D command supports dropping only one connection per command invocation.
NETSTAT uses two message catalogs: netmsg.cat (for IPv4 messages) and netmsg6.cat (for
IPv6 messages). When NETSTAT is run, it tries to open both message catalogs. If a message
catalog cannot be opened, default messages are used. The default message text is included
in the NETSTAT command.
If the catalogs and command processor are not in sync, an ABEND0C4 can occur in the
ONETSTAT module. The following message might be issued:
EZZ0157I CONFIGURATION: THE CONFIGURATION COMPONENT HAS TERMINATED
This error can occur during z/OS migration when the new z/OS version is pointing to the old
load library (TCPIP.SEZALOAD).
When you use a previous z/OS catalog version, you get the message 2008 100 19:39
UTC.ØIBM-1047 instead of 2011 041 21:05 UTC.yIBM-1047.
The maintenance level for the catalog must be at least EZASERVICE Service Level HIP61D0.
This section covers the trace facilities that are available to analyze TCP/IP problems on z/OS
servers and clients. It also describes how to process those traces.
The MVS component trace can be used to diagnose most TCP/IP problems. Some
components of TCP/IP continue to maintain their own tracing mechanisms, for example, the
FTP server. For more information about the various trace options, see z/OS Communications
Server: IP Diagnosis Guide, GC31-8782, and z/OS MVS Diagnosis: Tools and Service Aids,
GA22-7589.
Figure 9-3 shows the traces that can be used for debugging. Some applications have their
own internal trace functions. The output from those traces can be to the window, a file, or to
the syslogd logging function. The data from many of the traces that are captured by using the
z/OS Component Trace is written to either an external writer or 64-bit common storage
(HVCOMMON). The TN3270E Telnet server trace is written to either an external writer or
64-bit private storage. OMPROUTE, Resolver, and IKE traces are written to either an external
writer or private storage. The trace data can be dumped with the address space when
common and private storage is also dumped.
Event trace
Network/Transport layer
SYSTCPIP
64-bit
OSPF UDP TCP ICMP common
SYSTCPDA
Packet trace
CTWTR
file
Data Link layer
Trace output
MPC HiperSockets ... OSA
SYSTCPOT
Information APAR II12014 is a useful source of information about the TCP/IP component and
packet trace. For general information about the MVS component trace, see z/OS MVS
Diagnosis: Tools and Service Aids, GA22-7589.
Before starting the traces, create the external write procedure in the SYS1.PROCLIB library,
which allocates the trace data set. This procedure is activated by running the trace
command. A sample procedure that is named CTWTR is shown in Figure 9-4.
//CTWTR PROC
//IEFPROC EXEC PGM=ITTTRCWR
//TRCOUT01 DD DSNAME=SYS1.&SYSNAME..CTRACE,
// VOL=SER=COMST2,UNIT=3390,
// SPACE=(CYL,10),DISP=(NEW,CATLG),DSORG=PS
//*
Figure 9-4 Sample External Write procedure
Next, complete the following steps by running the trace command to activate, capture data,
and stop the trace process:
1. Start the external writer (CTRACE writer):
TRACE CT,WTRSTART=ctwtr
Where ctwtr is the name of the procedure that is created to allocate the trace data set.
2. Start the CTRACE and connect to the external writer:
TRACE CT,ON,COMP=component,SUB=(proc_name)
R xx,OPTION=(valid_options),WTR=ctwtr,END
Where:
– component is the component name of the trace being started and can be any of these:
• SYSTCPIP (Event trace)
• SYSTCPDA (Packet trace)
• SYSTCPDA (Data trace)
• SYSTCPIS (Intrusion Detection Services trace)
• SYSTCPIK (IKE daemon trace)
• SYSTCPOT (OSAENTA trace)
• SYSTCPNS (Network Security Services (NSS) server trace)
• SYSTCPRT (OMPROUTE trace)
• SYSTCPRE (RESOLVER trace)
– proc_name is the procedure that is related to the component trace being started, and
can be any of these:
• tcpip_proc
• iked_proc
• nss_proc
• omp_proc
The next sections describe each component trace that is used by the z/OS Communications
Server - TCP/IP component for documenting problems. For a detailed explanation of each
component trace, see z/OS Communications Server: IP Diagnosis Guide, GC31-8782.
z/OS Communications Server provides a default trace options set in the SYS1.PARMLIB
member (CTIEZB00 for SYSTCPIP, and CTIEZBTN for the TN3270 Telnet server). The options
that are provided can be changed by using an alternate member with the options (for
example, CTIEZBXX), and then changing the value in the parm CTRACE keyword in your TCP/IP
procedure, as shown in Figure 9-5.
Note: The buffer size option is defined during TCP/IP startup only, so any change must be
done by using the CTIEZBxx parmlib member and cannot be reset without restarting the
TCP/IP address space. The default is 8 MB.
If you want to specify different trace options after TCP/IP initialization, you can run the TRACE
CT command and either specify the new component trace options file or respond to prompts
from the command.
RESPONSE=SC30
IEE843I 17.09.02 TRACE DISPLAY 466
SYSTEM STATUS INFORMATION
ST=(ON,0001M,00004M) AS=ON BR=OFF EX=ON MO=OFF MT=(ON,024K)
TRACENAME
=========
SYSTCPIP
MODE BUFFER HEAD SUBS
=====================
OFF HEAD 4
NO HEAD OPTIONS
SUBTRACE MODE BUFFER HEAD SUBS
--------------------------------------------------------------
TCPIPA ON 0008M
ASIDS *NONE*
JOBNAMES *NONE*
OPTIONS MINIMUM
WRITER *NONE*
Figure 9-6 DISPLAY TRACE,COMP=SYSTCPIP,SUB=(TCPIPC) output
The MINIMUM trace option is always active. During minimum tracing, certain exceptional
conditions are being traced so the trace records for these events are available for easier
debugging in case the TCP/IP address space should encounter an abend condition.
When you must trace application-related problems by using the SOCKAPI option, consider the
following guidelines:
Trace only one application. Use the job name or ASID option when capturing the trace to
limit the trace data to one application.
Trace only the SOCKAPI option. To get the maximum number of SOCKAPI trace records,
specify only the SOCKAPI option.
Use an external writer. Use the external writer to save more trace data.
Trace only one TCP/IP stack.
Activate the data trace only if more data is required. The SOCKAPI trace contains the first
96 bytes of data that is sent or received, which is usually sufficient.
However, the SOCKET option is primarily intended for use by TCP/IP Service and provides
information that is meant to be used to debug problems in the TCP/IP socket layer, UNIX
System Services, or the TCP/IP stack. For more information about the SOCKAPI option, see
z/OS CS: IP Diagnosis, GC31-8782.
Note: If you use the Telnet option, do not specify the JOBNAME parameter when starting
CTRACE.
Note: You can use the parmlib member CTIEZBxx to provide the same options:
TRACE CT,ON,COMP=SYSTCPIP,SUB=(TCPIPC),PARM=(CTIEZBXX)
3. Display the active component trace options to verify that they are correct:
DISPLAY TRACE,COMP=SYSTCPIP,SUB=(TCPIPC)
IEE843I 12.12.22 TRACE DISPLAY 206
SYSTEM STATUS INFORMATION
ST=(ON,0001M,00004M) AS=ON BR=OFF EX=ON MO=OFF MT=(ON,024K)
TRACENAME
=========
SYSTCPIP
MODE BUFFER HEAD SUBS
=====================
OFF HEAD 3
NO HEAD OPTIONS
SUBTRACE MODE BUFFER HEAD SUBS
--------------------------------------------------------------
TCPIPC ON 0008M
ASIDS *NONE*
After your events trace data is captured, the trace data set that is created by the external
writer procedure is saved and IPCS is used to format and analyze its contents. For further
details about SYSTCPIP events trace, see z/OS CS: IP Diagnosis, GC31-8782.
You can also use the packet trace to capture data traffic going through improved fast local
socket (local traffic). However, if packet trace is enabled, the connection flows by using fast
local sockets (pre-V1R12 function), even when the packet trace is turned off. Component
trace (CTRACE) and data trace (DATTRACE) can be used to gather diagnostic information for
improved fast local socket connections.
With the VARY PKTTRACE command or PKTTRACE statement in PROFILE.TCPIP, you can specify
options such as IP address, port number, discard, and protocol type. If you are planning to
gather a trace for relatively long hours, or if your system experiences heavy traffic, specify
these filtering options so that TCP/IP does not have to gather unnecessary packets.
D TRACE,COMP=SYSTCPDA,SUB=(TCPIPA)
IEE843I 14.00.29 TRACE DISPLAY 388
SYSTEM STATUS INFORMATION
TRACENAME
=========
SYSTCPDA
MODE BUFFER HEAD SUBS
=====================
OFF HEAD 2
NO HEAD OPTIONS
SUBTRACE MODE BUFFER HEAD SUBS
NO HEAD OPTIONS
SUBTRACE MODE BUFFER HEAD SUBS
--------------------------------------------------------------
TCPIPA MIN 0016M
ASIDS *NONE*
JOBNAMES *NONE*
OPTIONS MINIMUM
WRITER CTWTR
3. Start the trace through the PROFILE.TCPIP statement and the VARY OBEYFILE command, or
through the V TCPIP,,PKT command:
VARY TCPIP,TCPIPA,PKT,ON
EZZ0060I PROCESSING COMMAND: VARY TCPIP,TCPIPA,PKT,ON
EZZ0053I COMMAND VARY PKTTRACE COMPLETED SUCCESSFULLY
4. Optional: Modify the trace options to filter the data that is captured by using the VARY
command. If both options IPaddr and PORTNUM are specified in the same command, an
AND condition is created so data is captured only if both conditions are met.
It can also create an OR condition issuing multiple VARY commands to apply filters
together. For example, if you want to record all packets with destination ports xx OR
source ports yy, use the following commands:
VARY TCPIP,tcpprocname,PKT,DEST=xx
VARY TCPIP,tcpprocname,PKT,SRCP=yy
When VIPAROUTE statements are defined to a sysplex distributor to select routes, the
sysplex distributor encapsulates the packet with a new header before sending it to the
target stack. The IPaddr option can allow filtering to be performed on not only the outer
packet but the inner packet.
Additionally, z/OS Communications Server provides the DISCARD option, which allows you
to filter inbound packets that are discarded by the stack. You can also filter packet trace
collection and formatting by using discard reason codes. For example, if you want to
record all packets that are discarded or filter the packets with reason code such as 4136,
use these commands:
VARY TCPIP,TCPIPA,PKT,DISCARD=*
EZZ0060I PROCESSING COMMAND: VARY TCPIP,TCPIPA,PKT,DISCARD=*
EZZ0053I COMMAND VARY PKTTRACE COMPLETED SUCCESSFULLY
VARY TCPIP,TCPIPA,PKT,DISCARD=4136
EZZ0060I PROCESSING COMMAND: VARY TCPIP,TCPIPA,PKT,DISCARD=4136
EZZ0053I COMMAND VARY PKTTRACE COMPLETED SUCCESSFULLY
5. Check whether the packet trace options that are set are correct by using the netstat dev
(-d) command. Example 9-16 shows a sample packet trace setting. It shows the PORTNUM
= 23 option (1), the IPADDR = 10.1.8.21 option (2), and the Discard Code = 4136 option
(3).
After the packet trace or the socket data is captured, the trace data set that is created by the
external writer procedure is saved. Use IPCS to format and analyze the saved contents. For
further details about these traces, see z/OS CS: IP Diagnosis, GC31-8782.
Note: The next hop IP address is provided for all outbound packets. This information is
only viewable if the packet trace is formatted with the FULL option, and also is available
externally by way of the real-time packet trace NMI. Additionally, CTRACE with
OPTIONS((LAST IPADDR(ipaddress) )) can select packets for the inner IP address.
A PORTNUM parameter is supported on the VARY TCPIP,,DATTRACE command that you can use
to trace only packets that have a source or destination port that matches a specific port
number.
To verify whether the data trace options setting are correct, use the NETSTAT CONFIG
command. See Example 9-18.
Example 9-19 Sample formatted trace for start and end records
COMPONENT TRACE SHORT FORMAT
SYSNAME(SC30)
COMP(SYSTCPDA)SUBNAME((TCPIPA))
z/OS TCP/IP Packet Trace Formatter, Copyright IBM Corp. 2000, 2010; 2010.067
DSNAME('SYS1.SC30.TCPIPA.CTRACE')
**** 2010/09/28
RcdNr Sysname Mnemonic Entry Id Time Stamp Description
----- -------- -------- -------- --------------- -----------------------------
------------------------------------------------------------------------------
129 SC30 DATA 00000005 13:19:18.657635 Data Trace
To Jobname : FTPDA Full=0
Tod Clock : 2010/09/28 13:19:18.657635 Cid: 00000205
Domain : AF_Inet6 Type: Stream Protocol: TCP
State : API Data Flow Starts
Data trace records for the socket data flow start and end are supported only on TCP and
UDP sockets; they are not supported on RAW sockets.
The OMPROUTE CTRACE can be started anytime by using the command TRACE CT, or it can
be activated during OMPROUTE initialization. If not defined, the OMPROUTE component
trace is started with a buffer size of 1 MB and the MINIMUM tracing option.
A parmlib member can be used to customize the parameters and to initialize the trace. The
default OMPROUTE Component Trace parmlib member is the SYS1.PARMLIB member
CTIORA00. The parmlib member name can be changed by using the
OMPROUTE_CTRACE_MEMBER environment variable.
In addition to specifying the trace options, you can also change the OMPROUTE trace buffer
size. The buffer size can be changed only at OMPROUTE initialization. The maximum
OMPROUTE trace buffer size is 100 MB. The OMPROUTE REGION size in the OMPROUTE
catalog procedure must be large enough to accommodate a large buffer size.
Here are the necessary steps to start the CTRACE for OMPROUTE during OMPROUTE
initialization by using the parmlib member CTIORA00 and directing the trace output to an
external writer:
1. Prepare the SYS1.PARMLIB member CTIORA00 to get the wanted output data.
Example 9-20 shows a sample of CTIORA00 contents.
2. Start the OMPROUTE procedure by using the wanted DEBUG and TRACE options, as shown
in Example 9-21.
In Example 9-21, see item 1. Parameters -t (trace) and -d (debug) define how detailed
you want the output data to be. A preferred practice is to use -t2 and -d1.
3. Verify that CTRACE is started by running the following console command:
D TRACE,COMP=SYSTCPRT,SUB=(OMPC)
IEE843I 16.31.37 TRACE DISPLAY 058
SYSTEM STATUS INFORMATION
ST=(ON,0256K,00512K) AS=ON BR=OFF EX=ON MO=OFF MT=(ON,024K)
TRACENAME
=========
SYSTCPRT
MODE BUFFER HEAD SUBS
=====================
OFF HEAD 1
NO HEAD OPTIONS
SUBTRACE MODE BUFFER HEAD SUBS
-------------------------------------------------------------
After these steps, the generated trace file must be formatted by using the IPCS. For more
information about OMPROUTE diagnosis, see z/OS Communications Server: IP Diagnosis
Guide, GC31-8782.
To gather the component trace for the resolver, use the commands that are listed in 9.5.1,
“Taking a component trace” on page 367 and, in step 2 on page 367, specify the comp=
parameter with the resolver component name SYSTCPRE and the sub= parameter with the
resolver proc_name.
The generated trace file that is created after the problem is reproduced must be formatted by
using the IPCS. For more information about resolver diagnosis, see z/OS Communications
Server: IP Diagnosis Guide, GC31-8782.
Tip: The IKE daemon reads the IKED_CTRACE_MEMBER environment variable only during
initialization. Changes to IKED_CTRACE_MEMBER after daemon initialization have no effect.
After IKE daemon initialization, you must use the TRACE CT command to change
component trace options.
After the IKE daemon is initialized, you can start CTRACE to modify trace options or send
data to an external writer by using the commands that are listed in 9.5.1, “Taking a component
trace” on page 367 and, in step 2 on page 367, specify the comp= parameter with the IKE
daemon component name SYSTCPIK and the sub= parameter with the IKE proc_name.
The generated trace file that is created after the problem is reproduced must be formatted by
using the IPCS. For more information about IKE daemon diagnosis, see z/OS
Communications Server: IP Diagnosis Guide, GC31-8782.
If the EZZ4210I message indicates the parmlib member name CTIIDS00, then the IDS
CTRACE space is set up by using the default BUFSIZE of 32 M.
The CTIIDS00 member is used to specify the IDS CTRACE parameters. To eliminate this
message, ensure that a CTIIDS00 member exists within parmlib and that the options are
correctly specified. A sample CTIIDS00 member is included with z/OS Communications
Server.
For details about the Intrusion Detection Services (IDS) trace, see z/OS Communications
Server: IP Diagnosis Guide, GC31-8782. For information about defining policy, see z/OS
Communications Server: IP Configuration Guide, SC27-3650.
The QDIOSYNC trace is not a traditional trace in which output is generated based on specific
events. Instead, the QDIOSYNC trace freezes and captures (logs) OSA-Express diagnostic
data in a timely manner. In addition to (or instead of) using the Hardware Management
Console (HMC) to manually capture the diagnostic data, you can arm the OSA-Express
adapter to automatically capture diagnostic data when one of the following situations occurs:
The OSA-Express adapter detects an unexpected loss of host connectivity.
Unexpected loss of host connectivity occurs when the OSA-Express adapter receives an
unexpected halt signal from the host or when the host is unresponsive to OSA requests.
The OSA-Express adapter receives a CAPTURE signal from the host.
A CAPTURE signal is sent by the host when one of the following situations occur:
– The VTAM-supplied message processing facility (MPF) exit (IUTLLCMP) is driven.
– Either the VTAM or TCP/IP functional recovery routine (FRR) is driven with ABEND06F.
(ABEND06F is the result of a SLIP PER trap that specifies ACTION=RECOVERY).
When arming an OSA-Express adapter for QDIOSYNC, you can specify an optional filter that
alters what type of diagnostic data is collected by the OSA-Express adapter. This filtering
reduces the overall amount of diagnostic data that is collected, and decreases the likelihood
that pertinent data is lost.
If you have several OSAs to arm, but you do not want to arm all of them, consider first arming
all OSAs and then individually disarm those you do not want armed.
You can change the parmlib member name by using the NSSD_CTRACE_MEMBER environment
variable.
Tip: The NSS server reads the NSSD_CTRACE_MEMBER environment variable only during
initialization. Changes to NSSD_CTRACE_MEMBER after server initialization have no effect.
After the NSS server is initialized, you can start CTRACE to modify trace options or send data
to an external writer by using the commands that are listed in 9.5.1, “Taking a component
trace” on page 367 and, in step 2 on page 367, specify the comp= parameter with the NSS
server component name SYSTCPNS and the sub= parameter with the nss_proc_name.
The generated trace file that is created after the problem is reproduced must be formatted by
using the IPCS. For more information about NSS server diagnosis, see z/OS
Communications Server: IP Diagnosis Guide, GC31-8782.
To obtain a dump of the TCP/IP stack when no abend occurred, use the DUMP command.
Specify the CSA option for SDATA because it contains the trace data that is contained in 64-bit
common (HVCOMMON) storage. Be sure to include “region” (RGN) in the SDATA dump
options, as shown here:
DUMP COMM=(enter_dump_title_here)
Rxx,JOBNAME=tcpproc,CONT
Rxx,SDATA=(CSA,LSQA,NUC,PSA,RGN,SQA,SUM,SQA,TRT),END
To obtain a dump of the OMPROUTE, RESOLVER, IKED or TELNET address space (which
contains the trace table), use the DUMP command as shown here:
DUMP COMM=(enter_dump _title_here)
Rxx,JOBNAME=proc_started_task_name,SDATA=(RGN,CSA,ALLPSA,SQA,SUM,TRT,ALLNUC),END
The primary purpose of the component trace is to capture data that the IBM Support Center
can use in diagnosing problems. There is little information in the documentation on
interpreting trace data. If you want to analyze the packet trace or data trace, you can do so by
formatting the trace data by using a z/OS tool in TSO called IPCS. For more information about
trace and dump analysis by using IPCS, see z/OS Communications Server: IP Diagnosis
Guide, GC31-8782.
To assist in problem diagnosis, the OSAENTA function provides a way to trace inbound and
outbound frames for the OSA-Express2, OSA-Express3, and OSA-Express4S features. The
OSAENTA trace function is controlled and formatted by z/OS Communications Server, but is
collected in the OSA at the network port.
If you determine the microcode level from the HMC, complete the following tasks:
1. Select your system.
2. Double-click OSA Advanced Facilities.
3. Select the appropriate physical channel identifier (PCHID).
4. Select View code level.
Figure 9-7 shows the microcode level that is installed in one of the OSA-Express3 features.
Alternatively, you can issue the D NET,TRL,TRLE=OSA2080T command. Example 9-22 shows
the output.
The OSA-Express needs an additional DATAPATH statement on the TRL for OSAENTA tracing
(see Example 9-24).
VMAC IP address
HOME 00096B1A7490 010.001.000.010
HOME 00096B1A7490 010.001.001.010
HOME 00096B1A7490 010.001.002.010
HOME 00096B1A7490 010.001.002.011
REG 00096B1A7490 010.001.002.012
REG 00096B1A7490 010.001.003.011
REG 00096B1A7490 010.001.003.012
REG 00096B1A7490 010.001.004.011
REG 00096B1A7490 010.001.005.011
REG 00096B1A7490 010.001.006.011
REG 00096B1A7490 010.001.007.011
REG 00096B1A7490 010.001.008.010
REG 00096B1A7490 010.001.008.020
Important: If you use a zEnterprise (z196) at driver 86 and later, the HMC icon for Network
Traffic Analyzer (NTA) does not exist anymore. In this case, go to 9.6.5, “Defining a
resource profile in RACF” on page 392.
Use this task to select an OSAENTA Support Element (SE) control to customize the
OSA-Express NTA settings in Advanced Facilities, or to check the current OSA-Express NTA
authorization.
Customizing OSA-Express NTA allows the following activities for the SE:
Set up the OSA LAN Analyzer traces and capture data to the SE hard disk drive.
Change authorization to allow host operating systems to enable the NTA traces outside
their own partition.
Note: The OSA-Express NTA is mutually exclusive with the OSA LAN Analyzer for tracing
on a specified CHPID. Only one or the other can be enabled for a specified CHPID at any
one time.
2. Select the CPC that you want to work with, as shown in Figure 9-8.
3. Select and open the Service task list, as shown in Figure 9-9.
7. Click Allow the Support Element to allow Host Operating System to enable NTA.
13.In our case, we select PCHID 0390 (CHPID 02), and then double-click Advanced
Facilities. See Figure 9-14.
15.Select OSA-Express Host Network Traffic Analyzer Authorization, and then click OK.
See Figure 9-16.
Important: For checking the authorization of OSA-Express NTA support, the user
ID must have the Access Administrator Tasks role assigned.
The OSAENTA statement enables an installation to trace data from other hosts that are
connected to OSA-Express.
Important: The trace data that is collected should be considered confidential and TCP/IP
system dumps and external trace files containing this trace data should be protected.
To see the complete syntax of the OSAENTA command, see z/OS Communications Server: IP
Configuration Reference, SC27-3651.
Note: Update CTINTA00 to set the CTRACE buffer size. This setting uses up auxiliary
page space storage.
When the ON keyword of the OSAENTA parameter is used, VTAM allocates the next available
TRLE data path that is associated with the port. This data path is used only for inbound trace
data.
When the OFF keyword of the OSAENTA parameter is used (or the trace limits of the TIME, DATA,
or FRAMES keywords are reached), the data path is released.
In this case, OSAENTA traces the port name OSA2080 only for traffic matching the
following filters:
– Protocol = UDP
– IP address = 10.1.2.11
– Port number = 2323
The following filters are available to define the packets to be captured:
– MAC address
– VLAN ID
– Ethernet frame type
– IP address (or range)
– IP protocol
Note: Use filters to limit the trace records to prevent overconsumption of the OSA CPU
resources, the LPAR CPU resources, 64-bit common storage, memory, auxiliary page
space, and the IO subsystem writing trace data to disk.
Important: If you receive ERROR CODE 0003, it means that an attempt was made to
enable OSAENTA tracing for a specified OSA, but the current authorization level does
not permit it.
For directions about how to change the authorization to allow OSAENTA to be used on
this specified OSA, see 9.6.4, “Customizing OSA-Express Network Traffic Analyzer” on
page 385. For more information about this topic, see Support Element Operations
Guide, SC28-6860.
The NETSTAT display for devices shows the NTA interfaces. The interface name prefixed
the OSA port name with EZANTA.
To display a specific NTA interface, use the INTFName=EZANTAosaportname keyword.
Traces are placed in an internal buffer, which can then be written out by using a CTRACE
external writer. The MVS TRACE command must also be issued for component SYSTCPOT
to activate the OSAENTA trace.
Important: If you receive ERROR CODE 0005, it means that an attempt was made to
enable OSAENTA tracing for a specified OSA that already has either OSAENTA or OSA
LAN Analyzer tracing enabled elsewhere on the system for this OSA.
Only one instance of active tracing (either OSAENTA or LAN Analyzer) for a specified
OSA is permitted on the system at any one time.
When the trace is started from OSA/SF, you can see that another device is allocated for
trace (Example 9-31).
VMAC IP address
HOME 00096B1A7490 010.001.000.010
HOME 00096B1A7490 010.001.001.010
HOME 00096B1A7490 010.001.002.010
HOME 00096B1A7490 010.001.002.011
REG 00096B1A7490 010.001.002.012
REG 00096B1A7490 010.001.003.011
REG 00096B1A7490 010.001.003.012
REG 00096B1A7490 010.001.004.011
REG 00096B1A7490 010.001.005.011
REG 00096B1A7490 010.001.006.011
REG 00096B1A7490 010.001.007.011
REG 00096B1A7490 010.001.008.010
REG 00096B1A7490 010.001.008.020
To display information about the status of the component trace for all active stack
procedures in a CINET environment, issue the following command:
DISPLAY TRACE,COMP=SYSTCPOT,SUBLEVEL,N=8
Example 9-34 displays the output.
From the IPCS PRIMARY OPTION MENU, select 0 DEFAULTS - Specify default dump and
options. See Example 9-35.
You may change any of the defaults listed below. The defaults shown before
any changes are LOCAL. Change scope to GLOBAL to display global defaults.
If you change the Source default, IPCS will display the current default
Address Space for the new source and will ignore any data entered in
the Address Space field.
Modify the DSNAME and OPTIONS to match your environment, and then select the following
options:
2 ANALYSIS - Analyze dump contents
7 TRACES - Trace formatting
1 CTRACE - Component trace
D DISPLAY - Specify parameters to display CTRACE entries
Fill in the parameters that are necessary to format the OSAENTA trace. See Example 9-36.
On the command line, enter the S command. Example 9-37 shows the trace that is formatted
by IPCS.
**** 2007/09/11
RcdNr Sysname Mnemonic Entry Id Time Stamp Description
----- -------- -------- -------- --------------- -----------------------------
------------------------------------------------------------------------------
365 SC30 OSAENTA 00000007 15:01:23.356987 OSA-Express NTA
To Interface : EZANTAOSA2080 Full=86
Tod Clock : 2010/09/24 14:20:25.931533
Sequence # : 0 Flags: Pkt Out Nta Vlan Lpar L3
Source : 10.1.2.11
Destination : 224.0.0.5
Source Port : 0 Dest Port: 0 Asid: 0000 TCB: 0000000
Frame: Device ID : 02030002 Sequence Nr: 372 Discard: 0 (OK)
EtherNet II : 8100 IEEE 802.1 Vlan Len: 0x0044 (68
Destination Mac : 01005E-000005 ()
Source Mac : 00096B-1A7490 (IBM)
Vlan_id : 10 Priority: 0 Type: 0800 (Int
IpHeader: Version : 4 Header Length: 20
Tos : 00 QOS: Routine Normal Service
Packet Length : 68 ID Number: 0AFD
Fragment : Offset: 0
TTL : 1 Protocol: OSPFIGP CheckSum: C253
Source : 10.1.2.11
Destination : 224.0.0.5
------------------------------------------------------------------------------
366 SC30 OSAENTA 00000007 15:01:33.360143 OSA-Express NTA
To Interface : EZANTAOSA2080 Full=86
Tod Clock : 2010/09/24 14:20:35.933003
Sequence # : 0 Flags: Pkt Out Nta Vlan Lpar L3
Source : 10.1.2.11
Destination : 224.0.0.5
Source Port : 0 Dest Port: 0 Asid: 0000 TCB: 0000000
Frame: Device ID : 02030002 Sequence Nr: 373 Discard: 0 (OK)
EtherNet II : 8100 IEEE 802.1 Vlan Len: 0x0044 (68
Destination Mac : 01005E-000005 ()
Source Mac : 00096B-1A7490 (IBM)
Vlan_id : 10 Priority: 0 Type: 0800 (Int
IpHeader: Version : 4 Header Length: 20
Tos : 00 QOS: Routine Normal Service
LPAR-1 LPAR-2
Interface2 Interface3
Interface1 Interface4
VLAN-2 VLAN-3
zSeries CEC-1
PCI-X
OSA
If you have multiple interfaces in the same stack, you must issue this command for each
interface.
Note: You can use the OSAENTA trace facility to debug any problem with OSX and OSM
CHPID types.
When you use NTA, you must have a data path that is available for each NTA that you start.
An OSM CHPID type supports nine data paths; an OSX CHPID type supports 17 data
paths. For more information, see Chapter 10, “IBM z/OS in an ensemble” on page 419.
I
CS z/OS and n Local monitor
components
APIs
s
t
r
u
Commands/Utilities
m
e Exit
n point
Exit points
t
a
t
SNMP
i
o
n
z/OS
The NMI API can interface with IBM Tivoli® OMEGAMON® XE for Mainframe Networks (or
other products) to provide the following types of functions:
Trace assistance: Real-time tracing and formatting for packet and data traces (including
OSA trace)
Information gathering:
– TCP connection initiation and termination notifications
– API for real-time access to TN3270 server and FTP event data and to IPSec
– APIs to poll information about currently active connections
– TCP listeners (server processes)
– TCP connections (detailed information about individual connections and UDP
endpoints)
– Communications Server storage usage
– API to receive and poll for Enterprise Extender management data
The resolver callable NMI is modeled after the TCP/IP callable NMI. You can use the same
triplet and quadruplet structures to identify the offset, length, and number of various types of
information about requests and responses. The calling application, which must be authorized,
provides an output buffer area to hold the NMI response data.
The resolver NMI provides the current resolver setup statement and global TCPIP.DATA
statement values. The resolver indicates in the NMI output whether the configuration
statements were explicitly defined or were defaulted. The resolver also includes the source
file names of the z/OS of file system files from which the configuration data was retrieved.
The objective of the TCP/IP product is to define and generate the lowest level of detail that
is needed by all disciplines. A customer must use other products such as IBM RMF™,
Performance Reporter for z/OS (PR), MVS Information Control System (MICS), or
SAS-based tools. In many cases, there are customer-written programs to generate the
reports to collect and analyze the SMF Records that are created by TCP/IP.
Note: SMF records that are produced by TCP/IP should not be viewed in isolation. Other
components in MVS produce SMF records for the same purposes as those produced by
TCP/IP. An installation is likely to combine information from a series of subsystems when
performing detailed performance or capacity planning.
The contents of SMF records can be used to generate reports in customized formats that help
customers to perform tasks such as the following ones:
Performance management
Customized reports can be generated to verify whether the defined service levels are met
and, if not, to identify possible causes. These reports are usually a set of time intervals
ranging from weeks through days matching the SMF interval. Examples of potential
reports that are related to performance management are as follows:
– TCP connection elapsed time per server port number per time of day (potentially
broken down by source IP address or netmask)
– Number of TCP connections per server port number per time of day (potentially broken
down by source IP address or netmask)
– Number of inbound/outbound bytes transferred in TCP connections per time of day
(potentially broken down in various ways: by destination or source port, by destination
IP address, netmask, or in total)
– Events that are related to dynamic VIPA environment, such as the following events:
• Status changes
• DVIPA removed or added
• Changes on the target server (stop/start)
Capacity planning
Capacity planning can be done by using the SMF records to generate reports showing
trends for resource utilization of central processing power, memory, channel-based I/O
subsystem, network attachments, and network bandwidth, over a period. These trends
can help with planned launches of new applications or use of existing applications to
predict capacity needs in the future. Some examples of potential reports that are related to
capacity planning are:
– Total number of TCP connections per reserved server port number per day including
analysis of average and variations around average during daily peak periods
– Total number of UDP inbound/outbound UDP datagrams per reserved server port
number per day including average and variations around average during daily peak
periods
– Number of bytes or packets transferred inbound and outbound per interface (LINK) per
time of day (potentially broken down into unicasts, broadcasts, and multicasts)
Depending on the configuration for the z/OS Communications Server - TCP/IP component,
SMF records can be cut at multiple levels in the TCP/IP protocol stack, and the type of
information that can be included depends on where the SMF record is created:
At the IP and interface layer
Information about ICMP activity, IP packet fragmentation and reassembly activity, IP
checksum errors, IGMP activity, and ARP activity. This information is important to
generate reports that are related either to performance or capacity management.
At the transport protocol layer
Information about IP addresses, port numbers, and host names. It has also information
about TCP connections, such as byte counts, connection times, reliability metrics, and
performance metrics. For UDP-related workload, each UDP datagram is a separate entity;
the only way to aggregate information for UDP is on a UDP socket level, where SMF
records can be created every time a UDP socket is closed.
At the application layer
Currently, application-layer SMF recording is done for the TN3270 Telnet server (Telnet),
the FTP server, and the IKE daemon, but not for any other servers.
Table 9-3 Events that are logged by using SMF record type 118
Events Subtype records
TCP/IP statistics 5
SMF record type 118 provides basic information and does not have information that is related
to the TCP/IP stack. In a multiple-stack environment, it is not easy to determine which SMF
records relate to which TCP/IP stack.
SMF record type 119 contains additional values that identify the TCP/IP stack, which solves
the record 118 problem. It also provides other advantages such as uniformity of date and time
(UTC), common record format (self-defining section and TCP/IP identification section), and
support for IPv6 addresses and expanded field sizes (64-bit versus 32-bit) for some counters.
The SMF record type 119 subtype records that are available are shown in Table 9-4.
Table 9-4 Events that are logged by using SMF record type 119
Events Subtype records
DVIPA removed 33
9.8.1 Concept
You may use the EZACMD command to run selected z/OS Communications Server UNIX
commands from other command environments, such MVS console, IBM Tivoli NetView for
z/OS, and TSO.
The command is used as a common interface for the running of specific z/OS
Communications Server TCP/IP infrastructure policy-related commands (pasearch, trmdstat,
nssctl, ipsec, and ping) from other environments beyond z/OS CS UNIX.
However, if you want to use this feature, you must enable EZACMD. The following sections
describe how to enable the EZACMD functions.
Note: You must configure and enable the System REXX component to use EZACMD.
pasearch Queries information from the z/OS Communications Server policy agent.
nssctl Displays information about NSS clients that are currently connected to the local NSS
server.
trmdstat Displays a consolidated view of log messages that are written out by the Traffic
Regulation Management daemon (TRMD).
ipsec Displays and modifies IP security (IPSec) information for a local TCP/IP stack and the
IKE daemon. It is also used for the NSS IPSec client that uses the IPSec network
management service of the local NSS server.
ping Tests the connectivity between devices and the z/OS system.
EZACMD is one of the See the options for z/OS UNIX commands This is the optional
supported z/OS UNIX in z/OS Communications Server: IP maximum number
commands (and System Administrator’s Commands, of output lines.
pasearch, trmdstat, SC31-8781, z/OS Communications The default is 100,
nssctl, ipsec, and ping). Server: Quick Reference, SX75-0124, and and the maximum
The command name is z/OS Communications Server: IP is 64000.
not case-sensitive. Configuration Guide, SC31-8775. Options
are case-sensitive and must be entered in
the required case.
Note: Each environment has specific requirements and characteristics for using EZACMD,
which are described in 9.8.5, “Configuring z/OS for using the EZACMD” on page 412.
In the figure, item 1 shows that REXX is the constant, and &SYSCLONE is the system symbol
defining a one- to two-character shorthand notation for the system name. This command
prefix is available sysplex-wide to route commands between images within a sysplex.
2. Follow the System REXX documentation for defining JCL procedures and RACF
definitions.
Table 9-8 Steps to configure the MVS support for selected TCP/IP commands
Task How to do it Reference
Enable the use of Configure and enable Chapter 8, “AXR00 (default System REXX
EZACMD from the MVS the System REXX data set concatenation)”, in MVS
console. component on z/OS. Programming: Authorized Assembler
Services Guide, SA22-7608, and Chapter
31, “System REXX”, in MVS Initialization
and Tuning Reference, SA22-7592
Call z/OS Use the new EZACMD z/OS Communications Server: IP System
Communications Server command, followed by a Administrator’s Commands, SC31-8781
UNIX policy-related specific policy-related
commands from the command, such as
MVS console, TSO, or pasearch, trmdstat,
NetView environments. nssctl, ipsec, or ping,
as input.
Example 9-44 Response of Ipsec command through EZACMD through the MVS console
REXX32EZACMD 'ipsec -f display -r short -p tcpipc MAX=10'
System REXX EZACMD: ipsec command - start - userID=CS03
System REXX EZACMD: ipsec -f display -r short -p tcpipc
FilterName |FilterNameExtension
|GroupName
System REXX EZACMD: Maximum number of output lines (10) has been
reached.
System REXX EZACMD: ipsec command - end - RC=4
Note: If you need help while in the console, type the following line, where (pref) is your
current sysplex/partition:
(pref)EZACMD ? /-? /help
Table 9-9 Steps to configure the use of EZACMD by TSO and NetView
Task How to do it Reference
Enable the use of 1. Copy EZACMD to a REXX library that is z/OS Communications Server: IP
EZACMD from the used by TSO. System Administrator’s
z/OS TSO. 2. Concatenate tcpip.SEZAEXEC to the Commands, SC31-8781
SYSEXEC or SYSPROC DD name.
Note: To preserve the case of the entered arguments, prefix the EZACMD command with the
NetView NETVASIS command:
netvasis ezacmd ping -v w3.ibm.com max=20
Example 9-45 shows the response to the EZACMD command in Figure 9-24.
***
Example 9-46 EZACMD integrated in REXX through the NetView PIPE command
/* NetView REXX */
cmd = 'EZACMD ping -v 127.0.0.1'
address NETVASIS 'PIPE NETV 'cmd' | Corrwait 10 | Stem cmdout.'
if cmdout.0 > 0 then do nvix=1 to cmdout.0
say '**'||cmdout.nvix
end
exit(0)
There are no specific requirements for using EZACMD in a TSO REXX program. It can be
invoked like any other TSO command by using an address command, as shown in
Example 9-47.
Create an OPERCMDS resource profile with the following name to protect the EZACMD
command:
MVS.SYSREXX.EXECUTE.EZACMD
Only logged in console users who are authorized with READ access to that profile can use
the EZACMD command from the z/OS console. This level of security applies to the z/OS
console environment only.
The z/OS UNIX command security that is described in Table 9-10 applies to all environments
in which the EZACMD command is used.
SERVAUTH profiles are especially useful with the ipsec command, so consider using them
for that command.
For more information about this topic, see Appendix E, “Steps for preparing to run IP
security”, in z/OS Communications Server: IP Configuration Guide, SC27-3650.
Table 9-11 The five types of operator security that are supported by z/OS NetView
NetView _BPX_USERI z/OS UNIX command What it does
SECOPTS.OPERSEC D passed to
setting z/OS UNIX by
EZACMD
SAFDEF NetView NetView operator SAF SAF checking for both logon passwords and
operator SAF user ID, UID, and GID attributes (NETVIEW segment).
user ID
SAFCHECK NetView NetView operator SAF Logon passwords are checked by SAF.
operator SAF user ID, UID, and GID Attributes that are specified in the
user ID DSIOPF/DSIPRF. DATASET, and
OPERCMDS classes are checked at the
task level.
SAFPW NetView NetView operator SAF Logon passwords are checked by SAF.
operator SAF user ID, UID, and GID Attributes that are specified in the
user ID DSIOPF/DSIPRF. DATASET, and
OPERCMDS classes are checked at the
NetView level.
NETVPW NetView NetView started task SAF Logon passwords are defined in DSIOPF or
started task user ID, UID, and GID DSIEX12. Attributes are specified in
user ID DSIOPF/DSIPRF.
MINIMAL NetView NetView started task SAF Ignore logon passwords and attributes.
started task user ID, UID, and GID
user ID
In Example 9-49, the IP address cannot be reached from the location where the ping
command is issued, and System REXX times out the command after 30 seconds.
Information about z/OS Communications Server product support is at the following address:
http://www.ibm.com/software/network/commserver/zos/support/
Information about IBM Tivoli OMEGAMON XE for Mainframe Networks is at the following
address:
http://publib.boulder.ibm.com/tividd/td/IBMTivoliOMEGAMONXEforMainframeNetworks1.0
.html
With zEnterprise, virtualized resources of both the z Systems platform and selected IBM
blades, which are housed in the zEnterprise BladeCenter Extension (zBX), are pooled and
jointly managed through the zEnterprise Unified Resource Manager.
This chapter covers the topics that are shown in Table 10-1.
10.3, “Connectivity” on page 422 The connections between z Systems and zBX.
10.4, “Enabling z/OS as a member of the Requirements for z/OS to become a member of the
ensemble” on page 423 ensemble.
9.4, 10.5, “Adding z/OS How to define and verify the z/OS ensemble interfaces.
Communications Server into the
ensemble” on page 430
The zBX components are configured, managed, and serviced the same way as the other
components of z Systems. Although the zBX processors are not z Systems PUs and run
specific software, including hypervisors, the software that is intrinsic to the zBX components
does not require any additional administration effort or tuning by the user. In fact, it is handled
as z Systems Licensed Internal Code (LIC). The zBX hardware features are part of the
mainframe, not add-ons.
For more information, see Building an Ensemble Using IBM zEnterprise Unified Resource
Manager, SG24-7921.
OSA-E3
Support
1000 Base T
Ethernet
"CSM" Intraensemble Top of rack switch
Data
Network Top of rack switch
OSA-E3
Support
HMC
10 Gigabit
Ethernet
"OSX"
HMC
The communication among LPARs in the same CPC can take advantage of using the internal
queued direct I/O extensions (IQDX) function.
Example 10-1 BPXPRMnn changes to add the IPv6 address family to the z/OS image
NETWORK DOMAINNAME(AF_INET) A
DOMAINNUMBER(2)
MAXSOCKETS(10000)
TYPE(CINET)
INADDRANYPORT(10000)
INADDRANYCOUNT(2000)
NETWORK DOMAINNAME(AF_INET6) B
DOMAINNUMBER(19)
MAXSOCKETS(10000)
TYPE(CINET)
With these definitions in BPXPRMnn, dual-mode TCP/IP stacks are supported (IPv4 with
AF_INET (A) and IPv6 with AF_INET6 (B)). If your z/OS image contains only one TCP/IP
stack, your definition is simpler, indicating a TYPE(INET) and omitting the INADDRANYPORT and
INADDRANYCOUNT parameters.
The INADDRANYPORT and INADDRANYCOUNT values for NETWORK AF_INET6 are taken from
the NETWORK AF_INET statement. These values are ignored if they are specified on the
NETWORK statement for AF_INET6.
Location 1 shows that common INET (CINET) is running with the IPv6 PFS and the address
family for IPv6 (AF_INET6).
Next, display the TCP/IP stack’s home list to verify that a LOOPBACK6 device appears there,
indicating that the stack itself is enabled for IPv6 (Example 10-3).
Example 10-3 A z/OS display of the dual-mode TCP/IP stack and its home list with IPv6 enabled
D TCPIP,TCPIPA,N,HOME
The ENSEMBLE start option must be changed to indicate ENSEMBLE=YES. You can enable the
option in either of two ways:
Change the VTAM Start Options to include the option setting.
Issue a MODIFY command.
Before you enable the ENSEMBLE option, examine the VTAM Transport Resource List Entries
(TRLEs) to determine whether any are built for the INMN network. Example 10-4 shows the
VTAM member ISTTRL.
In the example LPAR, the VTAM Start Options are SYS1.VTAMLST(ATCSTR30). We add a
parameter to this Start Option list so that the next recycle of VTAM can make z/OS ready for
the ensemble.
.....
ENSEMBLE=YES, A X
......
Fortunately, the start option ENSEMBLE (A) is dynamically modifiable; therefore, before you
perform a new IPL of VTAM, you can change the default of ENSEMBLE=NO to ENSEMBLE=YES. Use
the following command from the z/OS console to change the setting of the parameter:
F NET,VTAMOPTS,ENSEMBLE=YES
After the change in VTAM Start options, the next step to implement the ensemble is to recycle
the TCP/IP stack. This step dynamically creates the INMN interfaces and the corresponding
TRLEs in VTAM, as shown in 10.4.3, “Validating the INMN interfaces in z/OS” on page 426.
Important: If the host is not added to ensemble in VTAM, the following message is
displayed when you start the TCP/IP stack:
EZZ4336I ERROR DURING ACTIVATION OF INTERFACE OSA230AI - CODE 10103037
DIAGNOSTIC CODE 01
Example 10-6 Display the TRLEs for the INMN Connections in VTAM
D NET,E,ID=ISTTRL
IST097I DISPLAY ACCEPTED
IST075I NAME = ISTTRL, TYPE = TRL MAJOR NODE 248
...
A simple display of one of the ensemble TRLEs in VTAM shows which device addresses from
the IOCDS are used to build the TRLE. As an example, we display the TRLE for an OSM
CHPID (see Example 10-7).
Example 10-7 TRLE display of the devices to be used for an OSM interface
D NET,E,ID=IUTMT00A
IST097I DISPLAY ACCEPTED
IST075I NAME = IUTMT00A, TYPE = TRLE 281
IST486I STATUS= ACTIV, DESIRED STATE= ACTIV
IST087I TYPE = LEASED , CONTROL = MPC , HPDT = YES
IST1954I TRL MAJOR NODE = ISTTRL
IST1715I MPCLEVEL = QDIO MPCUSAGE = SHARE
IST2263I PORTNAME = IUTMP00A PORTNUM = 0 OSA CODE LEVEL = 0932
IST2337I CHPID TYPE = OSM CHPID = 0A
IST1577I HEADER SIZE = 4096 DATA SIZE = 0 STORAGE = ***NA***
IST1221I WRITE DEV = 2341 STATUS = ACTIVE STATE = ONLINE
IST1577I HEADER SIZE = 4092 DATA SIZE = 0 STORAGE = ***NA***
IST1221I READ DEV = 2340 STATUS = ACTIVE STATE = ONLINE
IST924I -------------------------------------------------------------
IST1221I DATA DEV = 2342 STATUS = ACTIVE STATE = N/A
IST1724I I/O TRACE = OFF TRACE LENGTH = *NA*
IST1717I ULPID = TCPIP ULP INTERFACE = EZ6OSM01
IST2310I ACCELERATED ROUTING DISABLED
IST2331I QUEUE QUEUE READ
IST2332I ID TYPE STORAGE
IST2205I ------ -------- ---------------
IST2333I RD/1 PRIMARY 4.0M(64 SBALS)
IST2305I NUMBER OF DISCARDED INBOUND READ BUFFERS = 0
IST1757I PRIORITY1: UNCONGESTED PRIORITY2: ****NA****
IST1757I PRIORITY3: ****NA**** PRIORITY4: ****NA****
IST2190I DEVICEID PARAMETER FOR OSAENTA TRACE COMMAND = 01-01-00-03
IST1801I UNITS OF WORK FOR NCB AT ADDRESS X'2807E010'
IST1802I P1 CURRENT = 1 AVERAGE = 2 MAXIMUM = 4
IST924I -------------------------------------------------------------
IST1221I DATA DEV = 2343 STATUS = RESET STATE = N/A
IST1724I I/O TRACE = OFF TRACE LENGTH = *NA*
IST924I -------------------------------------------------------------
................
IST1500I STATE TRACE = OFF
IST314I END
Example 10-8 Display the INMN and IEDN addresses and interface names in TCP/IP
D TCPIP,,N,HOME
Note how the addresses for the INMN interfaces (A and B) are IPv6 LINK_LOCAL addresses
that begin with the prefix FE80. The dynamically assigned names for the autoconfigured
interfaces are EZ6OSM01 and EZ6OSM02.
Example 10-9 Display the OSA Information for an OSM OSA interface
D TCPIP,TCPIPA,OSAINFO,INTFNAME=EZ6OSM01 1
EZZ0053I COMMAND DISPLAY TCPIP,,OSAINFO COMPLETED SUCCESSFULLY
Example 10-9 on page 428 provides valuable information with the OSAINFO display:
1 Syntax of command to display a single OSM, dynamically generated
interface.
2 Dynamically assigned port name for an OSM TRLE and interface.
3 The OSM must be on Port number 0 of the OSM adapter.
4 The data path assignment correlates with the IOCDS for the generated
TRLE.
5, 6, 7 The physical channel identifier (PCHID), CHPID number, and CHPID type
correlate with the IOCDS.
8 The MCL level of the OSA port.
9, 10, 11 This is a Copper OSA, capable of jumbo frames, operating in ISOLATE mode.
12 The physical MAC address of the OSA port.
13 The management OSA does not perform priority queuing in either direction.
14 The management OSA operates only in Layer 2 mode with no Layer 3
routing.
15 The management OSA port is operating in ACCESS mode, the TOR switch
assigns the VLAN ID, and the stack is unaware of any VLAN ID.
16, 17 A Virtual MAC is active and its address is displayed with the Ensemble prefix.
18 The Virtual MAC was fully generated by the OSA itself by using the Ensemble
prefix.
The next display is a typical NETSTAT output display. Use it for more information about the
OSM OSA port (Example 10-10).
Note: Only the OSA Express 3 10 GB or OSA Express 4S can be defined as an OSX
interface.
Serial number . . . . . . . :
Description . . . . . . . . :
2. On the Manage Virtual Networks window, select New Virtual Network (Figure 10-4).
2. Select the hosts for the VLAN and click Next to add them. See Figure 10-7.
Figure 10-9 shows the number of members that is added for a specific VLAN. The example
has two OSX interfaces for each host.
In the example, we create two VLANs (cs113res111 and cs113res112) for high availability
purposes.
Important: If the host was not added to the VLAN that is created in zEnterprise Unified
Resource Manager, the following messages are displayed at the activation of the interface:
EZD0004I ERROR SETTING VLAN ID FOR INTERFACE OSA232AI
EZZ4341I DEACTIVATION COMPLETE FOR INTERFACE OSA232AI
Example 10-12 shows how to define the INTERFACE statements by using the CHPID number in
SC30 LPAR for TCPIPA. We insert the statements for the interfaces on IEDN VLANs 111 and
112 into the TCP/IP profile.
Note: The IEDN interfaces can also be dynamically added with an OBEYFILE, but the INMN
connections require a stack initiation for the initial dynamic creation.
Example 10-13 Display the INMN and IEDN addresses and interface names in TCP/IP
D TCPIP,TCPIPA,N,HOME
Observe the names of the OSX interfaces at 1 - 4. These are the names that we preassigned
in our INTERFACE definitions in the TCP/IP profile. The interfaces are assigned the IPv4
addresses that we planned for.
Also, you may choose one of the following ways to define OSX interfaces for VTAM:
You can allow VTAM to build the TRLEs for the IP interfaces dynamically by referring to the
CHPID number.
You can predefine the VTAM TRLEs with a PORTNAME, and then code the IP interface
definitions by using the PORTNAME.
Example 10-14 Display the OSA information for an OSX OSA interface
D TCPIP,TCPIPA,OSAINFO,INTFN=OSA230AI
EZZ0053I COMMAND DISPLAY TCPIP,,OSAINFO COMPLETED SUCCESSFULLY
Display OSAINFO results for IntfName: OSA230AI
PortName: IUTXP018 PortNum: 00 Datapath: 2304 RealAddr: 0004
PCHID: 0590 CHPID: 18 CHPID Type: OSX OSA code level: 0D2F
Gen: OSA-E3 Active speed/mode: 10 gigabit full duplex
Media: Multimode Fiber Jumbo frames: Yes Isolate: No
PhysicalMACAddr: 001A643B2135 LocallyCfgMACAddr: 000000000000
Queues defined Out: 4 In: 1 Ancillary queues in use: 0
Connection Mode: Layer 3 IPv4: Yes IPv6: No
SAPSup: 000FF603 SAPEna: 0008A603
IPv4 attributes:
VLAN ID: 111 VMAC Active: Yes
VMAC Addr: 0207E300002D VMAC Origin: OSA VMAC Router: All
AsstParmsEna: 00300C57 OutCkSumEna: 0000001A InCkSumEna: 0000001A
Registered Addresses:
IPv4 Unicast Addresses:
ARP: Yes Addr: 10.1.111.11
Total number of IPv4 addresses: 1
IPv4 Multicast Addresses:
MAC: 01005E000001 Addr: 224.0.0.1
Total number of IPv4 addresses: 1
23 of 23 lines displayed
End of report
To provide this capability, configure the elected IQD CHPID by using a channel parameter (in
this case, it is the IQDX function in HCD), which enables the IQDX function of HiperSockets.
See Figure 10-10.
When the IQDX function is configured, the single IQD CHPID is integrated with the OSX to
the IEDN, inheriting the OSX configuration and eliminating the HiperSockets configuration
tasks within z/OS Communications Server and Unified Resource Manager.
With AUTOIQDX
OSX IQDX the IQDX interface is
dynamically configured
and transparently managed
("tucked" under OSX).
IQD
OSX
(IQDX)
Only OSX connectivity IQDX connectivity is transparent!
must be configured!
Internal IEDN
IEDN
For more information about IQDX function and its benefits, see z/OS Communications Server
IP Configuration Guide, SC31-8775.
Important: To define the IQDX function, add sufficient subchannel addresses to enable
the creation of one dynamic IQDX TRLE for each OSX CHPID for each IP version
(Version 4 and Version 6). Define at least ten IQDX subchannel addresses for each
OSX CHPID for IPv4, and ten subchannel addresses for each OSX CHPID for IPv6,
regardless of the number of VLANs.
3. Authorize the LPAR to use a VLANID, replicating this authorization to all Interfaces for this
LPAR on the IEDN. This setting is the default, so there is no need to do any further
changes in z/Enterprise Unified Resource Manager.
4. Enable HiperSockets access to the IEDN. Configure the appropriate value for the
AUTOIQDX parameter on the GLOBALCONFIG statement in the TCP/IP profile. You can code
the following values:
– NOAUTOIQDX: Do not use the IQDX interfaces.
– AUTOIQDX ALLTRAFFIC: This value is the default. Use IQDX interfaces for all eligible
outbound traffic on the IEDN.
– AUTOIQDX NOLARGEDATA: The large outbound TCP data is transported through IEDN
over OSX interfaces.
For more information about IQDX function implementation, see z/OS Communications Server
IP Configuration Guide, SC31-8775.
IPv6 was developed to resolve impending problems that are related to the limitations of IPv4
and the rapidly growing demand for IP resources and functions; the most significant issue is
the diminishing supply and expected shortages of IPv4 addresses.
Using IPv4 32-bit addressing allows for over 4 million nodes, each with a globally unique
address. This current IPv4 space cannot satisfy the huge expected increase in the number of
users on the internet. The expected shortage is exacerbated by the requirements of emerging
technologies such as PDAs, home area networks, and internet-connected commodities, such
as automotive and integrated telephone services. IPv6 uses 128-bit addressing and
generates a space large enough to last for the foreseeable future.
A.3.1 Tunneling
Tunneling is the transmission of IPv6 traffic that is encapsulated within IPv4 packets over an
IPv4 connection. Tunnels are used primarily to connect remote IPv6 networks, or to simply
connect an IPv6 network over an IPv4 network infrastructure.
Dependencies
All tunnel mechanisms require that the endpoints of the tunnel run in dual-stack mode. A
dual-stack router is a router running both versions of IP. There are other dependencies based
on the tunneling mechanism that is used.
For example, an IPv6 manually configured tunnel requires an ISP-registered IP address. The
automatic tunnel mechanism requires IPv6 prefixes. Intra-Site Automatic Tunnel Addressing
Protocol (ISATAP) tunnels require only a dual-stack router, but they are not yet commercially
available and 6over4 tunnels are not supported by vendor router software.
Advantages
Tunneling allows the implementation of IPv6 without any significant upgrades to the existing
infrastructure, and therefore does not risk interrupting the existing services that are provided
by the IPv4 network.
Considerations
Various tunneling mechanisms are designed for different primary tasks, so you must carefully
consider the mechanism that you choose. Some are primarily used for stable and secure links
for regular communications. Others are primarily used for single hosts or small sites, with low
data traffic volumes.
Dependencies
Dual-stack routers with IPv6 and IPv4 addresses are required to provide access to the WAN.
Access to a Domain Name System (DNS) is needed to resolve IPv6 names and addresses.
Advantages
Use of the existing Layer 2 infrastructure makes this implementation less complex and
immediate. This implementation is not disruptive, apart from a schedule change for router
configuration, and there is little impact to the status quo.
Dependencies
Dependencies vary from router configuration to specific hardware requirements to software
upgrades, depending on the service provider solution.
Advantages
Using this strategy requires minor modifications to the infrastructure and minor
reconfigurations of the core routers. It is a strategy that might have little or no impact to your
environment, involving low costs and low risks.
Considerations
Considerations also vary, depending on the strategy that is chosen. For example, using the
Circuit Transport over MPLS strategy does not support a mix of IPv4 and IPv6 traffic. IPv6 on
service provider edge routers do not support virtual private networks (VPNs) or virtual routing
and forwarding (VRF) currently.
Dependencies
Each site has the appropriate entries in a DNS to resolve both IPv4 and IPv6 names and
IP addresses.
Advantages
This is a basic and simple strategy for routing IPv4 and IPv6 traffic in a network.
Considerations
All routers in the network require a software upgrade to support dual stacks. Having dual
stacks require additional router management of a dual addressing scheme and additional
router memory.
Dependencies
A z/OS Communications Server that is configured to support IPv6 requires OSA-Express
ports to be running in QDIO mode.
Advantages
There are no additional software or hardware requirements for users in a z/OS environment
that is configured with OSA-Express features. Dual-mode allows IPv4 and IPv6 applications
to coexist indefinitely. However, any application can be migrated one at a time or at the user’s
convenience from IPv4 to IPv6. This is an inexpensive, low-risk, and low-impact deployment
strategy.
Considerations
The only link layer protocol that supports IPv6 is MPC+. The devices that use the MPC+
protocol are XCF, MPCPTP, and MPCIPA (for example, OSA-Express3 in QDIO mode and
HiperSockets on the System z196).
A.3.6 Suggestion
Using dual-mode stacks is the preferred strategy for application migration from IPv4 to IPv6.
Alternative notations that are described in RFC 4291 are acceptable, as in the following
example:
fedc:ba98:7654:3210:fedc:ba98:7654:3210
Note: Although writing the leading zeros in an individual field is unnecessary, at least
one numeral must be in every field, except for the case that is described in the next list
item.
Because of some methods of allocating certain styles of IPv6 addresses, it is common for
addresses to contain long strings of zero bits. To simplify the writing of addresses that
contain zero bits, a special syntax is available to compress the zeros. The use of the
double colon (::) indicates one or more groups of 16 bits of zeros. The :: can appear only
once in an address; it can also be used to compress the leading or trailing zeros in an
address.
Consider the following addresses:
– 1080:0:0:0:8:800:200c:417a (unicast address)
– ff01:0:0:0:0:0:0:101 (multicast address)
– 0:0:0:0:0:0:0:1 (loopback address)
– 0:0:0:0:0:0:0:0 (unspecified addresses)
They can be represented as follows:
– 1080::8:800:200c:417a (unicast address)
– ff01::101 (multicast address)
– ::1 (loopback address)
– :: (unspecified addresses)
An alternative form that is sometimes more convenient to use when dealing with a mixed
environment of IPv4 and IPv6 nodes is to use x:x:x:x:x:x:d.d.d.d. Here, the x’s are the
hexadecimal values of the six high-order 16-bit pieces of the address. The d’s are the
decimal values of the four low-order 8-bit pieces of the address (standard IPv4
representation).
Consider this example:
0:0:0:0:0:0:13.1.68.3
0:0:0:0:0:ffff:129.144.52.38
In compressed form, it is written as follows:
::13.1.68.3
::ffff:129.144.52.38
With minimal router configuration and no manual configuration of local addresses, a host can
generate its own IPv6 addresses. An IPv6 public autoconfigured address is the combination
of a router advertised prefix and the interface ID that is provided by the OSA-Express QDIO
adapter or manually configured by using the INTFID parameter on the INTERFACE statement.
0 9 63 127
54 bits 64 bits
10 bits
Interface ID
Site-local
Scope 1111 1110 11 0...0 MAC, Other Interface ID
FEC0
(deprecated)
0 4 63 127
Global Scope 3 bits 61 bits 64 bits
RFC 2373 Interface ID
and 001
variable
(anything MAC, Other Interface ID
RFC 4291 else) "subnet"
Manual security associations typically use specific IP addresses for the endpoints. You can
use wild cards for the security endpoint addresses so that the data endpoints and security
endpoints are considered identical. Alternatively, you can use predictable IPv6 addresses for
the security endpoints. You can obtain predictable IPv6 addresses by configuring full 128-bit
IPv6 addresses on your INTERFACE statements by specifying the INTFID keyword on your
INTERFACE statements or by using VIPAs.
To enable temporary address support for a TCP/IP stack, specify TEMPADDRS on the IPCONFIG6
statement in the TCP/IP profile. TEMPPREFIX on an interface definition specifies the set of
prefixes for which temporary IPv6 addresses can be generated.
Each address begins with a format or scope prefix of 10 bits, followed by a second field and
then an interface identifier field. Each of these addresses serves a unique purpose:
Link-local scope
These are special addresses that are only valid on a link of an interface. Using this
address as the destination, the packet never passes through a router. A packet with a
link-local source or destination address does not leave its originating LAN. A router
receiving the packet does not forward it onto another physical LAN. An address of this type
bears the prefix of fe80.
A link-local address is assigned to each IPv6-enabled interface after stateless
auto-configuration, commonly used in IPv6 implementations. The link-local address is
used for link communications, such as the following examples:
– Neighbor discovery, that is, discovering whether anyone else is on this link
– Communication with a neighbor when a router is unnecessary
Figure A-2 on page 451 shows a LAN environment that is separated into two LAN
segments, which are represented by link scope zone A with three nodes and link scope
zone B with four nodes. The link local addresses in each zone begin with the prefix fe80.
fe80...
fe80...
Node X
Within a zone, nodes communicate with each other by using link-local addresses. Across
zones, nodes must communicate with each other by using global scope addresses.
Node X has link-local addresses in two zones: in zone A and in zone B. Because link-local
addresses use the same prefix value, it is necessary to understand which zone a packet
should be sent to, particularly when a default route is used. So, if a route exists on Node X
for any destination address with a prefix of fe80, then the routing table must distinguish
between fe80 in zone A and fe80 in zone B.
Therefore, both the address and the zone index value must be specified in the routing
table. The zone index is a value that is assigned by the stack to represent the correct entry
(or interface) in the routing table. If the zone index is not present, then the stack uses the
“default route” for this configuration.
If the default route uses the interface that matches the IPv6 link-local address that was
specified, everything works fine. However, if the default route does not use the correct
interface for the specified IPv6 link-local address, then a routing error is encountered and
the application request fails or times out. So, the zone index helps the stack to distinguish
whether the routing path should flow into zone A or into zone B.
z/OS Communications Server supports scope zone information about Getaddrinfo and
Getnameinfo invocations, and also on the z/OS socket APIs that support IPv6, thus
satisfying requirements for IPv6 compliance. One of those supported socket APIs is, for
example, the source address selection that enables specifying whether your application
prefers temporary or public addresses (if a JOBNAME procname PUBLICADDRS or TEMPADDRS
statement is specified in the SRCIP block, the API statement is ignored). In addition,
scope zone information can be included on command-line operations and in configuration
files for ftp, ping, traceroute, rexec, orexec, rsh, and orsh.
Note: Anycast addresses cannot be used as source addresses. They are used only as
destination addresses.
Further details about configuration options that are not referenced here are available in z/OS
Communications Server: IP Configuration Reference, SC27-3651 and z/OS Communications
Server: IPv6 Network and Application Design Guide, SC31-8885.
Table A-1 summarizes the z/OS TCP/IP stack-related functions and the level of support,
which are based on the current release of the z/OS Communications Server. You can use this
table to determine whether a given function is applicable and supported.
CTC Y N
LCS Y N
CLAW Y N
CDLC (3745/3746) Y N
X.25 NPSI Y N
NSC HyperChannel Y N
ATM Y N
Sysplex support
Sysplex distributor Y N
integration with Cisco
MNLB
IP routing functions
Configurable Device or Y Y
Interface Recovery Interval
Transport-layer functions
SNMP agent Y Y
Distributed Protocol Y Y
Interface
OMPROUTE subagent Y N
TN3270 subagent Y Y
Security function
IP filtering Y Y
IKE daemon Y Y
NAT traversal Y N
Application Transparent Y Y
TLS (AT-TLS)
Intrusion Detection Y Y
Services (IDS)
Server applications
Based on the description in “Common design scenarios for IPv6” on page 445, here we
concentrate on a single-stack environment running in dual-mode. A single-stack environment
is one TCP/IP stack running in an LPAR.
Dual-mode stack
A TCP/IP stack that supports both IPv4 and IPv6 interfaces that can receive and send IPv4
and IPv6 packets over the corresponding interfaces is referred to as a dual-mode stack. A
dual-mode stack is a single stack supporting IPv4 and IPv6 protocols, which is different from
a dual-stack mode that uses two TCP/IP stacks running side by side, each supporting only
one of the protocols (either IPv4 or IPv6).
A z/OS dual-mode stack is enabled when both AF_INET and AF_INET6 are coded in
SYS1.PARMLIB(BPXPRMxx). You cannot code AF_INET6 without specifying AF_INET,
and doing so causes the TCP/IP stack initialization to fail.
If a partner is IPv6 enabled and running on an IPv6-only stack, then communication fails. The
partner has only a native IPv6 address (not an IPv4-mapped IPv6 address). The native IPv6
address for the partner cannot be converted into a form that the AF_INET application
understands.
Older AF_INET applications can communicate only by using IPv4 addresses. IPv6-enabled
applications that use AF_INET6 sockets can communicate by using both IPv4 and IPv6
addresses (on a dual-mode stack). AF_INET and AF_INET6 applications can communicate
with one another, but only by using IPv4 addresses.
If the socket libraries on the IPv6-enabled host are updated to support IPv6 sockets
(AF_INET6), applications can be IPv6-enabled. When an application on a dual-mode stack
is IPv6-enabled, the application can communicate with both IPv4 and IPv6 partners. This is
true for both clients and server on a dual-mode stack.
Table A-2 summarizes the application communication rules when running in dual-mode.
IPv6-only No Yes
Dual-mode Stack
IPv4-only IPv6-enabled
Application Application
Network Interfaces
IPv6 IPv6
Network Network
BPXPRMxx definitions
IPv6 is not enabled by default. You must specify a NETWORK statement with AF_INET6 in your
BPXPRMxx member.
To support our dual-mode stack (IPv4 and IPv6), we add the NETWORK statement, as shown in
Example A-1, to our BPXPRMxx member.
The TYPE option in our case is INET because we use a single stack.
For more information about the definitions that are required in BPXPRMxx to provide a dual-
stack, see z/OS Communications Server: IP Configuration Guide, SC27-3650.
VTAM definitions
One of the protocols that z/OS Communications Server TCP/IP supports is MPC+, and the
MPC+ protocols are used to define the DLCs for OSA-Express devices in QDIO.
OSA-Express QDIO connections are configured through a TRLE definition. Because VTAM
provides the DLCs for TCP/IP, all TRLEs are defined as VTAM major nodes (see
Example A-2).
The PORTNAME 1 is identical to the device name that is defined in the TCP/IP PROFILE data set
on the INTERFACE statement.
TCP/IP definitions
We add one INTERFACE statement for the OSA-Express3 1000BASE-T port to support IPv6.
This statement merges the DEVICE, LINK, and HOME definitions into a single statement. Several
different parameters are associated with the INTERFACE statement. To determine which of
them best fits your requirements, see z/OS Communications Server: IP Configuration
Reference, SC27-3651.
Note: To configure a single physical device for both IPv4 and IPv6 traffic, you must use
DEVICE/LINK/HOME for the IPv4 definition and INTERFACE for the IPv6 definition so that the
PORTNAME value on the INTERFACE statement matches the device name on the DEVICE
statement.
Example A-3 shows the TCP/IP profile for our environment by using SYSTEM SYMBOLS and
INCLUDE statements. The &SYSCLONE that you see throughout the example results in a two-digit
value (30 in our example, for system SC30) being inserted. By doing this, we can use the
same profile for each of several systems, each time translating to the appropriate system
value (systems 30, 31, and 32). The &SYSCLONE value is defined in SYS1.PARMLIB.
Example A-3 Profile definition with the use of SYSTEM SYMBOLS and INCLUDE
ARPAGE 20
;
GLOBALCONFIG NOTCPIPSTATISTICS
;
IPCONFIG NODATAGRAMFWD SOURCEVIPA 1
IPCONFIG6 NODATAGRAMFWD SOURCEVIPA 2
;
SOMAXCONN 240
;
TCPCONFIG TCPSENDBFRSIZE 64K TCPRCVBUFRSIZE 64K SENDGARBAGE FALSE
TCPCONFIG TCPMAXRCVBUFRSIZE 256K
TCPCONFIG RESTRICTLOWPORTS
;
UDPCONFIG RESTRICTLOWPORTS
;
INCLUDE TCPIPE.TCPPARMS(HOME&SYSCLONE.V6)
INCLUDE TCPIPE.TCPPARMS(STAT&SYSCLONE.V6)
;
AUTOLOG 5
FTPDE&SYSCLONE JOBNAME FTPDE&SYSCLONE.1
ENDAUTOLOG
;
PORT
20 TCP * NOAUTOLOG ; FTP Server
21 TCP FTPDE&SYSCLONE.1 ; control port
23 TCP TN3270XE NOAUTOLOG ; MVS Telnet Server
23 TCP OMVS ; Telnet Server
25 TCP SMTP ; SMTP Server
514 UDP OMVS ; UNIX Syslogd daemon
;
SACONFIG ENABLED COMMUNITY public AGENT 161
;
SMFCONFIG
FTPCLIENT TN3270CLIENT
TYPE119 FTPCLIENT TN3270CLIENT
;
Example A-5 show static routes in a flat network (no dynamic routing protocol).
The messages that are shown in Example A-6 are written to the z/OS console when the
TCP/IP stack of TCPIPE is initializing on SC30. We also manually start our external TN3270E
server (TN3270XE).
------------------------------------------------
We manually started our external TN3270E server.
------------------------------------------------
S TN3270XE
$HASP100 TN3270XE ON STCINRDR
IEF695I START TN3270XE WITH JOBNAME TN3270XE IS ASSIGNED TO USER
TCPIP , GROUP TCPGRP
$HASP373 TN3270XE STARTED
IEE252I MEMBER CTIEZBTN FOUND IN SYS1.IBM.PARMLIB
EZZ6001I TN3270XE SERVER STARTED
EZZ6044I TN3270XE PROFILE PROCESSING BEGINNING FOR FILE 897
TCPIPE.TCPPARMS(TN3270XE)
EZZ6045I TN3270XE PROFILE PROCESSING COMPLETE FOR FILE
TCPIPE.TCPPARMS(TN3270XE)
EZZ6003I TN3270XE LISTENING ON PORT 23
In this example, 1 and 2 indicate that IPv6 support is enabled and that the interface is
initialized with IPv6 addresses.
Note: The NETSTAT display classifies them as GLOBAL. This classification adheres to
RFC 4291 - Site-Local IPv6 Unicast Addresses, which states the following information:
3 This is an auto-configured LINK_LOCAL address for the same OSA Express device.
4 This is the IPv6 Loopback address.
5 These are the auto-configured LINK_LOCAL addresses for the OSM channel-path
identifiers (CHPIDs), which are part of the intranode management network (INMN).
If the device does not have an LNKSTATUS or INTFSTATUS of READY (as with 1, 2, and 3),
you must resolve this before you continue. There are several factors that might cause the
LNKSTATUS or INTFSTATUS to not be READY. For example, the device cannot be varied
online or defined to z/OS correctly or the device cannot be defined in the TCP/IP profile
correctly.
We use the TSO ping command to verify locally IPv4 and IPv6 interfaces (see
Example A-11).
ping feC0:0:0:1001::3302
Pinging host FEC0:0:0:1001::3302
Ping #1 response took 0.000 seconds.
ping 192.168.2.10
Pinging host 192.168.2.10
Ping #1 response took 0.000 seconds.
In our case, we use MVS system symbols to enable us to share the definitions for our TCP/IP
stacks across LPAR SC30, SC31, SC32, and SC33. MVS system symbols are used in
creating shared definitions for systems that are in a sysplex. With this facility, you use the
symbols that are defined during system startup as variables in configuring your TCP/IP stack.
This means that you must create and maintain only a template file for all the systems in the
sysplex.
For the use of MVS system symbols in other configuration files, use the symbol translator
utility, EZACFSM1. EZACFSM1 reads an input file that includes the system symbols, and
creates an output file with the symbols converted to the system-specific values. This process
is done before the files are read by TCP/IP.
SYSDEF HWNAME(SCZP301)
LPARNAME(A11)
SYSNAME(SC30)
SYSPARM(00)
SYMDEF(&SYSID1='0')
SYMDEF(&BROTHER='SC31M')
In this example, at location 1, the value of SYSCLONE is defined as two characters starting from
the third character of SYSNAME. Our SYSNAME is SC30, so SYSCLONE resolves to 30.
You can also define and use your own variable in configuring Communications Server for
z/OS IP aside from &SYSNAME or &SYSCLONE. For information about creating symbols
output data set, see z/OS Communications Server: IP Configuration Guide, SC27-3650.
Important: The system symbols are stored in uppercase by MVS. Because you can code
the TCP/IP configuration statements in either uppercase or lowercase, you must ensure
that you code the system symbol name in uppercase.
In our environment, all stacks across LPARs share OSAs and use the same HiperSockets
interfaces. We can share the device-related definitions: DEVICE, LINK, BEGINROUTES, and START.
We cannot share the definitions for HOME and VIPADynamic statements because they are
unique in each TCP/IP stack, so we make them separate members and use the INCLUDE
statement. We use the SYSCLONE value to point to those members (the members name must
include SYSCLONE).
Note: A dot (.) is needed at the end of &SYSCLONE because the next character is not a
space or a closing parenthesis.
Example B-4 shows the sample definition of a separate member for a stack-specific
statement. It contains only the HOME statement for system SC30, called HOME30. This
member is included in the PROFILE.TCPIP file in SC30 system. Likewise, define separate
members for other LPARs.
Example B-6 shows how to enable this function by using the MVS SET (T) command.
Example B-7 Without REUSASID TCPIP the old ASID is unavailable and a new ASID is assigned
D A,TCPIPA
IEE115I 14.42.09 2010.298 ACTIVITY 318
JOBS M/S TS USERS SYSAS INITS ACTIVE/MAX VTAM OAS
00004 00024 00003 00034 00019 00003/00030 00022
TCPIPA TCPIPA TCPIPA NSW SO A=0082 PER=NO SMC=000 1
PGN=N/A DMN=N/A AFF=NONE
CT=000.223S ET=333.691S
WUID=STC09685 USERID=TCPIP
WKL=SYSTEM SCL=SYSSTC P=1
RGP=N/A SRVR=NO QSC=NO
ADDR SPACE ASTE=062B7080
DSPNAME=00000EDC ASTE=093D7500
DSPNAME=TCPIPDS1 ASTE=7EE44C00
P TCPIPA
EZZ4201I TCP/IP TERMINATION COMPLETE FOR TCPIPA
IEF352I ADDRESS SPACE UNAVAILABLE 2
$HASP395 TCPIPA ENDED
S TCPIPA 3
$HASP100 TCPIPA ON STCINRDR
IEF695I START TCPIPA WITH JOBNAME TCPIPA IS ASSIGNED TO USER
TCPIP , GROUP TCPGRP
$HASP373 TCPIPA STARTED
D A,TCPIPA
IEE115I 14.43.10 2010.298 ACTIVITY 446
JOBS M/S TS USERS SYSAS INITS ACTIVE/MAX VTAM OAS
00004 00023 00002 00034 00019 00002/00030 00021
TCPIPA TCPIPA TCPIPA NSW SO A=0085 PER=NO SMC=000 4
PGN=N/A DMN=N/A AFF=NONE
CT=000.107S ET=030.909S
WUID=STC09689 USERID=TCPIP
WKL=SYSTEM SCL=SYSSTC P=1
RGP=N/A SRVR=NO QSC=NO
ADDR SPACE ASTE=062B7140
DSPNAME=00000EDC ASTE=093D7500
DSPNAME=TCPIPDS1 ASTE=7EE44C00
Example B-8 With REUSASID enabled the old ASID is available and reused
S TCPIPA,REUSASID=YES 1
$HASP100 TCPIPA ON STCINRDR
IEF695I START TCPIPA WITH JOBNAME TCPIPA IS ASSIGNED TO USER
TCPIP , GROUP TCPGRP
$HASP373 TCPIPA STARTED
D A,TCPIPA
IEE115I 14.49.38 2010.298 ACTIVITY 711
JOBS M/S TS USERS SYSAS INITS ACTIVE/MAX VTAM OAS
00004 00023 00002 00034 00019 00002/00030 00021
TCPIPA TCPIPA TCPIPA NSW SO A=0085 PER=NO SMC=000 2
PGN=N/A DMN=N/A AFF=NONE
CT=000.121S ET=069.808S
WUID=STC09694 USERID=TCPIP
WKL=SYSTEM SCL=SYSSTC P=1
RGP=N/A SRVR=NO QSC=NO
ADDR SPACE ASTE=062B7140
DSPNAME=00000EDC ASTE=093D7500
DSPNAME=TCPIPDS1 ASTE=7EE44C00
P TCPIPA
EZZ4201I TCP/IP TERMINATION COMPLETE FOR TCPIPA 3
$HASP395 TCPIPA ENDED
S TCPIPA,REUSASID=YES 4
$HASP100 TCPIPA ON STCINRDR
IEF695I START TCPIPA WITH JOBNAME TCPIPA IS ASSIGNED TO USER
TCPIP , GROUP TCPGRP
$HASP373 TCPIPA STARTED
D A,TCPIPA
IEE115I 14.56.01 2010.298 ACTIVITY 868
JOBS M/S TS USERS SYSAS INITS ACTIVE/MAX VTAM OAS
00004 00023 00001 00034 00019 00001/00030 00021
TCPIPA TCPIPA TCPIPA NSW SO A=0085 PER=NO SMC=000 5
PGN=N/A DMN=N/A AFF=NONE
CT=000.111S ET=028.495S
WUID=STC09698 USERID=TCPIP
WKL=SYSTEM SCL=SYSSTC P=1
RGP=N/A SRVR=NO QSC=NO
ADDR SPACE ASTE=062B7140
DSPNAME=00000EDC ASTE=093D7500
DSPNAME=TCPIPDS1 ASTE=7EE44C00
SOURCEVIPA
When the packet is sent to the destination host, the source IP address is included in the
packet. In most cases, the source IP address of the packet is used as the destination IP
address of the returning packet from the other host. For the inbound traffic, z/OS
Communications Server sets the destination IP address of the incoming packet to the source
IP address of the return packet. However, for outbound traffic, the source IP address is
determined by several parameters.
By default (IPCONFIG NOSOURCEVIPA), z/OS Communications Server sets the IP address of the
interface that is used to send out a packet to a specific destination as the source IP address.
The sending interface is selected depending on the routing table of the TCP/IP stack.
When IPCONFIG SOURCEVIPA is set, outbound data grams use the Virtual IP Addressing (VIPA)
for the source IP address of the packet instead of the physical interface IP address. By using
VIPA as the source IP address and the destination IP address of the return packets from other
hosts, SOURCEVIPA provides the tolerance of device and adapter failures.
The order of the HOME list is important if SOURCEVIPA is specified. The source IP address is the
first static VIPA listed above the interface that is chosen for sending the packet. In
Example B-9, if OSA20C0 2 is chosen as the actual physical interface for sending the
outbound packet, then the IP address of the first VIPA above the HOME list, 10.1.2.10, is the
source IP address.
HOME
10.1.1.10 VIPA1L
10.1.2.10 VIPA2L 1
10.1.2.11 OSA2080I
10.1.3.11 OSA20C0I 2
....
Note: The source IP address selection can be overridden with the SRCIP statement, as
described in “Source IP address” on page 501.
SOURCEVIPA has no effect on OSPF or RIP route information exchange packets that are
generated by the OMPROUTE routing daemon, which means that it is only applicable for
data diagrams.
By default (IPCONFIG NOMULTIPATH), there is no multipath support and all connections use the
first active route to the destination network or host even if there are other, equal-cost routes
available.
PATHMTUDISCOVERY
Coding IPCONFIG PATHMTUDISCOVERY prevents the fragmentation of data grams. It tells TCP/IP
to discover dynamically the Path Maximum Transfer Unit (PMTU), which is the smallest of the
MTU sizes of each hop in the path between two hosts.
When a connection is established, TCP/IP uses the minimum MTU of the sending host as the
starting segment size and sets the Don’t Fragment (DF) bit in the IP header. Any router along
the route that cannot process the MTU returns an ICMP message requesting fragmentation
and informs the sending host that the destination is unreachable. The sending host can then
reduce the size of its assumed PMTU. You can find more information about PMTU discovery
in RFC 1191 - Path MTU Discovery.
The default is IPCONFIG NOPATHMTUDISCOVERY. Aside from enabling PMTU during stack
initialization, you can also enable or disable PMTU discovery by using VARY OBEYFILE.
CHECKSUMOFFLOAD
When sending or receiving packets over OSA-Express in QDIO mode with checksum offload
support, TCP/IP offloads most IPv4 (outbound and inbound) checksum processing (IP
header, TCP, and UDP checksums) to the OSA. The TCP/IP stack still performs checksum
processing in the cases where checksum cannot be offloaded. With the OSA-Express4S
features, LPAR-LPAR and LAN checksum offload are supported for both IPv4 and IPv6 and
do not have to be performed by the stack.
SEGMENTATIONOFFLOAD
When sending packets over OSA-Express in QDIO mode with TCP segmentation offload
support, TCP/IP offloads most IPv4 outbound TCP segmentation processing to the OSA. The
TCP/IP stack still performs TCP segmentation processing in the cases where segmentation
cannot be offloaded. Segmentation offload is supported only for packets that go onto the
LAN. It is not supported, for example, for LPAR-LPAR traffic through a shared OSA.
Tip: Applications that use large TCP send buffers obtain the most benefit from TCP
segmentation offload. The size of the TCP receive buffer on the other side of the TCP
connection also affects the negotiated buffer size.
You can control the size of these buffers by using the TCPSENDBFRSIZE and TCPRCVBUFRSIZE
parameters on the TCPCONFIG statement to set the default TCP send/receive buffer size for
all applications. However, an application can use the SO_SNDBUF socket option to
override the default TCP send buffer sizes (for example, FTP).
Note: These offloads apply to QDIO mode for the OSD and OSX channel-path identifier
(CHPID) types. No offloads are supported for OSM (which is Layer 2).
IQDIOROUTING
When IPCONFIG IQDIOROUTING is configured, the inbound packets that are to be forwarded by
this TCP/IP stack use HiperSockets (also known as internal queued direct I/O (iQDIO)) and
queued direct I/O (QDIO) directly and bypass the TCP/IP stack. This type of routing is called
HiperSockets Accelerator because it allows you to concentrate external network traffic over a
single OSA-Express QDIO connection and then accelerates the routing over a HiperSockets
link, bypassing the TCP/IP stack. The default is NOIQDIOROUTING. For more information about
HiperSockets, see Chapter 4, “Connectivity” on page 139.
The UNIX shell command onetstat -R displays the current ARP cache entries. The
uppercase R in the option is required for this display. A third parameter can be coded that
specifies the IP address of the entry that you want to display, as the NETSTAT ARP ip_addr
command does from TSO. If you want to display the entire ARP cache, you can specify the
third parameter with the reserved word ALL (again, all in uppercase letters). If you do not
specify in uppercase letters, the reserved word is not recognized (see Example B-11).
TCPIPSTATISTICS
This statement prints the values of several TCP/IP counters to the output data set that is
designated by the CFGPRINT JCL statement. These counters include the number of
TCP retransmissions and the total number of TCP segments that is sent from the MVS
TCP/IP system. These TCP/IP statistics are written to the designated output data only during
termination of the TCP/IP address space.
IQDMULTIWRITE | NOIQDMULTIWRITE
This statement allows the HiperSockets to move multiple buffers of data with a single write
operation. HiperSockets multiple write can reduce CPU use and increase throughput for
outbound streaming-type workloads, such as FTP transfers.
This parameter applies to all HiperSockets interfaces, including IUTIQDIO and IQDIOINTF6
interfaces that are created for Dynamic XCF.
NOIQDIOMULTIWRITE | IQDIOMULTIWRITE
This statement tells TCP/IP to displace the CPU cycles for HiperSockets multiple write
workload to a z Systems Integrated Information Processor (zIIP). Example B-12 shows the
output of the following z/OS command:
D TCPIP,TCPIPA,NETSTAT,CONFIG
IP Configuration Table:
Forwarding: Yes TimeToLive: 00064 RsmTimeOut: 00060
IpSecurity: No
ArpTimeout: 01200 MaxRsmSize: 65535 Format: Long
IgRedirect: No SysplxRout: Yes DoubleNop: No
StopClawEr: No SourceVipa: Yes
MultiPath: No PathMtuDsc: No DevRtryDur: 0000000090
DynamicXCF: No
QDIOAccel: No
IQDIORoute: No
TcpStackSrcVipa: No
ChecksumOffload: Yes SegOffload: No
SMF Parameters:
Type 118:
TcpInit: 00 TcpTerm: 00 FTPClient: 00
TN3270Client: 00 TcpIpStats: 00
Type 119:
TcpInit: No TcpTerm: No FTPClient: No
TcpIpStats: No IfStats: No PortStats: No
Stack: No UdpTerm: No TN3270Client: No
IPSecurity: No Profile: No DVIPA: No
SmcrGrpStats: No SmcrLnkEvent: No
Telnet, for example, is a server that binds to INADDR_ANY. Previously, a client that wanted to
access both Telnet servers, TN3270 and UNIX Telnet, connected to different ports or different
TCP/IP stacks, depending on which Telnet server it wanted to connect to. This led to cases
where either one server used a different, nonstandard port, or multiple TCP/IP stacks had to
be used. With this function, you do not need to have two separate ports or TCP/IP stacks. You
use the same port 23 for both TN3270 and UNIX Telnet. All that is needed is to code the BIND
keyword in the PORT statement for each server:
PORT
23 TCP TN3270A BIND 10.1.1.10
23 TCP OMVS BIND 10.1.1.20
In this case, the TN3270A is a job name for a TN3270 server. When it binds to port 23 and
INADDR_ANY, it is associated with IP address 10.1.1.10. The OMVS job name identifies any
UNIX server, including the UNIX Telnet server. When UNIX Telnet Server binds to port 23 and
INADDR_ANY, it is associated with IP address 10.1.1.20.
Both IP addresses can be dynamic VIPA addresses, static VIPA addresses, or real interface
addresses. You also can code a wildcard for the job name. This function works only for
servers that bind to INADDR_ANY, and it is not valid with the PORTRANGE statement.
UDPCONFIG UNRESTRICTLOWPORTS
If you want the well-known ports to be used only by predefined application processes or
superuser-authorized application processes, then you can define the RESTRICTLOWPORTS
option on the TCPCONFIG and UDPCONFIG statements. This action prevents any non-authorized
socket application from acquiring a well-known port.
EPHEMERALPORTS
Typically, ephemeral ports are ports that TCP/IP assigns to a client when the client issues a
connect( ) socket call and the port number is not already known. In some cases, ephemeral
ports can also be used by servers. For example, FTP servers in passive mode use ephemeral
ports. Ephemeral ports are port numbers 1024 - 65,535. Security requirements necessitate
configuring firewalls to limit the range of acceptable ports. A parameter on the TCPCONFIG and
UDPCONFIG statements, EPHEMERALPORTS, allows the assignment of the range of the low and
high ephemeral ports to be given out by the stack.
If EPHEMERALPORTS is not specified, the low value defaults to 1024 and the high value defaults
to 65,535. Separate definitions for TCPCONFIG and UDPCONFIG statements allow different ranges
per protocol. Example B-14 shows the assignment of a non-default range of the ephemeral
ports.
With the NETSTATS STATS/-S command in Example B-16, you can display four new TCP and
UDP statistics.
The following methods of assigning or reserving ports can have interactions and take
precedence over EPHEMERALPORTS:
PORT or PORTRANGE statement:
– Ports that are reserved by PORT and PORTRANGE statements are not available for use as
ephemeral ports.
– Ports that are reserved by PORT or PORTRANGE for a protocol that are also within the
defined EPHEMERALPORT range for that protocol are skipped when assigning an
ephemeral port.
– PORT UNRSV does not affect ephemeral port assignment and applications can bind to
ports that are within the defined EPHEMERALPORT range.
GLOBALCONFIG EXPLICITBINDPORTRANGE statement:
– The range that is defined by EXPLICITBINDPORTRANGE is not required to be within the
EPHEMERALPORTS range.
– If there is overlap between EXPLICITBINDPORTRANGE and EPHEMERALPORTS, ports within
that overlap are not available for general ephemeral port assignment.
SYSPLEXPORTS statement:
– If a bind() is made to a distributed DVIPA with SYSPLEXPORTS specified, a port is chosen
by the coupling facility from the EPHEMERALPORTS range.
– TCP/IP communicates its EPHEMERALPORTS range to the coupling facility to ensure that it
uses the correct range.
FTP passive data specification:
– If an FTP command uses PASSIVEDATAPORTS, the port is assigned from within the
PASSIVEDATAPORTS range, as defined in the FTP server’s configuration file.
– The PASSIVEDATAPORTS range must also be reserved on a PORTRANGE statement with the
AUTHPORT parameter:
• They are available for assignment only to the FTP server.
• This range of ports is not required to be within the defined EPHEMERALPORTS range.
• If these ranges do overlap, the AUTHPORTS are not available for general
ephemeral port assignment.
BPXPARMS INADDRANYPORT and INADDRANYCOUNT statements:
– Ports that are defined by BPXPARMS INADDRANYPORT and INADDRANYCOUNT must be
restricted by the PORT or PORTRANGE statement to the job name OMVS:
• These ports are not assigned by the stack unless the user has a job name of
OMVS.
• This range is not required to be within the defined EPHEMERALPORT range.
• If there is overlap with the defined EPHEMERALPORT range, the ports that are reserved
for job name OMVS are not available for ephemeral port assignment.
All these methods reduce the number of ephemeral ports that are available for general
assignment by the stack. The “Configured ephemeral ports” values shown on NETSTAT STATS
reflect the actual number of ports available for assignment by the stack. Ensure that you have
sufficient EPHEMERALPORTS range available after accounting for the interactions that are
described. Use Ephemeral Ports Max Usage and Ephemeral Ports Exhausted statistics to
monitor whether the chosen port range is sufficient.
Network Management
The TCPCONFIG and UDPCONFIG EPHEMERALPORTS low and high values can be retrieved by the
Network Management Interface (NMI), System Management Facility (SMF) records, and
Simple Network Management Protocol.
The GetProfile NMI supports the low and high values for both TCP and UDP.
System Management Facility record 119, subtype 4 supports the low and high values for both
TCP and UDP.
System Management Facility record 119, subtype 5 supports the configured ephemeral ports,
in use, maximum usage, and exhausted values for both TCP and UDP.
PORT
The PORT reservations that are defined in the PROFILE data set are the ports that are used by
specific applications. You control access to particular ports by port number by reserving the
port by using the PORT or PORTRANGE profile statements. You can also use the optional SAF
parameter to provide additional access control.
You then must explicitly define PORT statements to reserve each port or define the process
with superuser authority in RACF. The reserved ports indicate that the port is not available for
use by any user. However, the unreserved port numbers 1024 - 65535 are available for use by
any application that issues an explicit bind to a specific unreserved port. These port numbers
are also used by the stack to provide stack-selected ephemeral ports.
Note: This control (UNRSV) does not affect the use of ports that are selected by the stack
either as a local ephemeral port or as a sysplex-wide port for a distributed DVIPA.
You reserve the ports by using PORT and PORTRANGE statements with a job name of OMVS, the
forked process job name, or a wildcard job name, such as asterisk (*) for UNIX applications.
The job can use fork() to another address space with a different name (for example, InetD or
an FTP server). Example B-17 shows the access control to the ports.
Normally, you can specify either OMVS or the job name in the PORT statement. However,
certain daemons have special considerations in this matter.
When the FTP server starts, it forks the listener process to run in the background, requiring
that the name of the forked address space (FTPDA1, in this example), not the original
procedure name, be used on the PORT statement of the control connection (2). You must
specify OMVS as the name on the PORT for FTP’s PORT 20 (1), which is used for the data
connection that is managed by the child process. If you specify the forked name on the data
connection (Port 20), the data connections fail.
You can also reserve UDP port 514 (3) to OMVS. This port is used by the SyslogD server in
OMVS to receive log messages from other SyslogD servers in the TCP/IP network. The
PORTRANGE statements (4) reserve a range of ephemeral TCP and UDP ports for UNIX System
Services. The PORTRANGE statement (5) reserves TCP ports 5000 - 5002 for any job name
starting USER1 that uses the wildcard feature. The PORT UNRSV statement (6) denies UDP
explicit bind access to application-specified unreserved ports by any job.
NETWORK DOMAINNAME(AF_INET6)
DOMAINNUMBER(19)
MAXSOCKETS(10000)
TYPE(INET)
To display the PORT reservation list, use the NETSTAT PORTL TSO/E command, the
D TCPIP,procname,NETSTAT PORTL MVS command, or the onetstat -p procname -o UNIX
shell command. Example B-19 shows the MVS command.
TCPCONFIG TCPSENDBFRSIZE
TCPCONFIG TCPSENDBFRSIZE specifies the TCP send buffer size. This value is used as the
default send buffer size for those applications that do not explicitly set the buffer size by using
SETSOCKOPT(). The default is 64K.
TCPCONFIG TCPRCVBUFRSIZE
TCPCONFIG TCPRCVBUFRSIZE specifies the TCP receive buffer size. This value is used as the
default receive buffer size for those applications that do not explicitly set the buffer size by
using SETSOCKOPT(). You can specify a value 256 - TCPMAXRCVBUFRSIZE. The default is 64K.
Note: The FTP server and client applications override the default settings and use 64 KB
as the TCP window size and 180 KB for send/recv buffers. No changes are required in the
TCPCONFIG statement for the FTP server and client.
TCPCONFIG FINWAIT2TIME
The TCPCONFIG FINWAIT2TIME parameter allows you to specify the number of seconds a TCP
connection should remain in the FINWAIT2 state. When this time limit is reached, the system
waits a further 75 seconds before dropping the connection. The default is 600 seconds, but
you can specify a value as low as 60 seconds, which reduces the time that a connection
remains in the FINWAIT2 status and free up resources for future connections.
TCPCONFIG TCPTIMESTAMP
The TCP time stamp option is exchanged during connection setup. This option is enabled (by
default) by using the TCPCONFIG TCPTIMESTAMP parameter. Enabling the TCP time stamp
allows TCP/IP to better estimate the round-trip time (RTT), which helps avoid unnecessary
retransmissions and helps protect against the wrapping of sequence numbers.
B.3.4 IDYNAMICXCF
You have the option of either defining the DEVICE, LINK, HOME, and START statements for MPC
XCF connections to another z/OS, or letting TCP/IP dynamically define them for you.
Dynamic XCF devices and links, when activated, appear to the stack as though they are
defined in the TCP/IP profile. They can be displayed by using standard commands, and they
can be stopped and started. For multiple-stack environments, IUTSAMEH links are
dynamically created for same-LPAR links. For more information, see IBM z/OS V2R2
Communications Server TCP/IP Implementation Volume 3: High Availability, Scalability, and
Performance, SG24-8362.
The SMFPARMS statement can also be used to turn on SMF logging. However, you are
encouraged to migrate to SMFCONFIG, which has the following advantages over the SMFPARMS
statement:
Using SMFCONFIG means that SMF records are written by using standard subtypes. With
SMFPARMS, you must specify the subtypes to be used.
You can use SMFCONFIG to record both type 118 and type 119 records. With SMFPARMS, only
type 118 records can be collected.
You can use SMFCONFIG to record a wider variety of information.
By using SMFCONFIG, you gain support for dynamic reconfiguration for all environments
under which Communications Server for z/OS IP is running (SRB mode, reentrant, XMEM
mode, and so on), and you can avoid duplicate SMF exit processes.
In the following example, type 118 FTP client records, type 119 TN3270 client records, and
type 119 IPSEC records are collected:
SMFCONFIG TYPE118 FTPCLIENT
TYPE119 TN3270CLIENT IPSECURITY
The preceding example can also be coded as follows because type 118 records are collected
by default:
SMFCONFIG FTPCLIENT
TYPE119 TN3270CLIENT IPSECURITY
SMFCONFIG is coded in the PROFILE.TCPIP, but it has related entries in both Telnet and in FTP.
(See IBM z/OS V2R2 Communications Server TCP/IP Implementation Volume 2: Standard
Applications, SG24-8361.)
In our example, we use the NETSTAT,CONFIG command to check whether the SMFCONFIG setup
is correct. Example B-21 shows the output.
For more information about TCP/IP SMF record layouts and standardized subtype numbers,
see z/OS Communications Server: IP Configuration Reference, SC27-3651.
B.3.7 NETMONITOR
Use the NETMONITOR statement to activate or deactivate selected real-time TCP/IP NMI.
If you want your application to process only SMF 119 records by using these real-time SMF
services, you must configure only the NETMONITOR profile statement. You do not need to
request support for these SMF 119 records on the SMFCONFIG profile statement.
The SMFSERVICE parameter can be used to configure the real-time Internet Protocol network
monitoring NMI to support the new SMF 119 event records, subtypes 32 - 37, which provide
sysplex event information, specifying the subparameter DVIPA, as shown in Example B-22.
Example B-23 Use the NETSTAT, CONFIG command to verify the network monitor statements
D TCPIP,TCPIPA,N,CONFIG
NETWORK MONITOR CONFIGURATION INFORMATION:
PKTTRCSRV: NO TCPCNNSRV: NO NTASRV: NO
SMFSRV: YES
IPSECURITY: YES PROFILE: YES CSSMTP: YES CSMAIL: NO DVIPA: YES
For more information about NETMONITOR usage, see z/OS Communications Server: IP
Programmer’s Guide and Reference, SC31-8787.
INTERFACE IUTIQDF4L 9
DEFINE IPAQIDIO
CHPID F4 10
IPADDR 10.1.4.31/24
SOURCEVIPAINTerface VIPA1L
READSTORAGE GLOBAL 11
SECCLASS 255 12
NOMONSYSPLEX 13
;
In Example B-24 on page 493, the numbers correspond to the following information:
1 The INTERFACE statement replaces the DEVICE and LINK statements. The INTERFACE
statement label must be unique.
2 The INTERFACE statement can be used for all IPv4 and IPv6 devices.
3 The PORTNAME operand as defined in TRL node. For multiple VLAN configurations, the
same PORTNAME can be defined several times.
4 The IPADDR operand replaces the HOME statement. The optional subnetmask definition
replaces a similar definition that is coded in BEGINROUTES.
5 The optional MTU operand replaces a similar definition that is coded in BSDROUTINGPARMS.
6 The optional VLANID operand is required when defining multiple VLANs.
7 The optional VMAC operand, with or without set values, is required when defining VLANs.
8 SOURCEVIPAINT defines the VIPA that is associated with this INTERFACE.
9 The INTERFACE statement for HiperSockets.
10 IQD CHPID for HiperSockets interface.
11 GLOBAL is the default value for READSTORAGE, which specifies a fixed amount of storage that
should be kept available for read processing.
12 The parameter that is used to associate a security class for IP filtering with this interface.
13 NOMONSYSPLEX is the default value that specifies the sysplex autonomics to not monitor the
link status.
Note: If SOURCEVIPAINT is coded, the whole INTERFACE definition block must be defined in
PROFILE after the VIPA DEVICE and LINK statements are defined.
Note: The netstat dev (-d) command always return the resources that are defined with
the DEVICE/LINK statements first and the resources that are defined with the INTERFACE
statement later.
In Example B-26, the number 1 is the parameter to code to delete an INTERFACE. Note the
syntax differences from DEVICE/LINK deletion coding.
D TCPIP,TCPIPA,N,DE,INTFN=OSA20A0I 2
INTFNAME: OSA20A0I INTFTYPE: IPAQENET INTFSTATUS: NOT ACTIVE 2
PORTNAME: OSA20A0 DATAPATH: UNKNOWN DATAPATHSTATUS: NOT ACTIVE 2
.................................................................. Lines deleted
IPV4 LAN GROUP SUMMARY
LANGROUP: 00002
NAME STATUS ARPOWNER VIPAOWNER
---- ------ -------- ---------
OSA2081L ACTIVE OSA2081L YES
OSA20A0I 2 NOT ACTIVE OSA2081L NO
1 OF 1 RECORDS DISPLAYED
END OF THE REPORT
V TCPIP,TCPIPA,O,TCPIPA.TCPPARMS(OBDELINT) 3
EZZ0060I PROCESSING COMMAND: VARY TCPIP,TCPIPA,O,TCPIPA.TCPPARMS(OBDELINT)
EZZ0300I OPENED OBEYFILE FILE 'TCPIPA.TCPPARMS(OBDELINT)'
EZZ0309I PROFILE PROCESSING BEGINNING FOR 'TCPIPA.TCPPARMS(OBDELINT)'
D TCPIP,TCPIPA,N,DE,INTFN=OSA20A0I 4
0 OF 0 RECORDS DISPLAYED 4
END OF THE REPORT
In Example B-27 on page 496, the numbers correspond to the following information:
1 Stop the interface.
2 Check that the interface is not active.
3 Enter the OBEYFILE command.
4 Check that the interface is deleted.
Example B-28 shows the TRL nodes definition in VTAMLST for OSA-Express 3.
Note: The two TRLE resources that are associated with the two ports that are defined on
the same CHPID can be defined on either the same or different TRL major nodes.
D NET,TRL,TRLE=OSA20A1P
IST097I DISPLAY ACCEPTED
IST075I NAME = OSA20A1P, TYPE = TRLE
IST1954I TRL MAJOR NODE = OSA20A1
IST486I STATUS= ACTIV, DESIRED STATE= ACTIV
IST087I TYPE = LEASED , CONTROL = MPC , HPDT = YES
IST1715I MPCLEVEL = QDIO MPCUSAGE = SHARE
IST2263I PORTNAME = OSA20A1 PORTNUM = 1 1 OSA CODE LEVEL = 000C
IST1577I HEADER SIZE = 4096 DATA SIZE = 0 STORAGE = ***NA***
IST1221I WRITE DEV = 20A9 STATUS = ACTIVE 2 STATE = ONLINE
IST1577I HEADER SIZE = 4092 DATA SIZE = 0 STORAGE = ***NA***
IST1221I READ DEV = 20A8 STATUS = ACTIVE 2 STATE = ONLINE
IST1221I DATA DEV = 20AA STATUS = ACTIVE 2 STATE = N/A
IST1724I I/O TRACE = OFF TRACE LENGTH = *NA*
IST1724I I/O TRACE = OFF TRACE LENGTH = *NA*
IST1717I ULPID = TCPIPA
.................................................................. Lines removed
IST1221I DATA DEV = 20AB STATUS = ACTIVE 2 STATE = N/A
.................................................................. Lines removed
IST1221I DATA DEV = 20AC STATUS = ACTIVE 2 STATE = N/A
.................................................................. Lines removed
IST1221I DATA DEV = 20AD STATUS = RESET 3 STATE = N/A
.................................................................. Lines removed
IST314I END
Example B-30 shows the OSA-Express 3 devices online and allocated by NET.
Note: All OSA-Express 3 devices of either port 0 and port 1 are defined under the same
CHPID. Additional device addresses can be defined through HCD if required (see
OSA-Express Customer’s Guide and Reference, SA22-7935).
VLAN ID
Support is provided for virtual local area network standard IEEE 802.1q (VLAN).
Implementing VLAN allows a physical LAN to be logically subdivided into separate logical
LANs. With VLANID specified, the TCP/IP stacks that share an OSA can have an IP address
that is assigned from separate IP subnets.
The VLAN ID is configured and implemented in the z/OS environment through the LINK
definitions in the PROFILE.TCPIP for OSA-Express in QDIO mode. VLANs support ARP
takeover in a flat network (no routing protocol) when connected appropriately. For more
information about this implementation, see Chapter 4, “Connectivity” on page 139.
Example B-32 shows a link definition example of OSA2080I that is attached to virtual LAN
(VLAN) 10.
Example B-33 shows the NETSTAT DEVLINKS display of an OSA-Express that has VLAN ID
enabled.
In this example, the VLAN tagging (1) is enabled on this device (VLAN 10).
Alternatively, you can designate the source IP addresses to be used for outbound TCP
connections that are initiated by specified jobs or destined for specified IP addresses,
networks, or subnets, by using the SRCIP statement, as described here:
You can set job-specific source IP addressing by using the JOBNAME option in the SRCIP
statement.
You can set destination-specific source IP addressing by using the DESTINATION option in
the SRCIP statement.
These source IP address definitions override any other source IP address specification in the
TCP/IP profile. However, the use of SRCIP can also be overridden directly by an application
through the use of specific socket API options.
You can relieve this restriction by adding the option EXPLICITBINDPORTRANGE. Unlike the
sysplexport pools, where each pool is associated with a specific distributed DVIPA, the port
range that is specified by EXPLICITBINDPORTRANGE is not associated with any specific
distributed DVIPA, and can be used for any distributed DVIPA.
The EZBEPORTvvtt structure in the coupling facility, where vv is the two-character VTAM group
ID suffix that is specified on the XCFGRPID start option and tt is the TCP group ID suffix that is
specified on the GLOBALCONFIG statement in the TCP/IP profile, coordinates this port range
among all members of the sysplex. The port range should be identical in all members of the
sysplex.
SRCIP
JOBNAME * 10.1.1.10 CLIENT 1
JOBNAME CUST* 10.1.2.10 SERVER 1
DESTINATION 10.1.2.240 10.1.1.10 2
DESTINATION 10.1.2.0/24 10.1.2.10 2
DESTINATION 10.1.100.0/24 10.1.8.10 3
ENDSRCIP
We use the NETSTAT,SRCIP command to verify our configuration, as shown in Example B-35.
To verify the destination-specific source IP address feature functions correctly, we issue the
TSO telnet command with an IP address that is configured in an L3 Switch. Example B-36
shows the results of the show tcp brief command that is issued for the L3 Switch.
The example shows a separate source IP address that is used for each specific destination IP
address.
Both the IMS sockets and CICS sockets support provide a user exit that you can use to
validate each IMS or CICS transaction that is received by the listener function. How you code
this exit, and what data you require to be present in the transaction initiation request, is your
decision.
z/OS LPAR: A12 z/OS LPAR: A13 z/OS LPAR: A14 z/OS LPAR: A15
HiperSockets
CHPID F4 Devices E800-E81F IPADDR 10.1.4.x
CHPID F5 Devices E900-E91F IPADDR 10.1.5.x
CHPID F6 Devices EA00-EA1F IPADDR 10.1.6.x
CHPID F7 Devices EB00-EB1F (DYNAMICXCF) IPADDR 10.1.7.x
CF38 CF39
CF LPAR: A2E CF LPAR: A2F
10.1.x.x
192.168.x.x
Ethernet Switch 1 Ethernet Switch 2
10.1.x.240 10.1.x.220
Windows XP
with PCOM
We wrote our books (and ran our implementation scenarios) by using four logical partitions
(LPARs) on an IBM z13 (referred to as LPARs A12, A13, A14, and A15). We implemented and
started one TCP/IP stack on each LPAR. Each LPAR shared the following resources:
HiperSockets inter-server connectivity
Coupling Facility connectivity (CF38 and CF39) for Parallel Sysplex scenarios
Eight OSA-Express 1000BASE-T Ethernet ports that are connected to a switch
Finally, we shared four Windows workstations, representing corporate network access to the
z/OS networking environment. The workstations are connected to the switch. For verifying our
scenarios, we used applications such as TN3270 and FTP.
The IP addressing scheme that we used allowed us to build multiple subnetworks so that we
did not impede ongoing activities from other team members.
TC P IPE TC P IP A TC P IP B TC P IP C TC P IP D
V L AN 10 V LAN 11
10 .1.2.240 10.1.3.240
V LAN 12
192.16 8.2.2 40
R R outer
10.1.10 0.2 40
V LAN 30
XCF
10.1.7.x1
VLAN 10 VLAN 11
10.1.2.240 10.1.3.240
SWITCH 1
You can use the z/OSMF Configuration Assistant to generate configuration files and policies
for the following technologies:
Application Transparent TLS (AT-TLS)
Defense Manager Daemon (DMD)
IP Security (IPSec)
Network Security Services (NSS)
Policy-based routing (PBR)
Quality of service (QoS)
Intrusion detection system (IDS)
TCP/IP profile
Configuration files can be created for any number of z/OS images, with any number of TCP/IP
stacks per image.
This section shows how to configure a simple TCP/IP stack by using the information that is
shown in Table D-1 (which is based on the A12 LPAR (SC30) configuration in Figure D-3 on
page 522).
OSA2080I 02 10 10.1.2.11/24
OSA20A0I 03 10 10.1.2.12/24
OSA20C0I 04 11 10.1.3.11/24
OSA20E0I 05 11 10.1.3.12/24
You are prompted to proceed to the next step of adding a TCP/IP stack to the z/OS system
image (see Figure D-6 on page 525).
4. Click Proceed. In the New TCP/IP Stack Information window, type the TCP/IP stack name
and then click OK. We use the TCPIPA for our stack on SC30, as shown in Figure D-7.
3. Enter the IP address of the VIPA in the connectivity column, as shown in Figure D-10.
4. The z/OSMF Configuration Assistant can define multiple interfaces. We define OSA2080I
with CPHPID 02, VLAN ID 10, and IP address 10.1.2.11/24, as shown in Figure D-11 on
page 527.
5. To define HiperSockets interfaces in this configuration, you must enter a name, type, and
IP address of this device:
a. Each type has a different set of parameters that you can define. HiperSockets
interfaces are defined as shown in Figure D-12.
6. In the Current Backing Store window (see Figure D-14), the Systems tab shows that the
Status of the TCP/IP Configuration file is now Complete. Click Action and select Install
Configuration File.
7. Select the current list configuration file and Select Action → Show Configuration File to
see the configuration TCP/IP profile details that are created by the z/OSMF Configuration
Assistant, as shown in Example D-1.
9. When the save is complete, click Close. You can optionally enter a comment for the
configuration file’s history log. Click OK → Close.
D.3.2 Starting a TCP/IP stack by using the TCP/IP profile from z/OSMF
In our example, we start TCPIPA in LPAR A12 and receive the messages that are shown in
Example D-2.
This message indicates the successful establishment of a connection to the UNIX System
Services Environment. The numbers have the following meaning:
1. Show how the stack is bound to UNIX System Services.
2. The EZB6473I and EZAIN11I messages show that the TCP/IP stack initialization is
complete and the services are available.
3. Our environment is defined within a sysplex, so message EZD1176I indicates the
connectivity to the TCP/IP sysplex group EZBTCPCS.
If messages are displayed by the TCP/IP address space, they should describe errors or why
the TCP/IP stack did not start.
Example D-3 shows the output from the NETSTAT HOME command.
Example D-4 shows the output from the NETSTAT DEVLINKS command.
If the status for the interface is not READY, verify that the VTAM Major node is active. You can
do this by using the VTAM command D NET, TRL.
Also, make sure that the VLAN ID that is defined to the interface matches the one for the port
in the Ethernet switch.
The publications that are listed in this section are considered suitable for a more detailed
discussion of the topics that are covered in this book.
You can search for, view, or download Redbooks, IBM Redpapers™, Technotes, draft
publications and Additional materials, and order hardcopy Redbooks publications, at this
website:
ibm.com/redbooks
Other publications
The following publications are also relevant as further information sources:
IBM Health Checker for z/OS: User’s Guide, SA22-7994
OSA-Express Customer’s Guide and Reference, SA22-7935
z/OS Communications Server: CSM Guide, SC31-8808
z/OS Communications Server: IP Configuration Guide, SC27-3650
z/OS Communications Server: IP Configuration Reference, SC27-3651
z/OS Communications Server: IP Diagnosis Guide, GC31-8782
z/OS Communications Server: IP Messages Volume 1 (EZA), SC31-8783
Online resources
The following websites are also relevant as further information sources:
Mainframe networking
http://www.ibm.com/servers/eserver/zseries/networking/
z/OS Communications Server product overview
http://www.ibm.com/software/network/commserver/zos/
z/OS Communications Server product support
http://www.ibm.com/software/network/commserver/zos/support/
SG24-8360-00
ISBN 0738442097
Printed in U.S.A.
®
ibm.com/redbooks