123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197 |
- Date : 2004-Nov-26
- Author: Gerald Schaefer (geraldsc@de.ibm.com)
- Linux API for read access to z/VM Monitor Records
- =================================================
- Description
- ===========
- This item delivers a new Linux API in the form of a misc char device that is
- usable from user space and allows read access to the z/VM Monitor Records
- collected by the *MONITOR System Service of z/VM.
- User Requirements
- =================
- The z/VM guest on which you want to access this API needs to be configured in
- order to allow IUCV connections to the *MONITOR service, i.e. it needs the
- IUCV *MONITOR statement in its user entry. If the monitor DCSS to be used is
- restricted (likely), you also need the NAMESAVE <DCSS NAME> statement.
- This item will use the IUCV device driver to access the z/VM services, so you
- need a kernel with IUCV support. You also need z/VM version 4.4 or 5.1.
- There are two options for being able to load the monitor DCSS (examples assume
- that the monitor DCSS begins at 144 MB and ends at 152 MB). You can query the
- location of the monitor DCSS with the Class E privileged CP command Q NSS MAP
- (the values BEGPAG and ENDPAG are given in units of 4K pages).
- See also "CP Command and Utility Reference" (SC24-6081-00) for more information
- on the DEF STOR and Q NSS MAP commands, as well as "Saved Segments Planning
- and Administration" (SC24-6116-00) for more information on DCSSes.
- 1st option:
- -----------
- You can use the CP command DEF STOR CONFIG to define a "memory hole" in your
- guest virtual storage around the address range of the DCSS.
- Example: DEF STOR CONFIG 0.140M 200M.200M
- This defines two blocks of storage, the first is 140MB in size an begins at
- address 0MB, the second is 200MB in size and begins at address 200MB,
- resulting in a total storage of 340MB. Note that the first block should
- always start at 0 and be at least 64MB in size.
- 2nd option:
- -----------
- Your guest virtual storage has to end below the starting address of the DCSS
- and you have to specify the "mem=" kernel parameter in your parmfile with a
- value greater than the ending address of the DCSS.
- Example: DEF STOR 140M
- This defines 140MB storage size for your guest, the parameter "mem=160M" is
- added to the parmfile.
- User Interface
- ==============
- The char device is implemented as a kernel module named "monreader",
- which can be loaded via the modprobe command, or it can be compiled into the
- kernel instead. There is one optional module (or kernel) parameter, "mondcss",
- to specify the name of the monitor DCSS. If the module is compiled into the
- kernel, the kernel parameter "monreader.mondcss=<DCSS NAME>" can be specified
- in the parmfile.
- The default name for the DCSS is "MONDCSS" if none is specified. In case that
- there are other users already connected to the *MONITOR service (e.g.
- Performance Toolkit), the monitor DCSS is already defined and you have to use
- the same DCSS. The CP command Q MONITOR (Class E privileged) shows the name
- of the monitor DCSS, if already defined, and the users connected to the
- *MONITOR service.
- Refer to the "z/VM Performance" book (SC24-6109-00) on how to create a monitor
- DCSS if your z/VM doesn't have one already, you need Class E privileges to
- define and save a DCSS.
- Example:
- --------
- modprobe monreader mondcss=MYDCSS
- This loads the module and sets the DCSS name to "MYDCSS".
- NOTE:
- -----
- This API provides no interface to control the *MONITOR service, e.g. specify
- which data should be collected. This can be done by the CP command MONITOR
- (Class E privileged), see "CP Command and Utility Reference".
- Device nodes with udev:
- -----------------------
- After loading the module, a char device will be created along with the device
- node /<udev directory>/monreader.
- Device nodes without udev:
- --------------------------
- If your distribution does not support udev, a device node will not be created
- automatically and you have to create it manually after loading the module.
- Therefore you need to know the major and minor numbers of the device. These
- numbers can be found in /sys/class/misc/monreader/dev.
- Typing cat /sys/class/misc/monreader/dev will give an output of the form
- <major>:<minor>. The device node can be created via the mknod command, enter
- mknod <name> c <major> <minor>, where <name> is the name of the device node
- to be created.
- Example:
- --------
- # modprobe monreader
- # cat /sys/class/misc/monreader/dev
- 10:63
- # mknod /dev/monreader c 10 63
- This loads the module with the default monitor DCSS (MONDCSS) and creates a
- device node.
- File operations:
- ----------------
- The following file operations are supported: open, release, read, poll.
- There are two alternative methods for reading: either non-blocking read in
- conjunction with polling, or blocking read without polling. IOCTLs are not
- supported.
- Read:
- -----
- Reading from the device provides a 12 Byte monitor control element (MCE),
- followed by a set of one or more contiguous monitor records (similar to the
- output of the CMS utility MONWRITE without the 4K control blocks). The MCE
- contains information on the type of the following record set (sample/event
- data), the monitor domains contained within it and the start and end address
- of the record set in the monitor DCSS. The start and end address can be used
- to determine the size of the record set, the end address is the address of the
- last byte of data. The start address is needed to handle "end-of-frame" records
- correctly (domain 1, record 13), i.e. it can be used to determine the record
- start offset relative to a 4K page (frame) boundary.
- See "Appendix A: *MONITOR" in the "z/VM Performance" document for a description
- of the monitor control element layout. The layout of the monitor records can
- be found here (z/VM 5.1): http://www.vm.ibm.com/pubs/mon510/index.html
- The layout of the data stream provided by the monreader device is as follows:
- ...
- <0 byte read>
- <first MCE> \
- <first set of records> |
- ... |- data set
- <last MCE> |
- <last set of records> /
- <0 byte read>
- ...
- There may be more than one combination of MCE and corresponding record set
- within one data set and the end of each data set is indicated by a successful
- read with a return value of 0 (0 byte read).
- Any received data must be considered invalid until a complete set was
- read successfully, including the closing 0 byte read. Therefore you should
- always read the complete set into a buffer before processing the data.
- The maximum size of a data set can be as large as the size of the
- monitor DCSS, so design the buffer adequately or use dynamic memory allocation.
- The size of the monitor DCSS will be printed into syslog after loading the
- module. You can also use the (Class E privileged) CP command Q NSS MAP to
- list all available segments and information about them.
- As with most char devices, error conditions are indicated by returning a
- negative value for the number of bytes read. In this case, the errno variable
- indicates the error condition:
- EIO: reply failed, read data is invalid and the application
- should discard the data read since the last successful read with 0 size.
- EFAULT: copy_to_user failed, read data is invalid and the application should
- discard the data read since the last successful read with 0 size.
- EAGAIN: occurs on a non-blocking read if there is no data available at the
- moment. There is no data missing or corrupted, just try again or rather
- use polling for non-blocking reads.
- EOVERFLOW: message limit reached, the data read since the last successful
- read with 0 size is valid but subsequent records may be missing.
- In the last case (EOVERFLOW) there may be missing data, in the first two cases
- (EIO, EFAULT) there will be missing data. It's up to the application if it will
- continue reading subsequent data or rather exit.
- Open:
- -----
- Only one user is allowed to open the char device. If it is already in use, the
- open function will fail (return a negative value) and set errno to EBUSY.
- The open function may also fail if an IUCV connection to the *MONITOR service
- cannot be established. In this case errno will be set to EIO and an error
- message with an IPUSER SEVER code will be printed into syslog. The IPUSER SEVER
- codes are described in the "z/VM Performance" book, Appendix A.
- NOTE:
- -----
- As soon as the device is opened, incoming messages will be accepted and they
- will account for the message limit, i.e. opening the device without reading
- from it will provoke the "message limit reached" error (EOVERFLOW error code)
- eventually.
|