AIX¶

Links¶

CPU¶

Simulate CPU load¶

The number 4 indicates how many threads you want to load

perl -e 'while (--$ARGV[0] and fork) {}; while () {}' 4

Trace per-process CPU usage¶

tprof -x sleep 60

High j2pg usage¶

j2pg - Kernel process integral to processing JFS2 I/O requests.

The kernel thread is responsible of managing I/Os in JFS2 filesystems, so it is normal to see it running in case of lot of I/Os or syncd. We could see that j2pg runs syncHashList() very often.The sync is done in syncHashList(). In syncHashList(), all inodes are extracted from hash list. And whether the inode needs to synchronize or not is then judged by iSyncNeeded().

Note that a sync() call will cause the system to scan all the memory currently used for filecaching to see which pages are dirty and have to be synced to disk

Therefore, the cause of j2pg having this spike is determined by the two calls that were being made (iSyncNeeded ---> syncHashList.

What is going on here is a flush/sync of the JFS2 metadata to disk. Apparently some program went recursively through the filesystem accessing files forcing the inode access timestamp to change. These changes would have to propogated to the disk.

Here's a few reasons why j2pg would be active and consume high CPU:

If there several process issuing sync then the j2pg process will be very active using cpu resources.
If there is file system corruption then the j2pg will use more cpu resources.
If the storage is not running data fast enough then the j2pg process will be using high amount of cpu resources.

j2pg will get started for any JFS2 dir activity. Another event that can cause j2pg activity, is syncd. If the system experiences a lot of JFS2 dir activity, the j2pg process will also be active handling the I/O. Since syncd flushes I/O from real memory to disk, then any JFS2 dir's with files in the buffer will also be hit.

Checking the syncd...

From data, we see:

$ grep -c sync psb.elfk
351 << this is high
$ grep sync psb.elfk | grep -c oracle
348 << syncd called by Oracle user only

It appears that the number of sync which causes j2pg to run is causing spikes.

We see /usr/sbin/syncd 60

j2pg is responsible for flushing data to disk and is usually called by the syncd process. If you have a large number of sync processes running on the system, that would explain the high CPU for j2pg. The syncd setting determines the frequency with which the I/O disk-write buffers are flushed. The AIX default value for syncd as set in /sbin/rc.boot is 60. It is recommended to change this value to 10.

This will cause the syncd process to run more often and not allow the dirty file pages to accumulate, so it runs more frequently but for shorter period of time. If you wish to make this permanent then edit the /sbin/rc.boot file and change to the 60 to 10.

You may consider mounting all of the non-rootvg file systems with the 'noatime' option. This can be done without any outage:

However selecting a non-peak production hours is better:

mount -o remount,noatime /oracle
chfs -a options=noatime /oracle

noatime turns off access-time updates. Using this option can improve performance on file systems where a large number of files are read frequently and seldom updated. If you use the option, the last access time for a file cannot be determined. If neither atime nor noatime is specified, atime is the default value."

From the symptom it looks like update was intended to call SQL query but falsely invoke /usr/sbin/update command. So, please check with application team and find what these processes are and fix it not to call /usr/sbin/update if it is not intended to update the super block of file systems. Removal of all these sync processes should bring down the j2pg usage."

Thread migrations¶

# mpstat -w 1 10

System configuration: lcpu=8 ent=0.5 mode=Uncapped

cpu    min    maj    mpc    int     cs    ics     rq    mig   lpa   sysc    us    sy    wa    id    pc   %ec   lcs
  0     18      0      0    199    113      0      1      0 100.0    367  24.9  41.0   0.0  34.0  0.00   0.4   155
  1      0      0      0     15      0      0      0      0     -      0   0.0   0.8   0.0  99.2  0.00   0.2    16
  2      0      0      0     10      0      0      0      0     -      0   0.0   2.1   0.0  97.9  0.00   0.2    11
  3      0      0      0      9      0      0      0      0     -      0   0.0   0.5   0.0  99.5  0.00   0.2    10
  4      0      0      0      9      0      0      0      0     -      0   0.0   1.4   0.0  98.6  0.00   0.2    10
  5      0      0      0      9      0      0      0      0     -      0   0.0   0.5   0.0  99.5  0.00   0.2    10
  6      0      0      0      9      0      0      0      0     -      0   0.0   1.6   0.0  98.4  0.00   0.2    10
  7      0      0      0     18      0      0      0      0     -      0   0.0   1.6   0.0  98.4  0.00   0.2    19
  U      -      -      -      -      -      -      -      -     -      -     -     -   0.0  99.7  0.49  98.1     -
ALL     18      0      0    278    113      0      1      0   0.0    367   0.1   0.2   0.0  99.7  0.01   1.9   241
------------------------------------------------------------------------------------------------------------------------------------

Column	Description	Comment
`mig`	Total number of thread migrations (to another logical processor)
`lpa`	Logical processor affinity. The percentage of logical processor re-dispatches	Less than 100 means thread migrated to another socket. 100 means thread migration on same socket.

# mpstat -d 1

System configuration: lcpu=8 ent=0.5 mode=Uncapped

cpu     cs    ics  bound     rq   push S3pull  S3grd  S0rd  S1rd  S2rd  S3rd  S4rd  S5rd   ilcs   vlcs S3hrd S4hrd S5hrd  %nsp
  0    142      3      1      1      0      0      0  99.4   0.6   0.0   0.0   0.0   0.0      1    152 100.0   0.0   0.0   115
  1      0      0      0      0      0      0      0   0.0 100.0   0.0   0.0   0.0   0.0      0     16 100.0   0.0   0.0   115
  2      0      0      0      0      0      0      0   0.0 100.0   0.0   0.0   0.0   0.0      0     11 100.0   0.0   0.0   115
  3      0      0      0      0      0      0      0   0.0 100.0   0.0   0.0   0.0   0.0      0     11 100.0   0.0   0.0   115
  4      0      0      0      0      0      0      0   0.0 100.0   0.0   0.0   0.0   0.0      0     11 100.0   0.0   0.0   115
  5      0      0      0      0      0      0      0   0.0 100.0   0.0   0.0   0.0   0.0      0     11 100.0   0.0   0.0   115
  6      0      0      0      0      0      0      0   0.0 100.0   0.0   0.0   0.0   0.0      0     12 100.0   0.0   0.0   115
  7      0      0      0      0      0      0      0   0.0 100.0   0.0   0.0   0.0   0.0      0     19 100.0   0.0   0.0   115
ALL    142      3      1      1      0      0      0  95.1   4.9   0.0   0.0   0.0   0.0      1    243 100.0   0.0   0.0     0
------------------------------------------------------------------------------------------------------------------------------------------

Column	Description
`S0rd`	The percentage of thread redispatches within the same logical processor with scheduling affinity domain 0.
`S1rd`	The percentage of thread redispatches within the same physical processor or core with scheduling affinity domain 1.
`S2rd`	The percentage of thread redispatches within the same chip set, but not within the same processor core with scheduling affinity domain 2.
`S3rd`	The percentage of thread redispatches within the same MCM (multiple chip module) , but not within the same chip set with scheduling affinity domain 3.
`S4rd`	The percentage of thread redispatches on different MCMs within the same CEC or Plane with scheduling affinity domain 4.
`S5rd`	The percentage of thread redispatches on a different CEC or Plane with scheduling affinity domain 5.
`S3hrd`	The percentage of The percentage of local thread dispatches on this logical processor.
`S4hrd`	The percentage of near thread dispatches on this logical processor.
`S5hrd`	The percentage of far thread dispatches on this logical processor.

Memory¶

Memory usage per process¶

svmon -P -O summary=basic,unit=MB

Memory usage per user¶

svmon -U -t 5 -O summary=basic,unit=MB

Processes taking up paging space¶

svmon -P -O sortseg=pgsp

Top 15 processes using memory¶

svmon -Pt15 | perl -e 'while(<>){print if($.==2||$&&&!$s++);$.=0 if(/^-+$/)}'

Processes using filesystem cache¶

svmon -Sl | more

svmon will list each PID that has a segment mapped. Any segments marked as Unused are NOT mapped to any process.

vmstat commands¶

vmstat -IWwt

Column	Description
`kthr:b`	queue-count of blocked threads underway
`kthr:p`	queue-count of raw IO threads underway
`kthr:w`	queue-count of JFS/JFS2 IO threads underway
`memory:avm`	Computational Memory in 4096byte memory pages
`memory:fre`	total real-time AIX Free Memory in 4K mempages
`page:fi`	count of default&rbr JFS/JFS2 4K page reads; no raw/CIO/mmfs/NFS
`page:fo`	count of default&rbw JFS/JFS2 4K page writes; no raw/CIO/mmfs/NFS
`page:pi`	count of paging space page-ins
`page:po`	count of paging space page-outs
`page:fr`	free rate of AIX:lrud adding to memory:fre
`page:sr`	scan rate of AIX:lrud scanning for page:fr
`faults:in`	count of device interrupts
`faults:sy`	count of system calls called
`faults:cs`	count of thread context switches
`cpu:us`	user% of cpu:pc when ec>100 (or ent) on SPLPARs
`cpu:sy`	system% of cpu:pc when ec>100 (or ent) on SPLPARs
`cpu:id`	idle% of cpu:pc when ec>100 (or ent) on SPLPARs
`cpu:wa`	wait% of cpu:pc when ec>100 (or ent) on SPLPARs

vmstat -v tuning¶

pending disk I/Os blocked with no pbuf¶

Number of pending disk I/O requests blocked because no pbuf was available. Pbufs are pinned memory buffers used to hold I/O requests at the logical volume manager layer. Count is currently for the rootvg: only.

Use AIX:lvmo to monitor the pervg_blocked_io_count of each active LVM volume group,

# lvmo –a –v rootvg
vgname = rootvg
pv_pbuf_count = 512
total_vg_pbufs = 512
max_vg_pbufs = 16384
pervg_blocked_io_count = 19
pv_min_pbuf = 512
max_vg_pbuf_count = 0
global_blocked_io_count = 1566

Acceptable tolerance is 4-digits of pervg_blocked_io_count per LVM volume group for any 90 days uptime.

Otherwise, for each LVM volume group, adjust the value of AIX:lvmo:pv_pbuf_count accordingly:

If 5-digits of pervg_blocked_io_count, add ~2048 pbuf’s to total_vg_pbufs per 90-day cycle.
If 6-digits of pervg_blocked_io_count, add ~[4*2048] pbuf’s to total_vg_pbufs per 90-day cycle.
If 7-digits of pervg_blocked_io_count, add ~[8*2048] pbuf’s to total_vg_pbufs per 90-day cycle.
If 8-digits of pervg_blocked_io_count, add ~[12*2048] pbuf’s to total_vg_pbufs per 90-day cycle.
If 9-digits of pervg_blocked_io_count, add ~[16*2048] pbuf’s to total_vg_pbufs per 90-day cycle.

Use AIX:lvmo to confirm/verify the value of total_vg_pbufs for each VG.

lvmo -v rootvg -o pv_pbuf_count=1024

filesystem I/Os blocked with no fsbuf¶

Number of filesystem I/O requests blocked because no fsbuf was available. Fsbuf are pinned memory buffers used to hold I/O requests in the filesystem layer.

Ffilesystem I/Os blocked with no fsbuf # mostly JFS

If many, increase ioo:numfsbufs to 512,1024 or 2048 per severity of blocked I/Os
Default value of ioo:numfsbufs=192 ---> 1024
JFS fsbufs are per-filesystem static-allocations in pinned memory
Must re-mount (umount; mount) filesystems for effect

ioo -p -o numfsbufs=1024

Number of external pager client filesystem I/O requests blocked because no fsbuf was available. JFS2 is an external pager client filesystem. Fsbuf are pinned memory buffers used to hold I/O requests in the filesystem layer.

Acceptable tolerance is 5-digits per 90 Days-Uptime.

First tactic to attempt: If 6-digits, set ioo –h j2_dynamicBufferPreallocation=128.

First tactic to attempt: If 7+ digits, set ioo –h j2_dynamicBufferPreallocation=256.

ioo -h j2_dynamicBufferPreallocation=value

The number of 16K slabs to preallocate when the filesystem is running low of bufstructs. A value of 16 represents 256K. The bufstructs for Enhanced JFS (aka JFS2) are now dynamic; the number of buffers that start on the JFS2 filesystem is controlled by j2_nBufferPerPagerDevice (now restricted), but buffers are allocated and destroyed dynamically past this initial value. If the number of external pager filesystem I/Os blocked with no fsbuf increases, the j2_dynamicBufferPreallocation should be increased for that file system, as the I/O load on a file system may be exceeding the speed of preallocation.

A value of 0 will disable dynamic buffer allocation completely.

Heavy IO workloads should have this value changed to 256.

File systems do not need to be remounted to activate.

ioo -po j2_dynamicBufferPreallocation=256

Network buffer memory usage¶

You'll want to capture the below values over time, and watch the increases to the allocated memory.

# echo "bucket -s" | /usr/sbin/kdb | /usr/bin/egrep 'allocated|thewall'
thewall............0000000000800000  kmemfreelater......0000000000000000
allocated..........000000000D66B000  bucket.......... @.F1000B04602A0158

The above values are in HEX, so you can convert them to bytes.

# echo "ibase=16; 000000000D66B000" | bc
224833536

Storage¶

fsck details of filesystem¶

# /sbin/helpers/jfs2/fscklog -p /opt/ibm/scratch2
*** Checking prior fsck log. ***

Found a valid superblock.  Continuing with fsck log check.

Time Stamps
s_time.tj_sec:          Fri Feb 15 14:56:15 2019
last mounted:           Mon Dec  7 16:35:49 2020
last unmounted:         Sun Jun  7 11:02:40 2020
last marked dirty:      Never marked dirty
last recovered:         Never recovered
last size change:       Never changed size

format LUN¶

dd if=/dev/zero of=/dev/rhdisk2 bs=1024k count=$(bootinfo -s hdisk2)

Manually remove a hdisk¶

To manually delete hard disks that won’t delete.

odmget –q name=hdisk# CuAt          <-- (Should be 6 entries)
odmget –q name=hdisk# CuDv          <-- (Should be 1 entry)
odmdelete –q name=hdisk# -o CuAt    <-- (Should delete 6 entries)
odmdelete –q name=hdisk# -o CuDv    <-- (Should delete 1 entry)
rmdev /dev/hdisk# /dev/rhdisk#

Rename a volume group (VG)¶

# lspv
hdisk0 002322fa97605ea2 rootvg active
hdisk1 002322fa0f8c3457 oldvg active
hdisk2 002322fa84e6f325 oldvg active
# varyoffvg oldvg
# exportvg oldvg
# lspv
hdisk0 002322fa97605ea2 rootvg active
hdisk1 002322fa0f8c3457 None active
hdisk2 002322fa84e6f325 None active
# importvg -y newvg hdisk1 (or hdisk2)
newvg
# lspv
hdisk0 002322fa97605ea2 rootvg active
hdisk1 002322fa0f8c3457 newvg active
hdisk2 002322fa84e6f325 newvg active

To get a disk out of Missing/Removed state¶

chpv -va hdiskX

You may have to run varyonvg to get the volume group to re-probe for the disk and recognize its state has changed.

Manually bring up a path to a disk¶

chpath -l hdisk2 -p vscsi0 -s enable

Change a hdisk from removed to active¶

chpv -va <hdisk#>

Manually assign PVID to disk¶

chdev -a pv=yes -l <hdisk>

PowerPath commands¶

Display high level HBA info¶

powermt display

Display all devices¶

powermt display dev=all

Display particular device¶

powermt display dev=hdiskpower0

Retrieve PowerPath registration key¶

powermt check_registration

Display PowerPath options¶

powermt display options

Display HBA mode enabled/disabled¶

powermt display hba_mode

Display I/O paths¶

powermt display paths

Display port status¶

powermt display port_mode

Display PowerPath version¶

powermt version

Check I/O paths¶

If you have made changes to the HBA’s, or I/O paths, just execute powermt check, to take appropriate action. For example, if you have manually removed an I/O path, check command will detect a dead path and remove it from the EMC path list.

powermt check
powermt check force

Configure Power Path¶

powermt config

Save/Resotre Power Path configuration¶

powermt save      <-- Saves to /etc/powermt.custom
powermt save file=/etc/powermt.21-Aug-2010
powermt load file=/etc/powermt.21-Aug-2010

Request Power Path to recheck I/O Paths¶

powermt restore dev=all

Change mode of specific HBA to active/standby¶

powermt set mode=[active|standby] hba=X     <-- X being the HBA number

Delete an I/O path¶

powermt remove dev=X              <-- X being the vaule in the I/O Paths column
powermt remove dev=hdiskpower0    <-- Will remove all I/O paths to a specific device

SDDPCM Commands¶

Query device paths¶

pcmpath query adapter

Remove failed paths¶

rmpath -p fscsi0 -d

Show adapter WWPN's¶

pcmpath query wwpn

Query ports¶

pcmpath query port

Check current and ODM queue depth value¶

Can use this to check if AIX has been rebooted since changing queue depth

# lsattr -El hdisk6 -a queue_depth
queue_depth 128 Queue DEPTH True          <-- Value in ODM
# echo scsidisk hdisk6 | kdb | grep queue_depth
   ushort queue_depth   = 0x80;           <-- Running config
# echo "ibase=16 ; 80" | bc
128                                       <-- Hex value conversion

Create ramdisk¶

mkramdisk 2G
mkfs -V jfs2 /dev/ramdisk0
mkdir -p /ramdisk0
mount -V jfs2 -o log=NULL -o dio,rbrw,noatime /dev/ramdisk0 /ramdisk0

Remove file by inode¶

ls -i
find . -inum <inode>
find . -inum <inode> -exec rm {} \;

Extended Logical Volume (LV) information¶

getlvcb -AT <lv_name>

Manually unmirror logical volumes¶

This will remove the logical volume from hdisk0

rmlvcopy hd6 1 hdisk0
rmlvcopy hd5 1 hdisk0

List Filesystems in reverse sort order¶

Based up mountpoint string length - useful for unmounting a larger number of filesystems with parent mounts

lsvgfs rootvg | awk '{ print length, $0 }' | sort -n -r | cut -d" " -f2-

Sort /etc/filesystems by mountpoint string length¶

Should prevent parent/child mount conflicts

for FS in $(awk '(!/^\*/) && (/^\//){ print length, $0 }' /etc/filesystems | sort -n | cut -d" " -f2-); do grep -p "^${FS}" /etc/filesystems; done

Show permissions for all directories to a certain path¶

# dir=/export/nim/images/OpenSSH; while [ "$dir" != "/" ]; do ls -ald $dir; dir=`dirname $dir`; done
drwxr-x---    8 root     system         4096 Sep 01 09:18 /export/nim/images/OpenSSH
drwxrwxr-x   43 root     system         4096 Oct 20 09:45 /export/nim/images
drwxr-xr-x   20 root     system         4096 Jul 27 09:58 /export/nim
drwxrwxr-x    3 root     system          256 Nov 06 2015  /export

rsync delete files in the destination that are no longer in the source¶

Won't copy new/changed files, this is only to delete. Remove the --dry-run option to actually delete

rsync --recursive -x --delete --ignore-existing --existing --prune-empty-dirs --verbose --dry-run /kristian1/ /kristian2

Use rsync to resume SSH download¶

rsync --partial --progress -avz -e "ssh -p 22" <user>@<host>:~/IBM/Downloads/AIX/7100-04-00-ISO/*.iso .    <-- Pull
rsync --partial --progress -avz . <user>@<host>:~/aixtoolbox    <-- Push

Move a filesystem or logical volume from one volume group to another¶

Example below has the /app/IBMucd in rootvg, and we're moving it to kristianvg

1. Verify if the existing filesystem is using internal or external logging¶

# mount | grep /app/IBMucd
/dev/fslv00      /app/IBMucd      jfs2   Mar 16 09:10 rw,nodev,nosuid,log=INLINE

2. Umount existing filesystem¶

umount /app/IBMucd

3. Copy existing logical volume to another volume group with a new name¶

cplv -v kristianvg -y newfslv00 fslv00

4. Change the filesystem to use the new logical volume and log device¶

If using inline logging

chfs -a dev=/dev/newfslv00 -a log=INLINE /app/IBMucd

If using external logging

chfs -a dev=/dev/newfslv00 -a log=/dev/XXXX /app/IBMucd

Where XXXX is the external log for the existing volume group. If no external log exists, create one with mklv and logform.

5. Run fsck and mount filesystem¶

fsck -ofull -y /app/IBMucd
mount /app/IBMucd

6. Remove the old logical volume from rootvg¶

rmlv -f fslv00

JFS2 Internal/External Snapshots¶

Internal¶

Internal snapshots are only supported from AIX 6.1 and must be enabled when the filesystem is created (-a isnapshot=yes)

1. Create snapshot¶

 snapshot -o snapfrom=/km -n kmsnap1

snapfrom: Filesystem to snapshot
-n: Name of snapshot

2. Query snapshot¶

# snapshot -q /km
Snapshots for /km
Current  Name         Time
   *     kmsnap1      Wed Feb 10 19:55:18 CST 2010

3. Restore individual files¶

cd /km/.snapshot/kmsnap1
cp -p <source> <dest>

4. Restore entire filesystem¶

umount /km
rollback –v -n kmsnap1 /km

5. Remove snapshot¶

snapshot -d -n kmsnap1 /km

External¶

1. Create snapshot¶

snapshot -o snapfrom=/km -o size=128M

Size is dependant by how many changes you will be making. In this instance, /km is 256M, so snap lv is half that size.

2. Query snapshot¶

# snapshot -q /km
Snapshots for /km
Current  Location      512-blocks        Free Time
   *     /dev/fslv08       262144      261376 Wed Feb 10 18:03:15 CST 2010

3. Increase snapshot image¶

snapshot -o size=+1 /dev/fslv08
Snapshot /dev/fslv08 size is now 524288 512-blocks.

4. Restore individual files¶

mkdir /mnt/snapfs
mount -v jfs2 -o snapshot /dev/fslv08 /mnt/snapfs
cp -p /mnt/snapfs/<source> <dest>

5. Restore entire filesystem¶

umount /km
rollback -v /km /dev/fslv08

Considerations¶

If writes to an internal snapshot fail (out of space), all snapshots are marked as INVALID, error writen to errpt. All snapshots need to be removed and then recreated.
Internal snapshots are removed if a fsck is ran against the filesystem.
Internal snapshots consume space inside the original filesystem.

Rename a device¶

To rename disk hdisk5 to hdisk2

rendev -l hdisk5 -n hdisk2

Filesystem throughput report¶

iostat -TsfV 1

Fibre channel statistics report¶

AIX 7.3 only.

# fcstat -t 1 -p scsi fcs0

FIBRE CHANNEL STATISTICS REPORT: fcs0
Device Type: FC Adapter (adapter/vdevice/IBM,vfc-client)

TP         Read Reqs(K)     Write Reqs(K)   Read (GB)  Write (GB)
===== ================= ================= =========== ===========
SCSI                207              2051           4          11
SCSI                207              2051           4          11
SCSI                207              2051           4          11
SCSI                207              2051           4          11
SCSI                207              2051           4          11

NFSv4 ACL's¶

The below is predicated on the filesystem that the directory/file resides in having version 2 extended attributes set (EAformat).

# lsfs -q /kristian
Name            Nodename   Mount Pt               VFS   Size    Options    Auto Accounting
/dev/fslv00     --         /kristian              jfs2  2097152 nodev,nosuid,rw yes  yes
(lv size: 2097152, fs size: 2097152, block size: 4096, sparse files: yes, inline log: yes, inline log size: 8, EAformat: v2, Quota: no, DMAPI: no, VIX: yes, EFS: no, ISNAPSHOT: yes, MAXEXT: 0, MountGuard: no)

Using the chmod command to manipulate the rwx permission bits, either in octal form (for example, 755) or in symbolic form (for example, u+x) will replace the NFSv4 ACL with an AIXC ACL, wiping out the original permissions that were on the directory/file. Once you've converted the directory/file to NFSv4 ACL's, use the acledit command to modify the base permissions.
- Allow splunk user to read a single file
  - Convert the single file from AIXC to NFS4 ACL's
```
# aclconvert -R -t NFS4 /kristian/app.log
# export EDITOR=/usr/bin/vim
```
  - Edit the ACL and add the u:splunk: a rRas entry. The entry needs to be added before the s:(EVERYONE@): d deny rule.
```
# acledit /kristian/app.log

*
* ACL_type   NFS4
*
*
* Owner: root
* Group: system
*
s:(OWNER@):     a       rwpRWaAdcCs
s:(OWNER@):     d       xo
s:(GROUP@):     a       rRadcs
s:(GROUP@):     d       wpWxACo
s:(EVERYONE@):  a       adcs
u:splunk:       a       rRas
s:(EVERYONE@):  d       rwpRWxACo
```
- Allow splunk user to read newly created files in a directory
  - Convert the directory from AIXC to NFS4 ACL's
```
# aclconvert -R -t NFS4 /kristian/log
# export EDITOR=/usr/bin/vim
```
  - Edit the ACL and add the u:splunk: a rRxas fidi entry. The entry needs to be added before the s:(EVERYONE@): d deny rule.
  - Add fidi to every entry (FILE_INHERIT/DIRECTORY_INHERIT) so newly created files and directories inherit the parent ACL's.
```
# acledit /kristian/log

*
* ACL_type   NFS4
*
*
* Owner: root
* Group: system
*
s:(OWNER@):     a       rwpRWxDaAdcCs fidi
s:(OWNER@):     d       o fidi
s:(GROUP@):     a       rRxadcs fidi
s:(GROUP@):     d       wpWDACo fidi
s:(EVERYONE@):  a       adcs fidi
u:splunk:       a       rRxas fidi
s:(EVERYONE@):  d       rwpRWxDACo fidi
```

Network¶

Transfer IP address¶

ifconfig en2 1.2.3.4 transfer en1
ifconfig en2 down detach
rmdev -Rdl ent2; rmdev -Rdl et2; rmdev -Rdl en2
chdev -l en1 -a netaddr='1.2.3.4' -a netmask='255.255.255.0' -a state=up

iptrace¶

Start trace

iptrace -a -i en0 -p 25 /tmp/iptrace.`hostname`.out

Stop trace

kill <pid> -l5

Read trace

ipreport -n /tmp/iptrace.`hostname`.out | more

Can also be read with Wireshark

Map a port to a process¶

# netstat -aAn | grep 22
f10007000028bbb0 tcp4       0      0  *.22               *.*                LISTEN
# rmsock f10007000028bbb0 tcpcb
The socket 0x28b808 is being held by proccess 151996 (sshd).

or

lsof -i :PORT

Map a process to a port¶

lsof -Pp <PID>

Remove multiple default gateways¶

# odmget -q "attribute=route" CuAt
    CuAt:
            name = "inet0"
            attribute = "route"
            value = "net,-hopcount,0,,0,192.168.0.2"
            type = "R"
            generic = "DU"
            rep = "s"
            nls_index = 0

    CuAt:
            name = "inet0"
            attribute = "route"
            value = "net,-hopcount,0,,0,192.168.0.2"
            type = "R"
            generic = "DU"
            rep = "s"
            nls_index = 0

If there are more than one, you need to remove the excess route

chdev -l inet0 -a delroute="net,-hopcount,0,,0,192.168.0.2"

Configure Dead Gateway Detection on the default route(DGD)¶

route change default -active_dgd

Add the command route change default -active_dgd to the /etc/rc.tcpip file to make this change permanent.''

Change the frequency of the DGD pings¶

no -p -o dgd_ping_time=2

Default is 5 seconds (Lowering it will allow for faster recovery)

List all HBA's and WWPN's¶

AIX

lsdev -C | awk '/^fcs/{ print $1 }' | while read -r FCS; do echo "${FCS}\t$(lscfg -vl "${FCS}" | awk -F. '/Network Address/{ print $NF }')"; done

For Virtual I/O Servers so you don't include FCoE adapters

lsdev -C | awk '/^fcs/ && /16Gb/{ print $1 }' | while read -r FCS; do echo "${FCS}\t$(lscfg -vl "${FCS}" | awk -F. '/Network Address/{ print $NF }')"; done

The apply command can also be used

apply "lscfg -vl fcs%1" 0 1 2 3 | grep Net

You can format the WWPN's for the SAN team

echo c0507603a292007c | sed 's/../&:/g;s/:$//'

Miscellaneous¶

Quick HTTP web server using python3¶

Use --bind 127.0.0.1 if you want to make it local only

python3 -m http.server 8080

Estimate mksysb size¶

df -tk $(lsvgfs rootvg) | awk '{ total+=$3 } END { printf "Estimated mksysb size %d bytes, %.2f GB\n", total*1024, total/1024/1024 }'

Update adapter firmware without using diag menus¶

diag -c -d fcsXX -T "download -s /etc/microcode -l latest -f"

vi out of memory¶

export EXINIT="set ll=20000000"

Read audit file¶

auditpr -h elRtcrp -vX < /audit/trail.20160113

Use Java to unzip a file¶

export PATH=$PATH:/usr/java8/bin
jar -xvf zipfile.zip

Find a parent device¶

odmget -q name=rmt0 CuDv

Convert Gb to 512byte blocks¶

expr 150 \* 1024 \* 1024 \* 1024 \/ 512

cpio¶

Extract

cpio -icvdum < /tmp/file.cpio

Read

cpio -ictv < /tmp/file.cpio

Create

cpio -ov > /tmp/file.cpio

Restore file from mksysb backup¶

read

restore -Tqvf <file.mksysb>

restore

restore -xvqf <file.mksysb>

restore individule directory and it's contents

restore -xdvqf <file.mksysb> ./ibmsupt

restore from mksysb file

restore -xvqf <file.mksysb> ./etc/exports

Read /var/adm/wtmp file¶

/usr/sbin/acct/fwtmp < /var/adm/wtmp > /test/wtmp.txt

Create an empty file of any size¶

lmktemp <file> 10M

or

dd if=/dev/zero of=/etl/test bs=1M count=5120    <-- Will create a 5GB test file

Prevent SIGHUP on a process already running¶

nohup -p <PID>

getconf commands¶

What was the device the system was last booted from

getconf BOOT_DEVICE

What size is a particular disk in the system

getconf DISK_SIZE /dev/hdisk0

What partition size is being used on a disk in the system

getconf DISK_PARTITION /dev/hdisk0

Is the machine capable of running a 64-bit kernel

getconf HARDWARE_BITMODE

Is the system currently running a 64-bit or 32-bit kernel

getconf KERNEL_BITMODE

How much real memory does the system have

getconf REAL_MEMORY

Set attention LED light to normal from command line¶

/usr/lpp/diagnostics/bin/usysfault -s normal

Mount an IOS image¶

loopmount -i cdrom.iso -o "-V cdrfs -o ro" -m /mnt

Use openssl to get MD5 of a file¶

openssl dgst -md5 dynadock_1_3.iso

Use csum to get MD5/SHA1 of a file¶

csum -h MD5 MH01706_x86.iso
csum -h SHA1 MH01706_x86.iso

Create ISO image from mksysb¶

mkcd -L -S -I /export/images/mksysb/2011 -m /export/images/mksysb/2011/MC_MOD.20110117.mksysb

Debug a hung process¶

dbx -a <hung_pid>

thread List thread ID's. Look for threads in an abnormal state, WAIT or DEADLOCK
thread current <number> Set attention to thread. Value is the number from $t1, e.g. "thead current 1"
x Thead dump, check if you can see where it's hanging
where Cab also give you an idea of where it's hung
detach Exit out of dbx session. "quit" will exit but also kill the PID.

Show value range for chdev/lsattr paramaters¶

# lsattr -l hdisk5 -a queue_depth -R
1...32 (+1)

sudo debugging¶

# touch /var/log/sudo_debug.log
# cat /opt/sysadm/etc/sudo.conf
Debug sudo /var/log/sudo_debug.log all@debug
Debug sudoers.so /var/log/sudo_debug.log all@debug

pam debugging¶

Add a *.debug entry in syslog.conf

touch /etc/pam_debug

Find HMC IP address¶

AIX 5.3

lsrsrc IBM.ManagementServer

AIX 6.1 or higher

lsrsrc IBM.MCP
lsrsrc IBM.MCP IPAddresses
lsrsrc IBM.MCP HMCIPAddr

Process creation time¶

# kdb
WARNING: Version mismatch between unix file and command kdb
           START              END <name>
0000000000001000 0000000007140000 start+000FD8
F00000002FF47600 F00000002FFE1000 __ublock+000000
000000002FF22FF4 000000002FF22FF8 environ+000000
000000002FF22FF8 000000002FF22FFC errno+000000
F1001104C0000000 F1001104D0000000 pvproc+000000
F1001104D0000000 F1001104D8000000 pvthread+000000
read vscsi_scsi_ptrs OK, ptr = 0x0
(0)> tpid -d 9044254 | head
                SLOT NAME     STATE    TID PRI   RQ CPUID  CL  WCHAN
pvthread+09A800 2472 pfcdaemo SLEEP 1A80157 03C    0         0  F1000915905EE310
(0)> u 2472 | grep ticks
   start..00000000604FDB75   ticks..0000000000001F04
(0)> hcal 00000000604FDB75
Value hexa: 604FDB75          Value decimal: 1615846261
(0)> quit
# perl -le 'print scalar localtime $ARGV[0]' 1615846261
Tue Mar 16 09:11:01 2021

multibos Error reading LVCB attribute¶

multibos -R fails, leaving two hd5's in rootvg (see highlighted lines below).

# multibos -R
Initializing multibos methods ...
Initializing log /etc/multibos/logs/op.alog ... 
Gathering system information ...
+-----------------------------------------------------------------------------+ 
Remove Operation 
+-----------------------------------------------------------------------------+ 
Verifying operation parameters ...
+-----------------------------------------------------------------------------+ 
Boot Partition Processing 
+-----------------------------------------------------------------------------+ 
multibos: 0565-080 Error reading LVCB attribute "fs,mb" of logical volume hd5.
multibos: 0565-082 Unable to verify multibos tag for standby BOS logical volume hd5
multibos: 0565-084 Error processing primary boot partition.
multibos: 0565-002 ATTENTION: cleanup did not complete successfully.
Log file is /etc/multibos/logs/op.alog 
Return Status: FAILURE

# lsvg -l rootvg 
rootvg: 
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT 
hd5 boot 1 2 2 closed/syncd N/A
hd6 paging 128 256 2 open/syncd N/A 
hd8 jfs2log 1 2 2 open/syncd N/A 
hd4 jfs2 4 8 2 open/syncd / 
hd2 jfs2 54 108 2 open/syncd /usr 
hd9var jfs2 16 32 2 open/syncd /var 
hd3 jfs2 8 16 2 open/syncd /tmp 
hd1 jfs2 28 56 2 open/syncd /home 
hd10opt jfs2 4 8 2 open/syncd /opt 
hd11admin jfs2 1 2 2 open/syncd /admin 
lg_dumplv sysdump 12 12 1 open/syncd N/A 
livedump jfs2 1 2 2 open/syncd /var/adm/ras/livedump 
bos_hd5 boot 1 2 2 closed/syncd N/A

Fix the logical volume control blocks

putlvcb -f 'vfs=jfs2:log=/dev/hd8:mount=automatic:type=bootfs:vol=root:free=true:quota=no' hd4 
putlvcb -f 'vfs=jfs2:log=/dev/hd8:mount=automatic:type=bootfs:vol=/usr:free=false:quota=no' hd2 
putlvcb -f 'vfs=jfs2:log=/dev/hd8:mount=automatic:type=bootfs:vol=/var:free=false:quota=no' hd9var 
putlvcb -f 'vfs=jfs2:log=/dev/hd8:mount=true:check=true:vol=/opt:free=false:quota=no' hd10opt

Remove the multibos tags from the existing file systems

chfs -a mb= /opt
chfs -a mb= /var
chfs -a mb= /usr
chfs -a mb= /

Remove or move the multibos directory
```
mv /etc/multibos /tmp
```
Remove the leftover bos_hd5
```
rmlv -f bos_hd5
```
Remove the /bos_inst directory
```
rm -R /bos_inst
```

Remove the mbverify entry from iniitab

cp /etc/inittab /etc/inittab.backup
rmitab mbverify

Recreate the boot image to ensure you have a good copy
```
bosboot -ad /dev/ipldevice
```
Verify the bootlist ponts to hd5 or the rootvg disk only
```
bootlist -om normal
```

Fixing underlying mount point permissions¶

Example of error

$ ls -al
ls: 0653-345 ./..: Permission denied.

Verify mount point permissions

#!/bin/ksh
#Show Mount Point Permissions

[ `whoami` = "root" ] || { echo "Run as root"; exit 1; }

tmpdir="/tmp/$$"
mkdir "$tmpdir"
for fs in `mount | grep jfs | awk '{print $2}'`; do
        parentmount=`df "/$fs/.." | tail -n 1 | awk '{print $7}'`
        mount -o ro "$parentmount" "$tmpdir"
        printf "%-24s" $fs
        ls -ald `echo $fs | sed "s%$parentmount%$tmpdir/%"`
        umount "$tmpdir"
done
rmdir "$tmpdir"

Fix underlying mount point permissions if you don't want to unmount the filesystem

#!/bin/ksh
#Add read/execute permissions to user/group/others on underlying mount point

fs="$1"

[ `whoami` = "root" ] || { echo "Run as root"; exit 1; }
if [ -z "$fs" ]; then
        echo "Enter Mount Point to change permissions on as argument"
        exit 1
fi

tmpdir="/tmp/$$"
mkdir "$tmpdir"
parentmount=`df "/$fs/.." | tail -n 1 | awk '{print $7}'`
mount "$parentmount" "$tmpdir"
echo "Original Permissions:"
ls -ald `echo $fs | sed "s%$parentmount%$tmpdir/%"`
chmod a+rx `echo $fs | sed "s%$parentmount%$tmpdir/%"`
echo; echo "New Permissions:"
ls -ald `echo $fs | sed "s%$parentmount%$tmpdir/%"`
umount "$tmpdir"
rmdir "$tmpdir"

installp BUILDDATE requisite failure¶

# lppchk -v
lppchk:  The following filesets need to be installed or corrected to bring
         the system to a consistent state:

  bos.rte.serv_aid 7.1.5.30               (usr: COMMITTED, root: not installed)

# lslpp -h bos.rte.serv_aid
  Fileset         Level     Action       Status       Date         Time
  ----------------------------------------------------------------------------
Path: /usr/lib/objrepos
  bos.rte.serv_aid
                  7.1.1.0   COMMIT       COMPLETE     02/12/13     12:36:21
                 7.1.1.16   COMMIT       COMPLETE     02/12/13     12:51:18
                 7.1.3.45   COMMIT       COMPLETE     08/11/15     18:18:09
                  7.1.4.0   COMMIT       COMPLETE     09/02/16     21:39:09
                  7.1.4.1   COMMIT       COMPLETE     02/22/17     16:30:57
                 7.1.4.30   COMMIT       COMPLETE     11/17/18     10:13:15
                 7.1.4.31   COMMIT       COMPLETE     11/17/18     10:13:17
                  7.1.5.0   COMMIT       COMPLETE     11/17/18     10:13:19
                 7.1.5.15   COMMIT       COMPLETE     08/14/19     17:13:09
                 7.1.5.30   COMMIT       COMPLETE     08/20/19     14:22:15

Path: /etc/objrepos
  bos.rte.serv_aid
                  7.1.1.0   COMMIT       COMPLETE     02/12/13     12:36:21
                 7.1.1.16   COMMIT       COMPLETE     02/12/13     12:51:18
                 7.1.3.45   COMMIT       COMPLETE     08/11/15     18:18:09
                  7.1.4.0   COMMIT       COMPLETE     09/02/16     21:39:09
                  7.1.4.1   COMMIT       COMPLETE     02/22/17     16:30:57
                 7.1.4.30   COMMIT       COMPLETE     11/17/18     10:13:15
                 7.1.4.31   COMMIT       COMPLETE     11/17/18     10:13:17
                  7.1.5.0   COMMIT       COMPLETE     11/17/18     10:13:19
                 7.1.5.15   COMMIT       COMPLETE     08/14/19     17:13:09
                       << -- 7.1.5.30 missing from this list -- >>

The highlighted line shows 7.1.5.30 missing from the list.

To fix, copy the fileset to the host, and then run one of the below installp commands.

# ls -l bos.rte.serv_aid.7.1.5.30.U
-rw-r-----    1 kristijan   staff      980992 Oct 04 2018  bos.rte.serv_aid.7.1.5.30.U
# installp -Or -ac bos.rte.serv_aid    <-- To reinstall the root part
# installp -Ou -ac bos.rte.serv_aid    <-- To reinstall the usr part

Get limits (ulimit) of a running process¶

# dbx -a 12517682
Waiting to attach to process 12517682 ...
Successfully attached to ovcd.
warning: Directory containing ovcd could not be determined.
Apply 'use' command to initialize source path.

Type 'help' for help.
reading symbolic information ...warning: no source compiled with -g

stopped in _event_sleep at 0x9000000005c5f54 ($t1)
0x9000000005c5f54 (_event_sleep+0x514) e8410028             ld   r2,0x28(r1)
(dbx) proc rlimit
rlimit name:          rlimit_cur               rlimit_max       (units)
 RLIMIT_CPU:         (unlimited)             (unlimited)        sec
 RLIMIT_FSIZE:       (unlimited)             (unlimited)        bytes
 RLIMIT_DATA:          134217728             (unlimited)        bytes
 RLIMIT_STACK:          33554432             (unlimited)        bytes
 RLIMIT_CORE:                  0                       0        bytes
 RLIMIT_RSS:            33554432             (unlimited)        bytes
 RLIMIT_AS:          (unlimited)             (unlimited)        bytes
 RLIMIT_NOFILE:             2000             (unlimited)        descriptors
 RLIMIT_THREADS:          262144             (unlimited)        per process
 RLIMIT_NPROC:            262144             (unlimited)        per user
(dbx) detach

Get umask of a running process¶

Start kdb, get the thread slot number, and query for the creation mask.

# kdb
           START              END <name>
0000000000001000 0000000008450000 start+000FD8
F00000002FF47600 F00000002FFE1000 __ublock+000000
000000002FF22FF4 000000002FF22FF8 environ+000000
000000002FF22FF8 000000002FF22FFC errno+000000
F1001004C0000000 F1001004D0000000 pvproc+000000
F1001004D0000000 F1001004D8000000 pvthread+000000
read vscsi_scsi_ptrs OK, ptr = 0x0
(0)> tpid -d 19530230
                SLOT NAME     STATE    TID PRI   RQ CPUID  CL  WCHAN

pvthread+09C900 2505 ruby     SLEEP 1C9011F 03C    0         0
pvthread+0B0500 2821 ruby     SLEEP 30501D7 03C    0         0

(0)> user 2505 | grep cmask
   cmask.........0017

Line 10 - This is the PID of the process you're trying to get the umask of.
Line 16 - All threads of a process should have the same umask value. Selecting one here should be enough, but you can verify them all.

The output is in hex and needs to be converted to octal.

# printf "%o\n" 0x17
27

The process with PID 19530230 has a running umask of 0027.

Building an AIX bff package¶

The mkinstallp command comes as part of the bos.adt.insttools package.

Create a build location
```
mkdir -p /packagename/root
```
Copy package contents

Copy over the package files into the base directory using the absolute location.
File/folder locations

If your file needs to be located in /app/package/file.txt then copy it into /packagename/root/app/package/file.txt. Set the folder/file permissions as required.
```
mkdir -p /packagename/root/app/package
cp -rp file.txt /packagename/root/app/package/file.txt
```

Create a package template file

The basic template below is enough to get a package built. You can find a complete list of options in /usr/lpp/bos/README.MKINSTALLP.

Package Name: PackageName
Package VRMF: 1.0.0.0
Update: N
Fileset
  Fileset Name: PackageName.rte
  Fileset VRMF: 1.0.0.0
  Fileset Description: Package description
  USRLIBLPPFiles
  Pre-installation Script: /packagename/pre_i
  Post-installation Script: /packagename/post_i
  Unconfiguration Script: /packagename/unconfig
  EOUSRLIBLPPFiles
  Bosboot required: N
  License agreement acceptance required: N
  Include license files in this package: N
  Requisites:
  USRFiles
  /app/package/file.txt
  EOUSRFiles
  ROOT Part: N
  ROOTFiles
  EOROOTFiles
  Relocatable: N
EOFileset

The following lines are most commonly changed:

1 & 5 are the package and fileset name
2 & 6 are the package and fileset version
7 is the package description
9-11 are the scripts used during installation/uninstallation
18 is a list of all the files that make up the package

Build the package
```
mkinstallp -d /packagename/root -T /packagename/root/TemplateFile
```
The built package will be located in /packagename/root/tmp

NIM¶

Configure SSL communication between master and client¶

niminit -v -a master=<nim_hostname> -a name=$(hostname) -a connect=nimsh
/usr/sbin/nimclient -c

Verify you can pull down SSL cert using tftp¶

tftp -g - <nim_hostname> /tftpboot/server.pem

Oracle¶

Check Oracle ASM LUN member/candidate ownership¶

/app/oragrid/product/*/inventory/Scripts/ext/bin/kfod verbose=true, disks=all status=true op=disks asm_diskstring='/dev/rhdisk*'
/app/oragrid/product/*/inventory/Scripts/ext/bin/kfod verbose=true, disks=all status=true op=disks asm_diskstring='/dev/rhdisk*' | egrep -i member

Start/Stop Oracle RAC (when performing LPM)¶

/app/oragrid/product/11.2.0.3/bin/crsctl start has
/app/oragrid/product/11.2.0.3/bin/crsctl stop has [-f]

Check Oracle RAC/cluster status¶

/app/oragrid/product/*/bin/crsctl stat res -t
/app/oragrid/product/*/bin/crsctl check cluster -all

Scripts¶

deactivate_paging.sh¶

If paging space logical volumes are all the same size, AIX will round robin between them. If they're at different sizes, the smallest will be used first, and then the next one, and so on...

If additional paging space logical volumes have been added, they're likely larger than the default logical volume of hd6. In this case, at boot, we can deactivate it in favour of the larger one(s).

# Script      : deactivate_paging.sh
#
# Description : Script runs at boot (inittab), and will deactivate the default
#               AIX paging device (hd6) if an alternate paging space is
#               active, and greater in size.
#
# Usage       : Script takes no parameters.

if lsps -ac | egrep -v '^#|^hd6' > /dev/null 2>&1; then
  # Get size of default paging space
  hd6_size=`lsps -a | grep '^hd6' | awk '{print $4}'`

  # Create array of other paging space attributes
  set -A paging_name `lsps -a | grep -v '^hd6' | awk '(NR!=1) {print $1}'`
  set -A paging_size `lsps -a | grep -v '^hd6' | awk '(NR!=1) {print $4}'`
  set -A paging_active `lsps -a | grep -v '^hd6' | awk '(NR!=1) {print $6}'`

  # Default paging space (hd6) will be turned off if any single
  # alternate paging space is active, and greater in size than
  # the default paging space
  count=0
  while (( $count < ${#paging_name[*]} )); do
    if [[ ( ${paging_active[$count]} = "yes" ) && ( ${paging_size[$count]%??} -gt ${hd6_size%??} ) ]]; then
      echo "At least one alternate paging space detected and active [${paging_name[$count]}]" > /dev/console
      echo 'Deactivating default paging space hd6...' > /dev/console
      swapoff /dev/hd6 > /dev/console
      exit
    else
      let count="count + 1"
    fi
  done
fi

extenddump.sh¶

Script runs from roots cron and checks the current dump device size against the dump size estimate, and increases the dump device if smaller. The prevent the dump device taking up all the space in rootvg, it's capped at 32GB.

# Script      : extenddump.sh
#
# Description : Script runs from roots cron and checks the current dump device size
#               against the dump size estimate and increases the dump device if smaller.
#
#               The prevent the dump device taking up all the space in rootvg, it's capped at
#               32GB.
#
# Usage       : Script takes no parameters.

# Current dump devices
PRI_DMP=$(sysdumpdev -l | awk /^primary/'{ sub("/dev/","",$2); print $2 }')
SEC_DMP=$(sysdumpdev -l | awk /^secondary/'{ sub("/dev/","",$2); print $2 }')

# Estimate size of dump
EST_DMPSIZE_BYTES=$(sysdumpdev -e | awk -F: '{ gsub(" ", "", $2); print $2 }')

# Primary dump increase
if [ "${PRI_DMP}" != "sysdumpnull" ]; then
    PRI_VG=$(lslv "${PRI_DMP}" | awk -F : '/VOLUME GROUP/{ gsub(" ", "", $3); print $3 }')
    PRI_VGPPSIZE=$(lsvg "${PRI_VG}" | awk -F'[^0-9]*' '/PP SIZE/{ print $2 }')

    PRI_EST_DMPSIZE_PP=$((("${EST_DMPSIZE_BYTES}" / 1024 / 1024 / "${PRI_VGPPSIZE}") + 1))
    PRI_CUR_DMPSIZE_PP=$(getlvcb -AT "${PRI_DMP}" | awk -F= /"number lps"/'{ gsub(" ", "", $2); print $2 }')

    # Check if the increase will extend the dump lv size beyond 32GB
    EXTEND_LV_PP=$(("${PRI_EST_DMPSIZE_PP}" - "${PRI_CUR_DMPSIZE_PP}"))
    if [ $((("${PRI_CUR_DMPSIZE_PP}" + "${EXTEND_LV_PP}") * "${PRI_VGPPSIZE}")) -le 32768 ]; then
        # Check if the dump lv is already large enough to accommodate the
        # estimated dump size in PP's
        if [ "${EXTEND_LV_PP}" -gt 0 ]; then
            if extendlv "${PRI_DMP}" "${EXTEND_LV_PP}"; then
                echo "$(date) - Dump LV: $PRI_DMP extended by $EXTEND_LV_PP PP's successfully."
            else
                echo "$(date) - Dump LV: $PRI_DMP extended by $EXTEND_LV_PP PP's failed."
            fi
        fi
    else
        echo "$(date) - Dump LV: $PRI_DMP extend failed, as it would extend beynd the 32GB limit."
    fi
fi

# Secondary dump, if it exists, should be the same size as the primary
if [ "${SEC_DMP}" != "sysdumpnull" ]; then
    # If for some reason the primary and secondary dump lv's are in different volume
    # groups, they might have a different volume group PP size. Let's convert the lv
    # sizes into MB, and work out the PP value from there.
    # - Primary in MB
    PRI_CUR_DMPSIZE_PP=$(getlvcb -AT "${PRI_DMP}" | awk -F= /"number lps"/'{ gsub(" ", "", $2); print $2 }')
    PRI_CUR_DMPSIZE_MB=$(("${PRI_CUR_DMPSIZE_PP}" * "${PRI_VGPPSIZE}"))
    # - Secondary in MB
    SEC_VG=$(lslv "${SEC_DMP}" | awk -F : '/VOLUME GROUP/{ gsub(" ", "", $3); print $3 }')
    SEC_VGPPSIZE=$(lsvg "${SEC_VG}" | awk -F'[^0-9]*' '/PP SIZE/{ print $2 }')
    SEC_CUR_DMPSIZE_PP=$(getlvcb -AT "${SEC_DMP}" | awk -F= /"number lps"/'{ gsub(" ", "", $2); print $2 }')
    SEC_CUR_DMPSIZE_MB=$(("${SEC_CUR_DMPSIZE_PP}" * "${SEC_VGPPSIZE}"))

    # Check if the secondary dump lv is smaller than the primary dump lv
    if [ "${SEC_CUR_DMPSIZE_MB}" -lt "${PRI_CUR_DMPSIZE_MB}" ]; then
        EXTEND_LV_PP=$((("${PRI_CUR_DMPSIZE_MB}" - "${SEC_CUR_DMPSIZE_MB}") / "${SEC_VGPPSIZE}"))
        if [ "${EXTEND_LV_PP}" -gt 0 ]; then
            if extendlv "${SEC_DMP}" "${EXTEND_LV_PP}"; then
                echo "$(date) - Dump LV: $SEC_DMP extended by $EXTEND_LV_PP PP's successfully."
            else
                echo "$(date) - Dump LV: $SEC_DMP extended by $EXTEND_LV_PP PP's failed."
            fi
        fi
    fi
fi

mksysb_check.py¶

Checks that there is a client mksysb resource on the NIM master, and checks that the creation date of the resource isn't older than 15 days.

Exit codes are used to determine the status.

Exit code	Description
`0`	mksysb found and is not older than 15 days
`1`	no mksysb found
`2`	all mksysbs found are older than 15 days

#!/opt/freeware/bin/python3
#
# Check that there is at least one mksysb for the client, and
# that the creation date of the mksysb is not older than 15 days

import sys
import socket
import subprocess
from datetime import datetime

# Get hostname
hostname = socket.gethostname()

# Get current time
current_time = datetime.now()

# Create list of mksysb resources from the NIM server
nim_mksysb_list = subprocess.check_output(f"/usr/sbin/nimclient -l -L -t mksysb {hostname} | /usr/bin/awk '/{hostname}/{{ print $1 }}'", shell=True, encoding='utf-8').split()
# If the subprocess above returns no values, the result is a
# single item list with an empty string. Let's strip that out.
nim_mksysb_list = filter(None, nim_mksysb_list)

# Parse list of NIM mksysb backups and compare creation date
if nim_mksysb_list:
    for mksysb in nim_mksysb_list:
        mksysb_creation_time = subprocess.check_output(f"/usr/sbin/nimclient -l -l {mksysb} | /usr/bin/awk -F = '/creation_date/{{ print $2 }}'", shell=True, encoding='utf-8').strip()
        mksysb_creation_time = datetime.strptime(''.join(mksysb_creation_time), '%c')
        elapsed = current_time - mksysb_creation_time
        if elapsed.days < 15:
            sys.exit(0)
else:
    # List is empty, no backups on NIM.
    sys.exit(1)

# If we've made it this far, all mksysb's are older than 15 days
sys.exit(2)

AIX¶

Links¶

CPU¶

Simulate CPU load¶

Trace per-process CPU usage¶

High j2pg usage¶

Thread migrations¶

Memory¶

Memory usage per process¶

Memory usage per user¶

Processes taking up paging space¶

Top 15 processes using memory¶

Processes using filesystem cache¶

vmstat commands¶

vmstat -v tuning¶

pending disk I/Os blocked with no pbuf¶

filesystem I/Os blocked with no fsbuf¶

external pager filesystem I/Os blocked with no fsbuf¶

Network buffer memory usage¶

Storage¶

fsck details of filesystem¶

format LUN¶

Manually remove a hdisk¶

Rename a volume group (VG)¶

To get a disk out of Missing/Removed state¶

Manually bring up a path to a disk¶

Change a hdisk from removed to active¶

Manually assign PVID to disk¶

PowerPath commands¶

Display high level HBA info¶

Display all devices¶

Display particular device¶

Retrieve PowerPath registration key¶

Display PowerPath options¶

Display HBA mode enabled/disabled¶

Display I/O paths¶

Display port status¶

Display PowerPath version¶

Check I/O paths¶

Configure Power Path¶

Save/Resotre Power Path configuration¶

Request Power Path to recheck I/O Paths¶

Change mode of specific HBA to active/standby¶

Delete an I/O path¶

SDDPCM Commands¶

Query device paths¶

Remove failed paths¶

Show adapter WWPN's¶

Query ports¶

Check current and ODM queue depth value¶

Create ramdisk¶

Remove file by inode¶

Extended Logical Volume (LV) information¶

Manually unmirror logical volumes¶

List Filesystems in reverse sort order¶

Sort /etc/filesystems by mountpoint string length¶

Show permissions for all directories to a certain path¶

rsync delete files in the destination that are no longer in the source¶

Use rsync to resume SSH download¶

Move a filesystem or logical volume from one volume group to another¶

1. Verify if the existing filesystem is using internal or external logging¶

2. Umount existing filesystem¶

3. Copy existing logical volume to another volume group with a new name¶

4. Change the filesystem to use the new logical volume and log device¶

5. Run fsck and mount filesystem¶

6. Remove the old logical volume from rootvg¶

JFS2 Internal/External Snapshots¶

Internal¶

1. Create snapshot¶

2. Query snapshot¶

3. Restore individual files¶

4. Restore entire filesystem¶

5. Remove snapshot¶

External¶

1. Create snapshot¶

2. Query snapshot¶

3. Increase snapshot image¶

4. Restore individual files¶

5. Restore entire filesystem¶

Considerations¶