PSP0 CONSUME HIGH CPU

PSP0 CONSUME HIGH CPU

At first, i thought PSP is PlayStation portable because i am addicted to video games. But not exactly, it is a process spawner background process which spawns additional processes according to the job requirements after instance startup.

There is not of much information in the internet of this background process. In one of the critical database, this PSP0 process was on top of the sessions in ASH all the time consuming 100% of the CPU.

If alert log is scrutinized, PSP0 is the process to be started after PMON which is also near critical process to schedule other required OS processes acting as a master process.

Initially, this PSP0 has consumed high CPU with an OEM alert.

After checking the ASH data, PSP0 is consistently consuming CPU but not high.

WHO     PGM   W_CHAINS                                 SQL_ID             TOTAL CHART_WCLASS         STATE       TAASH    PER%AAS
------- ----- ---------------------------------------- ------------- ---------- -------------------- ---------- ------ ----------
SYS     PSP0  On CPU                                                     178020  ################    BURN_CPU        0  1.8391192
SYS     PSP0  os thread creation                                             46                      LONG_WAIT       1 .826086957

To get more deeper, strace was attached to psp0 OS process to track the system calls. This should not practiced in production system.

[root@hydrupgrd oracle]# ps -ef|grep psp0
oracle     35045       1  0 20:17 ?        00:00:00 ora_psp0_hydrupgd
root       36811   36768  0 20:44 pts/0    00:00:00 grep --color=auto psp0
[root@hydrupgrd oracle]# strace -rp 35045 -o psp0.trc &
[1] 36858
[root@hydrupgrd oracle]# strace: Process 35045 attached

[root@hydrupgrd oracle]#
[root@hydrupgrd oracle]#
[root@hydrupgrd oracle]# ls -lrt|tail -1
-rw-r--r--   1 root   root          5339 Jan  4 20:45 psp0.trc
[root@hydrupgrd oracle]# vi psp0.trc

PSP0 process is one of that process in top which consume less CPU but consistent eater.

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  10847 oracle    20   0 2874964 217064   9380 S   1.3   2.8  77:55.21 java
   7607 oracle    20   0 2701968  12612  10408 S   0.7   0.2   0:03.40 ora_psp0_hydrup

To check the back stage of PSP0 from database centric, oradebug 10046 is enabled.

SQL> oradebug setospid 7607
Oracle pid: 3, Unix process pid: 7607, image: oracle@hydrupgrd (PSP0)
SQL> oradebug unlimit
Statement processed.
SQL> oradebug event 10046 trace name context forever,level 12;
Statement processed.
SQL> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/hydrupgd/hydrupgd/trace/hydrupgd_psp0_7607.trc

Trace reveals that the PSP0 process sits idle most of the time waiting on ‘rdbms ipc message’ from the foreground process to allocate work to spawn a process.

In between, there is an event called ‘os thread creation‘ which creates alerts for the OS scheduler to spawn threads on behalf of the database. This is the reason why we call it Process spawner.

[oracle@hydrupgrd ~]$ tail -f /u01/app/oracle/diag/rdbms/hydrupgd/hydrupgd/trace/hydrupgd_psp0_7607.trc
WAIT #0: nam='rdbms ipc message' ela= 1018986 timeout=100 p2=0 p3=0 obj#=-1 tim=7526702932

*** 2023-01-05 18:17:54.048
WAIT #0: nam='rdbms ipc message' ela= 1019628 timeout=100 p2=0 p3=0 obj#=-1 tim=7527726928

WAIT #0: nam='rdbms ipc message' ela= 823545 timeout=100 p2=0 p3=0 obj#=-1 tim=8233583633
WAIT #0: nam='os thread creation' ela= 48236 pname=1244672048 is_process=1 p3=0 obj#=-1 tim=8233631937
WAIT #0: nam='rdbms ipc message' ela= 96816 timeout=13 p2=0 p3=0 obj#=-1 tim=8233728859
WAIT #0: nam='os thread creation' ela= 64997 pname=1244672049 is_process=1 p3=0 obj#=-1 tim=8233793909

PSP process uses select() call to monitor the file descriptors for an IO call, periodically accesses the oracle binary file and also check /proc/pid/stat about status of pmon

     0.000010 semtimedop(3, [{sem_num=7, sem_op=-1, sem_flg=0}], 1, {tv_sec=1, tv_nsec=0}) = -1 EAGAIN (Resource temporarily unavailable)
     1.019687 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=0, tv_usec=57133}, ru_stime={tv_sec=0, tv_usec=166010}, ...}) = 0
     0.000026 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=0, tv_usec=57133}, ru_stime={tv_sec=0, tv_usec=166014}, ...}) = 0
     0.000012 times(NULL)               = 431336006
     0.000009 times(NULL)               = 431336006
     0.000008 select(0, 0x7ffcaef35330, 0x7ffcaef35330, 0x7ffcaef35330, {tv_sec=0, tv_usec=1000}) = 0 (Timeout)
     0.001083 select(0, 0x7ffcaef35330, 0x7ffcaef35330, 0x7ffcaef35330, {tv_sec=0, tv_usec=1000}) = 0 (Timeout)
     0.001072 select(0, 0x7ffcaef35330, 0x7ffcaef35330, 0x7ffcaef35330, {tv_sec=0, tv_usec=1000}) = 0 (Timeout)
     0.001079 select(0, 0x7ffcaef35330, 0x7ffcaef35330, 0x7ffcaef35330, {tv_sec=0, tv_usec=1000}) = 0 (Timeout)
     0.001083 times(NULL)               = 431336006
     0.000010 semtimedop(3, [{sem_num=7, sem_op=-1, sem_flg=0}], 1, {tv_sec=1, tv_nsec=0}) = 0
     0.189959 access("/u01/app/oracle/product/12.1.0/dbhome_3/bin/oracle", X_OK) = 0
     0.000035 pipe([10, 11])            = 0
     0.000018 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f1840f57550) = 36977
     0.000175 close(11)                 = 0
     0.000017 read(10, 0x7ffcaef356d0, 16) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
     0.000694 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=36977, si_uid=54321, si_status=0, si_utime=0, si_stime=0} ---
     0.000010 read(10, "r\220\0\0\0\0\0\0\376\205\34\0\0\0\0\0", 16) = 16
     0.001724 close(10)                 = 0
     0.000016 wait4(36977, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 36977
     0.000030 times(NULL)               = 431336026

     0.000012 openat(AT_FDCWD, "/proc/35043/stat", O_RDONLY) = 10
     0.000027 read(10, "35043 (ora_pmon_hydrup) S 1 3504"..., 999) = 307
     0.000015 close(10)                 = 0

Below hidden parameter can be used to flow control diagnostics related to PSP0 to prevent flooding of the queues and resource allocation

SQL> set lines 200 pages 1000
col "Hidden_parameter" for a40
col "It's Value" for a20
col Desc for a40
select x.ksppinm as Hidden_parameter,
         y.ksppstvl "It's Value",
                   x.ksppdesc "Desc"
     from
  x$ksppi x
  inner join x$ksppcv y on (x.indx = y.indx)
and
  x.ksppinm like '%psp%'
order by x.ksppinm;SQL> SQL> SQL> SQL>   2    3    4    5    6    7    8    9

HIDDEN_PARAMETER                         It's Value           Desc
---------------------------------------- -------------------- ----------------------------------------
_min_time_between_psp0_diag_secs         300                  minimum time between PSP0 diagnostic use
                                                              d for flow control

In my case, it seem that PSP0 consume high cpu due to the fact that there is a need to spawn multiple parallel processes due to high load which was the cause of high cpu caused by psp0. Hence reducing the parallel usage reduced the cpu usage by psp0

Leave a Reply

Discover more from XscalibaL

Subscribe now to keep reading and get access to the full archive.

Continue reading