idle instance 是在 Oracle 运维过程中比较容易遇到的一个问题,现对其处理流程做一个梳理:

1. 检查实例是否起来了

ps aux | grep ora_

2. 检查ORACLE_SID是否正确

echo $ORACLE_SID

3. 检查 ORACLE_BASE 和 ORACLE_HOME 是否设置

echo $ORACLE_BASE
echo $ORACLE_HOME

4. 检查 ORACLE_BASE 和 ORACLE_HOME 是否设置正确

a.路径最后面不能有/
b.正在使用的(内存中的信息)要与即将使用的(配置文件中的信息)要匹配

查看内存中的信息:

# ps aux | grep [p]mon
oraadmin  5203  0.0  0.1 2561100 17152 ?       Ss    2014  89:23 ora_pmon_loan

# cat /proc/5203/environ 
HOSTNAME=loandb.xuwang.onlineSHELL=/bin/bashTERM=vt100HISTSIZE=1000SSH_CLIENT=10.129.2.247 3485 22NLS_LANG=SIMPLIFIED CHINESE_CHINA.ZHS16GBKSSH_TTY=/dev/pts/1USER=oraadminLD_LIBRARY_PATH=/home/oracle/oracle11/product/11.2.0/dbhome_1/lib:LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:ORACLE_SID=loanVER10LIST=11ORACLE_HOME_LISTNER=/data/oracle/oracle11/product/11.2.0/dbhome_1PATH=MAIL=/var/spool/mail/oraadminPWD=/home/oracle/initdbINPUTRC=/etc/inputrcJAVA_HOME=/home/oracle/jdk1.5.0_22LANG=SIMPLIFIED CHINESE_CHINA.ZHS16GBKSSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpassHOME=/home/oracleSHLVL=2LANGUAGE=en_US.UTF-8:zh_CN.GB18030:zh_CN.UTF-8LOGNAME=oraadminSSH_CONNECTION=10.129.2.247 3485 10.128.0.46 22LC_CTYPE=SIMPLIFIED CHINESE_CHINA.ZHS16GBKLESSOPEN=|/usr/bin/lesspipe.sh %sSYNC_HOME=/home/oracle/SwapDir/SOFTORACLE_HOME=/home/oracle/oracle11/product/11.2.0/dbhome_1G_BROKEN_FILENAMES=1_=/home/oracle/oracle11/product/11.2.0/dbhome_1/bin/sqlplusORA_NET2_DESC=13,16ORACLE_SPAWNED_PROCESS=1SKGP_SPAWN_DIAG_PRE_FORK_TS=1402576421SKGP_SPAWN_DIAG_POST_FORK_TS=1402576421SKGP_HIDDEN_ARGS=<FATAL/S/PMON/x0/x1/x0/x1EC386/5092/5092/x0/x2/x1/x1EC3B8/1402576421/1402576421>0SKGP_SPAWN_DIAG_PRE_EXEC_TS=1402576421[root@loandb ~]# 

查看用户环境定义:

$ cat .bash_profile
--------
ORACLE_HOME=/data/oracle/oracle11/product/11.2.0/dbhome_1
export ORACLE_HOME
--------

可见,内存中的 ORACLE_HOME(/home/oracle/oracle11/product/11.2.0/dbhome_1) 与 .profile 文件中配置的 ORACLE_HOME(/data/oracle/oracle11/product/11.2.0/dbhome_1) 并不 致,而 oracle 通过 ORACLE_SID 和 ORACLE_HOME 来产生 SGA 的唯一 ID 号(ORACLE_SID and ORACLE_HOME are used to generate the IDs of the shared memory segments for the SGA)。所以,用户在登录的时候会提示 idle instance。

将.profile 文件中配置的 ORACLE_HOME 修改为正确的 ORACLE_HOME 后,问题解决。

该问题的产生是因为 /data 是到 /home 的一个软链接,应该是之前实例启动后,有人修改过用户配置文件中的环境变量所致。

该问题是XXX省农信不良贷款数据库存在一两年的问题,使用 sqlplus / as sysdba 连接后提示 idle instance,但数据库实例进程都在,且数据库服务一直正常,管理员每次连接数据库时只能通过 sqlplus 走IP连接。我2017年过去后将其解决。

5. 操作系统资源限制值过低

XXX省农信一套RAC数据库两个节点,其中一个节点隔一段时间后连接异常,使用 sqlplus / as sysdba连接后,提示 idle instance,强制杀死 pmon 进程,该节点数据库实例自动重启后,恢复正常,一段时间后,问题再次重现。

该问题已经存在一段时间了,2017年我去后将其解决。

解决办法:将 nproc限制由2048变大后,问题不再发生。
带来的思考:对于 nofile 和 nproc 的资源限制应该定制好基线,避免此类问题再次发生,同时做好相关监控。

6. oracle二进制文件权限问题

The same issue can be caused by wrong permissions of the oracle executable. The problem will be discovered only when the bequeath connection(解释见后) will be executed as another OS user than the one used to startup the database.

The correct permissions should be :
-rwsr-s--x 1 oracle oinstall 187777421 Jun 13 19:16 oracle

and they look like the following when encountering the problem:
-rwxr-xr-x 1 oracle oinstall 187777421 Jun 13 19:14 oracle

关于bequeath connection:

参考文档:
https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/process-architecture.html#GUID-13FE4098-61DF-4D76-882D-551A88E0EBB8

How Oracle Database Creates Server Processes?

The database creates server processes in various ways, depending on the connection methods.

The connection methods are as follows:

  • Bequeath

    • SQL*Plus, an OCI client, or another client application directly spawns the server process.
  • Oracle Net listener

    • The client application connects to the database through a listener.
  • Dedicated broker

    • This is a database process that creates foreground processes. Unlike the listener, the broker resides within the database instance. When using a dedicated broker, the client connects to the listener, which then hands off the connection to the dedicated broker.

When a connection does not use bequeath, the database creates the server process as follows:

  • 1.The client application requests a new connection from the listener or broker.
  • 2.The listener or broker initiates the creation of a new process or thread.
  • 3.The operating system creates the new process or thread.
  • 4.Oracle Database initializes various components and notifications.
  • 5.The database hands over the connection and connection-specific code.

Optionally, if you use of the dedicated broker connection method, then you can pre-create a pool of server processes with the DBMS_PROCESS package. In this case, the Process Manager (PMAN) background process monitors the pool of pre-created processes, which wait to be associated with a client request. When a connection requires a server process, the database skips Steps 2-4 of process creation and performs only Step 5. This optimization improves performance.

6. Oracle 实例异常

如果排除了以上原因,且数据库已经不能正常对外提供服务,那很可能是 Oracle 实例已经确实导常,此时,可以通过 kill PMON 进程对实例强制重启。

-- By 许望(RHCA、OCM、VCP)
最后修改:2023 年 09 月 26 日 11 : 42 AM
如果觉得我的文章对你有用,请随意赞赏