已索引

切换前检查:

Failover 到物理备库

总体思想:
尽可能的将主库的日志传输到备库,然后 CANCEL REDO APPLYING ---> STOP REDO APPLYING ---> SWITCH TO PRIMARY。
如果有日志缺失不能解决,则将备库强制激活。

Step 1 :将主库的日志传输到备库(主库如果可以mount的话)

如果主库能启动到 mount 阶段,在主库上执行如下命令将未发送到 standby 的 archived log 和 online redo log 发送到 standby:

SQL> ALTER SYSTEM FLUSH REDO TO STANDBY_DB_UNIQUE_NAME;

如果该步操作能够正常完成,则可以零数据丢失。正常,go to setup5。
如果不能 mount,或者上面的命令有任何报错,或者因为不想等待中断了上面的命令,go to step2。

Step 2 :保证备库拥有主库最新的归档日志(each thread)

在 standby 上通过如下查询,查询到最高的日志序列号及缺失的日志:

SQL>  SELECT UNIQUE THREAD# AS THREAD, MAX(SEQUENCE#) OVER (PARTITION BY thread#) AS LAST from V$ARCHIVED_LOG;

    THREAD     LAST
---------- ----------
     1           100
     2           101
     
SQL> SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;

THREAD#    LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
         1            90             92     

LOW_SEQUENCE#, HIGH_SEQUENCE# 分别表示缺少的最小和最大归档日志序号。

尽最大可能将缺失的日志从 primary 传输到 standby,然后使用如下命令进行注册:

SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE 'filespec1';

重复上面的步骤直至所有日志缺失都得到解决。如果不能解决所有缺失的日志,则 Failover 会丢失数据。

Step 3 :Stop Redo Apply.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Step 4 :Finish applying all received redo data.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;

如果命令没有报错,proceed to Step 5.
如果有报错,尝试解决造成报错的原因,再次执行上面的命令,直至成功执行,然后 proceed to Step 5.

如果报错不能得到解决(例如日志缺失不能修复),此时只能将备库强制激活为主库(with some data loss) :

SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
SQL> ALTER DATABASE OPEN;

Step 5 :检查 standby 已经准备好成为主库

SQL> SELECT SWITCHOVER_STATUS FROM GV$DATABASE;

SWITCHOVER_STATUS
-----------------
TO PRIMARY
SESSIONS ACTIVE

Step 6 : 将 standby 切换为 primary

SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;

Step 7 :打开新主库

SQL> ALTER DATABASE OPEN;

新主库打开后,就可以在其它备库上开启日志应用了。此外,Oracle 建议对新主库做一个全备。

将故障的主库转换为备库(基于闪回)

对于故障的主库,可以基于 Flashback Database 或者 RMAN 转换为备库,或者重建。

官方文档:
Data Guard Concepts and Administration:
3 Creating a Physical Standby Database
3.2 Step-by-Step Instructions for Creating a Physical Standby Database
13 Data Guard Scenarios
13.2 Converting a Failed Primary Into a Standby Database Using Flashback Database
13.7 Converting a Failed Primary into a Standby Database Using RMAN Backups

Oracle 建议主库开启闪回数据库,通过闪回数据库将失败的主库转换为备库是最快速的方式(13.2)。
如果主库没有开启闪回,仍然可以通过本地备份将失败的主库转换为备库(13.7)。
或者,基于新主库的备份重建备库(3.2)。

这儿是基于闪回将原主库转换为备库的步骤。

Step 1:检查原备库切换为主库时的 SCN。

在新主库上执行:

SQL> SELECT TO_CHAR(STANDBY_BECAME_PRIMARY_SCN) FROM V$DATABASE;

Step 2:闪回失败的原主库

Shut down the old primary database (if necessary), mount it, 将其闪回到上面查询的 SCN。

[xoracle]$ srvctl stop database -d DB_UNIQUE_NAME
SQL> STARTUP MOUNT;
SQL> FLASHBACK DATABASE TO SCN standby_became_primary_scn;

Step 3:将原主库转换为备库

在原主库上执行:

SQL> ALTER DATABASE CONVERT TO PHYSICAL STANDBY; (对于RAC,只能有一个节点启动到 mounted 阶段)
This statement will dismount the database after successfully converting the control file to a standby control file.

关闭并重新挂载数据库:

SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP MOUNT;

Step 4:开始传输日志到新备库

在新主库上执行如下命令:

SQL> SELECT DEST_ID, DEST_NAME, STATUS, PROTECTION_MODE, DESTINATION, ERROR, SRL FROM V$ARCHIVE_DEST_STATUS;  <--- 查看归档路径的状态

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_n=ENABLE; <--- 如果需要的话

SQL> ALTER SYSTEM SWITCH LOGFILE;  <--- 切换日志确保备库能接收到新主库的日志

SQL> SELECT DEST_ID, DEST_NAME, STATUS, PROTECTION_MODE, DESTINATION,ERROR, SRL FROM V$ARCHIVE_DEST_STATUS;  <--- 查看归档路径的状态

On the new standby database, you may also need to change the LOG_ARCHIVE_DEST_n initialization parameters so that redo transport services do not transmit redo data to other databases.

Step 5:在新备库上开启 Redo Apply

在新备库上执行如下命令:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

Redo Apply automatically stops each time it encounters a redo record that is generated as the result of a role transition, so Redo Apply will need to be restarted one or more times until it has applied beyond the SCN at which the new primary database became the primary database. Once the failed primary database is restored and is running in the standby role, you can optionally perform a switchover to transition the databases to their original (pre-failure) roles.

-- By 许望(RHCA、OCM、VCP)
最后修改:2024 年 09 月 19 日 05 : 47 PM
如果觉得我的文章对你有用,请随意赞赏