案例记录 --- 永久表空间中不能扩展临时段(ORA-1653)

已索引

该文章已被如下文章引用：

Oracle临时表空间

已索引临时表空间的作用临时表空间是用来存放临时段的，大体说来，有两大作用：1、排序；2、存放临时表如下这些操作会用...

故障描述

EDI 跑数做拉链表运行如下语句时报错 ORA-1653 错误。
INSERT into X SELECT …… [GROUP BY ……];

在数据库 alert 日志中也看到相关的报错信息如下：

Mon Dec 25 03:57:18 2023
ORA-1652: unable to extend temp segment by 8 in tablespace   TBS_HSDM_DATA

Mon Dec 25 04:18:44 2023
ORA-1653: unable to extend table EDI.F_NHX_KNA_DPAC_HL by 64 in    tablespace TBS_CFOP_DATA 
ORA-1653: unable to extend table EDI.F_NHX_KNA_DPAC_HL by 4096 in   tablespace TBS_CFOP_DATA

使用 oerr 命令可以查看到，这儿的数字是指 blocks。如 8 代表 8个 blocks。
另外，注意这儿的 TBS_CFOP_DATA 表空间是永久表空间，并非临时表空间。
这儿的 ORA-1652 报错跟语句中的 group by 没有关系（实际上并没有 group by 语句，这儿写出来只是为了对其引起注意）。

那现在就有两个问题：

报错中为什么是 "temp segment"？
TBS_CFOP_DATA 表空间空间是否充足？

为什么是 "temp segment"？

在 Doc ID 94178.1 (ORA-01652: in Tablespace Other than Temporary During ALTER INDEX REBUILD) 中，提到重建索引的过程中会用到两种临时段：

During an index rebuild, there are two types of temporary segments involved.

First, there are the temporary segments that are used to store partial sort data when the SORT_AREA_SIZE is too small to process the complete sort set. These segments are built in the user's default TEMPORARY tablespace.

Second, as the index is being rebuilt, it uses a segment which is defined as a temporary segment until the rebuild is complete. Once this segment is fully populated, the old index can be dropped and this temporary segment is redefined
as a permanent segment with the index name.

The error you are seeing is probably due to there being insufficient room in the index's tablespace to hold both the original index and the new version concurrently. The new version of the index, currently a temp segment, will be in the tablespace where the index is required.

使用第二种临时段的还有如下一些操作：
create index
create tables ... as select ...
insert into ... as select ....
create pk constraint
enable constraint
move table

表空间 TBS_CFOP_DATA 为什么空间不足？

下面查看下 TBS_CFOP_DATA 表空间当前的 free 空间大小：

select sum(bytes)/1024/1024/1024 as "size(G)"
from dba_free_space 
where tablespace_name = 'TBS_HSDM_DATA';

   size(G)
----------
200.732361

看一下这些 free 空间的分布情况：

SQL> select FILE_ID,sum(BYTES)/1024/1024/1024 as "SIZE(G)" 
from  dba_free_space t  
where TABLESPACE_NAME='TBS_CFOP_DATA' 
group by FILE_ID;

   FILE_ID    SIZE(G)
---------- ----------
       325 11.2111816
        13 19.8383789
        14 17.0698853
        20 17.4968262
        17 23.6897583
       301 15.6937256
        18 19.7184448
        15  22.697937
        16 19.2509155
        19 24.0870361
       324 9.97827148

12 rows selected.

可以发现每个数据文件都有较大的剩余空间。

再来用平时查询表空间使用率的语句来看一下该表空间的剩余大小：

TABLESPACE_NAME                  TOTAL_MB    USED_MB USED_PCT                                     FREE_MB
------------------------------ ---------- ---------- ----------------------------------------- ----------
TBS_HSDM_DATA                      419840 214290.063 51.04%                                    205549.938

可见表空间的剩余空间是很大，一般说来，不应该出现空间不足的问题。

那问题出现在哪儿呢？上面的查询只是查询的总剩余空间是否足够，但是表空间是存在碎片的，上面free空间的总值是由很多不连续的空间组成的。

我们看一下为该表分配区的历史记录：

SQL> select EXTENT_ID,BYTES/1024/1024 MB from dba_extents where SEGMENT_NAME='F_NHX_KNA_DPAC_HL' order by EXTENT_ID

 EXTENT_ID         MB
---------- ----------
         0       .125
         1      .0625
……
        14      .0625
        15          1
……
        77          1
        78          8
……注意并不是说会越来越大，中间出现的数字有2，52，64，43，21，2，64，32，7，24.5 等
……所谓的自动分配，就是完全由 Oracle 来决定是单 extent 的大小

       529         64
……
       660         64

661 rows selected.

可见，该表空间使用的是区大小自动管理的方式，大需要大量空间时，Oracle 分配 extent 时已经趁向一个 extent 64M 了(extent 是 oracle 最小的分配单位)。

DBA_FREE_SPACE describes the free extents in all tablespaces in the database.
我们看一下该表空间最大的一个 free 空间的大小：

select max(bytes)/1024/1024/1024 as "size(G)"
from dba_free_space 
where tablespace_name='TBS_CFOP_DATA';
 
   size(G)
----------
.001586914

大小只有1M多，Oracle 尝试分配 64M 的 extent时，出现 ORA-1652。

增加一个新的数据文件后，重新跑批后，该表空间最大的一个 free 空间的大小为：

SQL> select max(bytes)/1024/1024/1024 as "size(G)"
     from dba_free_space
     where tablespace_name='TBS_CFOP_DATA';

   size(G)
----------
.936523438

解决办法

临时解决办法是增加一个新的数据文件，但是后面问题会再现，且现有监控不能提前发现问题。
根本的解决办法是整理表空间碎片，将这些碎片化的 free 变成连续的，整理碎片的方法可以参考 Doc ID 1264470.1 或者下面的文章，简单地说就是 shrink 表、move 表或者通过 expdp/impdp 重建。

表空间碎片、表碎片、索引碎片

已索引该文章已被如下文章引用：表空间碎片、表碎片、索引碎片本质上是相通的，因为表空间里面放的就是表和索引，表空间的...
如果无法进行碎片整理，可以针对报错的表空间增加一个监控，对 dba_free_space 中该表空间的 max free 值进行监控。
但是，如果想对所有数据库实施该监控不现实，因为其它数据库的 max free 值可能只有几M也没有问题，这跟数据库的活动特征有关，它如果没有经常性的大量数据写入或更新，那碎片就很容易能用上。
建议 Oracle 表空间使用 uniform (1M) 的 extent 管理方式。