客戶反應對錶做delete非常緩慢,sql很簡單,也有走索引.
但是每次執行要1000多秒,單獨的select count(1) from GEMINI_V3.prod_repair_sheet_detail2則非常快。
1、檢視表的資訊:
表不大,本身也有索引,而且索引都不大,也沒有lob欄位。
2、monitor分析
語句跑起來,看來絕大部分在CPU上。這個很詭異,表本身不大,物理IO也不高,很快才是。
觀察CPU的使用情況,14167就是delete的對談程序,佔了一個核的CPU, 使用基本100%。
3、等待事件
直接看下對談的等待情況:
SID S# SPID USERNAME MACHINE SQL_ID PREV_SQL_ID EVENT P123 P123TEXT WT LAST_CALL_ET SECONDS_IN_WAIT BLOCKING
------ ------- ---------- ---------- ---------------------- ------------------ ------------------ ------------------------------ ------------------ ----------------------------------- ---- ------------ --------------- ------------
997 57 14167 SYS database 5x59cnqhb93x3 32j3k52kwag60 Disk file operations I/O 2/3/2 FileOperation/fileno/filetype -1 18 17 :
wait 40s
SID S# SPID USERNAME MACHINE SQL_ID PREV_SQL_ID EVENT P123 P123TEXT WT LAST_CALL_ET SECONDS_IN_WAIT BLOCKING
------ ------- ---------- ---------- ---------------------- ------------------ ------------------ ------------------------------ ------------------ ----------------------------------- ---- ------------ --------------- ------------
997 57 14167 SYS database 5x59cnqhb93x3 32j3k52kwag60 Disk file operations I/O 2/3/2 FileOperation/fileno/filetype -1 59 58 :
Disk file operations I/O等待事件分析,竟然是大量undo表空間的IO。表本身並不大,為什麼需要那麼多undo?
SELECT distinct filetype_id, filetype_name FROM DBA_HIST_IOSTAT_FILETYPE where FILETYPE_ID=2
FILETYPE_ID FILETYPE_NAME
----------- ------------------------------
2 Data File
SELECT name FROM v$datafile WHERE file#=3;
NAME
------------------------------------------
/home/app/oradata/genvictdb/undotbs01.dbf
4、10046分析
沒思路了,做個10046跟蹤下。很有意思,有一條「 select /*+ all_rows */ count(1) from "GEMINI_V3"."PROD_REPAIR_TEST_RESULT" where "SHEET_DETAIL_ID" = :1」。
為什麼刪除A表的資料,要查B表的資料?難不成有外來鍵?!!
*** 2020-10-19 14:52:15.015
WAIT #139665416019160: nam='SQL*Net message from client' ela= 51654897 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1603090335015370
CLOSE #139665416019160:c=0,e=17,dep=0,type=0,tim=1603090335015536
=====================
PARSING IN CURSOR #139665416019160 len=103 dep=0 uid=0 oct=7 lid=0 tim=1603090335018120 hv=2696187811 ad='a37e19770' sqlid='5x59cnqhb93x3'
delete from GEMINI_V3.prod_repair_sheet_detail2 where sheet_id = 'a8c8f190-d078-4cf2-b996-d6848f6b7ba9'
END OF STMT
PARSE #139665416019160:c=1999,e=2502,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=854670735,tim=1603090335018119
WAIT #139665416019160: nam='Disk file operations I/O' ela= 119 FileOperation=2 fileno=3 filetype=2 obj#=-1 tim=1603090335018949
=====================
PARSING IN CURSOR #139665416121272 len=104 dep=1 uid=0 oct=3 lid=0 tim=1603090335020122 hv=3610118026 ad='a47f9e058' sqlid='830yptrbkw1wa'
select /*+ all_rows */ count(1) from "GEMINI_V3"."PROD_REPAIR_TEST_RESULT" where "SHEET_DETAIL_ID" = :1
END OF STMT
PARSE #139665416121272:c=1000,e=658,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=0,tim=1603090335020121
BINDS #139665416121272:
Bind#0
oacdty=01 mxl=128(50) mxlc=00 mal=00 scl=00 pre=00
oacflg=12 fl2=0001 frm=01 csi=852 siz=128 off=0
kxsbbbfp=7f066383e2f0 bln=128 avl=32 flg=05
value="2a9411d563b544ab9a4e116bf3784f53"
EXEC #139665416121272:c=1000,e=982,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=1814594818,tim=1603090335021254
*** 2020-10-19 14:52:15.194
FETCH #139665416121272:c=172974,e=173229,p=0,cr=11430,cu=0,mis=0,r=1,dep=1,og=1,plh=1814594818,tim=1603090335194518
STAT #139665416121272 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=11430 pr=0 pw=0 time=173232 us)'
STAT #139665416121272 id=2 cnt=0 pid=1 pos=1 obj=88515 op='TABLE ACCESS FULL PROD_REPAIR_TEST_RESULT (cr=11430 pr=0 pw=0 time=173212 us cost=1989 size=33 card=1)'
CLOSE #139665416121272:c=0,e=3,dep=1,type=3,tim=1603090335194819
BINDS #139665416121272:
Bind#0
oacdty=01 mxl=128(50) mxlc=00 mal=00 scl=00 pre=00
oacflg=12 fl2=0001 frm=01 csi=852 siz=128 off=0
kxsbbbfp=7f066383e2f0 bln=128 avl=32 flg=05
value="614ee7ac0a8a462aae931d56d5d7b7ae"
EXEC #139665416121272:c=0,e=121,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=1814594818,tim=1603090335195298
FETCH #139665416121272:c=90986,e=91291,p=0,cr=11429,cu=0,mis=0,r=1,dep=1,og=1,plh=1814594818,tim=1603090335286616
CLOSE #139665416121272:c=0,e=5,dep=1,type=3,tim=1603090335286826
BINDS #139665416121272:
Bind#0
oacdty=01 mxl=128(50) mxlc=00 mal=00 scl=00 pre=00
oacflg=12 fl2=0001 frm=01 csi=852 siz=128 off=0
kxsbbbfp=7f066383e2f0 bln=128 avl=32 flg=05
value="5fdd2fa148cb4c8ea403971ddcfb93e0"
EXEC #139665416121272:c=0,e=153,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=1814594818,tim=1603090335287401
FETCH #139665416121272:c=89986,e=89983,p=0,cr=11429,cu=0,mis=0,r=1,dep=1,og=1,plh=1814594818,tim=1603090335377440
5、外來鍵分析
很顯示GEMINI_V3.PROD_REPAIR_TEST_RESULT是子表,GEMINI_V3.PROD_REPAIR_SHEET_DETAIL2是父表。而子表在外來鍵欄位SHEET_DETAIL_ID上並沒有建立索引。
原因很清晰了,由於子表外來鍵欄位沒有索引。導致父表每一行的delete,子表都要掃描一次。父表1203150行 * 子表748073行的掃描,當然非常慢!
從AWR中也可以看到這張表的邏輯讀非常高。這也造成我前面從執行計劃、monitor都看起來正常,分析上走不少彎路。
父表和子表都不大,物理IO一次後讀到資料庫buffer中,在記憶體中進行多次掃描,造成了大量邏輯讀!
子表的外來鍵欄位必須建索引!!
建立索引後,速度有質的飛越
14:52:13 sys@GENVICTD > delete from GEMINI_V3.prod_repair_sheet_detail2 where sheet_id = 'a8c8f190-d078-4cf2-b996-d6848f6b7ba9';
11256 rows deleted.
Elapsed: 00:16:51.89
15:09:06 sys@GENVICTD > rollback;
Rollback complete.
Elapsed: 00:00:00.50
15:16:58 sys@GENVICTD >
15:17:00 sys@GENVICTD >
15:17:00 sys@GENVICTD > delete from GEMINI_V3.prod_repair_sheet_detail2 where sheet_id = 'a8c8f190-d078-4cf2-b996-d6848f6b7ba9';
11256 rows deleted.
Elapsed: 00:00:00.92