昨天通宵恢复一个1T的数据库,早上6点成功恢复,立即开始备份。7点回家睡觉下午三点起床,也许这就是一个DBA的生活。本想现在写写恢复过程,不过又想睡觉了,明天好好写吧。
Updated @ 2010-12-07
一、系统背景:
硬件: P690_1和P670_1
操作系统:A IX 5.2 ML04
数据库: Oracle 9.2.0.1.0 2 Nodes RAC HA
备库P570(非DG): 与DG原理类似,利用生产库的归档日志在P570上恢复
故障时间:2010-11-23 13:20
二、故障现象:
数据库系统在13:19报告SYSTEM表空间空间不够无法自动扩展,因为系统使用RAW设备所以无法自动扩展,需要管理员手工增加RAW设备文件。
| Tue Nov 23 13:18:41 2010
Completed: alter database backup controlfile to ‘/p690_1_arch Tue Nov 23 13:19:20 2010 ORA-1653: unable to extend table WWW.MLOG$_WWW_MONEY_T by 8192 in tablespace SYSTEM ORA-1653: unable to extend table WWW.MLOG$_WWW_MONEY_T by 8192 in tablespace SYSTEM ORA-1653: unable to extend table WWW.MLOG$_WWW_MONEY_T by 8192 in tablespace SYSTEM …… |
客户工程师在P670_1节点上新增RAW设备并增加到SYSTEM表空间:
RAW设备:/dev/rwwwdb_system04
SQL>alter tablespace system add ‘/dev/rwwwdb_system04’ reuse;
随后系统报错:
| ORA-01157: cannot identify/lock data file 424 – see DBWR trace file
ORA-01110: data file 424: ‘/dev/rwwwdb_system04′ ORA-27037: unable to obtain file status IBM AIX RISC System/6000 Error: 2: No such file or directory Additional information: 3 Tue Nov 23 14:17:03 2010 Errors in file /oracle/ora92/admin/wwwdb/bdump/wwwdb1_dbw0_516116.trc: ORA-01186: file 424 failed verification tests ORA-01157: cannot identify/lock data file 424 – see DBWR trace file ORA-01110: data file 424: ‘/dev/rwwwdb_system04′ Tue Nov 23 14:17:03 2010 File 424 not verified due to error ORA-01157 |
dbv检查数据文件,报文件坏块:
| DBVERIFY – Verification starting : FILE = /dev/rwwwdb_system04
Page 1 is influx – most likely media corrupt Corrupt block relative dba: 0×00000001 (file 0, block 1) Fractured block found during dbv: Data in bad block: type: 0 format: 0 rdba: 0×00000000 last change scn: 0×0000.00000000 seq: 0×0 flg: 0×00 spare1: 0×0 spare2: 0×0 spare3: 0×0 consistency value in tail: 0×00000000 check value in block header: 0×0 |
此时,由于系统表空间不足数据库系统被挂起,欲将增加失败的数据文件离线删除后重启数据库实例报错:
SQL>ALTER DATABASE DATAFILE ‘/dev/rwwwdb_system04′ offline drop;
| Tue Nov 23 14:56:18 2010
ORA-1541 signalled during: ALTER DATABASE DATAFILE ‘/dev/rwwwdb_system04′ off… This instance was first to open ORA-1147 signalled during: alter database open… ORA-01122: database file 424 failed verification check ORA-01110: data file 424: ‘/dev/rwwwdb_system04′ ORA-01251: Unknown File Header Version read for file number 26 |
不要受[ID 422031.1]ORA-01122, ORA-01251 or data blocks reported as corrupted影响,否则方向错了会花比较多的时间
时间:16:00
由于系统数据量大,在生产库不能短时间恢复情况下的
措施:启动P570备库
结果:启动备库P570失败
原因:在10月19增加数据库文件/dev/rphoto1026时出错,需要恢复。
时间:16:20
措施:利用生产系统RMAN备份来恢复
结果:无法恢复
原因:每晚8点的定时备份任务在14日由于备份空间不足备份失败,后续没有任务备份;系统归档日志保留策略为7天,若要恢复到14日备份前,缺少1天的归档日志文件。
