This month I am stealing a blog concept from a fellow blogger named Floyd Teter. A short time ago Floyd, an Oracle ACE Director and substantial blogger, blogged about mistakes made and lessons learned. I generally enjoy reading about the mistakes others make since those mistakes are good reminders that we need to be dilligent with our work as well as having a good sense of humor about it much of the time. With this in mind, I am going to share a recent mistake I made, since it drove my pucker factor into outer space.
My story begins after having completed several upgrades to an E-Business Suite environment on the way to a production upgrade that is a few weeks out. Now, I love to automate things, since I’m not a good typist. So, while completing a series of changes that needed to be input into the latest instance to bring it current with the initial testing instance, I took a few scripts that I had previously automated and thatched them together in an effort to reduce the number of manual steps I needed, as well as to improve performance. I pride myself on being very meticulous, but in this case I was a little over zealous.
The problem…
A script I used to automate the start and stops of the complete applications technology stack and database was inserted into this “master” script that I was using for efficiency reasons. So, the script was executed, and while the system was on its way up I noticed an error at the database mount stage. Now for those of you that are newer DBAs, a database comes up in 3 distinct stages (described below very simplistically):
NOMOUNT => this stage reads the init.ora file, allocates the shared memory structures and creates background processes
MOUNT => this stage reads the control file(s) for the names of its datafiles and redo logs
OPEN => this stage opens the online database datafiles and redo logs and checks them for consistency with the control files.
The Oracle Database Server will raise an error if any of the database files are out of sync with the control file, or vice versa, at which point the DBA will need to determine the cause and/or solution to the problem. So, with the preceeding explanation in mind, I was starting the database when I saw an error roll up on the screen during the mount stage. The error:
ERROR at line 1:
ORA-00201: control file version 10.2.0.0.0 incompatible with ORACLE version 9.2.0.0.0
ORA-00202: control file: ‘/u02/oradata/appststdata/cntrl01.dbf’
The error puzzled me since I had definitely completed the cutover from the 9i database to the 10g database weeks ago. In fact, I had already started and stopped the system several times in the intervening weeks. So I issued the shutdown immediate command so I could check the alert log. Hmmm, I didn’t find any entries in the alert log. Not one in the last few days? What was going on? Again, for those of you new DBAs, the alert log registers most Oracle database errors, which is why we want to look there first. And the fact that there were no entries in the last few days was concerning because the alert log records database start and stop information, which we have just done a few times. So I decided to start the database up again to see what would happen.
sqlplus /nolog
conn / as sysdba
startup mount
Again, I got the same error. What is the deal??? Then, in a flash of brilliance I remembered the script I used to start the system. Yep, the script had the 9.2.0 path code for the startup. Cool, now we were good since I knew what the problem was. At this point, I assumed that I had corrupted the control file in some way, so I decided to recreate it. Here is what I saw:
SQL> @noresetlog_appstst_ctrl_file.sql
ORACLE instance started.
Total System Global Area 612368384 bytes
Fixed Size 2071072 bytes
Variable Size 427820512 bytes
Database Buffers 167772160 bytes
Redo Buffers 14704640 bytes
CREATE CONTROLFILE REUSE DATABASE “APPSTST” NORESETLOGS NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01130: database file version 10.2.0.0.0 incompatible with ORACLE version
9.2.0.0.0
ORA-01110: data file 1: ‘/u02/oradata/appststdata/system01.dbf’
Oh #%*&!!! Now, if you have never had one of these moments, imagine jumping into the Arctic Ocean in just your birthday suit! Yep, that bad. What now?
My personal preference is to start the Service Request (SR) with Oracle Support so that I have 2 working paths and can hopefully get the issue resolved faster. So after logging the SR, I headed back to troubleshooting. I could rule out an application issue for obvious reasons, so I needed to go through my database troubleshooting steps:
-
Log completely out for fresh start => successful
-
Log in with db user => successful
-
echo $ORACLE_HOME => correct home is displayed
-
check init.ora file for 9.2.0 references => no references
-
Alert log? => not helpful
-
opatch lsinventory => Hmmmm… interesting error
Finally, an error to check out. I did not post the specific error here for the sake of brevity. But, rest assured it was interesting. The specific “aha” for this error showed that the current database was not listed in the oraInventory. Well, I know that this was not correct, because I had specifically rebuilt the inventories due to an oraclehomeproperties.xml file error when I laid down the 10g database initially. So, the next step was to look at the orainst.loc file to see where it was directing oraInventory transactions. Yep, it was pointed to the old location from before the rebuild. This is an easy fix. Yes, repointing that file fixed the opatch lsinventory error I received previously. And, yes, I could then confirm that the database was upgraded successfully because the component versions all match.
Having successfully connected and retrieved inventory information gave me confidence to try starting the database again. Dog gone it… Another error. But this one stated that the “compatible” version was set incorrectly. Hmmm, how is that possible since I already checked the init.ora file in previous steps? Ohhhhhhhhhh, wait a minute. When you do an upgrade from 9i to 10g the dbua will convert your database to use an spfile rather than the init.ora pfile. As a quick side bar; the spfile is short for server parameter file and houses the initialization parameters inside the database, whereas a pfile is a file system file that contains the initialization parameters. While both of these files fulfill the same function, they do so in very different fashions.
NOTE*** I’d be happy to cover more intracacies of these two files in a future blog if anyone is interested ***
Although the spfile is binary in nature there is a “readable” file located in the $ORACLE_HOME/dbs directory. A quick check of that file showed:
*.compatible=’9.2.0′
Eureka!!!! This means that I was in the ballpark. But, there is some trickery involved with an 11i applications database versus a regular database. The question becomes, “how do you start your 11i database?” I used to just log into the database and issue the startup command. Which, coincidentally enough, is how my scripts started the database as well. Now, the trickery for those of you Applications DBAs out there, is in how you start your databases. Autoconfig-enabled databases have an appsutil directory pushed from the Apps Tier to the Database Tier that contains Autoconfig-related files and scripts. Now, in this pathing there is a directory of scripts that start and stop your 11i applications database. The full directory for most installations would be:
$ORACLE_HOME/appsutil/scripts/<SID>_<HOST>
This directory contains three scripts related to the starting and stopping of the database:
addbctl.sh => this is the master script generated by Autoconfig for starting and stopping the 11i database. It takes input passed from the command line and then executes one of two additional sql script listed below
adstrtdb.sql => this Autoconfig-generated sql script is called from the “master” script and starts the 11i database
adstopdb.sql => this Autoconfig-generated sql script is called from the “master” script and stops the 11i database
So, where is the trickery? Well, the start script (adstrtdb.sql) starts the database using a pfile as opposed to a spfile!
Aha!!!! The specific syntax is: ### where the $ORACLE_HOME is the full path not variable ###
startup pfile=$ORACLE_HOME/dbs/init.ora
Allright, then executing the following command should fix my issue:
addbctl.sh start
Crossing my fingers…. I saw memory allocation and background processes… It mounted… It opened! Whew!!! Crisis averted!
And there it is, my most recent embarassing moment, where not paying attention nearly cost me! In the end, we all make mistakes, but hopefully we don’t make the same ones twice. Now that you know my mistake, and what I worked through to fix it, maybe you can avoid the same one. And, if you can’t avoid it I’ll settle for you being able to fix it afterwards with a lot less lost time than I did. That being said, anyone else in blog land have an interesting mistake to share?
If you enjoyed this post, make sure you subscribe to my RSS feed!
2 responses so far ↓
1 admin // Oct 1, 2008 at 7:31 am
Lon, there were several places where the database path was pointing to the older version (9i). Do you plan to modify your scripts in the future to catch all those inconsistencies?
2 Floyd // Oct 1, 2008 at 10:51 pm
Well, I guess the upside is that things like this keep us humble…
Leave a Comment