Database recovery looks at reconstruction of a database after a sudden failure. Now that you know how to carry out database backup, lets us learn something about database recovery 😇. As a DBA, You will be in a position necessitating you to carryout a database recovery in order to make the database available to users. Knowing the different database failures will help you much in understanding the type of recovery to be done on the database. Therefore, lets start with looking at these failures.
Types of Database Failures
This occurs when a database operation fails due to the SQL statement issued. This could be because of an invalid syntax, insufficient privileges, non-existence of the object, logic application error, running out of space on tablespace, etc. This can be solved by providing appropriate privileges to users, create views on tables and provide privileges on views rather than tables, increase user’s quota space, DBAs should work with users to provide valid data, working with developers can help in solving application logic errors, intensify user education, etc.
User Process Failure
This failure occurs whenever there is an abrupt termination of the user connection to the instance. The abrupt termination could be from the user closing the SQL*Plus without logging out, application errors that closes the app abruptly without the user first logging out, reboots, etc. The DBA intervention is not needed for this failure. The PMON process periodically checks for user processes to ensure they are still connected. If it finds a disconnected session, it de-allocates the assigned resources from the session as well as rolling back uncommitted transactions.
This failure occurs whenever the instance shuts down without synchronizing the database files to the same SCN (System Change Number) thus requiring a recovery the next time the instance is started. A number of reasons that cause this failure are out of your control as a DBA. They include power outages, server failure, failure of a background process, use of SHUTDOWN ABORT mode, etc. The good news is that Oracle performs the instance recovery automatically the next time the instance is started. The only thing a DBA can do is to tune the time taken by the database to recover from an instance failure. This is done by configuring the FAST_START_MTTR_TARGET parameter to the time you need the database to take when recovering from the instance failure. This time can be between 1 minute to 1 hour.
Depending on your database, different network components may be needed to enable users access and use your database. For example, users may be required to traverse different routers and switches before they access your database. From a network perspective, this configuration provides a number of points where a failure can occur. This could be a failure of the server, failure of the network card, etc. To guard against this, you can provide redundant network path from your clients to the server, ensuring hardware redundant, proper network management, etc.
User Error Failure
Even if users have been trained well on how to disconnect from the database, human nature is human nature. There may be cases when users delete or modify data in tables, delete tables, etc. This is what is known as user error. Although these operations succeed from a statement point of view, they may not be logically correct for example, the DELETE command worked well but you really didn’t want to delete that record. The possible causes for this kind of error could be ignorance of users, malicious actions, high stress levels, carelessness, etc.
If data was deleted, a ROLLBACK command could be used to undo the change if a COMMIT command is not yet issued. If the transaction was already issued, we have a number of options at our disposal. We can use Flashback Queries to recover the deleted records from the database. A flashback query is like a normal query with AS OF TIMESTAMP keyword. It takes reproduces the table as it appeared before deleting the records. We can use INSERT keyword with the flashback query to recover the records. Let us simulate this error and we recover the records using flashback query.
In the snippet above, we begin with confirming there is data in the student table which we afterwards delete and no longer available. We then use a flashback query to show the content of the affected table a day before its data was deleted. We then use an INSERT statement with the flashback query to recover this data. Finally, the data is recovered.
Apart from flashback query, we also have Flashback Table used to recover tables deleted. Flashback Table recovers the deleted table with its content. Whenever a table is deleted, it is placed in a recycle bin in the tablespace. This table stays in the recycle bin for a time determined by the UNDO_RETENTION parameter until it is deleted permanently. To use the Flashback Table, we specify the table name we need to recover and then the keyword TO BEFORE DROP. Lets simulate this.
In the above code snippet, we start with confirming the available tables and locations table is part of the tables. We issue a DROP statement that deletes it which we later confirm that it is no longer available. It is represented with an encrypted wording which shows it is available in the database. We then use the flashback table statement to recover it. We after confirm that it’s recovered with its data.
Therefore, in case of a user error, we can use flashback queries and flashback table to recover the lost data and objects.
This is the last failure and occurs as a result of a loss of one or more database files (data file, control file or redo log file). This could be caused by a failure of a disk, server crash, deletion of the file, renaming of the file, corruption of the file, etc. The RMAN can help you in fixing this problem by recovering the affected files. However, the RMAN only helps in recovering the data file and control file only. The rest of the files can be recovered from the backup copies. That’s why backup is so crucial in database management.
Recovering From Data File Loss
We are going to simulate a media failure by deleting a data file and we try to recover it using the RMAN tool. Lets go ahead and delete one data file from the list of files. To be able to delete this file, we need to first stop the Oracle service from the task manager.
After successfully deleting this file, we can turn on the Oracle service again. Lets try to start to connect to the database.
On trying to connect, we land into an ORA-01033 error. Let us use the RMAN to help us with this error. We use the list failure command to get the problem with the database. RMAN shows us that one of the data files is missing. We can run the advise failure command to get some solutions available in RMAN in fixing this issue.
Let us use the repair failure command to enable RMAN help us in recovering the database. In snippet below, on running the command RMAN prompts you to either accept or reject the recovery. We issued YES to start the recovery process. Once the recovery is done, RMAN prompts if you need the database to be open or stay closed. On accepting, RMAN, opens the database and now it is available to all users.
That is how we can recover from the loss of a data file. We can also see how we can recover from the loss of a control file.
Recovering From Control File Loss
Lets delete the control file manually from the oradata folder like we did when we deleted the data file (first stop Oracle Service). Lets use the RMAN tool to recover the file. RMAN uses the auto backup to restore the control file.
From the above, the control file has been restored. Though that’s not enough to get the database to operation. We need to recover the database by using the command below:
Now we need to recover the database. Lets use the following commands:
Now we need to shutdown the database and startup in mount mode. Afterwards, we need open the database with resetlogs. After that, we can now open the database.
Huff! 😮💨. Now the database is open and available to users. it has been a long way to recover the control file but we finally made it. Congratulations if you managed to recover your database.
The above discussion is about the various database failures and how we can recover from them. Understanding the type of failure helps you in knowing the recovery efforts needed. Hope you’ve learnt how we recover from all the failures. Lets meet in the next section.
1. Database recovery is the reconstruction of the database after a failure.
2. Types of a database failure include user process failure, media failure, instance failure, user error, network failure and statement failure.
3. RMAN tool helps greatly in recovering from media failure and flashback query and table helps in user error failure.