Teach Yourself Oracle 8 In 21 Days
Teach Yourself Oracle 8 In 21 Days
- Day 17 -
Recovering the Database
Yesterday you learned how to back up your database; today you will learn how to recover your database in the event of an emergency. Several types of failures can necessitate a recovery:
- An instance failure necessitates an instance recovery. As long as no permanent
hardware failures have occurred, the instance recovery will be automatic and complete.
Enabling parallel recovery improves performance.
- A hardware failure that results in the loss of a datafile necessitates a media
recovery. This involves both recovering the lost data and performing an instance
recovery (if necessary). This process will be explained in the section titled "Media
- A user error that results in the loss of data might necessitate a point-in-time recovery, which allows you to recover up to a certain point before the failure occurred. This can be very useful, but very problematic. The point-in-time recovery is detailed in the section titled "The Point-in-Time Recovery."
No matter what type of recovery is required, the important task of recovery should be performed as quickly as possible. If a failure requiring recovery occurs, many users will have little to do until the recovery is complete. But even though the recovery operation must be completed quickly, you should not rush. Any mistakes during the recovery will only delay the resumption of normal operations.
Recovery Versus Restoration
Today you will learn about recovery and restoration operations. These two concepts might sometimes be thought of as the same, but they are actually quite different:
- Recovery is the act of bringing the database back to where it was an instant
before the failure, and refers to the process whereby Oracle rolls forward committed
transactions and rolls back noncommitted transactions. Recovery is automatic.
- Restoration is the act of replacing a datafile with a backup copy. You can restore a database only if it is not running in ARCHIVELOG mode and has overwritten redo log files. Restoration is a manual operation requiring operator intervention (either by hand or via the graphical administration tools).
Recovering Your Database
Despite how reliable hardware has become, there are still occasions when the system might fail (perhaps because of a component failure or a power failure). If the failure causes no data loss, Oracle can recover itself; this is referred to as an instance recovery. If data has been lost, data must be restored from a previous backup and recovered; this is referred to as media recovery. If some event, such as an accidental deletion of a table, has occurred, a point-in-time recovery must be performed to avoid recovering the table-drop statement and repeating the mistake.
New Term: When an instance failure occurs, the Oracle recovery process can completely recover the database upon instance startup. All transactions that were committed at the time of the failure will be recovered, or rolled forward, and all transactions that were in process (also known as in-flight transactions) will be rolled back.
NOTE: Instance recovery can be quite time consuming. How much time instance recovery takes depends on the number of dirty buffers in the SGA. The number of dirty buffers depends on how much time has passed since the last checkpoint and the number of data modifications.
The instance-recovery process is automatic. When the instance is started, the startup process examines the datafiles and redo log files to determine whether the instance was properly shut down. At this point, the redo log is read and the affected transactions are rolled forward or back. If the checkpoint launched by the last log switch was completed, the transactions in fewer than one log file will require recovery.
TIP: If you tune the checkpoint interval and recovery parallelism, you can shorten the recovery time.
Users cannot access the database during the recovery process; only after the instance recovery is complete can users access the database. For this reason, recovery time should be kept to a minimum. Using the parallel-recovery feature of the Parallel Query option can help you reduce the time it takes for instance recovery to be completed. As you will see on Day 19, "Advanced Oracle Options," the number of processes or threads that perform the instance recovery can be tuned. If you tune the number of recovery processes to run best with your system, you can optimize the recovery interval.
In the event of a media failure (if, for example, a datafile is damaged), data recovery is necessary. If a datafile is damaged, you must restore it from backup before recovery can occur. The Oracle recovery process then applies archive log files and redo log files to restore the damaged datafile to its prior-to-failure state. This process might require the use of all archive log files created since the backup to recover the restored datafile, which can be quite time consuming.
TIP: If you schedule frequent backups, you can shorten restoration and recovery time. Recovery time depends both on how much time has passed and how much data has been modified since the last backup.
By placing the latest backup files as well as any archive log files created since the last backup online, you can shorten the time it takes to start the recovery process. If you have a plan and have everything ready to go, things will run more smoothly.
In order to assist the recovery process, Oracle8 provides several different methods for database recovery:
- Backup Manager
- OS facilities
- The Import utility
- The NT Recovery utility
Each of these utilities can perform an effective backup; personal preference and your system needs dictate which one you use.
As with the database-backup procedure, you can perform a recovery using Enterprise Manager's Backup Manager utility. To do so, follow these steps:
- 1. After you invoke Backup Manager, you can start the recovery process
by selecting Recover | Restore Wizard.
2. The Restore wizard leads you through a series of screens that are similar to those of the Backup wizard. The first screen, shown in Figure 17.1, allows you to choose what type of restore you want to perform. Note that I have selected the Tablespaces radio button. After you decide what type of restore you want to perform, click Next.
Screen one of the Restore wizard.
NOTE: The available options in this case are Tablespaces and Datafiles. Because the instance is up and running in this example, the database recovery option is not available. If the instance was down, Database would be an option.
- 3. Because I chose the Tablespaces option, the Tablespaces screen (shown in Figure 17.2) appears. From here you select what tablespaces you want restored; you can choose as many or as few tablespaces as you want. Note that I have chosen to restore the DOGS tablespace. After you select a tablespace, click Next.
The Tablespaces screen of the Restore wizard.
- 4. After you select the tablespace(s) you want recovered, the Restore wizard determines which datafiles will be restored. The Rename screen, shown in Figure 17.3, allows you to change the name of the datafile to be used in the restore. This is valuable because the volume that contains that datafile might not be available due to a hardware failure. If you want, you can change the name of the datafile. When you finish, click Next.
The Rename screen of the Restore wizard.
- 5. The Channels screen, shown in Figure 17.4, allows you to select the channel(s) to be used during the recovery process. Depending on how many backup files and datafiles you are working with, a larger number of channels can improve performance by adding some parallelism. It might occasionally be necessary to change the channel from the one used in the backup operation. After you determine what channel to use, click Next.
The Channels screen of the Restore wizard.
- 6. Finally, the Restore wizard presents the Summary screen, shown in Figure 17.5. If all is correct, start the restoration process by clicking OK.
The Summary screen of the Restore wizard.
If you want to change the type of restore operation from an online to an offline restore, you can change the state of the instance from the main screen of the Backup Manager utility. To shut down the database, simply select the Shutdown radio button and click Apply (see Figure 17.6).
After you select the Shutdown option, you are asked what type of shutdown you want to perform. Your options are
- Normal--If you select the Normal option, the shutdown process waits for all users
to disconnect and then continues. During this time, no new connections are allowed.
- Immediate--If you select the Immediate option, all idle connections are disconnected,
current transactions are rolled back, and the instance is shut down.
- Abort--If you select the Abort option, the Oracle instance is immediately terminated. A shutdown with Abort causes media recovery to be necessary.
Change the state of the database from the main screen of Backup Manager.
- Transactional--If you select the Transactional option, currently running transactions are allowed to complete. As soon as a transaction has finished or aborted, the connection is terminated. During this time, no new connections are allowed.
I recommend that you always use the Immediate or Normal option. The Abort option should be used only in the case of an emergency.
As you have seen, it is easy to back up and restore a database using Backup Manager. Nonetheless, it is still important to test the recovery process periodically to make sure it is working correctly.
The method for restoring a datafile using OS facilities is as simple as the method used for creating the backup using OS facilities:
- 1. With the tablespace offline, copy the datafile from the backup to the
2. After you bring the tablespace back online (via the ALTER TABLESPACE command or the restarting of the Oracle instance), Oracle will realize that the datafile has been replaced and prompt you to apply the archived log files.
Tablespace and datafile recovery can be performed only if the RDBMS is running in ARCHIVELOG mode. Otherwise, you must restore the entire database from the last full, offline backup. No roll forward or rollback occurs.
The Import Utility
You can use the Import utility to restore the database if and only if the Export utility was used to create the backup. For more information about the Import and Export utilities, see Day 9, "Managing Data."
The NT Recovery Utility
NT Recovery Manager is very similar to NT Backup Manager. NT Recovery Manager, provided with your Oracle for Windows NT software, is not part of Enterprise Manager. Because of this, NT Recovery Manager can recover only a local database--it cannot provide recovery over a network. To recover a local database with NT Recovery Manager, perform these steps:
- 1. As with NT Backup Manager, the first step in invoking NT Recovery Manager
is to supply the password for the internal account.
2. After you supply the password for the internal account and connect with the instance, you will see NT Recovery Manager's first screen. This screen, shown in Figure 17.7, lists the following recovery options:
- Automatic recovery
- Restore from full database backup
- Restore datafile, then do recovery (supply the name of the datafile)
- Restore control file, then do recovery
Oracle Recovery Manager offers a variety of recovery options.
- 3. The screen you see after selecting an option and clicking Recover depends
on what option you select. In this case, I have selected the Restore datafile option
and specified the filename \DATABASE\DOGS\DBF. Consequently, I am presented
with the Data File Recovery screen shown in Figure 17.8.
4. After you specify which datafiles to recover, NT Recovery Manager automatically recovers them for you. If instance recovery is necessary, it is performed as well.
The Data File Recovery screen of NT Recovery Manager.
The Log File Recovery
You cannot restore from the loss of a log file. If a log file is damaged and the system is still functional, you can drop the log file, re-create it, and immediately perform a full backup. The backup is necessary because you will not be able to use the archive and redo log files for any recovery on this database. That is why I always recommend that redo log files reside on fault-tolerant disk volumes.
NOTE: I recommend using RAID mirroring on the redo log files to protect them in the event of a disk failure.
The Point-in-Time Recovery
The point-in-time recovery allows you to recover a database back to a specified point in time, allowing you to recover from user errors or software problems that caused a failure to occur. However, because of the potential side effects, the point-in-time recovery, a somewhat complex operation, should be performed only on rare occasions. When you recover back to a point in time, the entire database will be recovered to that point. If a table is deleted by accident and the point-in-time recovery is being used to recover to a point just before that deletion, you must keep in mind that all other work to other tables that occurred after that point will be lost.
WARNING: The point-in-time recovery can be dangerous to use, because you are restoring to a point in time in the past. Always perform a full backup of your current database before attempting the point-in-time recovery. A point-in-time recovery recovers the entire database to that point in time. There is no way to recover a single tablespace or table, so all work done after that point to other tables will be lost.
NOTE: I have mentioned the point-in-time recovery only so that you will know it exists. If you plan to use the point-in-time recovery, you should study the Oracle documentation and carefully plan a restore before you attempt it.
The point-in-time recovery can be performed via Recovery Manager or via the RECOVER administrative SQL command. In either case, the syntax for performing the point-in-time recovery is entered through the command-line interface; there is no GUI option for the point-in-time recovery.
What method you use to restore your data depends on what type of error necessitated the recovery. In the event of a media failure, you must restore the datafile, then perform the media recovery with the point-in-time recovery option. In the event of a user error, follow these steps:
- 1. Create a temporary copy of the database.
2. Restore this temporary database from a previous backup.
3. Specify a point in time immediately before the failure, then perform the point-in-time recovery on this temporary database.
4. Export the table that has been damaged, which now is in a state just before the damage.
5. Import that undamaged table to the real database.
This task can be time consuming and dangerous. If you confuse the temporary and real copies of the database, you might destroy valuable data.
Point-in-Time Recovery Using Recovery Manager
Recovery Manager can be used to perform a point-in-time recovery via its command-line syntax. Because of the complexity of the command-line interface to Recovery Manager and the rarity of its use, I will not go into great detail on how to use it here. You can find very complete documentation in the Oracle Backup and Recovery manual.
When using the command-line interface to Recovery Manager, you must specify this additional parameter:
This parameter takes the following qualifiers:
TIME date LOG SEQ number THREAD number SCN scn
With this syntax, you can specify the exact time you want the recovery to stop by specifying the time, the log sequence, or the system change number. By using the SET UNTIL parameter when performing a recovery via Recovery Manager, you restore until the specified time.
Point-in-Time Recovery Using the RECOVER Command
When you perform a recovery with the RECOVER command, you can restore to any point in time by using the RECOVER UNTIL syntax. The RECOVER command has a number of options, but the important ones for the point-in-time recovery are
- RECOVER UNTIL CANCEL
- RECOVER UNTIL TIMEdate
- RECOVER UNTIL CHANGEscn
Like Recovery Manager, the RECOVER command can be used to recover a database, a tablespace, or a datafile, and can include a parallel clause. If you increase the degree of parallelism, you might see a performance benefit.
Developing a Recovery Strategy
It is important that you plan your recovery strategy so that in the event of a system failure you will be ready. Several scenarios must be planned for, depending on the type of failure.
Instance Failure: Planning for Instance Recovery
Except for requiring you to set the degree of parallelism, the instance-recovery process is automatic; it requires no operator intervention. Therefore, your planning should concentrate on the parallel recovery option. Evaluate your system to determine the degree of parallelism for the recovery process. This is set as the Oracle initialization parameter RECOVERY_PARALLELISM.
The number of disk drives and CPUs typically determines the degree of parallelism in your system. My rule of thumb is to use one process (thread) for every two disk drives, up to 20. Try starting with a number like this, and adjust it to suit your system. Because every system is different, it is difficult to make across-the-board recommendations. If you set this number too high, you might cause additional overhead.
Hardware Failure: Planning for Media Recovery
When you plan for media recovery, you must consider not only the steps required to recover the damaged media, but the restoration of data as well. Some of the key items necessary for quick media recovery include
- Hardware availability--Most system vendors offer hot-swappable disk drives. A
failed disk drive need only be replaced with the same type disk in order to start
automatic recovery. For this reason, you should keep spares available.
- Backup files--Backup files should be kept until subsequent backups have been
completed and verified. If you make these files available on the system or on a network
drive, you can perform quick media recovery.
- Archived log files--Any archived log files created since the last backup should be kept available on the system itself or on a network server until the next backup occurs.
If downtime is not an option, consider using a fault-tolerant disk subsystem. You can avoid considerable downtime by protecting your disk drives with a RAID disk array.
TIP: The component in your system that is most likely to fail is probably a disk drive. The more disk drives you have in your system, the more likely one will fail. By protecting your disk drives with hardware or software RAID, you can avoid considerable downtime.
Planning for Operator Error Recovery
Recovering from operator error is similar to recovering from media failure, except with operator error recovery you might need a large amount of disk space to temporarily recover your database. Fortunately, users can typically continue working on the permanent database unless their duties require them to access the table(s) that was deleted or corrupted. To recover, you need the following components:
- Backup files
- Archived log files
- Large temporary space
Today you learned how to recover from various types of system failures. You learned how instance, media, and point-in-time recoveries work. You also learned how to recover from an operator error. You learned the importance of planning ahead to reduce downtime; in most businesses, downtime can be very expensive.
On Day 18, "Administering Oracle Replication," you will learn how to use Oracle replication to increase performance and quicken recovery. You will also learn about other Oracle options for quick recovery, including the standby database option. These features do not provide fault tolerance, but do help you to come back online quickly.
- Q What is the difference between restoring and recovering?
A Restoring involves copying a backup file to replace a damaged datafile, whereas recovery involves Oracle using the transaction log to roll forward transactions.
Q What is the difference between instance recovery and media recovery?
A With instance recovery, Oracle automatically rolls forward committed transactions to bring the database up to date; media recovery requires a damaged datafile to be replaced with a backup copy before commencing.
Q What is an in-flight transaction?
A This term is sometimes used to describe a transaction that was in the process of execution when a failure occurred.
Q What Oracle utility is needed to restore from an export file?
A An export file can only be restored with the Import utility.
The workshop provides quiz questions to help you solidify your understanding of the material covered and exercises to provide you with experience in using what you've learned. See Appendix A, "Answers," for the answers to quiz questions.
- 1. What types of recovery operations are there?
2. What transactions are recovered from an instance failure?
3. What transactions can be recovered from media failure?
4. What is ARCHIVELOG mode?
5. What is a dirty buffer?
6. What is a checkpoint?
7. What files are necessary for instance recovery?
8. What files are necessary for media recovery?
9. What is a point-in-time recovery?
10. What is a point-in-time recovery used for?
- 1. Using Backup Manager, restore the tablespace you backed up yesterday.
2. Using NT Recovery Manager, restore the tablespace you backed up yesterday.
3. On a small test system, use the SHUTDOWN ABORT command to abort an Oracle instance. (Do not do this on a production system or one that others are using.)
4. Restart the instance and observe the recovery process; try this again with a larger degree of parallelism. (NOTE: You must perform some transactions for recovery to have activity.)
© Copyright, Macmillan Computer Publishing. All rights reserved.