selectfix::> set diag
Warning: These diagnostic commands are for use by NetApp personnel only.
Do you want to continue? {y|n}: y
selectfix::*> system configuration backup settings show -instance
Backup Destination URL: -
Username for Destination: -
Schedule 1: 8hour
Number of Backups to Keep for Schedule 1: 2
Schedule 1 Enabled: true
Schedule 2: daily
Number of Backups to Keep for Schedule 2: 2
Schedule 2 Enabled: true
Schedule 3: weekly
Number of Backups to Keep for Schedule 3: 2
Schedule 3 Enabled: true
Make sure the cluster level backup jobs for 8 hour, daily and weekly schedules are created:
selectfix::*> job show
Owning
Job ID Name Vserver Node State
------ -------------------- ---------- -------------- ----------
1 Certificate Expiry Check
selectfix asdf Queued
Description: Certificate Expiry Check
2 Licensing selectfix asdf Queued
Description: License Checking
3 CLUSTER BACKUP AUTO 8hour
selectfix - Queued
Description: Cluster Backup Job
4 CLUSTER BACKUP AUTO daily
selectfix - Queued
Description: Cluster Backup Job
5 CLUSTER BACKUP AUTO weekly
selectfix - Queued
Description: Cluster Backup Job
If any of the above indicators is negative, restart mgwd from FreeBSD or reboot the node.
Cause 2: Failure of Backup Archive Creation
How to determine if this is the cause:
Log messages indicating failure of Backup Archive Creation:
cluster_backup_job::queue_local_backup_jobs - Failed to queue local job for %s. Error: %s
cluster_backup_job::check_for_node_backups - Node: %s Backup Errored.
cluster_backup_job::check_for_node_backups - Failed to get job info for node: %s
cluster_backup_job::queue_local_backup_jobs - The node %s is not included in cluster backup
config_backup_create_handler::create_local_tarball - Failed to get the value of the bootarg.init.cfimagebase variable
config_backup_create_handler::create_snapshot - Failed to get rdb version
fill_node_backup_list - Could not find the directory %s to be backed-up
generate_cluster_conf_manifest - Could not determine the cluster uuid
generate_cluster_conf_manifest - Could not determine the software version
generate_node_conf_manifest - Could not determine the software version
generate_node_conf_manifest - Could not determine the hardware model
config_backup_restore_ops::create_dir - Failed to create the backup dir because of: %d
config_backup_restore_ops::create_dir - Found a non-directory in place of the backup dir
config_backup_restore_ops::create_dir - Found invalid permissions on backup dir
config_backup_create_handler::create_snapshot - root snapshot delete zapi failed: retval: %s
config_backup_restore_ops::clean_snapshots - failed to delete snapshot %s
config_backup_create_handler - failed to delete the temp dir %s
The following EMS messages are generated:
mgmtgwd.configbr.backupFailed
Message Type - Backup file creation failed.
This message occurs when the configuration backup cannot be created.
Source - filer
Sample Message - The %s configuration backup %s cannot be created. Error: '%s'.
Corrective Action: If the reason is the root volume file system, ensure that it is available and has enough space to create a backup. If the reason is that a scheduled backup file could not be created, create the backup file manually, and upload it to the remote URL.
mgmtgwd.configbr.deleteFailed
Message Type - Backup file deletion failed.
This message occurs when an old configuration backup cannot be deleted.
Source - filer
Sample Message - The %s configuration backup %s on node %s cannot be deleted. Error: '%s'.
Corrective Action: Ensure that the node is accessible and healthy. If necessary, delete the backup manually.
mgmtgwd.configbr.snapshotDeleteFailed
Message Type - Deletion of root volume snapshot failed.
This message occurs when a Snapshot(tm) copy on the root volume cannot be deleted. The Snapshot copy was created during a configuration backup.
Source - filer
Sample Message - Snapshot copy %s on the root volume cannot be deleted on node %s. Error: '%s'.
Corrective Action: Ensure that the root volume file system is available.
Resolution:
Make sure that scheduled backups are created and distributed within the cluster:
selectfix::*> system configuration backup show
Node Backup Tarball Time Size
--------- ----------------------------------------- ------------------ -----
asdf asdf.on_demand.917.2011-05-20.19_40_34.7z 05/20 19:40:34 9.53MB
asdf selectfix.8hour.2011-07-07.18_15_00.7z 07/07 18:15:00 12.94MB
asdf selectfix.8hour.2011-07-12.17_41_14.7z 07/12 17:41:14 8.52MB
asdf selectfix.daily.2011-07-07.00_10_10.7z 07/07 00:10:10 15.41MB
asdf selectfix.daily.2011-07-12.17_41_14.7z 07/12 17:41:14 8.52MB
asdf selectfix.on_demand.1382.2011-06-22.18_30_05.7z
06/22 18:30:05 16.02MB
asdf selectfix.weekly.2011-06-05.00_15_00.7z 06/05 00:15:00 12.06MB
Make sure the EMS messages are logged indicating that the backup files are created successfully:
selectfix::*> event log show -messagename mgmtgwd.configbr.*
Time Node Severity Event
------------------- ---------------- ------------- ---------------------------
7/12/2011 17:43:37 asdf NOTICE mgmtgwd.configbr.backupCompleted: Scheduled configuration backup selectfix.daily.2011-07-12.17_41_14.7z was created successfully.
7/12/2011 17:43:36 asdf NOTICE mgmtgwd.configbr.backupCompleted: Scheduled configuration backup selectfix.8hour.2011-07-12.17_41_14.7z was created successfully.
If the reason is the root volume file system, ensure that it is available, and has enough space to create a backup. If the reason is that a scheduled backup file could not be created, create the backup file manually and upload it to the remote URL.
Cause 3: Failure of Distribution of Backup Archive
How to determine if this is the cause:
Log messages indicating failure of distribution of the backup archive:
cluster_backup_job::transfer_node_backup - subscribe failed with error %s on %s
cluster_backup_job::transfer_node_backup - unpublish failed with error %s on %s
cluster_backup_job::check_for_transfering_nodes - unpublish failed with error %s
cluster_backup_job::check_for_distribute_completed - Node: %s Distribute Errored.
cluster_backup_job::check_for_distribute_completed - Failed to get job info for node: %s
cluster_backup_job::check_for_distribute_completed - Failed to get job info for node: %s
config_backup_create_handler::handle_cluster_backup - distribute_and_rotate_cluster_backup failed with error
The following ems message is generated: mgmtgwd.configbr.distributeFailed
Message Type - Distribution of a backup file to another node failed.
This message occurs when a configuration backup cannot be distributed to another node in the cluster.
Source - filer
Sample Message - Configuration backup %s cannot be distributed to node %s. Error: '%s'.
Corrective Action: See the Resolution below
Resolution:
Ensure that the destination node is accessible and healthy. If necessary, upload the backup file to the remote URL to increase the availability of the backup file.
Cause 4: Failure of Upload of Backup Archive
How to determine if this is the cause:
The following EMS message is generated: mgmtgwd.configbr.uploadFailed
Message Type - Failure to upload a backup file to a remote URL.
This message occurs when the configuration backup cannot be uploaded to the destination URL.
Source - filer
Sample Message - Configuration backup file %s cannot be uploaded to the destination URL %s. Error: '%s'.
Corrective Action: See the Resolution below
Resolution:
Ensure that the destination URL is reachable, the protocol in the URL is supported, and the user credentials are valid.
Cause 5: Failure when obtaining backup information
How to determine if this is the cause:
Log messages indicating failure when obtaining backup information:
get_backup_file_info - failed to stat the source %s
get_backup_file_info - %s is not a regular file
get_backup_file_info - Failed to create the manifest dir(%s)
get_backup_file_info - failed to untar MANIFEST from the backup file: %s
get_backup_file_info - failed to untar MANIFEST from the backup file: %s
get_backup_file_info - failed to read MANIFEST file: %s
get_backup_file_info - Failed to delete the temp manifest dir %s
config_backup_restore_ops::parse_manifest - Could not open the MANIFEST file %s for reading
Resolution:
If the reason is the root volume file system, ensure that it is available. Make sure that the backup files are in the root volume under the /mroot/etc/backups/config directory.