Wednesday, December 24, 2014

Difference between Snapmirror and Snapvault

Most of my clients had this doubt about Snapmirror and Snapvault
1) What are the differences between these two as both these perform a copy from source to destination
2) why we need to have two products for backup
3) Why do i need to buy two product license , Instead i can just have one product
4) What is my RTO and RPO with these products and which one is better for DR
Here i am trying to tell the exact difference between Snapmirror and Snapvault, First going in to basics
What is Snapvault ?

snapvault-300x133
A SnapVault backup is a collection of Snapshot copies on a Flex volume that you can restore data from if the primary data is not usable. Snapshot copies are created based on a Snapshot policy. The SnapVault backup backs up Snapshot copies based on its schedule and SnapVault policy rules.
A SnapVault backup is a disk-to-disk backup solution that you can also use to offload tape backups. In the event of data loss or corruption on a system, backed-up data can be restored from the SnapVault secondary volume with less downtime and uncertainty than is associated with conventional tape backup and restore operations.
What is Snapmirror ?

snapmirror-300x152
SnapMirror is a feature of Data ONTAP that enables you to replicate data. SnapMirror enables you to replicate data from specified source volumes or qtrees to specified destination volumes or qtrees, respectively. You need a separate license to use SnapMirror.
You can use SnapMirror to replicate data within the same storage system or with different storage systems.
Now we see what is the difference between Snapvault and Snapmirror
First statement i would say the difference is "Snapvault is a backup solution where snapmirror is a DR solution"
Snapvault is a backup solution where we can have long snapshot retention periods on the destination filer and slower disks can be used at the destination side with low RPM's to minimize the budget , In case of disaster occurrence we can restore data from destination filer to source filer, But we cannot make the destination as source to serve the data as Snapvault destinations are READ ONLY.
Snapmirror is a DR solution where we can use Sync and Semi-Sync , Async relationships  and also we can easily restore the accidentally deleted, or lost data to source filer, if there are no updates were performed meanwhile. If there is a total disaster on the source side we can immediately perform a reverse snapmirror in case of total disaster of source we can make the destination volume/qtree as read-write and provide access to the clients. Which means low RTO and RPO , Which means low outages. Once the Source is ready we can resync the destination to source and continue with source as before.
Notable difference
Qtree SnapMirror
More suitable for providing immediate failover capability.
Uses the same functionality and licensing on the source and destination systems.
Transfers can be scheduled at a maximum rate of once every minute.
Relationships can be reversed. This allows the source to be re-synchronized with changes made at the destination.
Snapvault
More suitable where data availability is less critical, and immediate failover is not required.
Uses SnapVault source system and SnapVault destination system, which provide different functionality.
Transfers can be scheduled at a maximum rate of once every hour.
Snapshot copies are retained and deleted on a specified schedule.
Relationships cannot be reversed. It provides the capability to transfer data from the destination to the source only to restore data. The direction of replication cannot be reversed.
To Summarize 
Snapvault is moreover a Backup solution rather than a disaster recovery solution but imagine in case if you are  in deep trouble and need to have your production up asap then go for converting Snapvault qtree in to a Snapmirror qtree.( I haven't tried but it can be done from DIAG mode with Snapvault convert command ).
Snapmirror is purely replication solution which saves us in case of disaster.
So based on differences above we can easily judge which solution will have more RTO & RPO.
License structure was designed by Netapp , As we already know we need to have primary as well as secondary license.
Hope this helped !!

Friday, August 15, 2014

Setting up Netapp Passwordless Login

Usually we configure the passwordless login for Netapp filer by mounting or mapping Vol0 through NFS or CIFS , If in case your environment don't have these licenses then you can try the procedure below even if you have license i prefer the below one as its easy. As you jus need to unlock the diag account and proceed.

If you do not have neither CIFS nor NFS licenses, you could create this directory using diag account.

Note: take care using this account.

First, enter in advanced mode:
filer> priv set advanced

Now, unlock and set a password to diag account:

filer*> useradmin diaguser unlock

filer*> useradmin diaguser password ( Set a password )

Enter in the systemshell, create the directory you need and put the pubkey generated in the authorized_keys file:

filer*> systemshell

login: diag

Password: the same you set in the previous step above

filer% mkdir -p /mroot/etc/sshd/root/.ssh

filer% vi /mroot/etc/sshd/root/.ssh/authorized_keys

Now Copy your server SSH keys here and Save the file

filer% sudo chown -R root:wheel /mroot/etc/sshd/root

filer% sudo chmod -R 0600 /mroot/etc/sshd/root

Then, exit systemshell, lock diag account and exit advanced mode:

filer% exit

Lock the Diag User back

filer*> useradmin diaguser lock

filer*> priv set admin

Saturday, June 21, 2014

Netbackup Restoration Error 147


I have faced an issue during restoration of a unix ( Solaris ) filesystem backup







I have tired the restoration but was not able to restore as it failed with error

“Status 147 - required or specified copy was not found”,

 Later i have examined the bprd logs and i saw the error below

zmasb1:/usr/openv/netbackup/logs/bprd #  more  log.061814 | grep copy #1 not found!

11:27:27.411 [10605] <2> restore files:    proxy_copy = 0

11:27:27.411 [10605] <2> restore files:    alt_rest_copy_num = 0

11:27:27.696 [10605] <2> add_image_to_list: backupid zwftpt1_1278162514 copy #1 not found!

This is issue with the primary image in the netbackup where the PRIMARY_COPY/FRAGMENT mismatch or Corrupted, Now we have to duplicate the image to a good and valid backup image using command bpduplicate. Please keep in mind before you duplicate a image to make it primary image verify the image first. In my case i am going to duplicate last monthly backup image as primary image.

Verification..

Now go to your NBU console and go to Catalog and there you can search for a valid image which you can restore



Now right clcik on the displayed images and choose verify and check the results, You should see it is successful. Now go back and right click on the image and make it primary copy or use below command to duplicate the image








Tuesday, June 10, 2014

EMC VNX Default Thresholds Limits and SP Cache Behavior when it reaches beyond the water Mark

Today we had a discussion when one of my customer's VNX storage had performance hit and we have started digging down to find out the issue and finally we came to know it is the unique behavior of FLARE as
it has a condition called Forced Flushing. It occurs when the percent count of dirty cache pages crosses over the high watermark and reaches 100%. At that point, the cache starts forcefully flushing unsaved (dirty) data to disk, suspending all host IO. Forced flushing continues until the percent count of dirty pages recedes below the low watermark.

Forced flushing affects the entire array and all workloads served by the array. It significantly increases the host response time until the number of cache dirty pages falls below the low watermark. The Storage Processor gives priority to writing dirty data to disk, rather than allocating new pages for incoming Host IO. The idea of high and low watermark functionality was implemented as a mechanism to avoid forced flushing. The lower the high watermark, the larger the reserved buffer in the cache, and the smaller chance that forced flushing will occur.


So why "the SP Dirty Page% occasionally can reach 95%"? Because there are too many inbound IOs and the backend disks might have been overloaded thus the cache doesn't have enough time to write them to the disks.

Please also find the Default thresholds for different parameters in EMC VNX


FC
160 iops
EFD
2500 iops
ATA
70 iops
SATAII
90 iops
SAS
160 iops
NL SAS
120 iops
Dirty pages
95 %
Disk resp. time
>15 ms (if total iops > 20)
Average Seek Distance
10 % or > 30 GB/s
Lun response time
>22ms (if total iops > 20)
BE Bandwidth
>320MB/s or 2160MB/s for VNX

Tuesday, June 3, 2014

Moving a Netapp Filer from an old domain to a New Domain

Today i had a situation to change the domain of all my filers due to a major acquisition in my company.

Please note changing domain of a filer will have disruption to you storage accessed through network ( NAS ) make sure No open files at the time OF change cause it may cause corruption to the files but your LUNS will be just fine. Recommended to perform this during off-peak hours.

After change ask users to remount the shares using new fully qualified domain name or can jus use the Filer name followed by share name

Remember before proceeding make sure you have a Windows account with administrative privileges handy.

 First terminate the CIFS
   
Nayab> cifs terminate

   Then run the cifs setup

Nayab> cifs setup

    Now follow the prompts below and choose

    Do you want to delete the existing filer account information? [no]

    Delete your existing account information by entering yes at the prompt.

    Note: You must delete your existing account information to reach the DNS server entry prompt.

    After deleting your account information, you are given the opportunity to rename the storage system:

    The default name of this filer will be 'Nayab'.
    Do you want to modify this name? [no]:

    Keep the current storage system name by pressing Enter; otherwise, enter yes and enter a new storage system name.

    Data ONTAP displays a list of authentication methods:

    Data ONTAP CIFS services support four styles of user authentication. Choose the one from the list below that best suits your situation.
    (1) Active Directory domain authentication (Active Directory domains only)
    (2) Windows NT 4 domain authentication (Windows NT or Active Directory domains)
    (3) Windows Work group authentication using the filer's local user accounts
    (4) /etc/passwd and/or NIS/LDAP authentication

It chooses the domain 1 by default

 Selection (1-4)? [1]:

Now enter the new domain Name

What is the name of the Active Directory domain? [nayab.corp]: nayabrs.corp

        In Active Directory-based domains, it is essential that the filer's
        time match the domain's internal time so that the Kerberos-based
        authentication system works correctly. If the time difference between
        the filer and the domain controllers is more than 5 minutes,
        authentication will fail. Time services are currently not configured
        on this filer.

Would you like to configure time services? [y]: n

        In order to create an Active Directory machine account for the filer,
        you must supply the name and password of a Windows account with
        sufficient privileges to add computers to the NAYABRS.CORP domain.
Enter the name of the Windows user [Administrator@NAYABRS.CORP]:
Password for Administrator@NAYABRS.CORP:

    Respond to the remainder of the cifs setup prompts; to accept a default value, press Enter.

    Upon exiting, the cifs setup utility starts CIFS.

    Confirm your changes by entering the following command:

    Nayab> cifs domaininfo



cifs domaininfo
NetBios Domain:           NAYAB
Windows 2003 Domain Name: nayab.corp
Type:                     Windows 2003
Filer AD Site:            Singapore

Current Connected DCs:    \\DOMAINC01
Total DC addresses found: 4
Preferred Addresses:
                          None
Favored Addresses:
                          192.168.2.34    DOMAINC01         PDC
                          192..168.3.35
                           92.168.2.20                  PDC
                                               PDC
Other Addresses:
                          192.254.52.71                    BDC

Connected AD LDAP Server: \\domainc02.nayab.corp
Preferred Addresses:
                          None
Favored Addresses:
                          192.168.2.34
                           domain02.nayab.corp
                          192..168.3.35
                           domainc02.nayab.corp
                          192.168.2.20
                           domainc01.nayab.corp
Other Addresses:
                          None


  




Wednesday, May 28, 2014

EMC ! EMC ! , Knowing about EMC Data Domain Filesystem



Hey all now a days there's lot of buzz about EMC i see lot of my customers switching to EMC , I have already started working on Data Domain and VNX i just wanted to share some of my experience and understanding.

The Data Domain file system resides on ddvar directory and it is like a root directory.  It can be shared as NFS/CIFS share but one cannot rename or delete it because it holds the operating environment binaries and configuration.

And you will have a Data Directory followed by sub-directory called collection 1 which will be like /data/col1 and by default there is only one collection i.e., collection #1 under col1 you will have folders or MTrees , the default is called backup where all your backups will be stored however you can create more Mtrees under /col1 and each Mtree can be managed individually it has its own permissions and quota settings ( If needed to restrict ). The maximum number of Mtrees is 100. Any Mtree can be mounted as NFS shares or Mapped as CIFS shares.

Also Quotas can be set on Mtrees where we have two types Hard and Soft quotas, Hard Quotas will restrict the users from writing any data once the limit is reached and Soft quota will send the warnings and alerts.

Each MTree can be individually managed with its own policies and permissions.  The max number of MTrees is 100, however after 14 there is a performance degradation in environments where there is a lot of read/write streams to each MTree.  Best practice dictates that you keep MTrees to 14 and to aggregate operations to one MTree where possible.  You can mount MTrees like any other CIFS/NFS share, however it is not recommended to mix protocol type.  VTL and DDBoost clients create their own MTree, as well.

Thursday, May 22, 2014

Netapp Snap Restore at Volume Level

Today i have performed snap restore on few production volume reverting to the snapshot of dated 20-May-2014. Here i share the procedure with you all.

I have restored snapshot of all 12 prod volumes. Please be noted snap restore can be performed both at VOLUME LEVEL and FILE LEVEL.

Volume level is to revert back the volume to whatever the date you want based on the snapshot in my case my latest snapshot is on 20-May-2014 so I’m  reverting back to it

File level is to restore a single file , we have to mention  type   –t  (vol | file )  while restore.

Im doing at volume level....

Example:-


snap restore -t vol  <vol-name>

snap restore -t vol snap_prodcli_prod_fb

WARNING! This will revert the volume to a previous snapshot.

All modifications to the volume after the snapshot will be

irrevocably lost.

Volume snap_prodcli_prod_fb will be made restricted briefly before coming back online.

Are you sure you want to do this? yes

The following snapshots are available for volume snap_prodcli_prod_fb:

     date            name
------------    ---------

May 20 17:12    20140520_bkup  -->  I am Reverting to this Snapshot

May 12 05:50    20140512_bkup

May 09 18:35    filervltp1(0151751825)_snap_prodcli_prod_fb.1

May 09 17:55    20140509_coldbkup

Apr 10 15:41    20140410_coldbkup

Which snapshot would you like to revert volumesnap_prodcli_prod_fb to?  20140520_bkup   ( Here mention the snapshot name to which you want to revert )

You have selected snap_prodcli_prod_fb,  snapshot 20140520_bkup

Proceed with revert? yes

Thu May 22 10:58:39 SGT [filervltp1:wafl.snaprestore.revert:notice]: Reverting volume snap_prodcli_prod_fb to a previous snapshot.


Volume snap_prodcli_prod_fb : revert successful.


It’s been successful :)

Wednesday, May 21, 2014

Restoring Oracle Archive Logs using RMAN from Tapes

I got a request from my client asking to restore Archivelogs of  particular sequence. let me brief you all our Oracle using ASM and our backup software is Netbackup.

Generally you can restore archive logs from the OS backup by starting the BAR utility from Netbackup MASTER SERVER but in my case these are ASM files so it cannot be seen from the OS backups so have to invoke the BAR from the CLIENT but i faced an error when i tried to list the files in BAR " Library Binaries Error" something strange so i have opened a case with Netbackup but Netbackup denied support as my NBU version is 6.5.4 which is EOS ( End of Support ) already , So now i have only one way to restore is through RMAN , After some struggle with google  i was able to build a script to restore my archivelogs



Note:- I am going to restore the archivelogs to an alternate location on the same server may be /tmp in my case but please check if you have enough space in the particular directory before proceeding with the restore and also get the LOGSEQUENCE  from your DBA

Login to the Host then switch to oracle

prdbclient:~ #  su - oracle

prdbclient: : oracle:>

Now go to the instance for which you want to restore Archivelogs

 prdbclient: : oracle:> prdb1



 prdbclient:prdb1: oracle:>




Once you are logged in to the instance now go to the RECOVERY MANAGER ( RMAN )   PROMPT ( Check with your DBA for the home directory of Oracle to be set )

/OraBase/V10010/bin -->  Is my Oracle Home directory

prdb1/prdb1 -->  Oracle Userid and Password ( You can get userID and Password from you RMAN SCRIPT )


Now type in as below to go to RMAN prompt of instance prdb1

prdbclient:prdb1: oracle:> /OraBase/V10010/bin/rman target / catalog prdb1/prdb1@rman

Once done run the restoration script

RMAN>

 run {

set archivelog destination to '/tmp/PRDB1_Archlogs';

ALLOCATE CHANNEL t1 TYPE 'SBT_TAPE';

SEND 'NB_ORA_CLIENT=prdbclient, NB_ORA_SERV=masbak, NB_ORA_POLICY=prdbclient_prdb1_archlog_daily';

restore archivelog from logseq=443196 until logseq=433299;

RELEASE CHANNEL t1;

}



Monitor the Restoration Job from the Netbackup Console

 Once completed successfully let the DBA do his job :)

Wednesday, April 30, 2014

Netapp Not able to access CIFS shares

 We May not be able to access CIFS shares ( including C$ and ETC$ ) on a New Filer I have come across this issue and i tried all possible ways to figure out the cause and finally i was able to fix the issue, 

The reason is when you run CIFS SETUP on your filer it will create a user called "PCUSER"  ( All windows users will be mapped to this users while accessing the shares on the filer )  , you can check it from the below command your file should be similar to below

FAS3220> rdfile  /etc/passwd

root : : 0 : : : / :
pcuser : : 65534 : 65534 : : / :
nobody : : 65534 : 65534 : : / :
ftp : : 65533 : 65533 : FTP Anonymous : /home/ftp :


For some reason  it does not create the user pcuser and i have noticed the /etc/passwd was empty so all what i did is to copy the /etc/passwd from one of my old filers and don't forget to run source command after copying

FAS3220> source  /etc/passwd

Once done you should be able to access you shares

Happy Knowledge Sharing :) 

Wednesday, April 23, 2014

Migrating a root Volume in Netapp


  Please remember that moving a root volume requires a reboot of the controller.  

Therefore, i suggest to complete this activity in your Maintenance windows. 

It is always good to have a snapshot created before starting any migrations 


FASProd>snap create -V vol0 vol0_snap


 1.       Disable the cluster:  

FASProd>cf disable

2.       Check the size of current vol0:  

FASProd>df -Vh  vol0

3.       Create a new root volume on destination aggr:    

FASProd>vol create  vol0_new  dest_aggr  <SIZE>

4.       Copy the data to the new volume:

FASProd> ndmpcopy /vol/vol0  /vol/vol0_new ( Can also use Vol Copy )

 5.       Once copy is done now Rename the old root volume: 

FASProd>vol rename vol0 vol0_old

6.       Rename the new root volume:  

FASProd>vol rename vol0_new vol0

7.       Now change the new vol0 to be used as root volume 

FASProd> vol options vol0 root

8.       Reboot the controller:  

FASProd>reboot

9.       Confirm the destination aggr now hosts the root vol0:  

FASProd>vol status vol0  

10.   Once confirmed proceed to Offline and destroy the old volume

FASProd> vol offline vol0_old  

FASProd> vol destroy vol0_old

Now i am a NTSP ( Nimble Technical Sales Professional )


Tuesday, April 15, 2014

Knowing what is Vmware Vcenter, ESXi, Vsphere



I was really confused with the terms Vsphere , Vcenter and ESXi and i have thought to write up the difference so that it might help someone at some point of time

What is vSphere ?

vSphere is suite of products packaged shipped by VMware virtualization company, which includes products. Like VMware ESXi hypervisor, VMware vCenter Server, etc.

VMware ESXi ?

ESXi or hypervisor is bare metal OS which virtualizes x86 server hardware. You use ESXi ( OS ) installation  media to install OS on top of hardware just like you install any other OS like Windows or Linux

Once you have VMware ESXi installed on a x86 server hardware I is ready to virtualized. What does it mean ? - It means you can now use same server to create , configure, run more than one virtual machines and install required Operating systems within Virtual Machines they are referred as Guest Operating systems which could be Windows or Linux etc

 What is vCenter ?
 
Vcenter is another Vmware product used to manage one or more ESXi servers comes in the Vsphere software package.