Netapp Setup
Netapp Setup
Password Time zone Storage system location Language used for multiprotocol storage systems Administration host Host name IP address Virtual interfaces Link Names (physical interfaces names such as e0, e5a, or e9b) Simulator ns0,ns1 Number of links (number of physical interfaces to include in the vif) Name of virtual interface (Name of vif, such as vif0 Ethernet interfaces Interface name IP address Subnet mask Partner IP address Media type (network type) Are jumbo frames supported? MTU size for jumbo frames Router (if used) Gateway name IP address Location of HTTP directory DNS Domain name Server address 1 Server address 2 Server address 3
Your values
fas121 Netapp
192.168.1.1 192.168.1.212
netappu.com 192.168.1.212
NIS
Windows domain WINS servers 1 2 3 Windows Active Directory domain administrator user name Windows Active Directory domain administrator password Active Directory (command-line setup only) RMC MAC address IP address Network mask (subnet mask) Gateway Media type Mailhost RLM MAC address IP address Network mask (subnet mask) Gateway AutoSupport mailhost AutoSupport recipient(s)
netappu.com 192.168.1.212
Administrator Netapp
There is a completely unsupported method to actually accomplish this on the filer itself. 1. priv set advanced 2. java netapp.cmds.jsh
http://192.168.1.21/na_admin/
Filer General
By default, there is no such file, but if user modifies this file, they will have
/etc/syslog.conf /etc/messages ) ----------- which will tell where to direct messages at screen ( typically
Sysconfig t
( tape information ) - this command reads and executes any file containing filer commands
Telneting to Filer
Only one user can do telnet Options telnet
Autosupport Configuration
Filer>Options autosupport autosupport.mailhost < > autosupport.support.to < [email protected] > autosupport.doit <string> autosupport.support.transport https or smtp
Autosupport troubleshooting
1. ping netapp.com from filer 2. TCP 443 SSL should be open at SMTP server SMTP server may stay in DMZ side
3. Mail relay in exchange must be specified. Filers host name or IP address must be specified in mail relay. Routing for netapp.com or routing by this host or routing by this ip must be enabled for filer. Filer is acting as a SMTP client. In general setup of mail system, no SMTP client is able to send the mail thru mail server to other SMTP server when hosts identity is different as far as mail id is concerned. Relaying is blocked generally. 4. Proxy server http / https must pass http url
1. scrub to happen for only 6 hrs 2. forcing the scrub to start on Sunday at 1 am
RAID group
vol add vol0 g rg0 2 add 2 disks to raid group 0 of vol0 vol options yervol raidsize 16 changes the raidsize settings of the vol yervol to 16 vol create newvol t raid_dp r 16 32@36 - newvol creation with raid_dp protection. RAID group size is 16disks. Since the vol consists of 32 disks, those will form 2 RAID group, rg0 & rg1 Max Raid groupsize Raid DP Raid 4 28 14
Disk Fail/unfail
priv set advanced when disk goes bad disk fail partially then prefail copy disk unfail is seen when sysconfig r is done. Somestimes it may just hang there, so disk fail sysconfig d
Disk troubleshoot
Priv set advanced -i <disk name> would reconstruct the the RAID group Led_on < 1d.16> Led_off < drive id > Blink_on 4.19 ( failed disk now will be orange ) Blink_off 4.19 release the disk &
Zeroing disks
Priv set advanced Disk zero spares Sysconfig r --to zero out the data in all spares - spares disks
1. Check % used of inodes by Filer> df i 2. to increase Filer> maxfiles < vol name > <max>
NVRAM
Battery check Filer> priv set diag Filer> nv => should show battery status as OK and voltage as NVRAM3 6V
Raidtime out in options raid controlls ( 24 hr ) the trigger when bat low In 940s NVRAM5 is used as Cluster interconnect card as well, two in one on slot 11
Time Deamond
(port 123, 13, 37 must be open) When there is large skew, lot of messages from CfTimeDaemon : displacements /skews:10/3670,10/3670, 11/3670 Because of this hourly snapshot creation also fails or in progress message appears. Because of timed.max_skew set to 30 min, we may see above message in every 30min- 1 hr If we set this to 5s and see how skew happening if we see lot of skew messages (once we turned ON to timed.log ON ), MB replacement may require. For temporary do Cf.timed.enable ON on both cluster filers and watch those off errors Checking from unix host # ntptrace v filername From filer check Filer>options timed ( check all the options of this ) From filer view => set date and time : Synchronize now < ip of NTP server > => do synchronize now and check NTP from unix host. Tip : if there are multiple interfaces in filer, make sure that they are properly listed in NIS or DNS server same host name , multiple ip addresses may require
Enviromental Status
The top line in each channel says failures to yes , if there is any. Subsequent messages should say Power Cooling Temperature Embedded switching [ all to none ]
( if there is no problem )
Volume
vol options vol0 vol status vol0 -r ( raid info of volume ) sysconfig r vol options vol0 raidsize 9 vol add vol0 <number of disk > vol status l ---- to display all volume
Double Parity
vol create t raiddp r 2 ( minimum of two ) (There are two parity disks for holding parity and double parity data)
View ) - not root. RSH sec settings must be set with either ip or hostname, but with matching username for logon accounts ( not root, but the domain admin account ) RSH access from unix host # rsh l root <console p/w> <ip of filer> <command> ( add this unix host in /etc/hosts.equiv ( this command can be corned RSH Port 514 / TCP file similar for windows host as well )
Registry Walk
Filer> registry walk status.vol.<vol name>
P/W
To change admin host administrators p/w Filer>passwd Filer>login administrator Filer>new password:..
Enviroment
environment status all
tree
10m
WAFL stuffs
wafl check ( when inconsistencies happen, when vol becomes restricted all of a sudden ) to correct for inconsistencies volume Ctrl C while boot options selection ? wafl_check -z
NFS General
/etc/exports /vol/test -rw,root=sun1 /vol/vol1 rw,root=sun1 #mkdir /mnt/filer #mount filer1:/vol/vol1 /mnt/filer /etc/rntab - maintains the mount point /etc/hosts - name and IP address /etc/nsswitch.conf - resolution order file Filer> exportfs Filer > rdfile /etc/exports filer> exportfs a filer>exportfs I o rw=<ip address>, root=<ip address>
NFS troubleshooting
wcc u <unix user> ---------- unix credential >exportfs -c host pathname ro|rw|root #checks access cache for host permission >exportfs -s pathname # verifies the path to which a wol is exported >exportfs -f #flush cache access entries and reload >exportfs -r #ensures only persistent exports are loaded
NFS error 70 - stale file handle >vol read_fsid # mount --- will display what protocol being used for mounting # mount o tcp < > Qtree security Portmap d Rpcinfo p < filer ip >
( in unix host )
If I look in /var/log/message I see the following error: Mar 30 19:44:59 bilbo kernel: nfs_refresh_inode: inode number mismatch
Told customer to get rid of the nosuid on the exports file and that solved the issue.
First two numbers FSID Next three : FID, Inode, FID Next three : FID export point
=> gives hex number should match any number above so that it indicates, file of which volume has problem. Hex number can be converted to decimal value as well
In unix side
# find inum <decimal value > # find /mnt/cleearcase inum _________
# ls li
# find . inum < number > print ( Sometimes, vol fsid number found must be reversed to get the exact place of innode )
( Sometimes filer permission comes to stay on top of local permission at unix box, so that it cannot be seen they will become hidden ) To find use # chmod #chown
NFS Performance pktt start e5a , -dump e5a, pktt stop ( all three start to end) sysstat nfsstat d ( displays cache statistic ) -z ( zero out the stat ) -m ( mount point statistics ) perfstat b f filename > perfstat.begin perfstat e f filename > perfstat.end # time mkfile 10m test ( time it takes ) # time cp test windows host > sio_ntap_sol 100 100 4096 100m 10 2 a.file b.file noflock
perfstat t 2 f nasx p flat > text.txt -P domains ( smp ) ~ flat ~ kahuna ~ network ~ raid ~ storage
Nfs mount : /remote_file_system_name : Stale NFS file handle=20 this error message means that an opened file or directory has been destroyed or recreated
NFS error 70
File or directory that was opened by NFS client was either removed or replaced on the NFS filer server
1. 21048 is the pid of the process, check in solaris that it is running 2. take the value of 0x00000687 convert to decimal to obtain the value ( in solaris $ echo 0x000006d7=D|adb) will convert 3. to find the file solaris $ find .inum 1751 -print
Networking Troubleshooting
filer>traceroute filer>ping Filer > ifconfig for IP address related issues Filer > routed status Filer > routed OFF Filer > routed ON
DHCP
Filer cannot have DHCP dynamic address. It is stored in /rc file as static even if DHCP is choosen.
Packet
netstat i netstat i <interface name like ns0,e5a etc > netdiag vv ifstat a # flow control information at bottom 10/100/1GB flow etc purely switch based : what Ever switch is set, filer takes that
Port
netstat a to check all open ports on filer netstat ----- to see all connected connections
Port numbers
514 / tcp rsh 135 tcp/udp rpc udp rpc for sun
Network troubleshooting
Cannot Ping to other subnet 1. netstat rn should have default route addresss at top 2. do routed status if no entry 3. Even if rc file shows default gateway address add Manually Route add default <ip address> 1 and check above
Checking steps
rdfile /etc/rc ifconfig a >netstat rn #---- gateway line must be there >routed status >routed ON # --- if gateway is not there add manually
4. 5. 6. 7. 8.
stop all is created at C$ of filer cifs connection to filer and point to \\<filer>\C$ file mytrace.trc file by ethereal or packetizer
Brocade Switch
#switchshow # wwn
10:00:00:05:1e:34:b0:fc - may be the output # ssn "10:00:00:05:1e:34:b0:fc" - setting the switch serial number to wwn
MCData Switch
If direct connection works but not thru mcdata, verify that OSMS is licensed and enabled. config features show config features opensysMS 1 storage show switch Switchshow Cfgshow Portdisable Portenable Switchdisable Portperfshow
CIFS setup
cifs setup
Cifs general
Cifs Cifs Cifs Cifs Cifs Cifs Cifs shares access permission restart shares eng shares add eng /vol/cifsvol/eng access eng full control sessions
CIFS performance
cifs stats smb_hist -z sysstat c 15 2 ( 15 iterations every 2 seconds ) statit WAFL_susp Ifstat -a netstat m -r -i ( can be used any one ) netdiag v, -sap cifs sessions
Cifs homedirctory
1. volume snapvol is created 2. qtree is created as root of this vol => snapvol ; sec is unix 3. share is created as snaphome of this qtree as /vol/snapvol/home with everyone/full control 4. options cifs.home-dir /vol/snapvol/home 5. options cifs.home-dir-namestyle <blank> 6. edit /etc/cifs_homedir.cfg file and add at the end /vol/snapvol/home
CIFS troubleshooting
wcc s domain\name wcc u username Cifs domaininfo rdfile /etc/rc options wafl -----windows match with /etc/lclgroups.cfg file - any changes here requires reboot --------------unix - tells dns entry --------- will have dns info Should see unix Pcuser
/etc/usermap /etc/passwd
B.
1. Check DNS servers, must point to itself and must have at least 4,5 services - AD C. 1. Check where currently pointing to ( DNS ) Filer> priv set diag Filer> registry walk auth If requires to rerun cifs setup, this registry can be deleted Filer> registry deltree auth
D. Net view \\filername should show all shares from windows side and cifs shares should show from filer side But, when share is accessed from windows machine, we get No Network Provider Present. Ping works, iscsi works, iscsi drives are OK can access. But, cifs shares does not work. In filer side we see Called name not present ( 0x82). Cifs resetdc also gives the same message.
Check : 1. If filer and windowsdc is rebooted at the same time because of power failure this is seen. Filer needs to come first and then DC 2. make sure that there is no virus related activities goin on that host. Virus scan to windows host or filer can also make this happen
Disable WINS on interface e0 ( if requires to go by pure DNS only ) Filer> ifconfig e0 WINS ( so that filer do not talk to WINS server )
/vol/test
Common Cifs issues - cannot access , access denied 1. 1. time lag between pc and filer ( change from filer view ) 2. 2. qtree security [unix | ntfs | mixed ] - change temporarily From ntfs to unix and back to ntfs or ntfs to mixed and back to ntfs (when folder is mappedin its drive letter you do not see security tab..as well.)
Cifs Options
Cifs.show_snapshot ON --- alternate names of Filer ON* Options cifs.netbios_aliases.names
* to take ownership of file by windows top level administrator when file is created from unix side and has only unix ACLs
Scenario A
1. qtree in vol is created with mixed sec 2. share that qtree 3. groupwise users access in unix are defined in /etc/group file /etc/group - > is in unix side. Client or NIS server Eng::gid:khanna, Uddhav In client side ls l file / directory listing chmod chgroup chown
( to see both permission in cifs shares permission from unix and permission from windows use secureshare access ) 4. In windows create group and give access 5. . /etc/usermap.cfg file is used to map user accounts in windows and their corresponding account in unix to access/manage resources Win unix <= (unix to windows) >= (windows to unix) == (both) Test\* == ( all users of test windows domain) Domain\<user> => root ( if the user is not able to see home directory but all other users folders ; CIFS restart and access home ) 6. when file is created in that cifs directory or nfs mounted place, the ownership is maintained by who ever created it and access is granted by usermap.cfg file 7. Make sure that Wafl.net_admin_priv_map_to_root ON ( sometimes permissions are locked and some files gets corrupted; while accessing it says do not have access or encrypted. Every other files works fine. In this case changing Options cifs.nfs_root_ignore_acl from off to ON and Change the permission from NFS mounted side unix to Chmod 777 and access file. Change back to OFF. Will work after this all the time (this was the cause when user upgraded from 6.4 to 6.5 and some files in mixed qtrees folders were not able to access nor change the permission from even root user from NFS side. Above permission reset made it work.
Scenario B 1. 2. 3. 4. 5. qtree is created its security is unix share is created of that qtree so location is the same cifs client cannot chdir into directory if the user has execute Permission d-wx-wx-wx eg MODE == 111. User gets NT_Status_access_Denied message when accessed If the user is granted read only MODE == 444 ), chdir is Successful.
CIFS audit options cifs.audit.enable on cifs.audit.file_access_events.enable on cifs.audit.logon_events.enable on cifs.audit.logzie 524288 cifs.audit.saveas /etc/log/adtlog.evt
Filer > cifs audit save f Read /etc/log/adtlog.evt as event log thru windows
CIFS errors LSAOpenPolicy2 : Exception rpc_s_assoc_grp_max exceeded Veritas Backup Exec 9.1 : mycomputer -> shares -> sessions shows Veritas Backup Exec Administrative account connections for every share in filer. One connection per share and it grows each and every day as well as stays there each and everyday. This must be wiped out.
Virus Scan 1. 2. 3. 4. 5. 6. 7. vscan vscan vscan vscan vscan vscan vscan ---- to see the status of virus scan on off options scanners options client_msgbox [on|off] scanners secondary_scanners ip1 [ip address]
Quotas rdfile /etc/quotas Cluster Prerequisite volume option create_ucode on options coredump.timeout.enable on options coredump.timeout.seconds 60 or less
cf takeover
cf giveback Sometimes, due to active state, this may not run. Make sure that no cifs sessions are running. Also snapmirror should be off San FCP switch>cfgshow >fcp show cfmode (standby,partner,mixed) >fcp set cfmode mixed >fcp show adapters >fcp show initiators >fcp setup >fcp set cfmode [dual_fabric | mixed | partner | standby ] >fcp nodename >fcp config >fcp status >fcp start >fcp config 10b >igroup show >fcp stats vtic ( virtual target interconnect adapter ) >fcp stats 5a >sysstat f 1 Igroup show lun show m lun show -v
/usr/sbin/lpfc/lputil
/opt/NTAPsanlun/bin/create_binding.pl l root n <filer ip> /kernel/drv/sd.conf lputilnt (make sure that target id and adapters are here)
#sanlun
LUN 1. lun create 2. lun setup 3. lun show m, -v 4. lun stat a o i 2 5. lun destroy -f < lun path > ( the f command destroy the lun even if 6. lun move 7. lun map | unmap <lun path><initiator group>[<lun id>] 8. lun online 9. priv set diag 10.lun geometry
it is mapped )
SNAP drive LUN creation process 1. create qtree 2. share qtree 3. create lun snap drive can be used so that lun is created inside qtree (if qtree is not set properly, cannot access cifs shares access denied error message happens )
LUN restore from snapshot (snap restore of lun snap restore licensing req ) Filer > snap restore t file S snap1 /vol/lunvol/lun1/lun1.lun Q asked Choose Y ; Proceed => Y Filer> lun unmap <lun path> <initiator group> Filer > lun map <lun path> <initiator group> [lin id] Filer > lun online <lun path>
When volume, qtree,files their space reserve is disabled by default, to change this we must do: Vol options vol1 create_reserved on | off Lun create o noreserve -f ( overrides the default settings on the file level )
Snapshot of LUN Rws is the file created when snapshot of LUN is taken. 124 event ID is generated by SnapDrive. When deletion of this snapshot LUN is tried 134 is created as well. When there is busy snapshot, other snapshot may hang and 134 is also generated 124 - > 249 - > 134 ( must see kb2370) can be seen
NDMP copy of LUN ndmpcopy da root:netapp /vol/vol0/lun/test.lun 10.1.1.1:/vol/vol0/lun/test.lun ( lun files can only be restored to either root volume or qtree root directories ) ( Also, when the lun is copied, it may not be full, so it my go fast while copying ) After this on destination : Dest filer > reallocate start o n lunpath
LUN backup from snapmirrored volume 1. 2. 3. 4. on both source and destination create_upcode, convert_upcode ON from destination filer : snapmirror update [options] <dest_filer : dest_vol> On the source filer : lun share <lun path> read run snapmirror update command
iSCSI iscsi show interface iscsi fcp swt interface show iswta software target )
igroup iscsi show initiator iscsi stats sysstat i 1 iscsi config iscsi status #- to make sure that iscsi is running, also check that enable at filer
iscsi licensing is
SuSe iscsi LUN setup Chap authentication filer > iscsi security generate filer> iscsi security add I initiator s method p password n inname [ -o outpassword m outname] # ( particular initiator connection )
OR filer > iscsi seurity default s method p inpassword n inname [-o outpassword m outname ( any initiator connection ) [[ only this one works]]
Troubleshooting 1. 2. 3. 4. 5. 6. 7. 8. filer linux linux linux linux filer filer linux > # # # # > > # iscsi config /etc/iscsi.conf /etc/fstab.iscsi uname r cat /etc/issue iscsi show initiator iscsi security show cat /etc/initiatorname.iscsi
Iscsi private network connection filer> iswt interface show filer> iscsi show adapter filer> iswt session show v iswta ( will show tcp connection ip addresses or )
Now to change this to use private network only 1. snapdrive -> iscsi management -> disconnect 2. from filer disable iscsi on public nic iswa disable adapter < > 3. then reconnect and use prive ip from snap drive
to see snapshots by windows client check two things a. vol options vol0
nosnap = off, nosnapdir = off < default > These should be off , when turned ON, cifs windows client cannot access this and restore, they can see it b. options cifs cifs.show_snapshot ON
To get access of \\<ip of filer>\vol0\.snapshot - from windows cifs access host vol0 must be shared, otherwise cannot access \vol0\.snapshot
Nfs snapshot .snapshot directory is shown only at the mount point, although it actually exists in every directory in the tree Cifs snapshot .snapshot directory appears only at the root of the shared. Priv set advanced Snap status ( blocked owned =
x 4K = )
Snap list (generally snap reserve is 20 % ) Solaris troubleshotting for lun Solaris_info [-d <directory name>][-n<name>] Snap Drive troubleshotting tool SnapDrvDc.exe Snapdrive snapshot lun restore from mirror site 1. Break mirror 2. Check that lun is online 3. if using by terminal services and ge the Failure in Checking Policies error , Errro Code : 13040, then log off, and log back in or if this does not work, reboot the windows host. Single File Snap Restore ( SFSR ) is done before snapdrive makes the connection. During this time snap drive virtually does not work and issues 13040 error. No other lun restore can be done from same host. As SFSR is going on in background sol is : wait patiently. Log off and log back in after while, the drive should come.
Snap restore
Volume Restore Snap restore t vol path_and_vol_name Snap restore t vol s snapshot_name path_and_vol_name
File restore Snap restore t file path_and_file_name Snap restore t file s snapshot_name r new_path_and_file_name path_and_file_name
Qtree or directory restore Snap restore f t file s < snapshot > /vol/vol0/<directory name> directory - to restore for
vol restrict vol1 vol copy start vol0 vol1 vol online vol1 snap list vol1 snapshot_for_volcopy.0 snap create vol1 snap1 Snap Mirror
/etc/snapmirror.conf vol status b vol1 (size in blocks) vol status vol1 options snapmirror.access host=filerA filerB>vol restrict vol2 >wrfile /etc/snapmirror.conf
filerA:vol1 filerB:vol2 - * * * * (min hour day-mon day-week) filerA:vol1 filerB:vol2 45 10,11,12,13,14,15,16 * 1,2,3,4,5 snapmirror on
vol status v filerB>snapmirror initialize S filerA:vol1 filerB:vol2 #baseline data transfer snapmirror status snapmirror status l more detail info snapmirror off snapmirror break filerb:vol2
---
snapmirror on snapmirror quiesce filerB:/vol/vol0/mymirror (break a qtree snapmirror) snapmirror resync S filerB:vol2 filerA:vol1
update filerb:vol2 off #disable snapmirror on #resume snapmirror,reread /etc/snapmirror.conf break vol2 # converts a mirror to a read/write vol or qtree on dest filer destinations -s source_volname release vol1 filerc:vol1 status -l vol1
Breaking snapmirror 1. snapmirror quiesce < destination path> 2. snapmirror off 3. snapmirror break < destination path> To Resume the operation Have to resync snapmirror store #initialize a volume sanpmirror from tape on source vol snapmirror retrieve # on mirror vol #--- check from Snapmirror.conf file
Synchronous Snapmirror /etc/snapmirror.conf filera:/vol1 filerb:/vol2 - sync #multi path src_con = multi() src_con:/vol1 dest:/vol2 - sync #src_con = failover()
Steps to create Mirror 1. Enter the license on both 2. user snapmirror.access option to specify the destination filer 3. on the destination filer , edit /etc/snapmirror.conf file 4. On both source and destination filers enter snapmirror on command 5. on the destination filer, run snapmirror initialize <destination > command
Requirement Destination vol must be restricted Everything in destination will get deleted once initiazlied
snapmirror optimization filer > options snapmirror.window_size 199475234 (default ) (this will cause large brust of packet does not work well for WAN. May cause large packet drops resulting in the termination of snapmirror transfer or resulting very low throughput )
To change this Dest filer > options snapmirror.window_size < between 8760 ~ 199475234 )
Snapmirror problem On the source filer Snapmirror source transfer from <vol> to <destination filer>:<vol. : request denied, previous request still pending Socket connect error : resource temporarily unavailable Sol : On Destination 1. make sure that vol is there 2. other source is pingable Destination mirror filer> snapmirror abort netapp01:vol1 pcnetapp01:vol1 OR snapmirror abort netapp01 pcnetapp01 Destination filer> snapmirror status ( see transfer has stopped ) Destination filer> snapmirror resync s netapp01:vol1 pcnetapp01:vol1 Snapvault
baseline qtree >snapvault >snapvault >snapvault >snapvault >snapvault >snapvault >snapvault start S filer:/vol/vol1/c1-v1-q1 vault:/vol/volx/t1-v1-q1 modify -S src_filer:qtree_path update dest snap sched <volume> <snapname> count@day_list@hour_list snap sched vol1 sv_1900 4@mon-sun@19 snap unsched snap create #manually create a snapshot on the primary or secondary
#snapshot name must exist >snapvault >snapvault >snapvault >snapvault >snapvault restore s snapname S srcfiler:/vol/volx/qtree stop destfiler:/vol/volx/qtree abort dest_qtree release src_qtree dest_qtree status
<destination qtree>
Snapvault troubleshooting Ifa backup relationship from OSSV is created and then deleted from secondary, any attempt to recreate it fails with error message: Transfer aborted: the qtreee is not the source for the snapmirror destination Example
Twain*> snapvault start s fsr-pc1:E:\smalldir /vol/tinysmalldir ( error at console : snapvault : destination transfer from fsr-pc1:E:\smalldir to /vol/tiny/smalldir : the qtree is not the source for the snapmirror destination Transfer aborted : the qtree is not the source for the snapmirror destination
error :
E:\smalldir
To workaround Release the relationship on primary using snapvault.exe release e:\smalldir twain:/vol/tiny/smalldir
backup with DFM >options ndmpd.enable on >options ndmpd.access dfm-host options ndmpd.authtype <challenge | plaintext >
Non root user get ndmp password as ndmpd password <user name > ndmp password ..
add snapvault license >options snapvault.enable on >options snapvault.access host >options ndmpd.preferred_interface e2 #optional
importing existing relationship -just add the relationship -schedule/retention not imported
Diagnosis between DFM host and Filer At host where DFM is downloaded C:\> dfm host diag < primary filer >
C:\> dfm database get dbDir c:/Program files/Network Appliance/Data Fabric Manager/DFM/Data dblogDir dbCacheSize
Snaplock
vol create src1 l 2 (at this point question is asked , read that carefully, this volume cannot be deleted. It is one time) vol create dst1 2 vol status ( you will see snaplock compliance vol here ) snapmirror initialize S giardia:src1 L giadia:dst1
VIF
Create steps
a) vif create vif1 e0 e7a e6b e8 --------single mode
OR Vif create multi vif0 e4 e10 ---- multi mode b) ifconfig vif1 < ip of vif > netmask mediatype 100tx -fd c) update /etc/rc file d) reboot 255.255.255.0
Tip 2 If there is 3 port ( eg : 2 Gig and 1 100 bast T Ethernet ) so that e0 ( default 100 base T ) e0 must be turned off
Vfiler If the hosting filer administrator does not have CIFS or NFS access to the data contained in V filers, except for that in Vfiler0. After storage unit is assigned to a Vfiler, the hosting filer administrator loses access ot htat storage unit. The Vfiler administration gains access to the Vfiler by rsh to the Vfilers IP address.
As hosting filer administrator, before you create a Vfiler with the /vol/vol1 volume, you can configure the /etc/exports file so that you can mount the /vol/vol1 volume. After you create the Vfiler, an attempt to mount the /vol/vol1 volume would result in the Stale NFS file handle error message. The Vfiler administrator can then edit the Vfilers /etc/exports file to export /vol/vol1, run the exportfs-a command on the Vfiler, tehn mount /vol/vol1, if allowed. >ipspace create vfiler1-ipspace >ipspace assign vfiler1 e3a >ifconfig e3a 0.0.0.0 >ipspace destroy e3a_ipsapce >ipspace list >vfiler create vfiler2 -s vfiler2 -i 10.41.66.132 /vol/vfiler/vfiler2 >vfiler status -a >vfiler status -r #running >vfiler run vfiler1 setup >vfiler stop|start|destroy
VFM
Cache location C:\documents and settings \ all users\application data\nuview\storage\server\cache
<C:\Documents and Settings\All Users\Application Data\NuView\>, which contains the cache directory:
1. 2. 3. 4.
Take a snapshot of the application in case there is a need to return to the working state. This can be done through VFM in the Tools menu by selecting Take Application Snapshot Now. Have the user create a snapshot and save it. Save a copy of the VFM application folder < C:\Documents and Settings\All Users\Application Data\NuView> somewhere for backup purposes. Exit VFM and stop the StorageXReplicationAgent service and the StorageXServer service. Create a folder on a different drive on the VFM server where the application directory should reside in the future. Please use a local destination for the folder for example D:\VFMAppData. A mapped drive does not work in this situation. Create a new subdirectory called NuView in the new location. Ex: D:\VFMAppData\NuView
5. 6. 7. 8.
Go to the C:\Documents and Settings\All Users\Application Data\NuView directory and copy the StorageX directory to the new location created by the user under the NuView subdirectory. The new location should look something like this: D:\VFMAppData\NuView\StorageX Open the registry with regedit.exe and find the HKEY_LOCAL_MACHINE\SOFTWARE\NuView\StorageX key. Add a new String Value here with the name AppDataDir and set the value data to the root of the new cache location. Ex: D:\VFMAppData Close regedit and start the StorageX Server and Replication Agent services. Start VFM and wait as it reads through the new cache directory and loads roots and information that were copied to the new location.
Bakup media fundamentals Ndmpd should be ON To check Filer> ndmpd status Filer> ndmp probe 0 [ session 0 , can be from 0 5 ] sysconfig t ( will give some backup media information ) mt f nrst0a status restore tf nrst0a # display file list, there can be multiple backups in backup file, which is filelist mt f nrst0a fsf 1 storage disable adapter <port> storage enable adapter <port> storage show tape supported - should show wwn if yes (sysconfig a will tell port and also shows if adapter card online or offline usually slot 10) /etc/log/backup ----- log files
List all the files in backup Filer> Filer> Filer> Filer> restore tf rst0a rewind (rewind the tape) mt f rst0a fsf 6 (move the head to file list6) mt f rst0a status ( make sure ) restore xvbfDH 60 rst0a /vol/vol0/ ( restore )
SCSI tape diagnostics to send vendor ( more detail messages ) Filer> mt f diag 1 Filer > mt f diag 0 -------- ON -------------- OFF
( with diag 1 ON, all the messages will go to etc/messages like when any backup job, command is executed like mt f, rewind, offline , erase, status diag etc )
Some issues a. b. If veritas is showing RED to LTO tape devices, then reboot LTO and restart veritas services If backup is done from Veritas software, make sure that no sessions are staying back as cifs share sessions. Go to my computer->Manage->connect to filer->shares->sessions.
Administrative shares of backups are seen here as sticking not going away even after backup is complete and you see huge list here.
Filer> Fcadmin online adapter 8a Filer> Fcadmin online adapter 8b Filer > fcp show adapter filer> storage show tape FPN[200300051e35353d]:0.80 HP Ultrium 2-SCSI
Tape Drive: Description: Serial Number: World Wide Name: Alias Name(s): Device State:
McData Side CNXNAS*> storage show switch Switch: WWN[1:000:080088:020751] Fabric: WWN[1:000:080088:020751] Name: CNX01 Domain: 97 Type: switch Version: 06.01.00
If port is changed, alias name also gets changed storage unalias -a storage alias mc0 WWN[xx:xxx:xxxxxx:xxxxxx][Lx] storage alias st0 WWN[yy:yyy:yyyyyy:yyyyyy][Ly] and to cause the filer to create the aliases via the "source" command source /vol/vol0/etc/tape_alias
Tape for target slot x: Fibre Channel Target Host Adapter 11a o (Dual-channel, QLogic 2312 (2342) rev. 2, 64-bit, <OFFLINED BY USER/SYSTEM>) o Firmware rev: 3.3.10 o Host Port Addr: 000000 o Cacheline size: 8 o SRAM parity: Yes o FC Nodename: 50:0a:09:80:83:f1:45:b6 (500a098083f145b6) o FC Portname: 50:0a:09:81:83:f1:45:b6 (500a098183f145b6) o Connection: No link I/O base 0xde00, size 0x100 memory mapped I/O base 0xa0400000, size 0x1000
Drives slot x: FC Host Adapter 3a (Dual-channel, QLogic 2312 (2342) rev. 2, 64-bit, L-port, <OFFLINE (hard)>) Firmware rev: 3.3.142 Host Loop Id: 0 FC Node Name: 2:000:00e08b:1c780b Cacheline size: 16 FC Packet size: 2048 SRAM parity: Yes External GBIC: No Link Data Rate: 1 Gbit I/O base 0x9e00, size 0x100 memory mapped I/O base 0xa0c00000, size 0x1000 Time Not synchronizing +5 min shewed ahead Options timed Timed.enable on Timed.servers ntp2.usno.navy.mil:<ip address> Rdate <host>
Out of inodes
OR >maxfiles <vol> - will display the number df i /vol/vol0
NDMP copy from Vol to Vol ( /etc/hosts.eqiv file must have two filers information their entries ) ( best solution for data migration ; snapmirror or vol copy will cause fragmentation - filer will retain ACLs )
a) Ndmpcopy source:path_to_vol destination:path_to_volume -level 0 dpass For data changed since level1 b) ndmpcopy source:path_to_vol destination:path_to_vol level1 dpass Finally turn off cifs and nfs : for final incremental backup c) ndmpcopy source:path_to_vol destination:path_to_vol level9 -dpass ( After this level 0 is done, a level 1 ndmpcopy may be done to copy the data has changed since level 1 ) Data Migration Ndmpcopy /vol/vol0trad /vol/vol0 Vol options /vol/vol0 root ( this will also automatically set the aggregate option to root upon next reboot )
Tip:
If wrongly copied to vol sometimes, we see vol inside vol0 and vol cannot be deleted. When accessed by \\filer\C$ - we see vol and that cannot be deleted. It says folders lost or not found. In that case, the folder can be deleted. Renaming possible, rename it and delete it.
Sync core when cannot access by telnet, console, rsh, fier view press reset button at back (while physically connected console) Ok> sync Ok>bye Filer reboot. Once filer comes up get core file from /etc/crash..
Unix commands
# tip hardwire - ---- direct access from unix/linux to filer # cat messages | grep shelf
1. Error Code : 9035 : An attempt to resize lun /vol/vtape/nvlun0/lun0.un on filer 10.40.3.2failed. desription New size exceeds this lun&app geometry
Sol : size was more than 10%. Lun cannot increase more than 10% of initial size. Like if initial is 130GB then 1300 GB is the max possible.
Exchange data base restore to different location Snap drive is used to restore the snapshot and hence database files if different location is desired
Snap manager for exchange error 249 249 : unable to delete snapshot due to busy snapshot state So 124 -> 134-> 249-> is created. Due to Overlapping of SME backup and snapshot Scheduling timings. In SME , sme verification process ( 7or 8th process also fails) kb2370
Exchange Restore 1. Up to the minute what ever have since last crash, will replay the log files automatically so database is up-to the minutes. Test back and restore, will have fundamentally no effect. If backup is done and mail is deleted, and restored instantly, the intermittent log files will be deleted and hence no effect. Basically it was within the database, so system take as it already exists and ignore, but will have all latest mail Point in time till that time ; all the backup after that date and time is not usable. Cannot get mails after that ; logs are deleted.
2.
1. 2. 3. 4.
5. 6. 7.
Mount the previous SME shapshots both database and log files copy those to recovery storage group directory if tried to restore will get exchange error : C1041739 error Copy eseutil.exe files to that directory files are Eseutil.exe Ese.dll Exchmem.dll Exosal.dll Jcb.dll run eseutil /mh pirv1.edb Eseutil /p pirv1.edb Restore, during that time system ask to overwrite the database files, it is the option at the bottom of database properties, choose that option and restore System is monted.
When IS is created in log volume When IS is created in log volume, if it was in separate before, SME fails and reports Evnet ID 111 and Event ID 131
VSS_E_PROVIDER_VETO
Error in calling VSS API : Error code = 0x80042306 Error description: VSS_E_PROVIDER_VETO ( Error Code 0x80042306) Event id 4357 also displayed
1. Check with vssadmin list providers Should show MS & Netapp 2. Make sure that RPC services in ON Netapp VSS Hardware Providers in ON If previous version of provider is installed , remove that and install what ever comes in SME only 3. Verify that node name is the same C:\> iscsicli sessionlist C:\> iscsicli These both should show same node name like Iqn.2005-01.test:01
Remote Verification Error Log ( if verification server is different than present server )
Transactional log verification failed Error code : 0xC004146f Sanpmanager database verification failed
Sol : Check the exact version of 5 files Eseutil.exe 6.5.7226.0 Ese.dll Jcb.dll Exosal.dll Exchmem.dll
(while different verification server is used, symptom message says generated on and generated to so above 5 files needs to be checked.
Verification failure example
Local error code : 0xC004031d Event ID 209 also event id 264 - job failed Remote Verification RPC server unavailable 0xC004146e
Event ID : 177
Sol : Local server has preferred address set to dedicatedly connected cable. Local server has one nic for filer iscsi connection and other nic for public. Filer had two nics one for dedicated iscsi connection ( 192.168.1.1) and the other 10.1.8.11 for cifs and other connections. From remote change preferred ip to 10.1.8.11 ( the other ip of filer ). Make sure that at least drives can be created from this verification server. After preferred ip address change no above error happened.
The main problem of RPC was not able to ping from private network to public network. If remote verification is doing from another server, it is advisable to not to make iscsi session to the same nic where Local ( source ) server is talking. It gives RPC error error code 0xC004146e with event ID 250 and Event ID 117.
Add etc/hosts entry of exchange servers address in filer and \windows\system32\drivers\etc\hosts entry in exchange servers : preferred IPs of filers.
Make sure that snap drive services have the same account & p/w information in both servers.
SME error:
Error Code : 0xC00413e3 There are no bindings Facility : Win32 ID no : C00706b6 Microsoft CDO for exchange management Reason: MS Exchange Services were not started and database were not mounted
Unable to create snapshot. Check application log ( SnapDrive error code: 0xc00402be)
Snaplist q <vol> Vol options thisvol nosnapdir off Vol options thisvol convert_ucode on Vol options thisvol create_researved on
Event ID 51
An error has detected on device \Device\harddisk7 and hard disk is Netapp lun 1. 2. 3. 4. 5. Apply hotfix i834910 Apply SP1 Apply iscsi 2.0 See Patern. Move the database if it is overloaded Check virus stuffs.
Filer Report
Filer Report
Uptime
11:19pm up 49 mins, 0 NFS ops, 0 CIFS ops, 25 HTTP ops, 0 FCP ops, 0 iSCSI ops
Network Interfaces
Name ns0 ns1* lo Mtu Network 1500 192.168.1 1500 none 9188 127 Address fas121 none localhost Ipkts 2k 0 124 Ierrs 0 0 0 Opkts 1k 0 124 Oerrs 0 0 0 Collis 0 0 0 Queue 0 0 0
System Configuration
NetApp Release 7.2: Mon Jul 31 14:53:25 PDT 2006 System ID: 0099907364 (fas121) System Serial Number: 987654-32-0 (fas121) Model Name: Simulator Processors: 1 slot 0: NetApp Virtual SCSI Host Adapter v0 25 Disks: 11.8GB 2 shelves with LRC slot 1: NetApp Virtual SCSI Host Adapter v1 slot 2: NetApp Virtual SCSI Host Adapter v2 slot 3: NetApp Virtual SCSI Host Adapter v3 slot 4: NetApp Virtual SCSI Host Adapter v4 25 Disks: 11.8GB 2 shelves with LRC slot 5: NetApp Virtual SCSI Host Adapter v5 slot 6: NetApp Virtual SCSI Host Adapter v6 slot 7: NetApp Virtual SCSI Host Adapter v7 slot 8: NetApp Virtual SCSI Host Adapter v8 4 Tapes: VT-100MB VT-100MB VT-100MB VT-100MB
From command: fas121> sysconfig NetApp Release 7.2: Mon Jul 31 14:53:25 PDT 2006 System ID: 0099907364 (fas121) System Serial Number: 987654-32-0 (fas121) Model Name: Simulator Processors: 1 slot 0: NetApp Virtual SCSI Host Adapter v0 25 Disks: 11.8GB 2 shelves with LRC slot 1: NetApp Virtual SCSI Host Adapter v1 slot 2: NetApp Virtual SCSI Host Adapter v2
slot 3: NetApp Virtual SCSI Host Adapter v3 slot 4: NetApp Virtual SCSI Host Adapter v4 25 Disks: 11.8GB 2 shelves with LRC slot 5: NetApp Virtual SCSI Host Adapter v5 slot 6: NetApp Virtual SCSI Host Adapter v6 slot 7: NetApp Virtual SCSI Host Adapter v7 slot 8: NetApp Virtual SCSI Host Adapter v8 4 Tapes: VT-100MB VT-100MB VT-100MB VT-100MB