0% found this document useful (0 votes)
26 views2 pages

Isilon - SyncIQ Workers Repeatedly Restart Causing Replicated Data To Be Larger Than The Actual Data Set - Dell India

In OneFS 8.*, SyncIQ workers are restarting repeatedly, leading to replicated data being larger than the original data set. This issue arises from the worker pool mode introduced in this version, causing discrepancies between source and target data sizes. A workaround involves reverting to OneFS 7.x behavior by modifying the siq-conf.gc configuration file to disable worker pools.

Uploaded by

panwar14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views2 pages

Isilon - SyncIQ Workers Repeatedly Restart Causing Replicated Data To Be Larger Than The Actual Data Set - Dell India

In OneFS 8.*, SyncIQ workers are restarting repeatedly, leading to replicated data being larger than the original data set. This issue arises from the worker pool mode introduced in this version, causing discrepancies between source and target data sizes. A workaround involves reverting to OneFS 7.x behavior by modifying the siq-conf.gc configuration file to disable worker pools.

Uploaded by

panwar14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

9/2/22, 4:46 PM Isilon: SyncIQ workers repeatedly restart causing replicated data to be larger than the actual data

licated data to be larger than the actual data set | Dell India

Article Number: 000032506 Print

Isilon: SyncIQ workers repeatedly restart causing replicated


data to be larger than the actual data set
Summary: In OneFS 8.* the data set on the production cluster is smaller than the data replicated to the
target cluster. This issue is due to the workers being repeatedly restarted causing data on the target cluster
to twice that of the original data set.

Article Content

Symptoms

In OneFS 8.* when data is replicated to the target cluster, workers are restarted over again. This causes the replicated data to be
larger than the actual data set.

A worker should only start once in order to keep the data replicated value in sync with the production data set. If a worker restarts
repeatedly, then the data keeps mounting on the target side. Below one could see that the workers are restarting 2-5 times. This is
an issue with the worker pools introduced in OneFS 8.*.

Cluster-1# cat restart_workers.out | grep -o 'primary.*' | cut -d ' ' -f1-6 | sort | uniq -c | sort -n

2 primary[itsShared-CALSSync:1503969873]: Starting worker 21 on 'itsShared-CALSSync' ------> The numerical number on


the far left depicts number of times worker started.
2 primary[itsShared-CALSSync:1503969873]: Starting worker 23 on 'itsShared-CALSSync'
3 primary[itsShared-CALSSync:1503969873]: Starting worker 18 on 'itsShared-CALSSync'
4 primary[itsShared-CALSSync:1503969873]: Starting worker 19 on 'itsShared-CALSSync'
4 primary[itsShared-CALSSync:1503969873]: Starting worker 20 on 'itsShared-CALSSync'
4 primary[itsShared-CALSSync:1503969873]: Starting worker 22 on 'itsShared-CALSSync'
5 primary[itsShared-CALSSync:1503969873]: Starting worker 17 on 'itsShared-CALSSync'

Cause

This issue started in OneFS 8.* as SyncIQ moved to a worker pool mode. When jobs are either restarted or stopped SyncIQ
rebalances the workers across jobs. This causes a deviation in size between the source and target data sets.

Resolution

As a workaround one could revert to OneFS 7.x worker pool behavior by disable of worker pool mode. This change is implemented
by a modification of siq-conf.gc configuration file on the source cluster with the following procedure.

1. Disable all SyncIQ policies as a preparation for a change of global configuration file:

Cluster-1# isi sync policies disable all

2. Cancel any running SyncIQ jobs:

Cluster-1# isi sync jobs cancel all

3. Disable the isi_migrate service:

Cluster-1# isi services -a isi_migrate disable

4. Create a backup of the siq-conf.gc file:

https://www.dell.com/support/kbdoc/en-in/article/lkbprint?ArticleNumber=000032506&AccessLevel=10&Lang=en 1/2
9/2/22, 4:46 PM Isilon: SyncIQ workers repeatedly restart causing replicated data to be larger than the actual data set | Dell India

Cluster-1# cd /ifs/.ifsvar/modules/tsm/config
Cluster-1# cp ./siq-conf.gc ./siq-conf.gc.bak_<YYYY>-<MM>-<DD>

where YYYY is the 4-digit year, MM is the 2-digit month and DD is the 2-digit day of the file backed up.

5. Modify the siq-conf.gc file and add these entries.

Under the coordinator section add:


+coordinator.workers_per_node {token:1} = "3"
+coordinator.max_wperpolicy {token:1} = "40"
+coordinator.skip_work_host {token:1} = "1" <------ This is the real change that disables worker pools

Under the scheduler section add:


+scheduler.max_concurrent_jobs {token:1} = "5"

6. Double check the changes:

Cluster-1# diff -y ./siq-conf.gc ./siq-conf.gc_2017*

7. Re-enable the isi_migrate service:

Cluster-1# isi services -a isi_migrate enable

8. Re-enable all SyncIQ policies:

Cluster-1# isi sync policies enable --all

Article Properties

Affected Product
Isilon SyncIQ

Product

Isilon, PowerScale OneFS

Last Published Date


28 Jul 2022

Version
4

Article Type
Solution

https://www.dell.com/support/kbdoc/en-in/article/lkbprint?ArticleNumber=000032506&AccessLevel=10&Lang=en 2/2

You might also like