Skip to content

raft: kill TODO about behavior when snapshot fails #3609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 30, 2015

Conversation

yichengq
Copy link
Contributor

etcd is going to support incremental snapshot, and we design to let it
send at most one snapshot out at first stage. So when one snapshot is in
flight, snapshot request will return error.

When failing to get snapshot when sending MsgSnap, raft prints out
related log and abort sending message.

/cc @xiang90 @bdarnell

for #3549

@@ -263,7 +263,8 @@ func (r *raft) sendAppend(to uint64) {
m.Type = pb.MsgSnap
snapshot, err := r.raftLog.snapshot()
if err != nil {
panic(err) // TODO(bdarnell)
r.logger.Debugf("%x failed to send snapshot to %x because snapshot is unavailable (%v)", r.id, to, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we still check the error before logging. we can still panic on other cases. But it @bdarnell thinks it is OK. I think we can just do the logging. Just try to avoid unnecessary logic changes.

@bdarnell
Copy link
Contributor

LGTM, but I'd rather use Warningf instead of Debugf. Or, if Warningf would be too noisy for you, we could introduce a new special error like ErrUnavailable which snapshot() could return to skip the snapshot without logging.

@yichengq
Copy link
Contributor Author

The snapshot message may send every heartbeat interval in probe state. If heartbeat interval is 0.1s, it would print out 10 messages per second. So i don't want to print it out if unavailable snapshot is expected.

introduce a new special error like ErrUnavailable which snapshot() could return to skip the snapshot without logging.

Sound good. Will do.

For panic, I would keep original panic in other cases considering i don't know how it will be used in the future.

@yichengq
Copy link
Contributor Author

Updated. PTAL

@@ -31,6 +31,8 @@ const noLimit = math.MaxUint64

var errNoLeader = errors.New("no leader")

var ErrTemporarilyUnavailable = errors.New("snapshot is temporarily unavailable")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either name the variable ErrSnapshotTemporarilyUnavailable or remove the word "snapshot" from the message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ErrSnapshotTemporarilyUnavaliable seems to be better.

@bdarnell
Copy link
Contributor

LGTM

@@ -57,6 +57,9 @@ type Storage interface {
// first log entry is not available).
FirstIndex() (uint64, error)
// Snapshot returns the most recent snapshot.
// If snapshot is temporarily unavailable, it should return ErrTemporarilyUnavailable,
// so raft instance could know that Storage needs some time to prepare
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instance -> statemachine

@xiang90
Copy link
Contributor

xiang90 commented Sep 29, 2015

LGTM

etcd is going to support incremental snapshot, and we design to let it
send at most one snapshot out at first stage. So when one snapshot is in
flight, snapshot request will return error.

When failing to get snapshot when sending MsgSnap, raft prints out
related log and abort sending this message.
yichengq added a commit that referenced this pull request Sep 30, 2015
raft: kill TODO about behavior when snapshot fails
@yichengq yichengq merged commit 533e728 into etcd-io:master Sep 30, 2015
@yichengq yichengq deleted the raft-snapshot branch September 30, 2015 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants