Contiv network is not deleted when docker network is deleted#1029
Contiv network is not deleted when docker network is deleted#1029kahou82 merged 8 commits intocontiv:masterfrom
Conversation
Steps to reproduce: 1. Create a docker network with docker network CLI 2. Ensure docker network and contiv network are created (docker network ls and netctl network ls) 3. Remove docker network. 4. Check docker network ls, docker network is deleted 5. Check netctl networ ls, contiv network is still test. The root cause is due to the dockernetworkstate mapping is planned to delete after contiv network is delete. However, the determination of contiv network deletion is determined by the existance of the dockernetworkstate. Thefore, if the state is still there, the contiv network will be unable to delete. The fix is to delete the dockernetworkstate before proceed to contiv network deletion Signed-off-by: Kahou Lei kalei@cisco.com
|
DO NOT MERGE: I am writing unit test. |
mgmtfn/dockplugin/netDriver.go
Outdated
| // delete the dnet oper state | ||
| err = docknet.DeleteDockNetState(dnet.TenantName, dnet.NetworkName, dnet.ServiceName) | ||
| if err != nil { | ||
| log.Errorf("Couldn't delete docknet for nwID %s", networkID) |
There was a problem hiding this comment.
Log the error here to aid with debugging, please:
log.Errorf("Couldn't delete docknet for nwID %s: %s", networkID, err)|
PTAL @gkvijay ^^ |
|
build PR |
|
build PR |
|
build PR |
| log.Errorf("Couldn't delete docknet for nwID %s: %s", networkID, err.Error()) | ||
| } | ||
| log.Infof("Deleted docker network mapping for %v", networkID) | ||
| } |
There was a problem hiding this comment.
shouldn't we be failing if we get err in either of these two cases? under what conditions can docker and contiv networks be out of sync (when there is not a bug) . . . err==nil or not seems too generic, can we check for "not present" vs any other err e.g. "error talking to docker"?
| } | ||
| log.Infof("Deleted docker network mapping for %v", networkID) | ||
| } | ||
|
|
There was a problem hiding this comment.
we deleted docker network, found a contiv netowrk, but failed to delete the contiv netowrk, should we restore the docker network? (i dunno, just asking) . . L771
There was a problem hiding this comment.
There is no hook for me to do that at the moment to restore the network.
There was a problem hiding this comment.
Don't think we must secretly revert what the user does (docker network rm in this case). If contiv network deletion fails, I think an error to the user is good instead of reverting what the user did.
netmaster/docknet/docknet.go
Outdated
| log.Infof("docker v2plugin name (%s) updated to %s", netDriverName, pluginName) | ||
| netDriverName = pluginName | ||
| ipamDriverName = pluginName | ||
| func UpdatePluginName(netdriver string, ipamDriver string) { |
There was a problem hiding this comment.
Surprised go vet didn't complain about that
| log.Infof("docker v2plugin (%s) updated to %s and ipam (%s) updated to %s", | ||
| netDriverName, netdriver, ipamDriverName, ipamDriver) | ||
| netDriverName = netdriver | ||
| ipamDriverName = ipamDriver |
There was a problem hiding this comment.
this change is missing from the commit description
| if delErr != nil { | ||
| t.Fatalf("Unable to delete docker network. Error: %v", delErr) | ||
| t.Fail() | ||
| } |
There was a problem hiding this comment.
shouldn't we check for the network to be absent from both docker and netctl here?
netmaster/docknet/docknet_test.go
Outdated
| func TestUpdatePluginName(t *testing.T) { | ||
| expectNetDriver := "bridge" | ||
| expectIPAMDriver := "default" | ||
| UpdatePluginName("bridge", "default") |
| expectIPAMDriver := "default" | ||
| UpdatePluginName("bridge", "default") | ||
|
|
||
| if expectNetDriver != netDriverName { |
There was a problem hiding this comment.
how is netDriverName in scope?
There was a problem hiding this comment.
Since this Go test file is in the docknet package (see top of file), everything in the docknet package is available here.
This var is defined in docknet.go here: https://github.com/kahou82/netplugin/blob/37a3b829c00226b8af742f80e6c46889c20fd4d2/netmaster/docknet/docknet.go#L44
| t.Fail() | ||
| } | ||
|
|
||
| if expectIPAMDriver != ipamDriverName { |
There was a problem hiding this comment.
chrisplo
left a comment
There was a problem hiding this comment.
my first go code review, maybe my concerns are invalid
|
build PR |
1 similar comment
|
build PR |
|
@chrisplo can you review? It is one of the CVD blockers. |
.idea/vcs.xml
Outdated
| @@ -0,0 +1,6 @@ | |||
| <?xml version="1.0" encoding="UTF-8"?> | |||
There was a problem hiding this comment.
Please git rm this file (or the whole directory). It shouldn't be included in git. You could add it to .gitignore so it's not tracked
mgmtfn/dockplugin/netDriver_test.go
Outdated
| instInfo := core.InstanceInfo{DbURL: "etcd://127.0.0.1:2379"} | ||
| func initFakeStateDriver() { | ||
| instInfo := core.InstanceInfo{} | ||
| d, _ := utils.NewStateDriver("fakedriver", &instInfo) |
There was a problem hiding this comment.
Please check the error and assert that it is nil
(you never know!)
mgmtfn/dockplugin/netDriver_test.go
Outdated
| initFakeStateDriver() | ||
|
|
||
| // Delete Docker network by using helper | ||
| delErr := deleteNetworkHelper("non_existing_network_id") |
There was a problem hiding this comment.
Errors can just be named "err" unless you legitimately need multiple error variables in the same scope at the same time
mgmtfn/dockplugin/netDriver.go
Outdated
| // delete the dnet oper state | ||
| err = docknet.DeleteDockNetState(dnet.TenantName, dnet.NetworkName, dnet.ServiceName) | ||
| if err != nil { | ||
| log.Errorf("Couldn't delete docknet for nwID %s: %s", networkID, err.Error()) |
There was a problem hiding this comment.
Here, I would do something like:
msg := fmt.Sprintf("my %s", "error message")
log.Error(msg)
return errors.New(msg)Return as much useful info to the caller as possible
mgmtfn/dockplugin/netDriver.go
Outdated
| } | ||
| log.Infof("Deleted docker network mapping for %v", networkID) | ||
| } else { | ||
| log.Errorf("Couldn't find Docker network %s: %s", networkID, err.Error()) |
mgmtfn/dockplugin/netDriver_test.go
Outdated
| // initStateDriver initialize etcd state driver | ||
| func initFakeStateDriver() { | ||
| instInfo := core.InstanceInfo{} | ||
| d, _ := utils.NewStateDriver("fakedriver", &instInfo) |
|
build PR |
1 similar comment
|
build PR |
rchirakk
left a comment
There was a problem hiding this comment.
LGTM other than minor comments
| if err != nil { | ||
| msg := fmt.Sprintf("Could not delete docknet for nwID %s: %s", networkID, err.Error()) | ||
| log.Errorf(msg) | ||
| return errors.New(msg) |
There was a problem hiding this comment.
another way to simplify,
err := fmt.Errorf()
log.Error(err)
return err
There was a problem hiding this comment.
I am going to stick with the current implementation
| } | ||
| log.Infof("Deleted docker network mapping for %v", networkID) | ||
| } else { | ||
| msg := fmt.Sprintf("Could not find Docker network %s: %s", networkID, err.Error()) |
There was a problem hiding this comment.
fmt package takes care of calling Error()
fmt.Sprintf("xxx %s", err)
There was a problem hiding this comment.
I am going to stick with the current implementation
There was a problem hiding this comment.
i understand now that due to other bad behavior (not erroring in v2plugin mode when requesting to create the network) that we can get out of sync, I'm just looking for a ticket to block the netctl when in v2plugin mode, then we can clean up this lax behavior here
mgmtfn/dockplugin/netDriver.go
Outdated
| } else { | ||
| msg := fmt.Sprintf("Could not find Docker network %s: %s", networkID, err.Error()) | ||
| log.Errorf(msg) | ||
| return errors.New(msg) |
There was a problem hiding this comment.
don't return, log error message & proceed to delete network.
| if err != nil { | ||
| t.Fatalf("Unable to delete docker network. Error: %v", err) | ||
| t.Fail() | ||
| } |
There was a problem hiding this comment.
checking the return is non-error is good, but that didn't actually test that the network was deleted, can you check w/ docker/netctl for current status?
Take out return statement when docker network is not there.
|
build PR |
| } | ||
| log.Infof("Deleted contiv network %v", networkID) | ||
| } else { | ||
| log.Infof("Could not find contiv network %v", networkID) |
There was a problem hiding this comment.
We should still be bubbling up an error here (like bad request to user for deleting a network that didn't exist), but happy to see this as a new ticket
There was a problem hiding this comment.
At the moment, I don't think return error here will bubbling up to the docker network create command. We have to have another ticket to investigate it.
concerns addressed or discussed and understood
|
build PR |
Steps to reproduce:
The root cause is due to the dockernetworkstate mapping is planned to delete after
contiv network is delete. However, the determination of contiv network deletion is
determined by the existance of the dockernetworkstate. Thefore, if the state is still
there, the contiv network will be unable to delete.
The fix is to delete the dockernetworkstate before proceed to contiv network deletion
Also refactor docker plugin update by separating the plugin and ipam driver
Signed-off-by: Kahou Lei kalei@cisco.com