Not able to set SLAVE role on device openflow:97702827111235, reason: Was not able to propagate SLAVE role on device. Error: org.opendaylight.openflowplugin.impl.role.RoleChangeException: Can't commit a completed message.

Description

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-3node-0cmb-1ctl-2cmp-openstack-queens-upstream-stateful-oxygen/95/odl_3/odl3_karaf.log.gz

The exception and errors below are repeated numerous times as ovs nodes are initially connected to ODL. This looks to lead to initial default flows not being programmed on the switches.

2018-10-07T21:49:21,087 | INFO | pool-83-thread-1 | SalRoleServiceImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | SetRole called with input:SetRoleInput [_controllerRole=BECOMESLAVE, _node=NodeRef [_value=KeyedInstanceIdentifier{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:97702827111235]]]]}], augmentation=[]] 2018-10-07T21:49:21,088 | INFO | pool-83-thread-1 | SalRoleServiceImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Requesting state change to BECOMESLAVE 2018-10-07T21:49:21,088 | INFO | pool-83-thread-1 | SalRoleServiceImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | RoleChangeTask called on device:openflow:97702827111235 OFPRole:BECOMESLAVE 2018-10-07T21:49:21,088 | INFO | pool-83-thread-1 | RoleService | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | getGenerationIdFromDevice called for device: openflow:97702827111235 2018-10-07T21:49:21,093 | WARN | pool-83-thread-1 | OutboundQueueEntry | 391 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.6.4 | Can't commit a completed message. 2018-10-07T21:49:21,096 | ERROR | pool-83-thread-1 | RoleService | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | getGenerationIdFromDevice RPC error null 2018-10-07T21:49:21,099 | WARN | pool-83-thread-1 | ContextChainHolderImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Not able to set SLAVE role on device openflow:97702827111235, reason: Was not able to propagate SLAVE role on device. Error: org.opendaylight.openflowplugin.impl.role.RoleChangeException: Can't commit a completed message. 2018-10-07T21:49:21,100 | INFO | pool-83-thread-1 | ReconciliationManagerImpl | 376 - org.opendaylight.openflowplugin.applications.reconciliation-framework - 0.6.4 | Stopping reconciliation for node Uri [_value=openflow:97702827111235] 2018-10-07T21:49:21,105 | INFO | pool-83-thread-1 | ContextChainImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Closed clustering services registration for node openflow:97702827111235 2018-10-07T21:49:21,106 | INFO | pool-83-thread-1 | GuardedContextImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Terminating DeviceContextImpl[NEW] service for node openflow:97702827111235 2018-10-07T21:49:21,106 | INFO | pool-83-thread-1 | GuardedContextImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Terminating RpcContextImpl[NEW] service for node openflow:97702827111235 2018-10-07T21:49:21,107 | INFO | pool-83-thread-1 | GuardedContextImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Terminating StatisticsContextImpl[NEW] service for node openflow:97702827111235 2018-10-07T21:49:21,107 | INFO | pool-83-thread-1 | StatisticsContextImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Stopping running statistics gathering for node openflow:97702827111235 2018-10-07T21:49:21,113 | INFO | pool-83-thread-1 | GuardedContextImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Terminating RoleContextImpl[NEW] service for node openflow:97702827111235 2018-10-07T21:49:21,190 | INFO | epollEventLoopGroup-9-15 | AbstractConnectionAdapter | 391 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.6.4 | The channel outbound queue size:1024 2018-10-07T21:49:21,191 | INFO | epollEventLoopGroup-9-15 | ConnectionAdapterImpl | 391 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.6.4 | Hello received 2018-10-07T21:49:21,196 | INFO | epollEventLoopGroup-9-15 | ContextChainHolderImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Device openflow:97702827111235 connected. 2018-10-07T21:49:21,197 | INFO | epollEventLoopGroup-9-15 | ContextChainHolderImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | No context chain found for device: openflow:97702827111235, creating new. 2018-10-07T21:49:21,197 | INFO | epollEventLoopGroup-9-15 | DeviceManagerImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | ConnectionEvent: Device connected to controller, Device:/10.30.170.126:57622, NodeId:Uri [_value=openflow:97702827111235] 2018-10-07T21:49:21,198 | INFO | epollEventLoopGroup-9-15 | RoleContextImpl | 382 - org.opendaylight.openflowplugin.impl - 0.6.4 | Started timer for setting SLAVE role on device openflow:97702827111235 if no role will be set in 20s.

Environment

None

Attachments

1
  • 08 Oct 2018, 11:10 PM

Activity

Show:

JamO Luhrsen December 20, 2018 at 8:14 PM

Jamo Luhrsen we are debugging 2 issues here, one is the exception tracked with this jira. "Not able to set SLAVE role on device"

For this, we have raised a patch "https://git.opendaylight.org/gerrit/#/c/78199/" and wanted to run CSIT for the patch and confirm if the error is no longer seen before merging the review.

Ok, it looks like you already used our patch test to run CSIT. This is the job you triggered
when you gave that 'test-openflowplugin-cluster-netvirt' back on 11/27. I don't see this
WARN anymore in those three karaf logs 123

I put a +1 on that patch because of this. Feel free to run patch-test again if
you think it's needed since that patch has sat for almost a month now.

Another one is default flows missing, which is tracked by https://lf-opendaylight.atlassian.net/browse/OPNFLWPLUG-1056#icft=OPNFLWPLUG-1056. Even we have raised a debug patch https://git.opendaylight.org/gerrit/#/c/78730/ to get more logs, we need to run csit on this and collect some additional logs.

ok, let's let this 1039 be just for the WARN message and looks like that might
be resolved soon then. I'll go comment on 1056 now.

Thanks

Arunprakash D December 20, 2018 at 5:11 AM

we are debugging 2 issues here, one is the exception tracked with this jira. "Not able to set SLAVE role on device"

For this, we have raised a patch "https://git.opendaylight.org/gerrit/#/c/78199/" and wanted to run CSIT for the patch and confirm if the error is no longer seen before merging the review.

Another one is default flows missing, which is tracked by https://lf-opendaylight.atlassian.net/browse/OPNFLWPLUG-1056#icft=OPNFLWPLUG-1056. Even we have raised a debug patch https://git.opendaylight.org/gerrit/#/c/78730/ to get more logs, we need to run csit on this and collect some additional logs.

 

 

JamO Luhrsen December 19, 2018 at 4:56 PM
Edited

, , so the issue of default flows not being there seems to still
happen from time to time. Example here, but even
when that is not the failure in our jobs, we still see this WARN messages a lot:

org.opendaylight.openflowplugin.impl - 0.6.5 | Not able to set SLAVE role on device openflow:102142247889978, reason: Was not able to propagate SLAVE role on device. Error: org.opendaylight.openflowplugin.impl.role.RoleChangeException: Can't commit a completed message.

so, potentially the WARN message is not even related to the default flows missing.

Arunprakash D December 19, 2018 at 8:56 AM

Is the issue resolved? We wanted to ensure before merging the patch.

Sam Hague December 6, 2018 at 4:19 AM

The infra is down again so we can't stack the jobs. We need to fix that so that we can run the tests for this patch to verify the fix.

I am fine with merging the patch if it does pass the ofp tests and looks to solve the problem.

Done

Details

Assignee

Reporter

Labels

Fix versions

Priority

Created October 8, 2018 at 11:10 PM
Updated January 7, 2019 at 5:09 PM
Resolved January 7, 2019 at 5:09 PM