Switch handshaking loops indefinitely

Description

I'm running ODL fluorine stable release (opendaylight-0.9.0.tar.gz) and facing this problem:

when a simulated switch (mininet) connects and disconnects very frequently, the openflowplugin enters in a loop and cannot recover the handshaking properly. This also affects the connection with some other switches.

I can reproduce the same issue in a network of real switches (Extreme/Edgecore switches).

When this problem happens, there is also a huge memory leak resulting in having the number of DeviceContextImpl instances growing indefinitely.

Steps to reproduce:

1) run opendaylight-0.9.0

2) feature:install features-openflowplugin

3) run mininet (sudo mn --topo linear,20 --switch ovsk,protocols=OpenFlow13 --mac --controller remote,port=6633,ip=127.0.0.1)

4) simulate a switch disconnection by running the command "./changectrl.sh 10000 0.1" (script in attachments)

5) wait 1-2 minutes, you should see odl trying indefinitely to regain the connection

6) stop the script, the memory leak is now growing (you can check the number of instances of DeviceContextImpl by running "jcmd <pid> GC.class_histogram | grep -e "org.opendaylight.openflowplugin.impl.device.DeviceContextImpl$")

 

In attachment also the karaf.log

Environment

None

Attachments

2
  • 15 Nov 2018, 10:15 AM
  • 15 Nov 2018, 09:54 AM

Activity

Show:

Somashekhar Javalagi November 14, 2019 at 9:54 AM

Merged in master branch, which is magnesium

Luis Gomez Palacios July 30, 2019 at 10:49 PM

Somashekhar Javalagi July 30, 2019 at 11:20 AM

Hi

For this issue, we are expecting csit to pass to proceed further. I am seeing some of openflowplugin sodium csit jobs failing consistently only with reason ConnectionError: HTTPConnectionPool(host='10.30.170.90', port=8181). Are these seen before?

I just ran csit on dummy test review and below are logs.

https://jenkins.opendaylight.org/releng/job/openflowplugin-patch-test-core-sodium/83/

 

Anil Vishnoi January 28, 2019 at 5:30 PM

Discussed dampening mechanism for 3-node cluster setup. Having a local dampening mechanism (connection dampening in the context of single node ) and global dampening mechiansm (connection dampening across the three node cluster) would be a great value add.

Anil Vishnoi January 7, 2019 at 4:59 PM

If you get a chance can you please test the latest patch for this issue?

Done

Details

Assignee

Reporter

Affects versions

Priority

Created November 15, 2018 at 10:17 AM
Updated February 6, 2025 at 2:12 PM
Resolved November 14, 2019 at 9:54 AM