Attendees
@Alex Stancu Alex Stancu
@Bhushan Pagar
@Daya K @Hari Krishna
@Isaac Manuel Raj
@Karthik Natarajan
@Mahesh Jethanandani
@Ravi Sankar
@Sandeep Shah
@Paul Joseph
@David Kinsey
Agenda
- Start the Recording
- Antitrust Policy
- Agenda Bashing (Roll Call, Action Items (5 minutes)
- General Topics
- Netconf Sessions
Minutes
Netconf Sessions
- Manoj Chokka presented a discussion on NetConf Sessions
- Jeff Hartley – Conversation summary:
- Early part of the discussion covered the problem statement, and refreshers on the arguments for and against persistent-connections vs. always-disconnect.
- General consensus was formed around "minimize impact to existing production consumers of OpenDaylight Netconf.
- Jeff explained that the flow diagram here ( https://wiki.o-ran-sc.org/display/OAM/Termination+of+NetConf+Sessions ) is vastly over-simplified in terms of the actual steps ( https://wiki.onap.org/display/DW/Process+to+establish+a+NetConf+session ). ...Specifically the CPU + Heap cost associated with mounting/remounting devices/VNFs. ODL provides a lot of value-adds such as strict RFC compliance checking, and repeatedly performing these functions is somewhat expensive. Real-world measurements at a scale of 30K-50K of these ~small netconf node
- Cost to the DU (or any endpoint) for maintaining an idle/open netconf connection is negligible, particularly if upper-layer Netconf keepalives are disabled/infrequent.
- There was some discussion around simply monitoring for mounts w/keepalive timer set to 0 (disabled), whose TCP connections had expired, and having a cron-like "sweeper" process to dismount those. (Note: by default, they're simply left in the topology and ignored until a transaction comes to that node.)
- Lots of small discussions around what these mounts look like / how they behave in comparison to current common use-cases.
- Some discussion around whether or not Netconf CallHome could be used here (does not appear to match O-RAN RIC workflows).
- Bala walked the group through some of the keepalive code logic, and which parts are more/less complex.
- During a discussion around the Orchestration of this solution: If the assumption is that workflows would disconnect after every transaction, then those same workflows are intrinsically already multi-operation (the PUT for the mount, then the PUT/POST for the edit), thus adding one more step for the DELETE does not add any complexity. This has the benefits of letting operators and workflow/graph designers locally determine how to spend their resources, based on their own networks/servers/capacity/operations.
- Jeff briefly covered that many of the assumptions made at the time of this requirement's creation were made without actual data on the associated costs, since no one has ever actually measured them. Over the next 3-6 months, Lumina will be performing several such Netconf scale tests, and posting the tests+results via LFN forums (likely cross-posted from ODL into ONAP and O-RAN / TBD).
- The OpenDaylight JIRA related to the proposed feature is here:
Jira Legacy server System Jira serverId 32ffbfe1-5765-35b5-93e6-88d6f7c9d516 key NETCONF-638
- Current recommendation proposed during this meeting is to table this for now, with existing use-cases/workflows/graphs simply executing the DELETE of their mount if that's the desired behavior. That is already fully functional on all versions of ODL, and does not introduce more code/testing.
Action items
- Jeff Hartleyto create a Jira ticket and cross-reference with the wiki to track the changes to NetConf