Unable to remove 800 mountpoints from the datastore
Description
Environment
Attachments
- 28 Nov 2024, 06:04 PM
- 17 Oct 2024, 03:39 PM
- 17 Oct 2024, 03:39 PM
- 17 Oct 2024, 03:39 PM
- 17 Oct 2024, 10:38 AM
- 17 Oct 2024, 10:38 AM
- 17 Oct 2024, 10:06 AM
- 17 Oct 2024, 10:06 AM
- 17 Oct 2024, 10:06 AM
- 17 Oct 2024, 10:06 AM
- 17 Oct 2024, 10:06 AM
- 16 Oct 2024, 01:38 PM
blocks
relates to
split to
Activity
Peter Šuňa December 17, 2024 at 11:19 AM
Workaround which suppress this issue in NETCONF was merged:
https://lf-opendaylight.atlassian.net/browse/NETCONF-1432
@Robert Varga as suggested, I tried changing this to STATE.setVolatile(this, STATE_SEALED)
. However, this did not resolve YT-1651. The suggested fix is invoked within transaction.commit()
, but the problematic part starts sooner with transaction.delete(LogicalDatastoreType.OPERATIONAL, nodePath())
.
Peter Šuňa December 10, 2024 at 4:55 PM
Looks like the Yangtools can change asynchronously the modification type:
https://github.com/opendaylight/yangtools/blob/474f3f11081c1ff6f1e05354944a505e80ebddb7/data/yang-data-tree-ri/src/main/java/org/opendaylight/yangtools/yang/data/tree/impl/AutomaticLifecycleMixin.java#L91C1-L103C21
I have prepared for this separate issue in Yangtools with detailed description:
https://lf-opendaylight.atlassian.net/browse/YANGTOOLS-1651
A workaround for NETCONF could be to switch the order of closing the delegate and the communicator.
delegate.close();
communicator.close();
This will prevent unnecessary updates to the NETCONF device before it is removed:
https://github.com/opendaylight/netconf/blob/d01b73ebe64b969345ee95c6653bb1744b500d71/apps/netconf-topology/src/main/java/org/opendaylight/netconf/topology/spi/NetconfTopologyDeviceSalFacade.java#L62C1-L65C10
Robert Varga December 9, 2024 at 7:46 PM
Yup, and that is weird, as nothing should be happening to ModifiedNode once we seal it, which is when we run resolveModificationType() and all that jazz.
The only thing that comes to mind is that we are dealing memory ordering effects, i.e. a rehash of https://lf-opendaylight.atlassian.net/browse/YANGTOOLS-1537 in which case we need to promote https://github.com/opendaylight/yangtools/blob/08f0b5e8eb71c9e37f7912736c624d19f288d552/data/yang-data-tree-ri/src/main/java/org/opendaylight/yangtools/yang/data/tree/impl/InMemoryDataTreeModification.java#L333 to a setVolatile()…
Sangwook Ha December 8, 2024 at 1:41 AMEdited
requireDataAfter()
is called only when node.modificationType()
is WRITE
like this:
case WRITE -> {
out.writeByte(WRITE);
out.writeNormalizedNode(requireDataAfter(node));
}
But the IOException says that it’s UNMODIFIED
:
”java.io.IOException: Candidate for (urn:opendaylight:netconf-node-topology?
revision=2023-11-21)unavailable-capabilities (UNMODIFIED) does not have after-image”
How is this possible? Does that mean modificationType
changed between writeNode()
& requireDataAfter()
by other thread or something?
Peter Šuňa November 28, 2024 at 6:04 PM
This issue appears to persist in the current Calcium branch. When using the Netopeer device, I have encountered several devices that remain unremoved. In the LOG is present error:
”java.io.IOException: Candidate for (urn:opendaylight:netconf-node-topology?revision=2023-11-21)unavailable-capabilities (UNMODIFIED) does not have after-image”
Other branches are not affected after encapsulating netconf-node:
https://github.com/opendaylight/netconf/commit/1f8be4f6ee9c174e2bb796a3d4fd6242dc12d04b
This issue is present only in the Calcium branch. The current Master branch does not seem to have this problem. When around 800 or more devices are connected to ODL, there is an issue with removing them all at once. They continue to appear as connected devices, but it is not possible to remove them or retrieve data from them.
Steps to reproduce:
1) Start ODL:
feature:install odl-netconf-topology odl-restconf-all
2) Start testtool:
java -jar netconf-testtool-7.0.11-SNAPSHOT-executable.jar --device-count 1 --ssh true --md-sal true
3) Connect 800 mountpoints.
4) Verify devices are connected.
5) Remove all devices.
6) Sent get request to network-topology and observe not removed devices:
http://127.0.0.1:8181/rests/data/network-topology:network-topology/topology=topology-netconf
To reproduce 3-5 steps, you can use
testBug.sh
script provided in attachments.See ODL and testtool logs with provided errors in attachment.