Deadlock risk with Session Lock & KeepaliveTask Synchronization
Description
Environment
Activity

Peter Šuňa yesterday
Based on task description, two threads are blocked:
NetconfDeviceCommunicator#teardown
holds sessionLock and waits forKeepaliveTask#sendKeepalive
to finish.
KeepaliveTask#sendKeepalive
holds the KeepaliveTask lock and waits for NetconfDeviceCommunicator#teardown to release sessionLock inNetconfDeviceCommunicator#sendRequest
method.
The KeepaliveTask behavior is correct because:
KeepaliveTask#disableKeepalive should be executed when KeepaliveTask#sendKeepalive is finished, and vice versa. There is no option for any part of either method to be executed while the other method is not finished yet.
On the other hand, in NetconfDeviceCommunicator, the blocked methods are:
NetconfDeviceCommunicator#sendRequest and NetconfDeviceCommunicator#tearDown
These could be divided, and some parts of the code may be executed while the other method is running. This leads to the proposed solution to divide NetconfClientSession and RemoteDevice execution, which could be done in parallel. This will unblock execution of NetconfDeviceCommunicator#sendRequest method.
If
KeepaliveSalFacade
tried to send a keepalive RPC when the NETCONF channel is disconnected, this may lead to a deadlock in the following scenario:Thread #1
NetconfDeviceCommunicator#onSessionDown
-> NetconfDeviceCommunicator#teardown sessionLock.lock()
-> onRemoteSessionDown()
-> KeepaliveSalFacade#onDeviceDisconnected
-> stopKeepalives()
-> KeepaliveTask#disableKeepalive synchronized KeepaliveTask
Thread #2
KeepaliveTask#sendKeepalive synchronized KeepaliveTask
-> NetconfDeviceRpc#invokeNetconf
-> NetconfDeviceDOMRpcService#invokeRpc
-> NetconfDeviceCommunicator#sendRequest sessionLock.lock()
Karaf logs:
Thread #1
Thread #2