vpnmanager: OptimisticLockFailedException seen during rally scenario
Description
Environment
Operating System: All
Platform: All
Activity
Sam Hague April 5, 2018 at 9:22 PM
Michael Vorburger March 14, 2018 at 10:28 AM
@Sam Hague @Sridhar Gaddam can this one finally be closed now, or is there anything else I can do here?
Michael Vorburger February 26, 2018 at 6:27 PM
That (above) is correct. FTR c/66618 which changed the default retries in JobCoordinator from 0 to 3 globally for everything was discussed, accepted and merged on Jan 9th.
Perhaps we could finally close this bug (with Fix Version: Oxygen), and open new ones for other places?
Sam Hague December 21, 2017 at 5:16 AM
Ok so the fact we don't see any errors in the VpnNodeListener patch means somewhere in the 1-3 retries it passed - and failures would only be logged at debug, which is not on by default, so we don't know how many tries it took, but it did pass.
Then we need to see if we want to do the same for the new error. And in this one, it says retries =0 since that is the default - and not that is got through all retries and ended at 0.
Michael Vorburger December 19, 2017 at 4:51 PMEdited
we only got a retryCount=0 - does that mean the patch did what it was supposed to and only log the final exception after the last retry? The bug is still there, but the number of logs was reduced?
Now wait a second, your new log above from JobCoordinatorImpl is re. ItmTepRemoveWorker
.. whereas originally this issue was re. VpnNodeListener
... so what is the "scope" of this bug? 😉 The change above only fixed VpnNodeListener, by adding a ,3
parameter for retries, which we proved helped.
Now you either have to do the same in ItmTepRemoveWorker, and everywhere else, or we change the default retries in JobCoordinator from 0 to 3 for EVERYTHING... I'm not sure that everyone will agree that is a good idea, but I can propose it, for discussion, in c/66618 ...
Also can you clarify that your patches here - they don't fix this bug, they just reduce the number of logs? Do I have that right?
Well, no careful, you mix up 2 things here: Those 3 retries added e.g. to VpnNodeListener, or maybe by changing the default in JobCoordinator, DO fix this bug. But then the https://lf-opendaylight.atlassian.net/browse/CONTROLLER-1802#icft=CONTROLLER-1802 improves logging to reduce where it does not make sense to log (because there are retries), that's what c/66362 does for controller (merged on Dec 13th but still appears as WARN from ConcurrentDOMDataBroker because your log above is from a Dec 12th run!) and c/66355 forJobCoordinator (finally just merged, but likely buggy; follow up in c/66634).
In a Clustered setup with 3 ODL Controllers and 3 Compute nodes, while running rally with a light load of concurrency of 2 and times set to 40, the following exception was seen in karaf.
ODL rpm used: opendaylight-6.2.0-0.1.20170913snap58.el7.noarch.rpm
2017-09-15 03:39:25,801 | WARN | CommitFutures-3 | DataStoreJobCoordinator | 319 - org.opendaylight.genius.mdsalutil-api - 0.2.2.SNAPSHOT | Job: JobEntry {key='VPNNODE-9681529007732', mainWorker=org.opendaylight.netvirt.vpnmanager.VpnNodeListener$1@7a2b1deb, rollbackWorker=null, retryCount=0, futures=[org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker$AsyncNotifyingSettableFuture@74ececde]} failed OptimisticLockFailedException {message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was created by other transaction.]]} at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:751) at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPending(ShardDataTree.java:789) at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPendingTransaction(ShardDataTree.java:736) at org.opendaylight.controller.cluster.datastore.ShardDataTree.startCanCommit(ShardDataTree.java:819) at org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.canCommit(SimpleShardDataTreeCohort.java:90) at org.opendaylight.controller.cluster.datastore.CohortEntry.canCommit(CohortEntry.java:97) at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleCanCommit(ShardCommitCoordinator.java:236) at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleBatchedModifications(ShardCommitCoordinator.java:170) at org.opendaylight.controller.cluster.datastore.Shard.handleBatchedModificationsLocal(Shard.java:604) at org.opendaylight.controller.cluster.datastore.Shard.handleBatchedModifications(Shard.java:627) at org.opendaylight.controller.cluster.datastore.Shard.handleNonRaftCommand(Shard.java:312) at org.opendaylight.controller.cluster.raft.RaftActor.handleCommand(RaftActor.java:270) at org.opendaylight.controller.cluster.common.actor.AbstractUntypedPersistentActor.onReceiveCommand(AbstractUntypedPersistentActor.java:44) at akka.persistence.UntypedPersistentActor.onReceive(PersistentActor.scala:170) at org.opendaylight.controller.cluster.common.actor.MeteringBehavior.apply(MeteringBehavior.java:104) at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544) at akka.actor.Actor$class.aroundReceive(Actor.scala:502) at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(PersistentActor.scala:168) at akka.persistence.Eventsourced$$anon$1.stateReceive(Eventsourced.scala:727) at akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:183) at akka.persistence.UntypedPersistentActor.aroundReceive(PersistentActor.scala:168) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) at akka.actor.ActorCell.invoke(ActorCell.scala:495) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) at akka.dispatch.Mailbox.run(Mailbox.scala:224) at akka.dispatch.Mailbox.exec(Mailbox.scala:234) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was created by other transaction. at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkWriteApplicable(SchemaAwareApplyOperation.java:174) at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:128) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:313) at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:313) at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:313) at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:313) at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:313) at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:313) at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:313) at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131) at org.opendaylight.yangtools.yang.data.impl.schema.tree.StructuralContainerModificationStrategy.checkApplicable(StructuralContainerModificationStrategy.java:99) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkTouchApplicable(AbstractNodeContainerModificationStrategy.java:288) at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:125) at org.opendaylight.yangtools.yang.data.impl.schema.tree.RootModificationApplyOperation.checkApplicable(RootModificationApplyOperation.java:72) at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractDataTreeTip.validate(AbstractDataTreeTip.java:35) at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:743) ... 29 more