From last some time in my office we are working on moving out of Oracle dependency, already we are successful in migrating many data storage related dependency from Oracle DB to other Non SQL data storage. At this point we are seeing 1 major challenge and after rubbing my head multiple times and doing a lot googling I couldn't find any solution which we can be implemented quickly. At this point I feel it will be good if I can point out how it's a really challenge to generate monotonically increasing sequences in a distributed environment ? In our current system we are using around 40 different Oracle DB instances ( not any RAC ) which give us monotonically increasing sequences but now we need to replicate this to a central solution which can give us monotonically increasing values for thousand different sequences and can handle 100000 next value generation request for these sequences within 5 minutes.
While going through some Oracle related articles I found even in case of Oracle RAC cluster if we need high performance and we use caching mechanism each node in RAC cluster will cache some values of sequences and depend on which node fulfill a particular request we will get the value for SEQUENCE.Nextval. In this case of Oracle RAC we are not getting monotonically increasing values , it will only make sure values we are getting will always be within a range and never a value returned more than ones.
If we want to generate monotonically increasing sequences in a distributed environment we can think of problem as a group of 3 people as 3 cluster nodes "Ballu" , "Chunky" , "Dee" . The task is our client node "Tyagi" will keep asking these guys for next value of a Sequence named as "LAMBU". At the starting all 3 know that current value of LAMBU is 500 and scenario is as mentioned in above picture.
Now suppose first request comes from Tyagi and Load-balancer redirect it to Ballu who increase the current value of sequence LAMBU to 501 and return to Tyagi but now other concerns here
Ballu need to update Chunky and Dee to increase their current value also to 501 so they never return the same value 501 for another request from Tyagi
[1] There is still an issue, In a system where hundreds request are coming within a second it might possible before Ballu update Chunky and Dee about this new sequence value change they have already returned 501 to another Tyagi's request.
[2] Some one can suggest we should keep a storage system behind Ballu, Dee and Chunky. I told them it will make my application again single point of failure, It will increase latencies for the requests coming from Tyagi, If for each request I have to sync a back-end data storage how really benefit I am getting by putting Ballu, Chunky , Dee in a distributed way ?
[3] Another solution I can think is before returning value 501 to Tyagi's request Ballu will update Dee and Chunky that they should not use 501 any more. One question again arise here is in this case Ballu will have to wait till he get reply back the acknowledgement from both Chunky and Dee. There can be scenario where Ballu will send the update to Chunky and Dee "Ballu will use 501 please don't use it" , before this update reach to Dee, Dee also get another request from Tyagi and Dee also want to use 501 and send same update to Ballu and Chunky "Dee will use 501 please don't use it" . How can we avoid such collisions ? What will happen to nextValue request Tyagi had sent ? With the numbers of nodes growing in the system will not it be a bottle neck in the system to send updates and get acknowledgements ?
[4] One thing seems sure to me , there is no way where without taking to other nodes in the cluster any node can handle a request and still system returning monotonically increasing sequences. There only optimizations we can do is how to reduce this communication and even keep the promise to generate monotonically increasing sequences !!! I still need to think a lot for this and there seems no easy solution for now but I will love if some one can prove me wrong !!!
While going through some Oracle related articles I found even in case of Oracle RAC cluster if we need high performance and we use caching mechanism each node in RAC cluster will cache some values of sequences and depend on which node fulfill a particular request we will get the value for SEQUENCE.Nextval. In this case of Oracle RAC we are not getting monotonically increasing values , it will only make sure values we are getting will always be within a range and never a value returned more than ones.
Now suppose first request comes from Tyagi and Load-balancer redirect it to Ballu who increase the current value of sequence LAMBU to 501 and return to Tyagi but now other concerns here
Ballu need to update Chunky and Dee to increase their current value also to 501 so they never return the same value 501 for another request from Tyagi
[1] There is still an issue, In a system where hundreds request are coming within a second it might possible before Ballu update Chunky and Dee about this new sequence value change they have already returned 501 to another Tyagi's request.
[2] Some one can suggest we should keep a storage system behind Ballu, Dee and Chunky. I told them it will make my application again single point of failure, It will increase latencies for the requests coming from Tyagi, If for each request I have to sync a back-end data storage how really benefit I am getting by putting Ballu, Chunky , Dee in a distributed way ?
[3] Another solution I can think is before returning value 501 to Tyagi's request Ballu will update Dee and Chunky that they should not use 501 any more. One question again arise here is in this case Ballu will have to wait till he get reply back the acknowledgement from both Chunky and Dee. There can be scenario where Ballu will send the update to Chunky and Dee "Ballu will use 501 please don't use it" , before this update reach to Dee, Dee also get another request from Tyagi and Dee also want to use 501 and send same update to Ballu and Chunky "Dee will use 501 please don't use it" . How can we avoid such collisions ? What will happen to nextValue request Tyagi had sent ? With the numbers of nodes growing in the system will not it be a bottle neck in the system to send updates and get acknowledgements ?
[4] One thing seems sure to me , there is no way where without taking to other nodes in the cluster any node can handle a request and still system returning monotonically increasing sequences. There only optimizations we can do is how to reduce this communication and even keep the promise to generate monotonically increasing sequences !!! I still need to think a lot for this and there seems no easy solution for now but I will love if some one can prove me wrong !!!