Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Z-Platform is the new innovative powerful and complex platform to ingest data of any kind and store the data in the form of JSON documents in MongoDB and represent a sparse representation of the same in Neo4j graph database. Mahesh discusses how he tackled deadlocks and improved the performance of the system significantly. The test environment included small graphs (ranging up to 10000 relationships to very large graphs (ranging up to 39 million relationships). The average performance of the system is 3741 relationships per minute.

4.
Deadlocks in Z-Platform
• Creating relationships is one of the most time
consuming processes
• Log analysis reveals deadlocks among batch
transactions and retry-mechanism takes time
• Dependent on how nodes and relationships are
grouped together
• Batch size is dependent on the size of the JSON
block sent to the server
• Time required to build relationships and resolve
deadlocks is in the order of seconds
4

7.
Deadlocks across Transactions
• Transactions are also like separate processes
but in a single thread or multiple threads
• Deadlocks occur across transactions
– Two concurrent transactions need write locks on
the same node n1
– In two concurrent transactions, T1 has write lock
on node n1 and waiting on write lock on node n2
whereas T2 has write lock on n2 and is waiting for
write lock on node n1
– Transactions of varying sizes
7

10.
Deadlocks Detection and Avoidance
• Deadlocks Detection
– Only possible at run-time
– Recovery from deadlock is either to abort or retry
• Deadlocks Avoidance
– Reorder the operations to lower or eliminate the
likelihood of deadlocks
• Graph Clustering Algorithms: Most of them require
knowledge of entire graph
Clustering Relationships  Bipartite Graphs
10

11.
Bipartite Graphs
• Given a Graph G with Vertices V and Edges
E, then graph G is a bipartite graph such that
vertices V can be partitioned into two
independent sets V1 and V2.
V1
A
V2
A
E
D
C
C
D
E
B
B
11

12.
Creating Bipartite Graphs
• Use two colors to color each node such that
no two adjacent nodes have the same color.
1
2
A
V1
V2
A
E
D
C
C
D
E
B
B
12

14.
Algorithm to generate Graph
V1
V2
A
D
C
E
B
• Create all the nodes
• Create batches of
relationships among
the same colored
nodes
• Create batches of
relationships across
the two colors
14

15.
Algorithm in Z-Platform
• Batch of relationships R = {r1, r2, r3….. rn} :
– each r is a triplet {src, dest, props} where src and dest
are nodes and props is a set of key-value pairs
• Color the nodes based on each relationship with
two colors
• Mark the conflicting edges where both the src
and dest nodes are of the same color
• Batch these relationships together in a single
batch
• Start grouping the remaining edges such that no
two batches have any node in common
15

21.
Future Work
• Test performance over the network using Amazon EC2
servers to mimic real world setup
• Single threaded application  multi-threaded to see if
better performance
– More complex algorithm to batch relationships together
– Analyze if the complexity is worth the performance
improvement
• Vary multiple factors:
– Batch size : 1000 to 4000
– Properties (relationship descriptors) : 2 – 20
• Dispatcher Pattern to facilitate the single point distribution
of nodes and relationships to threads/Transactions
21

22.
Conclusion
• Deadlocks in general are time consuming and
difficult to detect and prevent
• Use of graph coloring to partition graph into
conflicting and non-conflicting edges
• Successful prototype tests shows significant
improvement in building relationships varying
from small number to a very large number
22