Develop and Query a Graph with SQL Server 2017 and R – Part 1

Problem

Graph analysis can be divided into two parts – Graph Rendering and Graph
Querying. There are many ready to use visualizations available in 3rd
party tools as well as frameworks like R and Python which provides ready to use
graph visualization where one can submit a dataset and the control will render the
visualization. Often the limitation with such controls is that there are limited
customization options. Such visuals are good for immediate or limited scale graph
datasets. For large scale graph applications where the graph contains millions of
nodes and edges, full control over each aspect of the graph is required in terms
of its rendering as well as graph traversal.

In this tip we will see how to create a graph with different aesthetic
customizations to represent the graphical nature of the data in a visually interpretable
manner. Generating a graph is just the first part of the graph analysis process.
The other part of the analysis is querying the graph and rendering the results of
the query in the graph visualization. In this tip we will learn how to query graph
data and render each step of the analysis in the graph visualization.

Solution

DiagrammeR is a R package that has all the necessary constructs to generate a
graph with the minutest customization on every aesthetic element of the graph.

Steps to Create a Graph Using SQL Server and R

1) First we need to ensure that SQL Server 2017, SSMS and R Server
is installed on the development machine. If you need to refer to the installation steps,
you can follow the
installation section of this R tutorial.

2) We will need to install R packages named – DiagrammeR, Magittr and
DiagrammeRsvg. You can read the instructions from this
link on how to install packages on a R server.

3) Before we start developing the actual code there are some basic elements of
a graph that we need to understand. At the very high level, there are at least two
major elements of a graph – a node and an edge. A node is the basic entity
and edge represents the relationships between these entities. Using these two basic
attributes, any kind of graph can be created or described.

4) Let's start by creating a node. We need to execute the sp_execute_external_script
stored procedure which allows external R scripts to be executed in SQL Server. In
the below code, we are creating 9 nodes using the create_node_df function. We are assigning
a type attribute to the nodes as well as labels to each node. Here nodes mean the
actual entities in a dataset. We are using hard-coded entities, but you have
the option to also read data from a SQL Server table and read that data in the R
script using InputDataSet data frame. After creating the nodes, we are using the create
graph function with the nodes as an input parameter to generate a graph. After that
we export the graph to a png image.

5) Once the above code is executed successfully, the visual would look as shown
below. This diagram does not exactly look like a graph. The missing elements are
the edges i.e. the relationship between the nodes. Before we start dealing with
the aesthetics of the nodes and edges, we first need to make the visual look like
an actual graph.

6) To add the edges to the nodes, we need to use the create_edge_df function,
which takes two arguments – From and To. The From parameter specifies the start
node and the To parameter specifies the end node, to create the edge from source to
destination. After the edges are created, we pass this as a parameter to the
graph.

7) After you execute the above code successfully, your graph should
look as shown below. If you compare it with the previous graph, you will be able
to see the arrows linking the nodes, which are the edges that we created.

8) We are now ready to start modifying the different aesthetic elements of the
node. These attributes can be categorized in the following categories.

Shape

Style

Size

Color

Fonts

Position

Direction

Labels

9) Let’s try to change the shape of the nodes. Add a parameter in the
create_node_df
function named shape = “oval” to change the shape of the nodes. Execute
the code and the graph should look as shown below.

10) The graph looks better and different now. We had used the type attribute while
creating the nodes. We can color the nodes based on the types using the fillcolor attribute.
Modify the above code and add one more parameter to the create_node_df function
as mentioned below. Execute the code after modifying and the graph would look as
shown below.

11) The text of the graph is an essential element of the graph from a usability
perspective. The fonts of the nodes can be changed using the following parameters
as mentioned below. After adding the below parameters and executing the code, the
graph would look as shown below.

fontname="Helvetica", fontsize="16", fontcolor="black"

12) As you can see in the above graph, the fonts are not fitting in the size of
the nodes. We can modify the width of the nodes using the width parameter and assigning
it a value of 1.5 as shown below. The final code for the nodes should look as shown
below after all the modifications mentioned above are done.

13)After executing the above code, the graph should look as shown
below. There are more options to modify other aesthetic elements of the node, and
you can refer to the DiagrammeR documentation.

14) Now that we have modified the visual elements of the nodes, it’s time
to modify the visual elements of edges. In the above graph, the edges look thin
and do not look bold and explicit. We can increase the width of the edges using
the penwidth parameter in the create_edge_df function as penwidth=”3.0”.
Modify the code and then execute it, and the graph would look as shown below.

15) The arrows look uni-directional and the arrow heads are not visually explicit.
We can change that using the arrowsize as well as the dir parameter which controls the
size and direction of the arrows respectively. After you make the changes, the final
code should look as shown below.

16)Execute the code and the final graph will get generated as shown
below with bi-directional arrows and larger arrow heads.

Summary

In this way, we can easily generate a graph by just using three functions. These
functions support many formatting parameters which control the visual aesthetic
of the graph elements, which we saw in the above steps. In the next part of the
tip, we will build upon this foundation of graph rendering and learn how to query
and navigate different graph elements as well as visually describe the query results
of the graph.

Next Steps

The DiagrammeR package provides full customization of graph elements. Consider
exploring all the formatting parameters and different functions supported by
this package to minutely control the look and feel of each granular detail of
the graph to make it look crisp and sharp.

Last Updated: 2018-10-05

About the author

Siddharth Mehta is an Associate Manager with Accenture in the Avanade Division focusing on Business Intelligence.

Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name
*Email
Email me updates

Signup for our newsletter
I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.