Secure your Shiny apps (against SQL injection)

Shiny takes inputs from UI elements and sends them to the server, where the application can access them as R variables. While Shiny has security measures in place, as in any typical web application, it remains the developer’s responsibility to sanitize the inputs before using them. For example, Shiny has no way to protect you if you are using an input in a SQL query such as select ... from ... where field = 'input'. Someone manipulating the websocket communication can craft a specially-formatted input that can force the database to execute a query that it is not supposed to do, termed an SQL injection. This might give an attacker access to private data or the ability to do other nefarious things, and it is a common security issue.

To see this in action, take a look at this Shiny app demo. The input is read from a dropdown menu and it is used on the server side to construct a SQL query

400: Invalid request

Even if the input is read from a list of predetermined values (such as a dropdown) it is quite easy for someone accessing the app to send arbitrary input. For example, using Chrome browser’s Developer Tools’ Network panel one can see that for each UI update the following data is sent: {"method":"update","data":{"inp_abbr":"CA"}}. You can now use the Developer Tools’ Javascript Console to send ' or 1=1 -- instead of CA, and then the SQL query constructed by the code above becomes select * from states where abbr = '' or 1=1 --'. The app will then run this query and return all the values from that table, which is very different from the original intent of the app. With these kinds of simple tricks someone could read data from other tables as well, and in certain cases even alter or delete data.

It’s important to note that this is not a deficiency of Shiny per se; it is instead a problem in any web framework – and it is the developer’s responsibility to check the inputs and act accordingly. If you are developing Shiny apps in a corporate environment while accessing data from a SQL database, you should definitely pay attention to these possibilities. Luckily there are several ways to deal with this problem, a few of which I describe below.

One way is to check the inputs and if they don’t conform with certain patterns, do not run the query and give an error to the user. You could also use Shiny’s validate function to have a more user friendly message if you prefer. It is usually safer to specify which patterns are allowed (i.e. whitelist) such as “only alphanumeric characters, dot, space and hyphen” rather then specifying those characters not allowed (i.e. blacklist) such as “single quote, backslash etc”. I wrote a very basic R package that provides functions for validating numeric, character, logical and Date values for the various Shiny UI elements. You can find it on github and use it by wrapping your inputs like this, and note that you can use regular expressions for specifying the pattern.

400: Invalid request

It is also always recommended to restrict the input as much as possible (e.g. to a set of known values as I described in the second option above).

Another approach to secure web applications is to sanitize the inputs, or in other words replace non-alphanumeric characters (like single quotes) with “escaped” ones before continuing with the query. For example, if you are using MySQL via the RMySQL package you can use the mysqlEscapeStrings function to achieve this. Unfortunately there are numerous edge cases that must be considered during implementation and this sanitizing approach is usually considered error prone. On the other hand, if you are using the excellent dplyr package you are provided basic safeguards by escaping automatically. This is primarily meant to prevent accidental misuse rather than an explicit attack, however. As Hadley says in the dplyr docs, “This is unlikely to prevent any serious attack, but should make it unlikely that you produce invalid SQL.”

Perhaps the most secure approach would be to use prepared statements, also called parameterized queries. Unfortunately the RMySQL package does not currently support this feature and the support in the other R database connectors is also somewhat in its infancy.

Regardless of how you address this issue, you should also have additional layers of security such as granting minimal privileges to the database user used by the application. For a Shiny app, this means you might provide read-only access to a few tables. In addition, you should also restrict the Shiny app to a limited set of users whenever possible – either by using Shiny Server Pro, ShinyApps.io or by proxying via a web server and using its authentication features. You should also encrypt the traffic (https) if it is not on a private network. Furthermore, other than running Shiny as an unprivileged user (which is the default), you could also run the app in an isolated environment and/or use the RAppArmor package for extra security.

Finally, while in this blog post I have focused upon SQL injection, validating the inputs is crucial in several other situations. With R being a ‘hyper-dynamic’ language, there are countless functions in base R and common packages that do eval-like things, such as when data becomes code and it is executed. While a blacklisting approach seems hopeless to me in this case, the sandboxR project has attempted to map R’s ‘unsafe’ functions – and the long blacklist therein should horrify you with how many ways things can go wrong. Therefore, you must think carefully and thoroughly check your inputs if you are developing a Shiny app (or any web application, for that matter). Be safe!

Share This Post

15 Comments

Instead of manually constructing a blacklist, you should investigate the existing ways of sanitising input. With RPostgreSQL you can use postgresqlExecStatement which gives you the functionality of prepared statements.

As I wrote in the post dplyr is doing some basic escaping, but mainly to prevent accidental misuse, not a targeted attack. As for RMySQL, dbEscapeStrings is essentially the same as the mysqlEscapeStrings function discussed in the post.

In the post I discussed the various alternatives such as checking inputs and raising errors, whitelists and blacklists, sanitizing inputs, prepared statements etc. Indeed, it seems if you are using PostgreSQL, you can use prepared statements, although the NEWS says it’s a “Initial implementation of prepared statements”. Maybe someone who has experience with RPostgreSQL can weigh in about how mature/usable that feature is.

You may also want to check out RODBCext – you can still use your MySQL backend (via RODBC) but it allows you to do parameterized SQL queries with sanitised inputs thus preventing SQL injection attacks. Also, big upside about it is it allows you to pass it a data.frame of values for a vectorised INSERT.

Nothing is 100% secure. You should have multiple layers of security as I mention in the 2nd to last paragraph in the post: “Regardless of how you address this issue, you should also have additional layers of security such as granting minimal privileges to the database user used by the application. For a Shiny app, this means…”

The “rpg” package on CRAN allows parameterized queries (eg “select * from x where y = $1”) against postgresql. No need to sanitize inputs. This simply uses what is available in libpq, so it is mature and robust.

Thanks for mentioning my “sandboxR” package, which is indeed a rather POC demo on the security threats running R inside of a web application with arbitrary user inputs. Although maintaining the list is not feasible in the long run with the ever growing number of R packages and frequent version updates, it has its use case in e.g. the Rapporter infrastructure: it can filter malicious calls as the first barrier of security with user-friendly error messages, and RAppArmor can be used to protect your HDD’s content. Unfortunately, there’s still no solution to filter network access in a smart way, but I hope that AppArmor will provide such solution in the near future: https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/796588