6 options if you want to analyze clinical data in 2017

There is no doubt that the NHS is becoming digital little by little. Anyone who has had to endure the computer systems knows that it’s locked down, fragmented, and slow. However, it’s here to stay, and it’s getting better. This change has given clinicians an opportunity that they have never had before. They have access to more data than they know what to do with. They also do not have to spend as much time collecting the data. Despite this, I still see junior doctors manually going through and typing data into excel spreadsheets in central London hospitals. In this day and age, it’s mildly depressing to see, especially when the consultant instructing them fancies themselves as a tech whiz who is up to date with cutting technology. Below are 6 options that you can pursue to get your data analysis out of the dark ages. Some take some effort, others can be utilized within a day of reading:

SQL

This isn’t really a programming language, you won’t be making platforms and apps out of it. However, it takes a very small amount of time to learn and use. Within minutes you will be able to improve and speed up your analysis with SQL. SQL stands for structured query language. All the databases I have seen on hospital systems are SQL. It’s a powerful platform where you can have multiple requests at the same time. Even sites like Facebook and Youtube use SQL. If you want to get specific data from the system the IT department will perform a query coded in SQL. Let’s say that you want to pull the employeeID, first name, last name, the date they were hired and the city from the employees database where they are working in London. All you’d have to do is code the following line:

SELECT EmployeeID, FirstName, LastName, HireDate, City FROM Employees
WHERE City = 'London'

You can then export the results to a CSV or Excel file. So why learn it? Can’t you just make a request to the IT department? Yes, you can but the time it takes you to learn this is worth it. I’ve seen IT departments tell clinicians that it would take weeks to return the results. If you have a basic understanding of SQL you will know this is rubbish. If you offer to come down there and write the SQL query yourself they will give you the data within hours and fob off someone else. You can also plan your audits. You can get really specific with the searches saving time in future analysis. You will know what’s possible and know when someone is telling you that something that’s possible is impossible. For a grounding in SQL check out the following tutorial [link].

R

Again this isn’t really a programming language. I have yet to come across anybody who has coded a software platform in R. However, like SQL it’s very straightforward and easy to pick up. Its primary focus is data analysis. It can interface with CSV and Excel files. It can also work with web data and SQL databases. You don’t have to worry about knowing anything when it comes to computers, just focus on the data. It’s lightweight and within a few lines you can process data, perform a statistical test, and plot it. It has loads of prebuilt statistical functions. It’s free to download for mac and windows. If you want to do more with your coding than data analysis I wouldn’t recommend R. However, if all you want to do is analyze data for your research project or department then go for it. You will be able to do some powerful stuff quickly with little to no technical knowledge. Want to get grounded in R? Check out the following tutorial [link].

Python

Be careful, this is my favorite language so I’m biased. It’s a little more technical than R, however, you can do so much more with it. Many big web platforms are coded in Python and it also has a growing number of data modules including machine learning. When the CIA files were released from Wikileaks it turned out that a number of their hacking tools were coded in Python [link]. Companies like Google also use python. If you want your data analysis to run in a product, you can code your data functions in python and import them into a web framework like Django or Flask. For instance, Instagram is coded in python and runs on the Django platform. It’s powerful and you can scale it. It also has a place in engineering. I have used python to pull data from hardware to analyze for my surgical robotics project, and I’ve used Python to code a web app with a database and search functions. Like R it’s free to download [link].

Matlab

This is kinda like the Apple version of coding. It’s expensive with a basic individual license starting at £1,800. There are much more powerful languages out there, it’s locked down, and it’s not versatile. However, it has a lovely interface. You can literally drag and drop in data from files. You can then code in the simplistic Matlab language, run animations with a click of a button. You won’t be able to produce any products with it and extra toolkits can cost hundreds. You can now integrate with Python and other languages for stuff like web apps, however, if you’re skilled enough to code a web app in Python, you might as well code the data models in Python. Matlab is extremely easy to use and if you want to do a specific task it can be a life saver. For instance, I had to propose a mathematical model for light wavelengths in a pulse oximetry unit and present the results of the model. I simply coded it in Matlab. It took me minutes to convert my math into code. All I had to do was do the mathematical model and present the results. Why make it harder? For those wondering, I’m a Mac user for the same reason. File syncing across my iMac, iPhone, and MacBook Pro. Easily python module installation with pip. Proper garbage collection, unlike Windows that gets Windows rot. For me, the price of the hardware is worth the easy life I get for tapping into the ecosystem. However, we have to remember pricing theory. If I say that PC users are wrong, I’m only displaying that I don’t understand basic economics. If Matlab seems right for you, check it out [link]. Make sure you have plenty of disc space though, that beautiful interface costs some hardrive memory.

Octave/ Freemat

Octave and Freemat are free knock-offs of Matlab. It’s still a nice interface but you get what you pay for, it’s not as good as Matlab.

excel

I personally don’t like Excel but I’ve included it here because it’s so widely used and it’s fairly cheap considering it has a graphical interface. It rounds numbers up so it’s not very accurate, and it has low data limits. It’s designed for office workers making simple calculations and organizing basic data models. However, because it’s so easy to use it’s been used and abused in areas where I shouldn’t. Open a big Excel file and you’ll see it struggling. It’s not that the computer is slow, it’s that Excel isn’t designed for that. You can code basic macros and forms in visual basic. For a tutorial on this check out the following [link]. Be careful though I find this a dangerous route. It’s very easy to start using Excel, and people spend hours doing so. They try and translate this to other coding languages and realize that they have to have more technical knowledge and skill to do the same thing. So they retreat back and hide in Excel. They do not want to give their project over to someone else and as a result, the database gets slower and slower, it starts to crash more often and makes rounding errors. These projects are doomed to fail and have a time stamp on them where you either have to start deleting loads of data or start manually swapping databases around or it will become unusable. In its defense it’s home office software, it’s not designed to be a solution in a clinical setting. It’s for people at home sorting out their tax returns. My dislike stems from me seeing too many fakers insisting on using it, whilst completely ignoring suggestions for better solutions. However, if your audit is small and you don’t mind burning hours manually going through data then this is the product for you. Do not buy that Excel is widely used for a reason to use it. Within a couple of lines, you can access data in an excel file in Python. Saving the data to an excel file in Python, R, Freemat, Matlab, and SQL is also extremely easy. This is just another demonstration that they are tech illiterate. Don’t get me wrong, if I want a plan a budget for a trip or tasks for a group project I still reach for Excel. However, when it comes to data analysis there are plenty of free alternatives that are more powerful.

One thought on “6 options if you want to analyze clinical data in 2017”

Another common platform clinicians use for statistics is SPSS. It is often used because it is considered simple, but I dislike it for the same reason. It becomes too easy for mathematically illiterate people to click a few buttons and think they’re doing stats. If you’re choosing SPSS to get around having to think hard about your statistical analysis you shouldn’t be doing statistics. It’s interesting to note very few statisticians actually use SPSS – R generally being the language of choice in the profession.

And yes, statistics without the maths is a pet peeve of mine. Would anyone consider Shakespeare without English? His works are bound to the language. So too for stats.