Useful Links

Search form

You are here

MolView: an attempt to get the cloud into chemistry classrooms

MolView is a web application which helps students and teachers to visualize molecular structures and view their properties. There are numerous databases publicly available to provide the required data such as PubChem, ChemSpider, ChEMBL, DrugBank, the Crystallography Open Database[1], and many more.[2][3] Currently, MolView uses PubChem, RCSB, the Crystallography Open Database, the Chemical Identifier Resolver[4], and the NIST WebBook[5] to retrieve data. MolView offers a simple search interface to find small molecules, proteins and crystal structures in these databases. MolView uses JavaScript libraries, that use modern web technologies such as WebGL, to visualize these structures. In the past year I have designed a new version of this application from the ground up to facilitate the implementation of new databases and tools. Along with a new architecture and user interface, this version will include internationalization, interactive instructions, advanced search tools, more import/export tools, and more presentation tools. I also intend to include more computational chemistry tools to make the analysis and processing of complex data easier and more fun.

Introduction

In recent years web technologies have come a long way. The usage of browser plugins, such as Adobe Flash and Java, for the development of advanced applications in websites, has been replaced by built-in JavaScript APIs such as WebGL[6] and Web Workers[7]. The market of cloud computing(a) has skyrocketed, partially due to the vast increase of big data(b).[8][9] New standards for database APIs(c), such as REST and JSON(d), have emerged and are now used by almost all public API's on the internet including some major chemical databases like PubChem, ChEMBL and PDBe. I think these developments have opened many new opportunities in cheminformatics and bioinformatics.

History

MolView started out as 2D to 3D structure converter where the user can draw a structural formula and view a generated 3D conformer that was generated using the Chemical Identifier Resolver(e). The ability to search by name using the Chemical Identifier Resolver was added later as an experimental feature. This turned out so well I added integrations for PubChem, RCSB and the Crystallography Open Database. Now MolView has become an example of what happens when you bring modern web browser technologies and online scientific data resources together. MolView is available free of charge on molview.org since 1 July 2014. The user base is still rather small (>10k sessions per month) but is growing rapidly.

Use cases

You might be wondering what you can use MolView for today. The best way to find out is, of course, to visit molview.org and try it out for yourself. To get you started, here is a list of things you might want to try. Each section is also demonstrated in a YouTube video.

View 3D structure of organic molecules

You can draw organic molecules in the sketcher on the left side. By default the sketcher shows the structural formula of caffeine. Clear the sketcher by clicking the trash icon in the top-left corner. Then draw a new structure using the sketch tools. You can for example draw a benzene ring, an individual atom or a bond. When you are done drawing the structure, convert the structure to 3D using the '2D to 3D' button in the top-right corner of the sketcher. You can now view and interact with the 3D structure.

Measure distances, angles and torsion angles in small molecules

You can also search for chemical structures. Type a molecule name into the search input in the top-left corner of the window. The search input will display a list of suggestions from PubChem, the Crystallography Open Database and RCSB. When you have loaded a structure, you can enable a measurement tool via the Jmol menu. MolView uses the JavaScript variant of Jmol (JSmol) for the measurement tools.[10] When you have enabled a measurement tool, you can click atoms in the viewing window to measure distances, angles and torsion angles.

Study electron distributions in small molecules

Jmol has many powerful computational tools and a few are currently directly accessible in MolView. You can for example render a Molecular Electrostatic Potential surface of the loaded 3D molecule. If you connect a fluorine atom to a hydrogen atom and render a translucent MEP surface, you can clearly see that the fluorine atom attracts electrons much stronger than the hydrogen atom. You can also run an energy minimization via the Jmol menu. This can be useful when the loaded structure is resolved using the Chemical Identifier Resolver. The Chemical Identifier Resolver uses CORINA, a program that splits the molecule into ensembles, looks these ensembles up in a database and assembles them back together.[11]

Study small crystal structures

You can load small crystal structures from the Crystallography Open Database via the search interface. The blue suggestions from the search input are mineral names from the Crystallography Open Database. Additionally you can search trough the entire Crystallography Open Database via the search menu. After you have loaded a crystal structure, you can render a supercell model via the Model menu.

Visualize biological macromolecule's

Just like small molecules and crystal structures, macromolecule's can also be loaded into MolView. MolView can retrieve biological macromolecule's from RCSB. You can switch between different color schemes and protein structure representations via the Protein menu.

View spectra

You can also view certain spectra. To do this, you first have to load or draw a molecule. Then you can open the spectrum viewer via the Tools* menu where you can choose from different spectra. IR and mass spectra are fetched from the NIST Chemistry WebBook. A H1-NMR prediction is calculated using *nmrdb.org.[12] When you select a spectrum, it will be loaded into the interactive spectrum viewer where you can read out the values.

Embed interactive 3D models in websites

If you write web articles that involve chemistry, the embedding tool from MolView might be just what you need. You can get an embed code for every 3D structure in the viewer by opening the embedding dialog via the Tools menu. You can then paste this embed HTML code into your web page. In future versions you will be able to embed more content such as the spectrum viewer and the sketcher. Below is an example of an embedded caffeine molecule.

Prospects

Over the past years numerous ideas for new features have come up. I'm now writing a new version with a much higher level of modularity than the current one. This will help me to integrate new exiting stuff. Apart from quite a number of new database integrations, I want to focus on the three other subjects: visualization tools, sketch tools and tools for teachers. The next three sections will explain these subjects in more detail.

Visualization tools

Currently MolView uses GLmol, JSmol and ChemDoodle Web for 3D visualization.[13][14] In the last few years more mature web viewers for molecules have been developed that will replace the current viewers (except for JSmol of course). These viewers include 3Dmol.js (organic molecules, fork of GLmol), PV (simple viewer for proteins) and NGL (advanced viewer for proteins).[15][16][17] But viewing 3D models is not the only visualization tool I intend to offer. An interactive NMR viewer like the one on nmrdb.org might finds its way into MolView pretty soon. And perhaps even a DNA sequence viewer/explorer.

Sketch tools

The sketcher is the only visualization tool I've actually written myself. I created my own sketcher because it is a very critical component and I was not satisfied with the web-based, open-source sketchers that are currently available. I've added some basic features to the sketcher already and I intend to add quite a number of new features to make it even more powerful. Apart from obvious features such as functional groups, dot structures, reaction arrows and annotations, I've been thinking about a tool for automatically depicting the structural formula using a certain projection such as the Newman projection or the Fisher projection.[18]

Tools for teachers

I want teachers to use MolView, and therefore making MolView more powerful for teachers is a top priority. In the past year I received quite some requests for new features from teachers who tried MolView. The new architecture will open up the way for quite a number of these features. It have built-in support for embedding any view, anyone will be able to create a step-by-step guide for tutorials in MolView, and it will have mature support for saving and sharing files. I'm especially exited about the step-by-step guides. These can be used to learn new users how to use MolView, but it could for example also be used for a tutorial to explain the difference between geometric isomers.

Get involved

You can start using MolView today on molview.org. Your feedback is very valuable! don't hesitate to contact me if you have questions or ideas. MolView is growing faster than ever and will require an increasing amount of server resources. A dutch hosting company called PCextreme granted me free access to their infrastructure so I can continue providing this application for free! In the future I want to add more heavy computational tools to MolView. Perhaps we can connect MolView to your computing grid to make this possible!

Footnotes

Using a network of remote servers for computational tasks rather than a local server.

The term 'big data' is often used to describe datasets that are too large to be processed by traditional applications or infrastructures.

In this case the API (short for Application Programming Interface) is the interface offered by the database that can be used by other applications to exchange data.

A format to encode data (like XML) that is based on the way JavaScript objects are encoded (JSON stands for JavaScript Object Notation), making it a suitable format for web applications.

In case you want to read learn more, you can have a look at this article.

Comments

You have created an impressive program. It is easy to use and has significant depth and breadth to be used in many courses. This is beneficial to students, as they won't need to learn different programs for different sub-disciplines in chemistry. And the information is accurate -- I *greatly* appreciate the use of high-level scientific data in all classes.

I am running into difficulties on the website. You have caffeine (an interesting but apt choice) as the default molecule. I noticed that the 2D and 3D structures were not oriented the same, so clicking [2D to 3D] returns 'Failed to load structure from its database'. I get the the same error when I draw structures, like HF, HCN, and CH3OH. (?)

I'm glad you like my program! Converting a structural formula to 3D still works fine for me (which makes the issue more complicated, unfortunately). Could it be the case that your browser is blocking AJAX requests? (MolView talks to the PubChem API inside the browser). Can you please try to load this page: http://molview.org/?cid=14917, if something is wrong with the retrieval of data, this page should display the same error.

The molecules are flipped because the sketcher uses a 2D coordinate system where the origin is in the top-left corner and the y coordinates increase when going down (so the y axis direction is reversed compared to a normal Cartesian coordinate system, this is often used in 2D computer graphics). The 3D viewer does use a normal Cartesian coordinate system which is not yet taken into account in the program. Of course it does not make a difference in the end but it might be confusing for students. I tried some other compounds and I found this might actually be a difficult issue. It seems the 2D and 3D coordinate data from PubChem does not always use the same orientation. I will put this in my notes for things to fix in MolView 3, thanks!

Following up on Roy’s question, I did some playing around with MolView. I think I just discovered something that I did not understand before. On 2D to 3D, you are switching between PubChem and Resolver automatically as needed. Am I correct on this point? More on this point based on your response.

Based on Herman’s answer, you may have stumbled on something that could help me out with a problem I’ve been having with PubChem. It could be the same problem MolView is having on your system. Would you try this experiment for me? Force MolView to go to Resolver and see if you still get the error message on 2D to 3D. You can do this by simply adding a second molecule the sketcher.

Tried with methane and water as two separate entities. Get different error messages.
1. When I don't put the hydrogens on the atoms, it "Fails to load structure from its database." Note: Herman successfully created phenol without adding a hydrogen.
2. When I do put hydrogens on the atoms (making the complete molecules), I get the message, "Failed to load structure from sketcher. 4 Atoms near stereocenter.

When I tried phenol (with and without hydrogen), I get the "Fails to load structure from its database." error.

PubChem should automatically add hydrogens (there is not much molecule sanity checking in MolView as of now). Issue 2 is caused by the SMILES generation algorithm in the sketcher (not written by me and contains some quirks), and tells that something is wrong with the molecule (i.e. the molecule is 'impossible'). In MolView 3 I want to replace SMILES with InChi (using a server-side program).
However, I still suspect this to be an issue with the data retrieval from PubChem. Since I never heard about this issue before, I suspect it is a very specific issue concerning your browser or browser configuration. Note for Otis: I do NOT use JSONP (a trick to get AJAX working in older browsers). What browser + version are you using? Can you try again with the latest version of Google Chrome or Firefox? (in case you are using another browser).

MolView will indeed use the Resolver if the molecule is not available in PubChem. I do not yet use PubChem for substances so drawing two structures should force MolView to use the Resolver. However, since this seems an AJAX related issue, I think the Resolver will not work as well (the Resolver API is also called from within the browser).

For what it’s worth, about two months ago all my AJAX PubChem calls broke. I temporarily switched to proxy server. Finally, it seemed to me that this was related to an AJAX cross domain issue. I wrote to Bob Hanson about this, and he was quite certain that this was not the problem because of PubChem using "Access-Control-Allow-Origin", “*"

Nevertheless, with the help of PubChem staff, I changed to JSONP, which PubChem supports. My problems disappeared. I’m sure this is all browser specific, so it would be useful to know Roy’s browser/platform. Roy, I’m not sure what the Resolver issue is, but it could be related.

To summarize: While I make direct calls to Resolver, I do not use AJAX - Resolver did not break. I make both AJAX and direct calls to PubChem - only the AJAX calls broke. JSONP fixed this.

One a related note: The US government has insisted that all .gov sites switch entirely to https. I guess it’s possible that PubChem’s change from http to https caused my original problem. There is currently some Resolver testing going on. After these tests are completed, Resolver will be changing to https.

Yes, but JSONP is a dirty trick and I don't really want to use that. MolView already uses the https protocol for PubChem and I think a proper 301 redirect should not cause issues for AJAX. However, browsers have become more strict on using cross origin requests because they can be abused. If Roy is using an old Firefox version, there is a good change all cross origin requests are disabled by default (if I remember correctly Firefox did that to me once too). Usually you should be able to see some kind of warning the the address bar. If you really want to know what is going wrong, just open the developer console.

I'm trying everything in the latest versions of Opera and Firefox. I am also currently building a website; one of the Drupal modules uses AJAX to upload files. It works fine in Opera (my main browser).

First, let me say this is fantastic. I will absolutely begin using it with my students next semester. Of the many things that could be done, this can be a great tool for students to practice determining polarity of molecules and bonds, then checking their own understanding. (I currently have students do a similar activity determining R and S stereochemistry on ChemDraw for iPad). I also love the spectroscopy features. I can see immediate application of this for my students to check the results of organic reactions that they are analyzing using IR, MS and NMR.

I have a question about the embedding of 3d images that you can do through the HTML code. Will that include the JMOL data if you have it showing (for example the MEP surface) or will it be the molecule model only?

Also, one small suggestion - The MolView- Layout looks confusing at first. Once I played with it, how it worked was clear but when I first looked at it, it wasn't obvious that there were 4 boxes there with different layout options.

Embedding does not retain anything from Jmol. I do want to include this in MolView 3 though. MolView 3 will also have a cleaner layout and step-by-step tutorials to get new users started. I hope to have a preview of MolView 3 this summer but I can't promise anything. This article should give you an idea of what it will look like: http://blog.molview.org/posts/2015/07/23/material-design/. Could you share your experience with MolView after applying it in your classes?

MolView is definitely a powerful tool and I like the option you included to observed the different graphs. One difficulty I have in my physical chemistry class is understanding what each graph means especially a UV-vis absorbance graph or even a Fluorescence Emission spectra. If you could include those graphs as well students will be able to understand more about the physical properties of the molecules they draw.

I think this is a terrific idea. In fact, there are many new databases I want to support in MolView 3. The main question is not if this is possible in MolView but if there is a reliable, open dataset that can provide this data. Of course I'll also have to write a parser/viewer for that data but that's entirely possible.

I want to use this opportunity to ask for some assistance. The most important thing about MolView is the data that makes it happen. This data is ideally collected from external databases that have their own maintenance team etc. Viewing various spectra is an interesting feature in MolView that can only exist because of the spectral data. Currently this data is collected on the fly from the NIST WebBook. However, to support more spectra (real NMR, ESR, Raman), I require more data. One of the biggest spectral databases in the world, the SDBS (http://sdbs.db.aist.go.jp/sdbs/cgi-bin/cre_index.cgi) has a lot of useful data. The digital spectrum data is closed and the SDBS can not be used by external programs. However, recently I talked (emailed) to the people at SDBS about this and they were positive about making the SDBS more open and accessible. Unfortunately they lacked the budget and manpower to make this happen in the foreseeable future. The SDBS is maintained (I believe) by AIST (kind of a Japanese NSF). Of course I do neither have the authority nor the assets to get this collaboration going but maybe some of you could help?

... both Opera and Firefox:
1. Allene without the hydrogens: "Failed to load structure from its database" (after 'updating ...' displayed for 20 seconds)
2. Allene with the hydrogens: "Failed to load structure from sketcher. This structure may contain allenes, which cannot be represented in the SMILES notation. Relevant stereo-information will be discarded." (instant response)

It is what you should expect. Otis was referring to the inconsistency between the allene computed by the Resolver (using CORINA) and from PubChem (which is likely computed using MMFF94). In this special case MMFF94 has not generated the 'right' conformation (according to QM). CORINA has a database of ensembles and the right coordinates of allene are stored in this database, therefore CORINA does return the right conformation.

I’m Sunghwan Kim, a Staff Scientist at PubChem. It’s interesting to see what people say about PubChem and its 3-D conformer models. I would like to give you some insights on PubChem 3-D conformer models, to help you understand what to expect from the PubChem 3-D resources. (More information is available in the papers published in the PubChem 3-D thematic series, which are all open-access) (http://www.jcheminf.com/series/PubChem3D).

PubChem produces a 3-D conformer model for a molecule if the molecule satisfy the following conditions:
(1) Not too large (with <=50 non-hydrogen atoms)
(2) Not too flexible (with <=15 rotatable bonds)
(3) consists of only supported elements (H,C, N,O, F, Si, P, S, Cl, Br, and I)
(4) Contains only atom types recognized by the MMFF94s force field
(5) Has only a single covalent unit (i.e., not a salt or mixture)
(6) Has fewer than six undefined atom or bond stereo centers.

Approximately 90% of compounds in PubChem satisfy all of these conditions, so they do have a 3-D conformer model. With that said, you can figure whether PubChem has a 3-D conformer model for a particular compound, even without accessing PubChem.

The most important feature of PubChem 3-D conformers is that PubChem’s 3-D conformers are not energy-minimized, so they are not stationary points on a potential energy hypersurface. Instead, the PubChem 3-D conformer model for a chemical structure represents all possible “biologically-relevant conformations that the molecule may have”. Each conformer model has up to 500 conformers, each of which a certain part of the conformational space that the molecule can span.

Although each molecule has up to 500 conformers (depending on the molecular size and flexibility), most of third-party programs use only the default PubChem conformer among many. While this default conformer has the lowest energy among all other conformers in the conformer model, the energy is calculated using a variant of MMFF94 (not MMFF94), which excludes *coulombic* terms.

Why do we exclude *coulombic* terms? Note that the PubChem conformer model aims to reproduce a biologically relevant conformer, which means a protein-bound structure. If you run geometry optimization in vacuum for biomolecules with hydrogen bond acceptors/donors and other features that will bind to protein residues, the energy-minimized structure tends to have a globular shape to maximize intramolecular interactions (e.g., hydrogen bonding, electrostatic interactions). In a protein-bound structure, these intramolecular interactions will be replaced with intermolecular interactions with protein residues, make the molecule less globular and more elongated. To mimic this effect, the coulombic term in MMFF94 is removed during the PubChem 3-D conformer generation, artificially creating a bias toward protein-bound structures. Therefore, the default PubChem conformer has a very different meaning from what most organic chemists think.

Probably relevant to the location of hydrogen atoms in allene, PubChem conformer generation takes care of conformation of the skeleton of a molecule (meaning non-hydrogen heavy atoms). It means that the algorithm itself does not really care the orientation of hydrogen atoms. (In drug discovery, we are primarily interested in “scaffold” of molecules. The location/orientation of molecules are usually considered a minor issue that can be changed by various factors such as pH, tautomeric states, the presence/absence of other ions that can form a salt with the molecule, and so on.)

In summary, PubChem conformers should not be considered as an energy minimum on the potential energy surface. Instead, it should be considered as one conformation that represent a part of the biologically-relevant conformational space of a molecule.

Thanks for taking the time to bring this detailed explanation to the Newsletter. In the CCCE cheminformatics course running concurrently with this Newsletter, I tried to tickle a discussion of this point by participants with this post:

"If you query PubChem for allene sdf 3d, you will get a perfectly flat allene. I though van’t Hoff settled this issue in 1875. Is PubChem in error here? The answer is no! And a discussion of this seemingly ridiculous answer is really a discussion of the essence of cheminformatics."

Thank you again for taking the time to bundle part of that essence into your detailed note in this forum. I use PubChem as a primary 3d coordinate resource in the CheMagic Virtual Model Kit. Over the years, I've gotten numerous notes from users about allene. My response has always been some variation of my above quote, but now I have a more complete answer. I am particularly grateful for your detailed explanation of the PubChem MMFF94 approach.

One final thanks to PubChem in general. I recently had to implement JSONP in my application’s AJAX calls to PubChem. The courteous, patient, and time consuming assistance I received from PubChem staff was exemplary!

I am aware of the issue about the 3-D conformation of allene in PubChem. As I mentioned, PubChem’s 3-D conformer ensemble model is developed to “mimic” protein-bound structures of drug-like molecules. (Here, I’m using the term “drug-like” in a very loose way.) Compounds in PubChem have 25 heavy atoms and 6 rotatable bonds on average, which makes it difficult to locate all stationary points on the potential energy surface. PubChem’s conformer generation process is “optimized” to deal with such large and flexible molecules.

On the contrary, allene is very different from average PubChem compounds. It is small enough to have only one energy minimum. This molecule is a good example that can be used in an organic chemistry class for explaining how molecular orbital theory can explain molecular structures. Obviously, we can’t to use a molecule with 25 heavy atoms and 6 rotatable bonds as an example for this.

With that said, the way in which organic chemists (and quantum chemists) look at molecular structures are very different from that of the drug design/discovery field. Then it is interesting to ask a question: how do we explain this difference to students in an organic chemistry class room? It may not be trivial because they probably do not have enough background to understand the difference (considering that most organic chemistry courses are designed for sophomore).

Your responses on this issue have been excellent. Don't hesitate to give a thorough response like this to a sophomore level organic audience. They are often more savvy than we think!

If a cheminformatics component is going to creep into the undergraduate curriculum, your's is exactly the type of expert comment that is needed. This is why I tried to tickle this discussion in the current cheminformatics course discussion list.

My main objective in this News Letter was to encourage play with MolView, but your related cheminformatic comments are certainly needed here as well.

I have really enjoyed exploring and playing with your program. It's quite remarkable what it can do, how easy it is to use, and how well it works. I'm not an organic chemist; the last time I taught some organic was in 1994 in a non-majors course aimed at future elementary and middle-school teachers. At the time I used a simple structure drawer (Brooks-Cole's Beaker) and a 3D visualizer (MacMolecule). How far it has come since then!! What started as a toy is now a useful tool, with capabilities I had not even heard of. I thank you profusely for all the work you have done.

Hello,
For some reason I cannot use the full features of Molview. It looks like a great tool, but I have tried multiple computers and I always get an error message that reads:
"This page (http://molview.org/) is currently offline. However, because the site uses CloudFlare's Always Online™ technology you can continue to surf a snapshot of the site" and there is no menu bar across the top of the screen. The demo molecule works though, in that I can rotate it etc. and the instructions screen pops up, but that is it.

I feel kind of lame posting this, but am I missing something? I am using PCs with Win 7 and Win XP. JMol runs just fine on two of the computers I have tried.

I've heard the issue with the menu-bar before. It turned out someone installed a utorrent browser plugin that injected some ads into the page that made the menu bar invisible. MolView should be online right now. The web-hosting I use for MolView is extremely cheap and probably not very stable. I'm hoping to fix this in MolView 3.

I think we got the problem fixed, or at least it is fixed with another list that I use for trouble shooting. It is now about 2:23 CDT, and hopefully we get a quick response. If not, I will send a follow-up in around 10 minutes, and see if they come together, or not.

If things are working we will schedule in the last article of the Newsletter, and I thank everyone for their patience.

OK, the time is 2:28 CDT and it is not looking good. I fear both of these emails will go through at the same time, when they are manually released by computer services. We have a list for the CCCE members that was also being blocked, but it was not the list that was originally labeled as SPAM. We may need to change the lists name.