Where do you perform data analysis and visualization of data produced at NERSC?

Location

Responses

Percent

All at NERSC

17

5.1%

Most at NERSC

40

12.0%

Half at NERSC, half elsewhere

67

20.2%

Most elsewhere

94

28.3%

All elsewhere

95

28.6%

I don't need data analysis or visualization

19

5.7%

If your data analysis and visualization needs are not being met, please explain why. 21 respondents

Need additional software:

NCAR Graphics is not correctly installed on davinci. (We use a version in Mary Haley's home directory instead.) NCO is not available on davinci's replacements.

We are using mostly GrADS software to visualize data, so we do it mostly on our own machines. However, for quick preliminary results, visualization could be very helpful to detect any errors in our model simulations.

I use Davinci to use vcdat for my visualization needs. That particular software needs to be upgraded to more recent versions. The present version keeps on crashing with segmentation fault very frequently. That hinders my work progress.

We are using meteorological analysis tools like GrADS. So we have to take data to local machine to perform it.

I would further like to point out that, if you install GrADS on machines, we can readily check the output before we ftp the outputs to our systems.

Most of my pre- and post-processing code is in IDL. Having access to IDL on Franklin would significantly reduce the amount of data which needs to be transferred remotely.

I can't run scripts using ptraj (from Amber) on the Nersc machines, and the queue wait times are long, so I'm not sure it would be good for running serial data analysis scripts.

an installation of 'ecce', courtesy of pacific northwest national laboratory, would be immensely useful to quickly look at nwchem output, though i don't know if it is possible to obtain a license for that software from pnnl.

Performance issues:

I mostly find that the visualization tools are too complicated and the response time is too low. As for data analysis, I mostly use Matlab, but Matlab does not benefit much from parallel architectures and, for serial application, my local machines are faster than NERSC's.

Connecting from Europe prevents many visualization abilities...

Having trouble loading GUI over ssh (e.g. for xmgrace)

As my datasets become too large to transfer rapidly or to fit into the memory of my desktop machine, I increasingly rely on remote visualization from NERSC. Up until recently, I did this by running Visit on Davinci, accessed through the NoMachine desktop application. To the best of my knowledge, the server application (NX) needed for this type of access is not running on Euclid, Davinci's replacement; I have asked NERSC personnel about this and received no response.

I originally started out analyzing data produced by Franklin on Davinci using matlab. However Matlab was very slow loading large data files so I now scp the files to a hard drive and examine the data locally. Despite being annoying at first, this has actually proven to be a very good way for me to work as the transfer speed from Franklin to my remote machine is extremely fast.

Mathematica "Notebook Evaluation" process on Euclid seems much slower than in my local desktop. An actively running Mathematica doing "Evaluation" of cells on Euclid seems only use about 1% of CPU. Don't understand why or how to adjust that.

ViSiT comments:

I use ViSiT to visualize my data. Since my run generates too huge data, sometimes I can't visualize because of quota limit. Or if we want to make a movie out of all the data, my quota is not enough. I know I can ask for quota increase, but is that possible to cancel the disk quota restriction on visualization? I would really appreciate if you consider this.

The tool I mostly use is VisIt, which can be run remotely with a gui, which is nice. However, Franklin lacks the ability to allow using VisIt in cli mode, which is useful for loading python scripts onto visit and perform some set of tasks, like opening all datafiles and outputting one contour plot per datafile in order to create a movie, for example.

PDSF comments:

I installed my data analysis program ROOT myself because I am not the Star group user, there is still a problem, each time I log in, I need to run "source thisroot.csh" to set the Root environment variables manually, and this procedure can't be wrote in log in bash script. Why can't these data analysis and visualization software be open to all users so they can choose themselves ?

gsl library on pdsf machines would be nice

Other comments:

It would be great if there were shared scratch space between franklin and davinci. As is I do all my analysis on franklin.

The code we use can easily be compiled on NERSC machines we just haven't gotten around to it yet.

Thank you. It is really great.

What additional services or information would you like to have on the NERSC web site? 26 respondents

Navigation and functionality suggestions:

Links that don't change font size or weight when you hover over them. This is very distracting.

Better organization. Eliminating conflicting information.

NERSC website has lots of content, I guess you can arrange it better.

drop off menu on the left column. It is sometimes difficult to find the status link (use to be a tab at the top); or to go to the machine web page from the status page.

The one thing that has been a minor annoyance is that there is not a link to nim in the NERSC home page.

The most valuable things I've found on the site are pages that I only discovered from taking NERSC surveys. This points out to me that the site navigation needs improvement since I didn't discover them by browsing or searching the site itself. Also, it is not uncommon to find out of date documentation on the site. This is really unfortunate as it undermines the credibility of the rest of the documentation.

More technical documentation:

Debugging software are not always transparent to the user. Maybe better help with the debuggers.

More information about new systems: carver, euclid

More details about the latency of each cluster's network.

Suggestions for new web services:

It would be useful if the MOTD were available as an RSS feed.

I don't know if I can choose environment variables on the web site to customize my needs on the supercomputer. For example, I am not a Star group user, but I want to use the data analysis software ROOT, then I can choose this option from NERSC web site.

For the love of god, make mercurial and git servers for hosting projects

Suggestions for NIM:

Use plain language rather than contractions or acronyms on the account usage. What's a "repo"? What resources are MPP and STR?

better integration with NIM

Better communication of status:

I work mostly on Franklin. I find that down time for Franklin (unscheduled maintainance) is happening quite frequently. This can be very frustrating to users when you are in the middle of editing/compiling a program. The message often says that there is no estimated time when the machine will come up again. That makes it a little difficult for planning the work especially when this happens towards the end of the week when we are often planning to submit a big job to run over the weekend.

I would like to see some more technical information on the reasons for past unexpected outages.

Other comments:

I am grateful to everyone who contributes to the functionality, resourcefulness, friendliness, and productivity at NERSC. Thank you so much for all of you.

I am not sure at the moment

I find that the overall support and service from NERSC is very satisfying. The technical support and user information are adequate and useful. Though satisfied with the service, I would like to provide improvements in the existing structure. I would like to see either increase in wall time for most machines or decrease in the queue time for most of the machines.

Overall very good.

Best in the DOE! Great facility.

I'd like to be able to use gedit on Franklin. I think it's installed, but there's something wrong with my x-term. I think the problem is on my end, as "xclock" does not work, either. (But it does work for non-NERSC machines.)

As you can tell from all of my "I don't use this" answers, I'm just a lowly grad student doing runs on behalf of MILC.

Allowed length of job is short (24-48 hours or so). I hope that users can make requests for jobs that can't be done in two pieces. Occasionally, one large phonon spectrum calculation can take something like 200 hours on 64-256 processors.

When I have a problem it takes too long for someone to answer my question. For example, if I am stumped and have to wait 3 hours for an e-mail answer, I loose 3 hours of work. It would be much faster for me to directly contact the expert. That avoids e-mail tag.

Hi, pgplot is a simple plot package, useful for a first look at code results. There has been talk of installing it as a public library for a few years, but it hasn't happened yet (to my knowledge). It will be very useful to several heavy users of Franklin. Thanks.