Connecting

You can access Jupyter by pointing any web browser to
https://jupyter.sesync.org/ and logging in with your SESYNC username and
password. If you forgot your username or password, please go to
https://pwm.sesync.org/. If your SESYNC credentials do not give you access,
please email cyberhelp@sesync.org and ask to enable this resource for your
account or whole team.

Running JupyterLab

Once you log in, you will see a JupyterLab interface. To learn how to use all
the features of the JupyterLab IDE, check out the complete
documentation.

Python Packages

User contributed Python packages can be installed on JupyterLab by using the
“pip” utility from the Terminal. To avoid a permission error, please use the
“–user” flag. User jsmith would install a package called “mypackage”, for
instance, by opening a Terminal and entering:

jsmith@juplab00:~$ pip install --user mypackage

If you receive an error stating that the package cannot be installed, there is a
chance that some underlying system library is not installed. Please email the
error message to cyberhelp@sesync.org, and explain which package you need to
install.

Where should I save stuff?

“Stuff” usually belongs in one of three places:

Research home directory (“~/” or equivalently “/research-home/USERNAME/”).

Research data directory (“/nfs/PROJECTNAME-data”)

Version controlled project (e.g. on GitLab and cloned into “~/”)

Research Home Directory

When you first open JupyterLab, you will be working in your home directory which
is located at “/research-home/USERNAME” or equivalently “~/”. This is a private
directory, and only you have access to the files in it. We strongly recommend
that you save source code in your home directory. This will protect against
multiple group members attempting to update a project file at the same time. If
you need to share code between project members please use a version control
application such as GitLab or GitHub.

Research Data Directory

If you have requested it, your group will have a data directory available. Your
research data directory appears on https://files.sesync.org/ as
PROJECTNAME-data, where PROJECTNAME is the short name assigned for your project
by SESYNC, and is accessed from https://jupyter.sesync.org at the path
“/nfs/PROJECTNAME-data”. You can add to this directory either by saving output
from Python to folders there, or by using one of the options for uploading
described in Quick Start: Research Data Directory. You should store
all shared data here. Examples of data types that should be placed here include
csv files, landsat imagery, hdf5 data files–anything that makes sense to have
only a single copy, with access shared by your project members.

Dealing with Data

Since everyone will be working off of the same set of code, there are three
options for working with data. If your data is quite small (i.e. a CSV with a
few hundred rows, also known as “small-batch artisanal data”) you can include it
in your project, push it to your remote repository, and everyone will have a
clone. Larger datasets should be in your Research Data Directory so that
everyone is able to work off one shared copy of the data. Very large datasets
may need to be loaded into a RDBMS, and SESYNC provides both MySQL and
PostgreSQL servers for this purpose.

Let’s assume that J. Smith (with USERNAME “jsmith”) is part of the “Trees and
Urban Heat Island Mitigation” working group. When J. Smith logs in to
https://files.sesync.org, the directory “cooltrees-data” will indicate that the
PROJECTNAME mentioned above is cooltrees. Data can be written and read into
JupyterLab. For example, a script saved as “~/example_script.py” could include:

importosimportnumpyasnpimportmatplotlib.pyplotaspltimportpandasaspd# Parameters and argumentsout_dir='/nfs/cooltrees-data'# Generate example datax=np.arange(0.,10.,0.4)y=x**2plt.plot(x,y,'bs')plt.xlabel('x')plt.ylabel('y')out_plot_filename=os.path.join(out_dir,'x_y_plot.png')plt.savefig(out_plot_filename)# Write out test data to current directorydataset=pd.DataFrame({'x':x,'y':y})## print(dataset)out_filename=os.path.join(out_dir,'dataset.csv')## write out to csvdataset.to_csv(out_filename)