IAG: uv100

IAG: alphacrucis

connect

As with any other Linux machine, you can connect with alphacrucis with either one of the two commands:

$ ssh username@alphacrucis.iag.usp.br
$ ssh username@10.180.0.63

For security reasons, the machine does not allow you to connect from outside IAG. Therefore, if you are e.g. at home, you
have to connect gina.iag.usp.br first and then execute one of the previous commands.

Before you start working in alphacrucis, you should better check your ~/.bashrc file, to set the path of the desired fortran compiler.
I suspect that, if at the time of applying for an account you checked on that you will be using fortran, your ~/.bashrc muist have been filled with all the available fortran compilers and all you have to do is to comment off the lines corresponding
to the desired compiler, and comment on the lines corresponding to the rest of the compilers.

usage

In general in big clusters like alphacrucis, you should better not simply run/execute programs. All time-consuming commands should be "submitted" to the cluster system,
ask for permission to run, and wait for the cluster resources to be freed in order to start your job. This thing is called
queuing, i.e. you submit a request to use some resources, and your request is run when the system is ready.

You may perform some tests before continuing. To this purpose, create a file machines, in order to list the nodes used by MPI. For example, the following machines file defines the execution of a program in 4 nodes and 96 cores, as each node contains 24 cores (processors).

r1i0n0
r1i0n1
r1i0n2
r1i0n3

For a run on more nodes, you simply add more lines/nodes in the above file

r1i0n0
r1i0n1
r1i0n2
...
r1i0n15

This uses an IRU (building block of nodes) of 16 nodes. We have 4 IRUs in tower 1 and 2 IRUs in tower 2. In the next IRU,
the nodes start at

r1i1n0
...
r1i1n15

queuing jobs

You can run the parallel version simply as:

$ mpirun -np 8 ./hdustparv2.02.bc input = hdust_bestar2.02.inp

But it is recommended that even for a test, you should better queue the job. In particular, you should use the queuing system
Torque/Maui to submit your jobs. In that case, you do not need to use the file machines mentioned above, and instead define the number of nodes and processors you need directly in the submission script. A sample
for submission script is runs/hdust/sample.job (see below). You submit the job with the command:

$ qsub sample.job

In other systems (e.g. the UWO cluster) the submission occurs with the command:

$ sqsub -q mpi -n 64 -r 10h -o outfile

With qstat you can see the job ID in the first column, which will be something like [6-digit number].alphacrucis, if not defined elsehow in the #PBS-type commands at the beginning of the submission script.

$ qstat # see status of your jobs
$ showq # see all running jobs in the cluster
$ psall # see the CPUs running for you right now (IAG alias)

You can kill this job with the command:

$ qdel [job ID]

You can see the situation of the cluster in ganglia. You can get help from the following Wikis: Gina, emu2009 (invalid link), LAi (Laboratório de Astroinformática).

submission script

A request is submitted via a submission script (usually with an extension .job), as the following:

The lines starting with #PBS are directives to the cluster (see relevant wiki).

You can define the working directory; if omitted, the default is where we are when running qsub. The working directory is where qsub searches for the executables and input files.

#PBS -d /sto/home/despo/distribution_current/runs/hdust/

resources requested

The most important is the directive for the cluster resources requested.

#PBS -l nodes=128,walltime=36:00:00

With the above line, we request the use of 128 nodes for 36 hours. This is important, because your submission script will
be killed after 36 hours, so if your program has not been completed yet, it will be interrupted. In general, the more the
nodes the less the time, while you should better enter a longer time than you expect (otherwise, if the execution needs longer
time than you had thought, you will need to resubmit with the time revised - to a longer period).
But again, if you request a lot of nodes for a long time, your submission script might wait for longer until it finally starts,
as the resources requested are high, and this means you will have those resources tied in time that other people might need
them too.

A rough estimation of the time to be requested, could be achived as follows. Let's say we request Nc processors. Check how long the program runs in your PC, say tp.
Assuming that the processors of the cluster and the processor in your PC are of similar capabilities, and also that the case
that you want to run in the cluster is similar to the case that you run in your PC (having Np threads) is needed in your PC.
Then, the time to be requested should be of the order tc = tp Nc / Np, provided that the main part of the code can be executed in parallel (e.g. as is the case when you calculate the trajectories
of photons: the calculation of each trajectory can be executed in parallel with the calculation of the trajectory of every
other photon - if each trajectory does not affect the other trajectories)

You can override this data for a specific project, running the same
commands but without the --global option. The changes will be recorded in .gitconfig, where .git is a directory located in the project's folder.

With

$ man git config

you access a manual with all the options.

terminology

We modify our code, add/delete files, change the subdirectory structure etc. From time to time (usually after we are sure
that the changes we made work alright), we want to register those changes. then we make a commit. Finally, we push those registered changes in the main repository. Every time we push, we actually send all the commits we made since we last pushed.

In cases we have two different versions or branches of a project, we often want to merge the two into one. Then we say we
merge.

A case in which we might need to merge is the following: Person A makes some changes in a code, person B makes some other
changes. So now we want everything merged together in a single code.

In some cases, there might exist some conflicts, e.g. if a specific piece of one file was modified by both people. Then git could not possibly know which change to select and put forward. In those cases there will be a warning message, and the person
who attempts to merge the two versions has to select one of the two for this exact point of the code.

In some other cases our project will not work well, but we will not be able to know this until we run the ececutable. Those
cases are far more tricky, but also are those cases that programmers very often encounter, when they work alone on one code:
You make a change and then everything goes wrong! But git cannot possibly foresee this kind of problem, and therefore after every merge, you have to attempt a run and check it out.

Branches are the different routes an initial version of a project might take. Whenever you create a branch, it is identical to the
parent branch (meaning the branch that was being used at the time of creation of the new branch), i.e. whatever exists in the parent branch
will be present in the new branch. This situation occurs when we want to start from a point where our code works in a stable
way, and we want to test or experiment on something, without harming the parent initial version.

general information

To check the commit history of the project, including the commit ID, the name of the person who made each commit, the date and the commit message, you have to type git log. This will print a list of the commit history - by pressing "space" you can reach the beginning of the git history. Here
we show an example of the last three commits:

If you want to see the changes between the last commit by me and the commit of Daniel (see notes on commit ID), you have to type:

git diff 1710b030a2..cb9e1f9b

When the changes are made on text files, it's much easier to compare, whereas when the changes are made on binary files (such
as pdf), there's not much to understand (at least for me). That's why git is mainly used for programming codes.

Other commands which provide information:

$ git log --stat # shows the commit history (lines changed in each file)
$ git log foo.txt # shows the commit history of file "foo.txt"
$ git status # shows whether there are changes to be commited
$ git diff # see what is there to be commited (differencies from last commit)

creation of a repository

in bitbucket

See here for how to replicate and process a project at this new repository.

Please, DO NOT do in bitbucket anything else than view. DO NOT upload into or download things from bitbucket. In general DO NOT use the visual interface
of bitbucket. Use only the command line. When our repository here is ready, we will certainly need to do everything from the
command line. So you should better learn the real commands and NOT bitbucket's interface.

What you expect to do by using the upload button of bitbucket, you should do with add+commit+push. Whatever you expect to accomplish by using the upload button of bitbucket, you should do with git pull. It will be quicker in the first place (ignoring the fact that you will have to enter your password). Add information on how you can save the password, so as to get over this delay.

commits

commit ID

The full ID of a commit is a 40-character string. As git makes sure that the first few characters of the full ID is a unique sequence among the distinct versions of the same repository,
you can refer to each commit with its ID's first few characters (usually 6 characters are enough).

selection of files

After you make some changes in any of the files within the project directory, you can give the following commands:

$ git add .
$ git commit .

The dot (.) in the end of both commands means add+commit of all changes encountered in current directory. If you don't want to add+commit
the changes in all files, but only the changes in file foo.txt, then the above commands have to be given as:

$ git add foo.txt
$ git commit foo.txt

comments

Let's explore the command:

$ git commit -m "first commit" .

If you omit the "-m message" option, then by running the commit command, it will prompt for a message. Instead of the short
form allowed in the command line, now you will be allowed to give a short message of 50 characters, then leave an empty line,
and finally write a more detailed description of the changes incorporated in this commit (as in the first commit message of
the sample that is shown in § general information).

branches

branching

Imagine that you have a 1D code and are about to make changes in order to add a 2D option.

you will see that none of the changes you made in Makefile can be found there.
To add to the original repository the branch you created in your local directory:

$ git push --set-upstream origin twodim

Your current branch is twodim. You can merge it to onedim by

$ git merge onedim

If there is any conflict in merging the two branches, it will return a relevant message.
You can see the conflicts by:

$ git diff

Remember that the same command can be given before any commit, in order to see what there is to be commited, i.e. the differencies
from last commit (in current branch; see § general information).

If you want to delete a branch:

$ git branch -d onedim # keeping it in the history tree
$ git branch -D onedim # removing it permanently from the history tree

If you get stuck with the conflicts and want to start over:

$ git reset --hard HEAD # in case you have not yet commited the merge
$ git reset --hard ORIG_HEAD # in case you have already commited the merge

You can check the differencies between two branches with

$ git diff onedim..twodim

versions

Let's say we have a big project that has been being developed for years, and there are more than one versions of it, say version
1 and version 2, which are located in directories Dv1 and Dv2, respectively. I will create a short shell script (no need for a long one) on how to create a git repository that will include
the two versions (if you copy it, make sure to correct the directory names; it is recommended that you type the commands one
by one, though). If you set Dv1 and Dv2 as shown in the first two lines, the rest of the lines may be copied as they are. You should check the path names, while
you might also want to change the path/name of the repository directory (TARGET).

Each directory may be accessed both locally and internally, since we'll use the command rsync, which may contact remote systems via a remote shell program (e.g. ssh) or through contacting an rsync deamon directly via TCP. We will suppose that Dv1 is in alphacrucis, and that Dv2 is in my home directory:

It's you first commit, no changes on files, therefore no more information are printed out. You might also see something as
simple like this.

[master 1fd89e9] version 1

Which form of message you will see depends on the configuration of git.
In any case, the first of these line contains a 40-character string, which is the full ID of the commit, whereas only the
first part of it (1fd89e) is reported in the short version. But we don't want to have to remember this strange number, even if it is as short as 6
characters long. We can tag this version of the code with the following command:

$ git tag -a "HDUSTv1" 1fd89e9c014ea69982ee2f941d55066c3d58b32f

so that now we can refer to it just by HDUSTv1. Although just by tagging each version we make it possible to turn back to it at any time, for various reasons it is good
that we also make a branch for version 1.

$ git branch version1

As we are still in the beginning, we are in the master branch, and by the last command we created another branch called version1.

Now we have to enter version 2 in the repository.

$ rsync -avz --delete --exclude '*.git*' $Dv2 $TARGET

Take caution to not add any slashes in the end of the directory names (this has to do with how the rsync command works).
This command should substitute everything in current directory with the contents of our local copy of version 2, deleting
everything from the old version that does not exist in the new version (except from the *.git* files, otherwise it would delete all previous information about the repository, i.e. it would forget all about version 1,
including the previous logs).

replication

evaluation of methods

There are ways to copy/replicate a git repository. The easiest and simplest is "cloning". "Adding" is just referred to for completeness. I am not aware of any real
differences between the two ways. But I say that cloning is simpler for the following (minor) reasons:

In "adding", you have to create and enter the new directory, while in "cloning" the directory is created at the time of "cloning".

In "cloning" defaults are automatically assumed, and you do not have to configure anything (except if you want to change the
defaults), while in "adding" you have to spend some time configuring (although not so long), after you have just added the
repository.

XSLT

XSLT is a declarative programming language. Often the output is XML or HTML, but in general XSLT can be used to produce arbitrary
output from any XML source. E.g it can be used to restructure an XML document to conform to another schema.

To process an XML file with an XSLT file and output an HTML file, one has to install saxonb and type:

Data given in a form by a user can be validated against an XML schema. During this procedure, it can be made possible to disable
some of the data contained in the XML file, to output values calculated from the form data etc.

XSD

To validate an XML document against an XSD schema:

xmllint --schema styles/list.xsd --postvalid --noout videos.xml

Στο sequence μπορείς να κάνεις mix από elements και groups.

To include an external schema with declarations, include the line

<include schemaLocation='general.xsd'/>

According to MSDN Data Development Center, The max occurs of all the particles in the particles of an all group must be 0 or 1". So it cannot be given the "unbounded" value. Therefore all has to be turned to a sequence, if you want to attach the maxOccurs='unbounded' attribute to an inner element.

In case you need some environment with the properties of a sequence, but you don't need the inner staff to really be in an specific order, then you have to do it indirectly (μπακάλικα!), as
explained in the ZVON tutorials.

the parent element being:

allowable elements

element

annotation, simpleType, complexType, unique, key, keyref

sequence

annotation, element, group, choice, sequence, any

element attribute

what it means

default

this value is implied when the element is not included in the
XML file, so that the validation with the XSD does not fail

XPath

The XML Path language (XPath) allows the developer to select specific
parts of an XML document. It gives the elements' path tree.

XPath was designed specifically for being used in combination with
XSLT and XPointer (?). Lately also XForms makes use of XPath.

XForms

XForms (-> JS, AJAX) can be viewed via:

native browser implementation (in firefox)

browser plug-ins (in IE)

JS implementations

Pure server implementations have been developed in order to use XForms without the need of a browser:

They translate XForms into a plain HTML file, and keep all the logic in the server side.

They are relatively simple to implement.

Each user action requires full client/server interaction.

XQuery

XQuery can use the SQL (Structured Query Language).

XQuery cannot modify or delete data from an XML file, and neither can it add new data.

In SQL Server 2005, the XML DML (Data Modification Language) can be used; it provides the following functions: delete, insert, replace value of.

mamoj

v0

species.in.save is the standard species.in file. We included SiO* so the elemental abundances of Si and O are slightly different from the standard ones. If the line
of SiO* is removed the elemental abundances become the standard ones.

chemistry.in.save is the standard chemistry file, after CO photodissociation was crudely included, as well as three SiO* reactions (actually
copied from the corresponding reactions for H2O*).

input_mhd.in.save is the standard file that is used for the steady state runs. After you have run the steady state run, you have to copy species.out to species.in. On the first line of species.out the final temperature and final drift speed of the steady state run are written. You have to set these values to input_mhd.in, and after changing S shock type to C, you can run the program.

wind.in.save includes the parameters for the standard run at 1e-6 solar masses per year and anchor point at 1 AU.

The executable is (run with the command):

./mamoj

(I believe) changes that have to be made in the code. You will indeed need to make some changes in the source, and one that
definitely needs to be done is in subroutine GET_MTRCS (wind.f90). In particular, changes have to be done in the parameters that are declared in the beginning and include SQRT's.