Ph.D. Qualifications

Ph.D. Qualifications

In order to start a Ph.D. in a field, you need a set of skill in
that field. Merely satisfying the departmental requirements for
computer science doesn't automatically give you those skills. This
page tries
to give a brief summary of a basic set of skills that are important for
working as a Ph.D. candidate in image understanding and pattern
recognition. There are some core skills, and some optional skills.

It's hard to list all the skills explicitly, so on this page, I try to
indicate skills by referencing standard text books. That doesn't mean
that you need to have the textbooks memorized, but you should have a
reasonable understanding of what's in those books. You probably should
have implemented the most important of the algorithms in those books at
one point yourself (usually in Python), and you should feel comfortable
implementing and applying most of the techniques if you need to.

You
may need a combination of courses and self-study to acquire these
skills. For self-study and review, the books below are a starting
point, but you may find it useful to refer to other books as well.

Computer Science Skills

Our research lab is in computer science and involves algorithms, mathematics, complexity analysis, and related topics. You should be familiar with a basic set of skills in computer science. If you don't know these algorithms or topics yet, it would be a good idea to read up on them. In addition to being useful for your research, the department also expects familiarity with these subjects.

Programming Skills

Generally
speaking, we use the following languages and tools (in decreasing order
of importance). Keep in mind that our primary goal is to do research
in pattern recognition and image understanding, not software
development. So, it's important to learn tools that make developing
new pattern recognition and image understanding methods easy.

Im-port-ance

Language

Description

Web Resources

1

Python

Python
is the default programming language for prototyping and writing
applications in IUPR. Python is used for scripting, numerical
algorithms, GUI development, and other applications. Every Ph.D.
student in the group should be fluent in writing Python code, and have
a good working knowledge of the NumPy, Matplotlib, sparselib, and PIL
extensions. You need to be able to write efficient array code in
NumPy. Please also learn and follow PEP8 coding conventions.

2

R

R
is widely used in statistics and machine
learning. Functionally, it is mostly a subset of Python. There are a
lot of
toolboxes that implement algorithms that we use for control
experiments. R is also good for plotting, and it should be your first
choice when making graphs and figures for papers or your thesis. Every
Ph.D. student should have a basic knowledge of R and be able to use
algorithms, toolboxes, and plotting functions implemented in R by
others. It is generally not necessary to be fluent at writing new code
in R, but if you find it convenient, it is OK to use R for your
research code.

3

C++

For some of our projects, we use C++. If you are working on a project requiring C++, then you need to be fluent in it. If your project does not require C++, you should avoid using C++:
C++ software development takes unnecessarily long and since C++
discourages the kind of visualization and experimentation that is
important in pattern recognition. If you need to use C++, you need to learn and follow our C++
coding conventions (these actually simplify your life because we don't
use a lot of the more arcane features of C++). If you do write C++ code, please make it available as a Python module as well (e.g., using SWIG).

4

Fortran 95/2003

Fortran
usage is optional and we do not use it much right
now. However, if you find it convenient, it is a choice you can make.
Fortran 95 and 2003 are very easy to use for numerical
computations and basically work like a very fast, compiled Matlab.
There are lots of libraries, it's easy to parallelize,
it works well with C, and you get high performance code with minimum
fuss. Fortran code is easy to call from Python. You should know at
least one of C++ or Fortran in order to be able to write fast numerical
code when needed.

5

Matlab / Octave

Matlab
functionality is a subset of Python's and R's, but there are many
packages that implement functionality that we need to benchmark
against. If your project involves machine learning, you probably at
least should be able to run existing packages in Matlab and know the
basics of the syntax. Please avoid writing new code in Matlab unless a
project specifically requires it and there is no other choice.

6

Java

We
avoid developing software in Java because Java doesn't have good
numerical or image processing support and because Java doesn't
integrate well with other languages. However, for some projects, it's unavoidable. If you need to deal with Java
software, Jython (a Python implementation in pure Java) can make your
life a lot easier.

Additional Skills

You
should know any of these books that relate to your specific thesis
topic. For example, if you work on 3D book surface modeling, you
should probably know what's in Forsyth and Ponce, Marr's, and Horn's
book.