Note:
CS 466 and
BIOE 498 are being held at the same time
and same location. They will have the same assignments,
exams, and grading will be done identically.

Brief description:
Algorithmic approaches in bioinformatics: (i) biological problems that can be solved computationally (e.g., genome assembly, sequence alignment,
and phylogeny estimation); (ii) algorithmic techniques with wide applicability in solving these problems (e.g., dynamic programming and probabilistic methods); (iii) practical issues in translating the basic algorithmic ideas into accurate and efficient tools that biologists may use.
The course has one midterm, homeworks,
and a final project (but no final exam).
Undergraduates will receive 3 undergraduate credits; graduate students can
enroll for 3 or 4 graduate credits.

Graduate students:
If you are enrolling for 4 units,
you will have four extra
homework assignments and you will
present a paper in class.

Teaching philosophy:
Although much of the course is through lectures,
some of it involves assigned reading, and the material
that is covered in class is likely to
go beyond both the posted lecture and assigned
reading.
In fact, since the discussion in class may result in
material being presented that is not in the two textbooks
for the course, it is important to attend lectures.
Much of this
material is aimed at teaching students about
new research, and enabling them to do research. Therefore,
reading and presenting research papers is a fundamental part of
the learning process.
Overall, the important point is that
active participation is necessary for this course.
In practice, this means you
are expected to ask questions and to
be present at all class meetings. If you
need to miss a class, please make sure to meet
with me or with someone else in the class to go over
what happened that day.

Course prerequisites: CS 225 (and all its
pre-requisites) or its equivalent.
No background in biology is required.
If you did not take CS 225 and its pre-requisites at UIUC,
you will need to get permission from me to stay in the course.
This will include doing an extra homework assignment (due January 21)
and meeting with me for a one-on-one meeting
after the homework is submitted to review the material.

Syllabus: Algorithms for DNA sequence comparison, including
pairwise and multiple sequence alignment, evolutionary tree construction, and genome assembly.
The design and analysis of algorithms, using techniques from
computer science including divide-and-conquer, dynamic programming,
and
recursion. Applications of these methods to solve biological problems.

Course materials:
Assigned reading should be done before coming to class,
and will come from
the scientific
literature and from two textbooks for
the course. One
of those textbooks,
Computational Phylogenetics,
can be downloaded from my website at that link.
The other textbook requires that you purchase it,
but is not strictly speaking necessary for the class.
With the exception of some material on genome assembly
and database search,
all the material I will teach will be in the
Computational Phylogenetics textbook.
My lecture presentations
will be posted at this website.
Textbooks for the class:

Attendance policy:
Attendance in the class is strongly
advised, since some material and discussion
will happen that does not appear in the posted lectures.
You are responsible for all material discussed in
class, whether or not you attend; if you
miss class for some reason, please make sure
to talk with me or one of the students in the
class to find out what you missed.
Participation in the course is also part of the
grade; hence, missing class can reduce this
component of your grade.

Grading:

Homeworks: 35% of the course grade (worst homework dropped).
Homeworks must be submitted in PDF format through Moodle.
These will be a combination of pen-and-paper calculations,
proofs,
programming assignments (in the language
of your choice), and writing assignments.
There will also be weekly reading assignments.
The academic integrity code is applied to the
homework assignments, as follows.
You are not allowed to consult with anyone
outside of the class (other than the instructor and TA)
about homework problems. However,
you are encouraged to work with other students currently
enrolled in this
course on the
homework.
If you work with
someone else, observe the following
guidelines: (1) indicate on the homework who you worked with,
(2)
do not look at the other homework solutions when you write
your own solutions,
(3) do not share your solutions with anyone, and
(4)
write your homework solutions
entirely on your own, using your own language.

Unless otherwise specified, these will be due on Tuesdays by 1PM, and
otherwise specified, all homework problems are
from the Computational Phylogenetics textbook, at the end of
each chapter.

Homeworks submitted late but before 24 hours after
the deadline receive 80% of the grade.
Homeworks submitted after 24 hours but
before 48 hours receive 60% of the grade.
No homeworks will be accepted after 48 hours past the
deadline.

Homework regrade policy:
To request a ``regrade" of a homework problem, submit the request for
a specific problem to me in email, within
one week of receiving the grade on the problem (this
covers all but the first two homeworks, for which
you can request a regrade by Feb 9). I will then give
you a new problem covering the
same material. That
resubmission should be submitted through Moodle as part of the
next week's homework.
You can only do this for three problems (not three homeworks).

Extra credit policy: if you are not enrolled
for four graduate credits, you are welcome to do
the extra homeworks that are assigned for the
graduate students who are taking four graduate credits;
a grade of 80% or better on any such homework can then be used
to replace any prior homework grade.
The eligible homeworks are HW 5, HW 6, and
and the second part of HW 8 (from Chapter 8).

Students enrolled for 4 graduate units: you have
extra assignments; see the homework list.

Midterm: 35% (March 31).
In class exam, closed book, no calculators

Class participation (including class
presentations):
10%.

Final project (due May 3): 20%.
Your final project
needs to be specifically related to content
from this course, focusing mainly on algorithmic aspects of

phylogeny estimation

multiple sequence alignment

genome assembly

You are also welcome to examine
applications of sequence alignment and/or phylogeny
estimation to problems in biology.
Because addressing NP-hard problems is fundamental to
bioinformatic analysis, you are also welcome to pick
some NP-hard problem (e.g., travelling salesman) and
do your final project about methods that attempt to
solve the problem.
Your proposal for a course project must be approved in
advance by the instructor.
If you do a research
project, it can be joint with another student in the class.
Note that you will present your final project
proposal in class, and use the Q&A to help
refine your project plan.
The simplest final project is
a survey of the literature
on computational methods or models for
some biological problem. More ambitious
projects involve doing research. For example,
you can analyze a biological dataset using
at least two different methods and discuss the
differences that you
see. Or
you can do a performance study of existing leading methods on
some datasets (biological or simulated), to evaluate
their accuracy and computational requirements.
Or you can develop a new method for some important problem,
and compare it to a prior method.
The more ambitious the project,
the more important it is to start early (and to
consult with me to make sure it's feasible).
All final projects require prior approval, and project
proposals are due no later than April 3 (earlier is much better).
The final project is due May 3 (last class day),
and should be written up in
proper format (as though it were to be published), with
references to the scientific literature. The written document
should be 6-10 pages long (or more if you wish), including
the references and any figures you wish to include.
The grading of the final project is 50% on writing and 50% on
content.
Many students in my classes have done course projects that
have led to journal papers!
I will list some specific final projects that you can do.
If you wish to do something else, you will need
to consult with me to get it approved well in advance.

Academic integrity:
You are expected to abide by the university
academic integrity standards, which means (among other things)
that you should never copy anyone else's homework nor let anyone
copy your homework.
This is particularly important for your final project, especially
if you refer to the scientific literature in your project.
You must also never plagiarize, which
means (among other things) that any text that you
copy from another document must be properly
attributed (with quotation marks around the
copied material, and citation to the document from which you
have copied the material). Even paraphrasing can count
as plagiarism.
All violations of academic integrity standards will be
reported to the appropriate university offices. Serious
violations will result in a failing grade for the course.
Please see this page for a brief discussion of this issue, and
the real
academic integrity page.
The academic integrity code is applied to the
homework assignments, as follows.
You are encouraged to work with other students currently
enrolled in this
course on the
homework, but if you do this, this is what you should do.
First, indicate on the homework who you worked with.
Second, do not look at the other homework solutions when you write
your own solutions; this includes not looking at
someone else's write-up of a critique of some literature.
Third, and more generally, you must write your homework solutions
entirely on your own, using your own language. Please do not
under any circumstances copy homework solutions from anyone
else, or let anyone copy from you.
Similarly, the academic integrity code is applied
to your final project by the expectation that you
will not copy text from any paper, and you will
give
appropriate credit to all material that you use
from prior publications, websites, etc.