Cool spinning 3D protein
About this animation

PLPTH 890
Introduction to Genomic Bioinformatics
Spring 2007

Home page
Organization
Schedule
WWW resources
Research project
K-State Online

Instructor

Clare Nelson
Associate professor
Department of Plant Pathology
4022B Throckmorton
jcn@ksu.edu
2-1359

Class times

Lecture: TU 10:00 - 11:15 in Throckmorton 4031 (Willis Conference Room)
Lab: U 12:30 - 3:20 in Throckmorton 1302B (Agronomy computer lab; not used this semester)

Course summary


This course, offered in spring of odd-numbered years, is oriented to graduate-level students in the biological sciences who seek an introduction to the principles and practice of computational analysis of genomic data. Students from computer-science and other analytical backgrounds are welcome, but should be aware that much of the material is targeted to the level of analytical aptitude and training of biology students. For this reason the course focuses less on developing tools than on using them appropriately and communicating intelligently with bioinformatics specialists.


What bioinformatics is and isn't


It's usually said that bioinformatics is the computational analysis of genomic data. But if you scan the journal Bioinformatics, you'll see very few biological results. What you'll see is a lot of work on computational methods and tools, built on foundations of mathematics and statistics. Few students in biological courses of study want this, so this course might better be called something like "Introduction to bioinformatics tools." Whatever the name, its aim is to help you become a biologist better equipped to use genomic data, and to just a few students, to show the beauty and power of deeper approaches.


Programming is not required in this course.

Working with a programmer is!

In the first three years the course was offered, it featured a required series of Perl lectures, labs, and assignments. Programming is a basic means of expression in bioinformatics (in my opinion, in most of the sciences), and a course purporting to cover bioinformatics cannot fail to expose students to Perl. However, as a rule about  half the students can handle this material, while the other half are baffled by it and sometimes even drop out to avoid it. In fact you can do lots of bioinformatics without writing a line of code -- but you can be much more effective if you know how, or have a colleague who does. In this course, at least one of those conditions will be met.



Two-track instruction plan


All students will attend the biweekly lectures. However a second set of four optional sessions on Perl programming will be arranged, beginning in the first month. Problem sets will contain exercises requiring Perl knowledge. Students will be assigned partners, at least one of whom will be able to tackle the programming exercise. These partners may not be the same throughout the semester. Exercises will overlap; some will need to be done jointly and some independently, but not all will be required. As a sample, an exercise set might contain
Required problems (50 - 80% of credit)
Elective problems (40 - 70% of credit)

with both groups containing problems requiring some or no programming.

I will expect independent work where indicated, and will encourage collaborative work elsewhere.


Prerequisites


Most basically: you must be comfortable with using computers for composing documents and WWW browsing.

Beyond this, a familiarity with the biochemistry and genetics of nucleic acids and proteins is assumed. Most bioinformatics texts devote a chapter or two to reviewing these basics. This isn't a biology course and won't teach you the subject from scratch, although we will try not to use jargon unnecessarily. If ORFs, UTRs, ESTs, STSs, cDNA, mRNA, RFLPs, BACs, YACs,
cosmids, promoters, mitochondria, ADH, ATP, poly-A tails, 5' ends, hybridization, Southern blots, contigs, denaturing, retrotransposons, LTRs, PCR, homology, paralogy, physical maps, and microarrays (to list just a few) are new to you, you risk getting lost. For a molecular biologist these things are core knowledge for bioinformatics work. So if most of these terms are mysteries to you, do consider taking a couple of biology courses first. Completion of Biology 450 (Modern Genetics) at KSU, its equivalent in Animal Science, or the equivalent at another university, should be enough. If you just need a refresher, have a look at this molecular-biology tutorial, this view of computational molecular biology, and this introduction to DNA structure!


Topics


More computational

More biological

  • building WWW pages
  • using Unix-flavored operating systems
  • pairwise and multiple sequence alignment
  • phylogenetic tree construction
  • DNA and protein pattern searching, gene finding
  • comparative genomics
  • analysis of microarray data
  • inferring gene regulatory networks
  • interface of genetics and genomics (eQTLs)


Textbooks




Textbooks are not required, but I don't recommend that you rely only on the lectures to learn bioinformatics.
Below are listed some books and my impressions of them.


Bioinformatics: Sequence and Genome Analysis, 2nd ed.
D. W. Mount; Cold Spring Harbor Laboratory Press, 2004
The author's motive is to explain the algorithms that underlie sequence alignment and database searching; reconstruction of phylogenies, genes, and RNA and protein structures; and genome analysis with coverage of microarrays and pathways. Local bookstores have the book in stock, or it costs about $80 plus shipping if ordered from CSHL. I recommend students' acquiring this widely adopted book. It covers a wide range of bioinformatics topics more comprehensively than other texts, if sometimes not as lucidly as might be wished
Bioinformatics and Molecular Evolution
P. Higgs, T. Attwood; Blackwell, 2005
I'm impressed with this book too; I find the explanations clearer and the end-of-chapter problems more interesting. But it's more phylogenetics-centered than Mount and there are several areas, like gene prediction, genome assembly, and physical mapping, that are touched only very lightly
Bioinformatics and Functional Genomics
J. Pevsner; Wiley, 2003
I haven't examined it yet.
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, 3rd edition
Baxevanis & Ouellette, eds.; Wiley Interscience, 2004
I think there's a new edition.

Fundamental Concepts of Bioinformatics
D. E. Krane, M. L. Raymer; Benjamin Cummings, 2002.

This is a shorter book than Mount's but contains solidly useful coverage of the key concepts. I found especially informative the chapters on phylogenetics and protein structure analysis. As a textbook it's designed for an undergraduate-level course -- we don't have one at KSU yet.
Discovering Genomics, Proteomics, & Bioinformatics
A. M. Campbell, L. J. Heyer; Benjamin Cummings, 2002
This book is unlike others in touching only lightly on algorithms (I don't think it even mentions the sequence-alignment problem) and focusing on practical discovery with existing tools and databases, with principal attention to human genetics and diseases. I would class this too as best adapted to an undergraduate course for future medical or molecular-biological professionals. This is in no way to slight its rich content (Chapters 7 and 8, for example, give a 50+ page coverage of genomic circuitry and its dissection, probably the foundation stone of post-genomic research), but the students in our 890 class are likely to be a mix of 1) plant genetics researchers and 2) crossover students from CS and other computing-heavy fields interested in computational approaches.
Developing Bioinformatics Computer Skills
Gibas & Jambeck; O'Reilly, 2001
Cheaper but less comprehensive.
Microarray Bioinformatics
D. Stekel, Cambridge University Press, 2003



Perl

There are plenty of these.
Learning Perl, 3rd edition
Schwartz & Christiansen; O'Reilly
Good starter; another is Programming Perl
Bioinformatics, Biocomputing and Perl
M. Moorhouse, P. Barry. Wiley, 2004
Haven't seen it.
Perl Programming for Biologists
D. C. Jamison. Wiley, 2003
Haven't seen it.
Beginning Perl for Bioinformatics
J. Tisdall, B. Waliszewski;  O'Reilly, 2001
I've only glanced at this, but it appears to be a manual of practical Perl programming, and well worth a look. The followup book, which introduces BioPerl in Ch. 9 (free for online reading), is...
Mastering Perl for Bioinformatics, 1st ed.
J. Tisdall; O'Reilly, 2003

Genomic Perl: from Bioinformatics Basics to Working Code
R. A. Dwyer, Cambridge University Press, 2002
Written by a computer scientist and too advanced for beginners.


Computational biology

The following books go much deeper than we do in the course.
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
Durbin, Eddy, Krogh, and Mitchison; Cambridge University Press 1998
Explains hidden-Markov models (HMMs)

Bioinformatics: the Machine Learning Approach
P. Baldi and S. Brunak, Harvard University Press 1996.

Relates Bayesian/information-theoretic/likelihood-maximizing/energy-minimizing measures applied to lots of problems including HMMs, phylogenetic trees, microarrays

Waterman and Setubal/Meidanis


Exercises/projects


Exercises and problem sets will be assigned each week. A research project will be required and may be developed from your own research data or interests.


More organizational details




WWW resources


Here's CN's current bookmarks list and source for ideas, tutorials, readings, and exercises for the course.


Journals


A highly regarded printed journal is Bioinformatics. CN receives this in print and online and will lend issues or help students obtain articles. Many more journals, including online-only resources like BMC Bioinformatics, appear in this list.


Former-student work


Demonstration of alignment by dynamic programming Animated tutorial on profile construction.


About the animation


The animated GIF at top was made with a RasMol script that rotates a myoglobin molecule 360° in 10° increments in each of the three axes in turn, while occasionally changing the drawing parameters, and saving an image to disk with each rotation. The 108 image files were then assembled with Animagic's GIF Animator into the single
gif image file that is invoked from this WWW page. Much more elaborate movies are possible, showing biochemically interesting aspects of the molecule being displayed. An operation that a few years ago required a graduate-level scientist and a high-end workstation is now an hour's work for a bright ten-year-old on a desktop computer...
 
Back to top