The Literature  Books and journals
[Advanced topics]
Linguistics

The
ACLMIT Series in Natural Language Processing contains a number of
books, mostly on advanced topics.

The Center for the Study of Language and Information (CSLI) at Stanford publishes an extensive
collection of books on advanced topics, many of them covering linguistics and computational
linguistics (or at least the theory thereof).
Access their catalogue
and look at the
CSLI homepage.
Word morphology
This deals with aspects of word formation such as plurals, hyphenated forms, and various
affixes. These are quite common in biology, so it is useful to understand this field.
Examples of common prefixes and suffixes are: ortho, poly, micro,
ase, some, cin, ine, globin, genic, and many more. Simple "stemming" chops off
suffixes, typically to turn plurals into singulars, but more careful manipulations
can be done.

Sproat, Richard. 1992.
Morphology and Computation. Cambridge:
MIT Press. 1992. ISBN 0262193140 313 pp. 49 illus.
$50.00/£34.50 (cloth). This is an excellent book on this topic.
The books below all require some competence in mathematics, particularly discrete math
and probability, statistics and some information theory.
Statistical NLP

Foundations of Statistical Natural Language Processing
by Christopher D. Manning and Hinrich Schütze
Cambridge: MIT Press. 1999. ISBN 0262133601 620 pp. $64.95/£44.95 (cloth).
Look here
for more details about the book, sample chapters, courses
around the country using it, etc.

Elements of Information Theory by Thomas M.
Cover and Joy A. Thomas, published by John Wiley, 1991.
Here is the authors' own
home page for the book. Truly an excellent book on the topics underlying
much of statistical natural language processing.
FiniteState Methods

FiniteState Language Processing
by Emmanuel Roche and Yves Schabes (eds.)
MIT Press 1997 ISBN 0262181827 464 pp. $60.00/£41.50 (cloth).
Finitestate methods, corresponding roughly to the regular expressions
used in Perl, the Java ORO tools, etc., are not as powerful as some
parsing techniques but they are theoretically and in practice faster
and more efficient than the more general methods. By carefully
crafting them and combining various finitestate analyses, practical
and fast systems can be built.
Statistical Language Learning
 Statistical Language Learning
by Eugene Charniak. MIT Press, 1996. ISBN 0262531410 192 pp. 80 illus. $17.95/£12.50 (paper).
A short book, full of interesting topics.