STANFORD LINGUIST 138/238     -     SYMBSYS 138
Introduction to Computer Speech and Language Processing 
Autumn 2004

Course Information

Time : Tu,Th 3:15-4:45 Classroom 380-380X
Professor: Dan Jurafsky Office Hours 113 Margaret Jacks, Wed 2:30-4:00
TA: Neal Snider Office Hours TBA

This new course is an introductory overview to the field of computer speech and language processing and computational linguistics. We will cover spoken language dialog systems, speech recognition and synthesis, web-based question answering, web search, finite-state methods, parsing and grammars, computational semantics, and discourse processing. The focus of this class is on writing scripts to use available online implementations of these applications, rather than on implementing the applications themselves. This class thus acts as a natural introduction to Stanford's rich offerings in computer language processing. Students are encouraged to continue with the classes listed in the the NLP, Speech and Dialog Processing, and Computational Linguistics Course List.

Course Requirements:

Readings

To be handed out.

Schedule

WK
DATE
TOPIC
HOMEWORK DUE TODAY
1
Sep 28
Overview of Computer Speech and Language Processing, Regular Expressions
lecture 1 slides: ppt
J+M Chapter 1

1
Sep 30
Finite Automata
lecture 2 slides: ppt
J+M Chapter 2
J Weizenbaum. ELIZA- A Computer Program for the Study of Natural Language Communication between Man and machine. CACM, Vol. 10, 1967
2
Oct 5
Speech Dialogue Systems
lecture 3 slides: ppt
J+M New Draft of Chapter 19
HW 1: ELIZA
2
Oct 7
Speech Dialogue Systems (II)
lecture 4 slides: ppt
Continue reading J+M new draft of Chapter 19
3
Oct 12
Part of Speech Tagging and Intro to Probabilistic Modeling
lecture 5 slides: ppt
J+M New Draft of Chapter 8
HW 2: Dialogue
3
Oct 14
Part of Speech Tagging (II): Guest Lecture by Neal Snider
lecture 6 slides: ppt
4
Oct 19
Text-to-Speech Synthesis (I): Overview, Text Normalization, Grapheme-to-Phoneme
lecture 7 notes: html

Read sections 1, 2, 3, and 4 from Alan Black's lecture notes on TTS and Festival.
You can also look at the Festival manual. You don't have to read the whole thing through, but you should skim it so you know where things are in the manual.
OPTIONAL ADVANCED READING: The scripting language for Festival is Scheme. For those who don't know Scheme, here's an Introduction to Scheme for C Programmers, from Cal Tech.
OPTIONAL ADVANCED READING: For those who are get really excited by Scheme and want to know more, here's the homepage for the text Structure and Interpretation of Computer Programs, by Abelson, Sussman, and Sussman.
OPTIONAL ADVANCED READING: For more on text normalization, you can read sections 5 and 6.1.1 from Alan Black's lecture notes on TTS and Festival.
HW 3: POS
4
Oct 21
Text-to-Speech Synthesis (II): Prosody, Duration, Diphones, Unit Selection
5
Oct 26
Automatic Speech Recognition (I)
lecture 9 slides: ppt
J+M Chapter 7
HW 4: TTS
5
Oct 28
Automatic Speech Recognition (II)
lecture 10 slides: ppt
Pages 191-206 in J+M Chapter 6
6
Nov 2
Machine Translation (I)
lecture 11 slides: ppt
J+M Chapter 21
6
Nov 4
Machine Translation (II)
lecture 12 slides: ppt
HW 5: ASR
7
Nov 9
Grammars and Parsing (I)
lecture 13 slides: ppt
J+M Chapter 9
7
Nov 11
Grammars and Parsing (II)
lecture 14 slides: ppt
J+M Chapter 10
HW 6: MT
8
Nov 16
No Class; Dan in Switzerland
Note that although there is no class today, there is lots of reading for Thursday! So make sure you look at it early!
8
Nov 18
Jim Martin Guest Lecture: Web-based Question Answering
lecture 15 slides: ppt
J+M Chapter 17, read pages 646-658
Daniel M. Bikel, Richard Schwartz and Ralph M. Weischedel. 1999. An Algorithm that Learns What¡¯s in a Nae. Machine Learning Journal Special Issue on Natural Language Learning
D. Moldovan, S. Harabagiu, M. Pasca, R. Mihalcea, R. Goodrum, R. Girju, and V. Rus. 1999. LASSO: A tool for surfing the answer net. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), 1999.
E. Brill, S. Dumais and M. Banko. 2002. An analysis of the AskMSR question-answering system. Proceedings of EMNLP 2002.
9
Nov 23
Wordnet, FrameNet, and Computational Lexical Semantics
lecture 16 slides: ppt
9
Nov 25
THANKSGIVING HOLIDAY
10
Nov 30
Anaphora Resolution
lecture 17 slides: ppt
10
Dec 2
TBA
10
Dec 3
Friday due-date for homework: 10:00am
HW 7: QA

URL: http://www.stanford.edu/class/linguist238/