Indian Institute of Technology Bombay

Course outline:
Sound : Biology of Speech Processing; Place and Manner of Articulation; Word Boundary Detection; Argmax based computations; HMM and Speech Recognition.

Words and Word Forms : Morphology fundamentals; Morphological Diversity of Indian Languages; Morphology Paradigms; Finite State Machine Based Morphology; Automatic Morphology Learning; Shallow Parsing; Named Entities; Maximum Entropy Models; Random Fields.

Structures : Theories of Parsing, Parsing Algorithms; Robust and Scalable Parsing on Noisy Text as in Web documents; Hybrid of Rule Based and Probabilistic Parsing; Scope Ambiguity and Attachment Ambiguity resolution.

Meaning : Lexical Knowledge Networks, Wordnet Theory; Indian Language Wordnets and Multilingual Dictionaries; Semantic Roles; Word Sense Disambiguation; WSD and Multilinguality; Metaphors; Coreferences.

Web 2.0 Applications : Sentiment Analysis; Text Entailment; Robust and Scalable Machine Translation; Question Answering in Multilingual Setting; Cross Lingual Information Retrieval (CLIR).

Lecture topics:

  1. Introduction
  2. Machine Learning and NLP
  3. ArgMax Computation
  4. WSD : WordNet
  5. Wordnet; Application in Query Expansion
  6. Wiktionary; semantic relatedness
  7. Measures of WordNet Similarity
  8. Resnick's work on WordNet Similarity
  9. Parsing Algorithms
  10. Evidence for Deeper Structure; Top Down Parsing Algorithms
  11. Noun Structure; Top Down Parsing Algorithms
  12. Non-noun Structure and Parsing Algorithms
  13. Probabilistic parsing; sequence labeling, PCFG
  14. Probabilistic parsing: Training issues
  15. Arguments and Adjuncts
  16. Probabilistic parsing; inside-outside probabilities
  17. Speech : Phonetics
  18. HMM
  19. Morphology
  20. Graphical Models for Sequence Labelling in NLP
  21. Phonetics
  22. Consonants (place and manner of articulation) and Vowels
  23. Forward Backward probability; Viterbi Algorithm
  24. Phonology
  25. Sentiment Analysis and Opinions on the Web
  26. Machine Translation and MT Tools - GIZA++ and Moses.
  27. Text Entailment
  28. POS Tagging.
  29. Phonology; ASR, Speech Synthesis
  30. HMM and Viterbi
  31. Precision, Recall, F-score, Map
  32. Semantic Relations; UNL; Towards Dependency Parsing.
  33. Universal Networking Language
  34. Semantic Role Extraction
  35. Baum Welch Algorithm; HMM training
