Natural Language Processing (CS 703) - Semester 7 BTech CSE at VIT

Your Progress

0 / 28 topics

0% complete

Overview

🎯

Why it matters

ChatGPT, Google Translate, Siri, Alexa — all NLP. Understanding tokenization, transformers, BERT, GPT is essential for building conversational AI, chatbots, search engines, and text analytics.

💼

Placement relevance

NLP Engineer roles at Google, Microsoft, OpenAI. Chatbot developers. Search ranking teams. ₹35-70 LPA for NLP specialists. HUGE demand post ChatGPT boom.

🔗

Prerequisites for

Conversational AI · Chatbot Development · Machine Translation · Text Analytics · Voice Assistants · LLM Fine-tuning

📚

Recommended books

Speech and Language Processing by Jurafsky and Martin · Natural Language Processing with Python by Steven Bird · Natural Language Processing in Action by Hobson Lane · Transformers for Natural Language Processing by Denis Rothman

Curriculum — 4 Units

U1

Unit 1 · 7 Topics · 0% complete

Text Processing & Basics

⚡ Key Formulae

TF-IDF:TF-IDF = TF(t,d) × log(N/DF(t))

Word2Vec:CBOW (context→word) vs Skip-gram (word→context)

Tokenization

Stemming & Lemmatization

Stop Words Removal

Bag of Words (BoW)

TF-IDF

Word Embeddings (Word2Vec, GloVe)

N-grams

U2

Unit 2 · 7 Topics · 0% complete

Language Models & Sequence Processing

⚡ Key Formulae

Language Model:P(w₁, w₂, ..., wₙ) = ∏P(wᵢ | w₁...wᵢ₋₁)

Attention:Context vector = weighted sum of encoder states

N-gram Language Models

RNN for NLP

LSTM for Text

Sequence-to-Sequence Models

Encoder-Decoder Architecture

Attention Mechanism

Beam Search

U3

Unit 3 · 7 Topics · 0% complete

Transformers & Modern NLP

⚡ Key Formulae

Self-Attention:Attention(Q,K,V) = softmax(QK^T/√d_k)V

BERT:Masked Language Modeling + Next Sentence Prediction

Self-Attention Mechanism

Transformer Architecture

BERT (Bidirectional Encoder)

GPT (Generative Pre-trained Transformer)

Fine-tuning Pre-trained Models

Transfer Learning in NLP

Hugging Face Transformers

U4

Unit 4 · 7 Topics · 0% complete

NLP Applications

⚡ Key Formulae

NER:Sequence tagging: BIO tags (Begin, Inside, Outside)

Sentiment:Classify: Positive, Negative, Neutral

Sentiment Analysis

Named Entity Recognition (NER)

Machine Translation

Text Summarization

Question Answering

Chatbots & Dialogue Systems

Topic Modeling (LDA)

Previous Year Questions

Unit 12023 · End Semester10 marks

Calculate TF-IDF scores for the word 'machine' in 3 documents. Given: Doc1: 'machine learning', Doc2: 'machine intelligence', Doc3: 'deep learning'. Show all steps.

Unit 32023 · End Semester8 marks

Explain Transformer architecture with self-attention mechanism. How does multi-head attention work? What are the advantages over RNN/LSTM?

Unit 42022 · End Semester6 marks

Design a sentiment analysis pipeline for tweets. Mention preprocessing steps, feature extraction (TF-IDF or embeddings), and classification algorithm.

Exam Strategy

🔢

TF-IDF calculations

Practice TF-IDF computations with 2-3 documents. Show term frequency, document frequency, final TF-IDF score. Common exam question.

🔄

Transformers are key

Attention mechanism, multi-head attention, BERT vs GPT comparison. Draw Transformer architecture diagram. Explain positional encoding.

💡

Real applications

Sentiment analysis, NER, chatbots — explain with pipeline diagrams. Preprocessing → Feature extraction → Model → Output. Give examples.

Related Subjects

Semester 7