• wonderlic tests
  • EXAM REVIEW
  • NCCCO Examination
  • Summary
  • Class notes
  • QUESTIONS & ANSWERS
  • NCLEX EXAM
  • Exam (elaborations)
  • Study guide
  • Latest nclex materials
  • HESI EXAMS
  • EXAMS AND CERTIFICATIONS
  • HESI ENTRANCE EXAM
  • ATI EXAM
  • NR AND NUR Exams
  • Gizmos
  • PORTAGE LEARNING
  • Ihuman Case Study
  • LETRS
  • NURS EXAM
  • NSG Exam
  • Testbanks
  • Vsim
  • Latest WGU
  • AQA PAPERS AND MARK SCHEME
  • DMV
  • WGU EXAM
  • exam bundles
  • Study Material
  • Study Notes
  • Test Prep

CS 7643 CS7643 Quiz 4 Latest

QUESTIONS & ANSWERS Dec 16, 2025 ★★★★★ (5.0/5)
Loading...

Loading document viewer...

Page 0 of 0

Document Text

CS 7643 / CS7643 Quiz 4 (Latest Update 2025 / 2026) Deep Learning | Questions & Answers | Grade A | 100% Correct - Georgia Tech

Question:

Teacher Forcing

Answer:

  • next input to model is not predicted value, but the actual value from the
  • training data

  • allows model to train effectively even if a mistake was made
  • if used instead of hidden-to-hidden recurrence nodes, can allow for
  • parallelization, but model becomes less powerful

  • emerges from MLE
  • issues may arise if network is later going to be used in "closed-loop" mode
  • where output is fed back as input

  • / 3

Question:

Skip-Gram Model: Loss/Objective Function

Answer:

Loss - for each position t, we try to predict the context words within a fixed window size given some context word

  • multiple these probabilities to get a likelihood
  • L(theta) = product(product(P(w_(t+j) | w_(t) ; theta))

- Objective function: J(theta) = - 1/T log(L(theta))

Question:

Skip-Gram Model: Calculate P(w_(t+j) | w_(t) ; theta)

Answer:

- Two vectors for each word:

  • u_w when w is center word
  • v_o when o is a context word
  • uses inner product (u_w, v_o) to measure how likely it is that u_w appears
  • with context word v_o

  • P(w_(t+j)) = SOFTMAX(u_(wt) * v(w_(t+1)))
  • params to optimize are thus u and w
  • / 3

Question:

Skip-Gram Model: Main Disadvantage

Answer:

  • Expensive to compute
  • Can solve this via hierarchical Softmax
  • Can solve this via Negative Sampling

Question:

Skip-Gram Model: Negative Sampling

Answer:

  • for each (w, c) pair, sample k negative pairs (w, c')
  • Maximize probability that outside word appears, minimize probability that
  • random word appears

  • choose a distribution that samples less frequent words likely

Question:

Word Embeddings as a graph

Answer:

  • each word is a node with edge connections to context words

  • / 3

User Reviews

★★★★★ (5.0/5 based on 1 reviews)
Login to Review
S
Student
May 21, 2025
★★★★★

I was amazed by the practical examples in this document. It was incredibly useful for my research. Truly excellent!

Download Document

Buy This Document

$1.00 One-time purchase
Buy Now
  • Full access to this document
  • Download anytime
  • No expiration

Document Information

Category: QUESTIONS & ANSWERS
Added: Dec 16, 2025
Description:

CS 7643 / CS7643 Quiz 4 (Latest Update) Deep Learning | Questions & Answers | Grade A | 100% Correct - Georgia Tech Question: Teacher Forcing Answer: - next input to model is not predicted value, b...

Unlock Now
$ 1.00