Something I was wondering recently is whats the difference between a QA model and a model like GPT. The truth is QA models are focused specifically on question answering task. As opposed to open ended generalist GPT which can do things like

  • Generate text
  • Hold conversations
  • Translate languages
  • Summarize Content
  • Write stories, code, emails, poems - etc.

Differences

FeatureTraditional QA Model (like BERT + SQuAD)ChatGPT
Primary PurposeExtract answer spans from contextGeneral-purpose language generation
Input RequirementNeeds context + questionCan answer with or without context
MemoryStateless (answers one question at a time)Conversational memory (keeps chat history)
Output TypeUsually short answer spansLong-form, flexible, human-like responses
TrainingFine-tuned on QA datasetsTrained on massive text + fine-tuned with RLHF (human feedback)

Something to note is that an LLM can be considered a QA model but an LLM isn’t just a QA model.

AI LLM