AI Glossary
Every AI term explained clearly, with real-world use cases from developers who build production AI apps every day. No academic jargon.
50 terms defined
Fundamentals
Large Language Model (LLM)
A neural network trained on massive text datasets that can generate, understand, and reason about human language.
Transformer
The neural network architecture behind all modern LLMs, using self-attention mechanisms to process sequences in parallel.
Tokenization
The process of breaking text into smaller units (tokens) that an AI model can process, typically subwords or word pieces.
Context Window
The maximum amount of text (measured in tokens) that an AI model can process in a single request, including both input and output.
Hallucination
When an AI model generates information that sounds plausible but is factually incorrect, fabricated, or not grounded in its training data.
Multimodal AI
AI models that can process and generate multiple types of data: text, images, audio, video, and code.
Temperature
A parameter that controls the randomness of AI model outputs, with lower values producing more deterministic responses and higher values producing more creative ones.
Attention Mechanism
A neural network component that allows models to dynamically focus on the most relevant parts of the input when generating each token of output.
Techniques
RAG (Retrieval-Augmented Generation)
A technique that enhances AI responses by retrieving relevant information from a knowledge base before generating an answer.
Embeddings
Numerical vector representations of text that capture semantic meaning, enabling similarity search and clustering.
Fine-Tuning
The process of further training a pre-trained AI model on your specific data to improve performance on domain-specific tasks.
Prompt Engineering
The practice of crafting effective instructions for AI models to produce desired outputs consistently.
Function Calling (Tool Use)
An AI capability where the model can decide to invoke external functions or APIs based on the conversation context.
AI Agent
An AI system that can autonomously plan, reason, use tools, and take actions to accomplish goals with minimal human intervention.
Chain of Thought (CoT)
A prompting technique that improves AI reasoning by instructing the model to break down complex problems into intermediate steps before giving a final answer.
Few-Shot Learning
A prompting technique where you provide a small number of input-output examples in the prompt to teach the model the desired behavior.
Zero-Shot Learning
The ability of an AI model to perform a task based solely on instructions, without any training examples provided in the prompt.
Transfer Learning
A machine learning technique where a model trained on one task is adapted to perform a different but related task, reducing the data and compute needed.
Reinforcement Learning from Human Feedback (RLHF)
A training technique that aligns AI model behavior with human preferences by using human feedback to reward desired outputs and penalize undesired ones.
Semantic Search
A search approach that finds results based on meaning rather than exact keyword matches, using embeddings to understand the intent behind queries.
Structured Output / JSON Mode
A feature that forces AI models to return responses in a specific format like JSON, ensuring parseable and type-safe outputs for programmatic use.
Streaming
A method of receiving AI model output token-by-token in real time as it is generated, rather than waiting for the complete response.
Batch Processing
Processing multiple AI requests together as a group, typically at lower cost and higher throughput than real-time individual requests.
Infrastructure
MCP (Model Context Protocol)
An open standard by Anthropic that provides a universal way for AI models to connect to external data sources and tools.
Vector Database
A specialized database optimized for storing and searching high-dimensional vector embeddings, enabling semantic similarity search.
Inference
The process of running a trained AI model to generate predictions or outputs from new inputs, as opposed to training the model.
API Gateway
A server that acts as a single entry point for AI API requests, handling routing, rate limiting, authentication, and load balancing across multiple AI providers.
Model Serving
The infrastructure and process of hosting a trained AI model and exposing it as an API endpoint for real-time or batch inference.
Edge AI / On-Device AI
Running AI models directly on user devices (phones, laptops, IoT) rather than sending data to cloud servers for processing.
GPU / TPU
Specialized processors designed for the parallel mathematical operations that AI models require for training and inference.
Quantization
A technique that reduces AI model size and memory requirements by using lower-precision numbers to represent model weights, trading a small accuracy loss for major efficiency gains.
Distillation
A technique where a smaller "student" model is trained to replicate the behavior of a larger "teacher" model, achieving comparable quality at lower cost.
Latency
The time delay between sending a request to an AI model and receiving the response, critical for real-time user-facing applications.
Token Limits / Rate Limiting
Restrictions imposed by AI API providers on the number of tokens processed or requests made within a given time period.
Models
GPT
OpenAI's family of generative pre-trained transformer models, the most widely adopted LLMs for commercial AI applications.
Claude
Anthropic's family of AI models known for long context windows, strong reasoning, and instruction-following capabilities.
Gemini
Google's multimodal AI model family optimized for text, image, audio, and video understanding and generation.
Llama
Meta's open-source large language model family that can be downloaded, modified, and self-hosted without API fees.
Open-Source AI
AI models whose weights and architecture are publicly available, allowing anyone to inspect, modify, run, and build upon them.
Applications
Computer Vision
The field of AI that enables machines to interpret and understand visual information from images and video.
Natural Language Processing (NLP)
The branch of AI focused on enabling computers to understand, interpret, and generate human language in useful ways.
Text-to-Speech (TTS)
AI technology that converts written text into natural-sounding spoken audio, enabling voice interfaces and audio content generation.
Speech-to-Text (STT)
AI technology that converts spoken audio into written text, enabling voice input, transcription, and voice-controlled interfaces.
Image Generation
AI models that create new images from text descriptions (prompts), enabling automated visual content creation.
Video Generation
AI models that create video content from text prompts or images, an emerging capability for automated video production.
Sentiment Analysis
An NLP technique that determines the emotional tone of text, classifying it as positive, negative, neutral, or more granular emotions.
Named Entity Recognition (NER)
An NLP technique that identifies and classifies named entities in text, such as people, organizations, locations, dates, and monetary values.
Autonomous Agents
AI systems that can independently plan, execute multi-step tasks, use tools, and adapt their approach based on results, with minimal human oversight.
Chatbot
An AI application that conducts conversations with users through text or voice, handling questions, tasks, and interactions in natural language.
Copilot
An AI assistant integrated into a user's workflow that provides real-time suggestions, completions, and assistance alongside the user's work.
Want to put these concepts into practice?
AI 4U Labs ships production AI apps using these technologies every week. Let's build yours.
Start a Project