AI Glossary

Every AI term explained clearly, with real-world use cases from developers who build production AI apps every day. No academic jargon.

50 terms defined

Fundamentals

Techniques

RAG (Retrieval-Augmented Generation)

A technique that enhances AI responses by retrieving relevant information from a knowledge base before generating an answer.

Embeddings

Numerical vector representations of text that capture semantic meaning, enabling similarity search and clustering.

Fine-Tuning

The process of further training a pre-trained AI model on your specific data to improve performance on domain-specific tasks.

Prompt Engineering

The practice of crafting effective instructions for AI models to produce desired outputs consistently.

Function Calling (Tool Use)

An AI capability where the model can decide to invoke external functions or APIs based on the conversation context.

AI Agent

An AI system that can autonomously plan, reason, use tools, and take actions to accomplish goals with minimal human intervention.

Chain of Thought (CoT)

A prompting technique that improves AI reasoning by instructing the model to break down complex problems into intermediate steps before giving a final answer.

Few-Shot Learning

A prompting technique where you provide a small number of input-output examples in the prompt to teach the model the desired behavior.

Zero-Shot Learning

The ability of an AI model to perform a task based solely on instructions, without any training examples provided in the prompt.

Transfer Learning

A machine learning technique where a model trained on one task is adapted to perform a different but related task, reducing the data and compute needed.

Reinforcement Learning from Human Feedback (RLHF)

A training technique that aligns AI model behavior with human preferences by using human feedback to reward desired outputs and penalize undesired ones.

Semantic Search

A search approach that finds results based on meaning rather than exact keyword matches, using embeddings to understand the intent behind queries.

Structured Output / JSON Mode

A feature that forces AI models to return responses in a specific format like JSON, ensuring parseable and type-safe outputs for programmatic use.

Streaming

A method of receiving AI model output token-by-token in real time as it is generated, rather than waiting for the complete response.

Batch Processing

Processing multiple AI requests together as a group, typically at lower cost and higher throughput than real-time individual requests.

Infrastructure

MCP (Model Context Protocol)

An open standard by Anthropic that provides a universal way for AI models to connect to external data sources and tools.

Vector Database

A specialized database optimized for storing and searching high-dimensional vector embeddings, enabling semantic similarity search.

Inference

The process of running a trained AI model to generate predictions or outputs from new inputs, as opposed to training the model.

API Gateway

A server that acts as a single entry point for AI API requests, handling routing, rate limiting, authentication, and load balancing across multiple AI providers.

Model Serving

The infrastructure and process of hosting a trained AI model and exposing it as an API endpoint for real-time or batch inference.

Edge AI / On-Device AI

Running AI models directly on user devices (phones, laptops, IoT) rather than sending data to cloud servers for processing.

GPU / TPU

Specialized processors designed for the parallel mathematical operations that AI models require for training and inference.

Quantization

A technique that reduces AI model size and memory requirements by using lower-precision numbers to represent model weights, trading a small accuracy loss for major efficiency gains.

Distillation

A technique where a smaller "student" model is trained to replicate the behavior of a larger "teacher" model, achieving comparable quality at lower cost.

Latency

The time delay between sending a request to an AI model and receiving the response, critical for real-time user-facing applications.

Token Limits / Rate Limiting

Restrictions imposed by AI API providers on the number of tokens processed or requests made within a given time period.

Models

Applications

Computer Vision

The field of AI that enables machines to interpret and understand visual information from images and video.

Natural Language Processing (NLP)

The branch of AI focused on enabling computers to understand, interpret, and generate human language in useful ways.

Text-to-Speech (TTS)

AI technology that converts written text into natural-sounding spoken audio, enabling voice interfaces and audio content generation.

Speech-to-Text (STT)

AI technology that converts spoken audio into written text, enabling voice input, transcription, and voice-controlled interfaces.

Image Generation

AI models that create new images from text descriptions (prompts), enabling automated visual content creation.

Video Generation

AI models that create video content from text prompts or images, an emerging capability for automated video production.

Sentiment Analysis

An NLP technique that determines the emotional tone of text, classifying it as positive, negative, neutral, or more granular emotions.

Named Entity Recognition (NER)

An NLP technique that identifies and classifies named entities in text, such as people, organizations, locations, dates, and monetary values.

Autonomous Agents

AI systems that can independently plan, execute multi-step tasks, use tools, and adapt their approach based on results, with minimal human oversight.

Chatbot

An AI application that conducts conversations with users through text or voice, handling questions, tasks, and interactions in natural language.

Copilot

An AI assistant integrated into a user's workflow that provides real-time suggestions, completions, and assistance alongside the user's work.

Want to put these concepts into practice?

AI 4U Labs ships production AI apps using these technologies every week. Let's build yours.

Start a Project