When building applications in Python, one of the most important and often overlooked tasks is data validation. Without proper validation, our systems become vulnerable to errors, inconsistencies, and potential security vulnerabilities. This is exactly where Pydantic comes in as an indispensable tool for any Python developer.

Pydantic is a library that defines how data should be structured using native Python types, offering automatic validation, serialization, and documentation of data models. Since its creation, it has become the de facto standard for data validation in frameworks like FastAPI, and its adoption continues to grow exponentially in the Python community.

🎯 Why Use Pydantic?

Before diving into the technical details, it's essential to understand why Pydantic has become so popular in the Python ecosystem. Imagine the common situation where you receive data from an external API, a web form, or any other source external to your system. Traditionally, you would have to manually write multiple checks and validations to ensure the data is in the expected format.

With Pydantic, you simply define the expected structure using Python classes with type hints, and the library does all the heavy lifting automatically. According to the official Pydantic documentation, the library "automatically validates data against the defined types and constraints, generating clear and helpful errors when data is invalid."

Beyond automatic validation, Pydantic offers automatic serialization and deserialization, converting Python objects to and from JSON, dictionaries, and other formats. This eliminates a significant amount of boilerplate code that we traditionally need to write.

📦 Installation and Setup

Installing Pydantic is extremely simple. Since it's a pure Python library, you can install it via pip:

pip install pydantic

For projects using advanced features like environment configuration validation, you can install the pydantic-settings extension:

pip install pydantic-settings

Pydantic is compatible with Python 3.8 and later versions. To check the installed version, you can run:

import pydantic
print(pydantic.__version__)

Real Python offers an excellent introductory tutorial on Pydantic that complements this guide perfectly.

🏗️ Creating Your First Model

The foundation of Pydantic is models, defined through classes that inherit from BaseModel. Let's create a simple example to understand how it works:

from pydantic import BaseModel

class User(BaseModel):
    name: str
    email: str
    age: int

# Creating an instance
user = User(name="John Doe", email="[email protected]", age=30)
print(user)

Notice that we don't need to specify types in quotes or use special syntax. Pydantic uses Python's native type hints to define the data structure.

When you try to create a model with invalid data, Pydantic raises a ValidationError with clear and descriptive messages:

try:
    invalid_user = User(name="John", email="invalid-email", age=-5)
except Exception as e:
    print(e)

This code will generate errors indicating that the email is in invalid format and that age cannot be negative.

🔧 Advanced Field Types

Pydantic supports a wide variety of types beyond the basics. Let's explore some of the most useful ones:

1. Optional Fields with Default Values

from pydantic import BaseModel
from typing import Optional

class Product(BaseModel):
    name: str
    price: float
    description: Optional[str] = None
    in_stock: bool = True

In this example, description is an optional field (can be None), and in_stock has a default value of True.

2. Validation with Field

Pydantic allows further customization of validation using the Field function:

from pydantic import BaseModel, Field
from typing import Annotated

class Registration(BaseModel):
    username: Annotated[str, Field(min_length=3, max_length=20)]
    password: Annotated[str, Field(min_length=8)]
    email: str
    age: int = Field(ge=18, le=120)

Here, we define specific validations: username must be between 3 and 20 characters, password must be at least 8 characters, and age must be between 18 and 120 years.

3. Email and URL Types

Pydantic includes special types for validating specific formats:

from pydantic import BaseModel, EmailStr, HttpUrl

class Contact(BaseModel):
    email: EmailStr
    website: HttpUrl
    phone: str

EmailStr automatically validates that the string is in valid email format, and HttpUrl validates URLs.

4. Lists and Generic Types

from pydantic import BaseModel
from typing import List, Dict

class Order(BaseModel):
    items: List[str]
    prices: List[float]
    attributes: Dict[str, str]

🔄 Custom Validators

Sometimes the built-in validations aren't enough for your specific needs. Pydantic allows you to create custom validators using the @field_validator decorator:

from pydantic import BaseModel, field_validator
import re

class AdvancedUser(BaseModel):
    name: str
    cpf: str
    password: str

    @field_validator('cpf')
    @classmethod
    def validate_cpf(cls, v):
        # Remove non-numeric characters
        cpf = re.sub(r'\D', '', v)
        if len(cpf) != 11:
            raise ValueError('CPF must have 11 digits')
        return cpf

    @field_validator('password')
    @classmethod
    def validate_password(cls, v):
        if len(v) < 8:
            raise ValueError('Password must be at least 8 characters')
        if not re.search(r'[A-Z]', v):
            raise ValueError('Password must contain at least one uppercase letter')
        return v

This example shows how to create validators for CPF and password with business-specific rules.

Whole Model Validators

Besides field validators, you can create validators that work with the entire model:

from pydantic import BaseModel, model_validator
from datetime import date

class Booking(BaseModel):
    start_date: date
    end_date: date
    guests: int

    @model_validator(mode='after')
    def validate_period(self):
        if self.end_date <= self.start_date:
            raise ValueError('End date must be after start date')
        if self.guests < 1:
            raise ValueError('There must be at least 1 guest')
        return self

📋 Models with Inheritance

Pydantic supports model inheritance, allowing you to create organized data hierarchies:

from pydantic import BaseModel
from typing import Optional

class Person(BaseModel):
    name: str
    email: str

class Employee(Person):
    position: str
    salary: float
    department: Optional[str] = None

class Manager(Employee):
    team_members: list[str] = []
    bonus: float = 0.0

Employee inherits all fields from Person and adds its own specific fields.

🔄 Serialization and Deserialization

One of Pydantic's most powerful features is the ability to convert models to different formats:

from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price: float
    category: str

# Create instance
product = Product(name="Laptop", price=3500.00, category="Electronics")

# Convert to dictionary
dictionary = product.model_dump()
print(dictionary)
# {'name': 'Laptop', 'price': 3500.0, 'category': 'Electronics'}

# Convert to JSON
json_str = product.model_dump_json()
print(json_str)
# {"name": "Laptop", "price": 3500.0, "category": "Electronics"}

# Create instance from dictionary
product2 = Product.model_validate(dictionary)

Pydantic also allows excluding fields, including only specific fields, or transforming names during serialization using the mode parameter:

# Exclude sensitive fields
safe = product.model_dump(exclude={'price'})

# Include only specific fields
summary = product.model_dump(include={'name', 'category'})

# Use alias for serialization
class ProductAlias(BaseModel):
    product_name: str = Field(alias='name')
    unit_price: float = Field(alias='price')

    model_config = {'populate_by_name': True}

⚙️ Model Configuration

Each Pydantic model can have specific configurations through the Config class:

from pydantic import BaseModel, ConfigDict

class ConfigUser(BaseModel):
    name: str
    email: str
    password: str

    model_config = ConfigDict(
        str_to_lower=True,  # Convert strings to lowercase
        str_strip_whitespace=True,  # Remove whitespace
        frozen=True,  # Make the model immutable
        extra='forbid'  # Forbid extra fields
    )

Some of the most useful configurations include:

  • str_to_lower: Automatically converts all strings to lowercase
  • str_strip_whitespace: Removes whitespace from the beginning and end
  • frozen: Makes the model immutable after creation
  • extra: Controls behavior for extra fields ('allow', 'forbid', 'ignore')
  • populate_by_name: Allows population by field name or alias

🌐 Integration with FastAPI

FastAPI is one of the most popular modern Python web frameworks, and its integration with Pydantic is absolutely seamless. In fact, FastAPI uses Pydantic as the foundation for data validation in requests:

from fastapi import FastAPI
from pydantic import BaseModel, EmailStr

app = FastAPI()

class UserCreate(BaseModel):
    username: str
    email: EmailStr
    password: str

@app.post("/users")
async def create_user(user: UserCreate):
    return {"message": "User created", "data": user}

The official FastAPI documentation demonstrates how this integration drastically simplifies creating robust APIs.

When you define your Pydantic models as endpoint parameters, FastAPI automatically:

  • Validates incoming data
  • Generates automatic documentation
  • Converts data to the correct type
  • Returns clear errors if validation fails

💾 Pydantic Settings

For applications that need environment configurations, pydantic-settings is the ideal solution:

from pydantic_settings import BaseSettings
from typing import Optional

class Settings(BaseSettings):
    app_name: str = "My Application"
    debug: bool = False
    database_url: str
    secret_key: str

    class Config:
        env_file = ".env"
        env_file_encoding = "utf-8"

# Usage
settings = Settings()
print(settings.database_url)

The pydantic-settings documentation details all available options.

🚀 Best Practices

To use Pydantic effectively in your projects, consider these best practices:

1. Organize your models in separate modules
For large projects, create dedicated files for your models. This makes maintenance and reuse easier.

2. Use descriptive names
Names like UserCreate, UserResponse, ProductUpdate are much clearer than just User or Product.

3. Document your models
Use docstrings to explain the purpose of each model and field:

class Order(BaseModel):
    """Represents a purchase order."""
    items: list[str]
    """List of order items."""
    total: float
    """Total order value in reais."""

4. Validate only what's necessary
Don't overdo validations. Validate enough to ensure integrity, but leave flexibility for changes.

5. Use Custom Types
Create your own types for domain-specific validations:

from pydantic import AfterValidator
from typing import Annotated

CPF = Annotated[str, AfterValidator(validate_cpf_formatted)]
CNPJ = Annotated[str, AfterValidator(validate_cnpj_formatted)]

📚 Conclusion

Pydantic has fundamentally transformed the way we handle data validation in Python. Its intuitive syntax based on type hints, robust automatic validation, and serialization capabilities make it an indispensable tool for developers seeking more secure and maintainable code.

If you're working with FastAPI, building APIs, processing external data, or any application that receives untrusted data, Pydantic should be your first choice. The learning curve is smooth for those who already know Python type hints, and the benefits in terms of code quality and bug reduction are immediate.

To continue learning, explore the official Pydantic documentation, which contains advanced examples and specific use cases. You can also check out additional resources like the Podcast __init__ tutorial on Pydantic and FastAPI.

Mastering Pydantic is an investment that will quickly pay off in more robust projects with fewer bugs related to invalid data.