PostgreSQL is one of the most advanced and reliable relational databases in the world. When combined with Python, it forms an unbeatable duo for applications demanding robustness, scalability, and performance. Whether you are building a web application, a data analysis system, or a SaaS platform, mastering the integration between Python and PostgreSQL is an essential skill.
In this complete guide, you will learn everything from installation and configuration to advanced connection, querying, and optimization techniques with PostgreSQL in Python. We will cover three of the main integration libraries: psycopg2, SQLAlchemy, and asyncpg, each with its ideal use cases.
To start, it is important to understand why PostgreSQL stands out among so many database options. Created in 1996 at the University of California, Berkeley, PostgreSQL is an open-source relational database that offers features comparable to commercial solutions like Oracle and SQL Server. Its extensible architecture, JSON support, advanced indexing, and ACID compliance make it the preferred choice for critical applications. The official PostgreSQL about page details its rich history and technical differentiators.
Why Use PostgreSQL with Python?
Python is the most popular language for data science, web development, and automation. PostgreSQL, in turn, is the most loved relational database by developers according to annual Stack Overflow surveys. This combination offers significant advantages:
Reliability and ACID Compliance: PostgreSQL ensures your transactions are atomic, consistent, isolated, and durable. This means your data remains intact even in failure scenarios. While simpler databases like SQLite are ideal for prototypes, as explained in our SQLite with Python guide, PostgreSQL is the right choice for systems requiring high concurrency and large data volumes.
Advanced Data Types: PostgreSQL supports arrays, JSON/JSONB, date ranges, geometric types, and even custom types. The official PostgreSQL data type documentation provides a complete overview of the possibilities.
Extensibility: You can create functions in multiple languages (PL/pgSQL, PL/Python, PL/v8), install extensions like PostGIS (geolocation), and build custom operators. The PostgreSQL extensions guide shows how to supercharge your database.
Performance: With native support for B-tree, Hash, GiST, GIN, SP-GiST, and BRIN indexes, plus table partitioning and query parallelism, PostgreSQL delivers exceptional performance even on terabyte-scale datasets. The official PostgreSQL Tutorial teaches how to optimize queries in practice.
Installation and Setup
Before writing Python code, we need to ensure PostgreSQL is installed and configured correctly. The process varies by operating system.
Installing PostgreSQL
Windows: Download the official PostgreSQL installer from postgresql.org/download. The installer includes pgAdmin, a complete graphical interface for database management.
Linux (Ubuntu/Debian): Use the package manager:
sudo apt update
sudo apt install postgresql postgresql-contrib
macOS: The simplest way is via Homebrew:
brew install postgresql@16
After installation, start the service and create a database for your projects:
# Linux/macOS
sudo systemctl start postgresql
createdb my_database
Installing Python Libraries
Now let us install the libraries we will use to connect Python to PostgreSQL:
pip install psycopg2-binary sqlalchemy asyncpg
psycopg2-binary is the pre-compiled version of psycopg2, ideal for development. In production, it is recommended to use psycopg2 compiled from source for better performance. The official psycopg2 documentation explains these differences in detail.
Connecting to PostgreSQL with Python
Let us explore three different approaches to connecting Python to PostgreSQL, each suited to a specific scenario.
1. Connection with psycopg2
psycopg2 is the most mature and widely used PostgreSQL adapter in the Python ecosystem. It offers fine-grained control over the connection, native support for PostgreSQL types, and excellent performance.
import psycopg2
from psycopg2.extras import RealDictCursor
Connection configuration
conn = psycopg2.connect(
host="localhost",
port=5432,
database="my_database",
user="postgres",
password="your_password"
)
Creating a cursor to execute queries
cursor = conn.cursor(cursor_factory=RealDictCursor)
Executing a query
cursor.execute("SELECT version();")
print(cursor.fetchone())
Closing the connection
cursor.close()
conn.close()
Notice we used RealDictCursor so results return as Python dictionaries, making it easier to access fields by name. The psycopg2 usage guide demonstrates all available cursor options.
2. Connection with SQLAlchemy
SQLAlchemy is the most powerful ORM (Object-Relational Mapper) in Python. It abstracts SQL, allowing you to work with Python objects instead of raw SQL queries. For a deeper dive, check our complete SQLAlchemy guide.
from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker
Connection string
DATABASE_URL = "postgresql://postgres:your_password@localhost:5432/my_database"
Creating the engine
engine = create_engine(DATABASE_URL, echo=True)
Creating a session
Session = sessionmaker(bind=engine)
session = Session()
Executing a query with raw SQL
result = session.execute(text("SELECT version()"))
print(result.fetchone())
session.close()
The SQLAlchemy connection string follows the pattern dialect://user:password@host:port/database. For PostgreSQL, the dialect is postgresql and the default driver is psycopg2. You can also use postgresql+asyncpg:// for asynchronous connections. The official SQLAlchemy engine documentation provides complete configuration examples.
3. Connection with asyncpg
For applications requiring high concurrency, such as FastAPI APIs or websocket servers, asyncpg is the ideal choice. It is a purely asynchronous PostgreSQL driver built on asyncio, offering the highest performance among all options.
import asyncio
import asyncpg
async def connect_and_query():
conn = await asyncpg.connect(
user='postgres',
password='your_password',
database='my_database',
host='localhost',
port=5432
)
version = await conn.fetchval("SELECT version()")
print(f"PostgreSQL version: {version}")
await conn.close()
asyncio.run(connect_and_query())
asyncpg is significantly faster than psycopg2 in high-concurrency scenarios because it does not block the Python event loop. The official asyncpg documentation includes detailed benchmarks and advanced usage examples.
CRUD Operations with PostgreSQL and Python
Let us implement the four fundamental database operations — Create, Read, Update, Delete — using psycopg2.
CREATE — Inserting Data
import psycopg2
conn = psycopg2.connect("dbname=my_database user=postgres password=your_password")
cursor = conn.cursor()
Creating a table
cursor.execute("""
CREATE TABLE IF NOT EXISTS users (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
age INTEGER,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
Inserting data with parameters (prevents SQL injection)
cursor.execute(
"INSERT INTO users (name, email, age) VALUES (%s, %s, %s)",
("Maria Silva", "[email protected]", 28)
)
Inserting multiple records
users = [
("John Santos", "[email protected]", 35),
("Ana Oliveira", "[email protected]", 42),
("Carlos Lima", "[email protected]", 31)
]
cursor.executemany(
"INSERT INTO users (name, email, age) VALUES (%s, %s, %s)",
users
)
conn.commit()
cursor.close()
conn.close()
Notice the use of %s as placeholders. psycopg2 automatically escapes the values, preventing SQL injection attacks. The executemany method optimizes multiple record insertion by sending them all in a single command.
READ — Querying Data
conn = psycopg2.connect("dbname=my_database user=postgres password=your_password")
cursor = conn.cursor(cursor_factory=RealDictCursor)
Simple query
cursor.execute("SELECT * FROM users WHERE age > %s", (30,))
users = cursor.fetchall()
for user in users:
print(f"{user['name']} - {user['age']} years old")
Query with JOIN
cursor.execute("""
SELECT u.name, COUNT(o.id) as total_orders
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id
HAVING COUNT(o.id) > 0
ORDER BY total_orders DESC
""")
results = cursor.fetchmany(10) # Top 10
cursor.close()
conn.close()
The methods fetchone(), fetchmany(n), and fetchall() provide granular control over the number of returned results. For queries returning millions of rows, fetchmany with a server-side cursor prevents memory overload.
UPDATE — Updating Records
conn = psycopg2.connect("dbname=my_database user=postgres password=your_password")
cursor = conn.cursor()
cursor.execute(
"UPDATE users SET age = %s WHERE email = %s",
(29, "[email protected]")
)
print(f"Updated records: {cursor.rowcount}")
conn.commit()
cursor.close()
conn.close()
DELETE — Removing Records
conn = psycopg2.connect("dbname=my_database user=postgres password=your_password")
cursor = conn.cursor()
cursor.execute("DELETE FROM users WHERE age < %s", (18,))
print(f"Deleted records: {cursor.rowcount}")
conn.commit()
cursor.close()
conn.close()
Connection Management with Connection Pooling
In real applications, opening and closing connections per request is inefficient. Connection pooling maintains a set of reusable connections, dramatically improving performance.
Pooling with psycopg2
from psycopg2 import pool
connection_pool = pool.SimpleConnectionPool(
minconn=2,
maxconn=10,
host="localhost",
port=5432,
database="my_database",
user="postgres",
password="your_password"
)
Acquiring a connection from the pool
conn = connection_pool.getconn()
cursor = conn.cursor()
cursor.execute("SELECT count(*) FROM users")
print(cursor.fetchone())
cursor.close()
Returning the connection to the pool
connection_pool.putconn(conn)
Closing all connections when the application ends
connection_pool.closeall()
ThreadedConnectionPool is the thread-safe version of the pool, ideal for multi-threaded web applications.
Pooling with SQLAlchemy
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool
engine = create_engine(
"postgresql://postgres:your_password@localhost:5432/my_database",
poolclass=QueuePool,
pool_size=10,
max_overflow=20,
pool_pre_ping=True
)
The pool is managed automatically by the engine
with engine.connect() as conn:
result = conn.execute(text("SELECT count(*) FROM users"))
print(result.fetchone())
The pool_pre_ping=True parameter checks whether the connection is still alive before using it, preventing errors with expired connections.
Transactions and Isolation
Transactions are fundamental for ensuring data integrity. PostgreSQL offers different isolation levels: Read Committed (default), Repeatable Read, and Serializable.
import psycopg2
from psycopg2.extensions import ISOLATION_LEVEL_REPEATABLE_READ
conn = psycopg2.connect("dbname=my_database user=postgres password=your_password")
conn.set_isolation_level(ISOLATION_LEVEL_REPEATABLE_READ)
try:
cursor = conn.cursor()
# Starts a transaction
cursor.execute("BEGIN")
cursor.execute(
"UPDATE accounts SET balance = balance - %s WHERE id = %s",
(500.00, 1)
)
cursor.execute(
"UPDATE accounts SET balance = balance + %s WHERE id = %s",
(500.00, 2)
)
# Commits the transaction
conn.commit()
print("Transfer completed successfully!")
except Exception as e:
conn.rollback()
print(f"Transaction error: {e}")
finally:
cursor.close()
conn.close()
Using with conn as a context manager can also be used for automatic commits and rollbacks, simplifying the code.
Working with JSON in PostgreSQL
One of PostgreSQL's most powerful features is its native JSON support. You can store, query, and index JSON documents directly in the relational database.
import json
import psycopg2
conn = psycopg2.connect("dbname=my_database user=postgres password=your_password")
cursor = conn.cursor()
Creating table with JSONB field
cursor.execute("""
CREATE TABLE IF NOT EXISTS products (
id SERIAL PRIMARY KEY,
name VARCHAR(200),
metadata JSONB
)
""")
Inserting JSON
product_meta = {
"category": "Electronics",
"tags": ["smartphone", "android", "5g"],
"specs": {
"screen": "6.5 inches",
"ram": "8GB",
"storage": "256GB"
},
"rating": 4.7
}
cursor.execute(
"INSERT INTO products (name, metadata) VALUES (%s, %s)",
("Smartphone X Pro", json.dumps(product_meta))
)
Querying JSON fields
cursor.execute("""
SELECT name, metadata->>'category' as category
FROM products
WHERE metadata @> '{"tags": ["android"]}'
""")
for row in cursor.fetchall():
print(row)
conn.commit()
cursor.close()
conn.close()
The @> operator checks whether the JSON document contains a specific subset of keys and values. PostgreSQL also allows creating GIN indexes on JSONB columns for extremely fast queries.
Prepared Statements and Optimization
For queries executed repeatedly, prepared statements offer significant performance gains by preventing PostgreSQL from recompiling the execution plan every time.
import psycopg2
conn = psycopg2.connect("dbname=my_database user=postgres password=your_password")
cursor = conn.cursor()
Preparing the query
cursor.execute("PREPARE find_user (int) AS SELECT * FROM users WHERE id = $1")
Executing with different parameters
cursor.execute("EXECUTE find_user (%s)", (1,))
print(cursor.fetchone())
cursor.execute("EXECUTE find_user (%s)", (2,))
print(cursor.fetchone())
Deallocating the prepared statement
cursor.execute("DEALLOCATE find_user")
cursor.close()
conn.close()
For those who prefer working with pure Python, the standard library sqlite3 module offers a similar experience, but without PostgreSQL's advanced features. Refer to the official sqlite3 tutorial to understand the differences and when to use each.
Schema Migrations with Alembic
In real projects, the database schema evolves over time. Alembic, created by the same author of SQLAlchemy, is the most popular tool for managing database migrations in Python.
pip install alembic
alembic init alembic
# alembic/env.py - Configure database URL
from sqlalchemy import engine_from_config
from alembic import context
config = context.config
config.set_main_option("sqlalchemy.url", "postgresql://postgres:your_password@localhost:5432/my_database")
Creating a migration
Command: alembic revision --autogenerate -m "create_users_table"
Applying migrations
Command: alembic upgrade head
Alembic automatically generates migration SQL code based on changes to your SQLAlchemy models, making the process safe and reproducible.
Security Best Practices
When working with PostgreSQL in Python, some security practices are essential:
Never concatenate SQL strings: Always use placeholders (%s in psycopg2, :param in SQLAlchemy) to prevent SQL injection. Even seemingly harmless queries can be exploited by attackers.
Manage passwords with environment variables: Use .env files or services like Vault to store credentials. Never hardcode passwords in source code.
Use SSL for remote connections: PostgreSQL supports SSL/TLS connections. In psycopg2, simply add sslmode='require' to the connection parameters.
Principle of least privilege: Create database users with strictly necessary permissions. A web application does not need permissions to create tables, for example.
For more information on PostgreSQL security, Real Python publishes up-to-date guides on best practices with SQL databases in Python.
Conclusion
The integration between Python and PostgreSQL offers a powerful combination for any type of application. In this guide, you learned:
- How to install and configure PostgreSQL and the necessary Python libraries
- Three ways to connect Python to PostgreSQL: psycopg2, SQLAlchemy, and asyncpg
- Complete CRUD operations with practical examples
- Efficient connection management with connection pools
- Using transactions to ensure data integrity
- Advanced features like JSONB, prepared statements, and migrations
- Essential security best practices
With this knowledge, you are ready to build robust and scalable applications using Python and PostgreSQL. To continue your studies, we recommend exploring the complete SQLAlchemy guide and experimenting with the combination of PostgreSQL with web frameworks like FastAPI and Django, which offer native integration with this exceptional database.