AI Features in pgAdmin: Configuration and Reports

This is the first in a series of three blog posts covering the new AI functionality coming in pgAdmin 4. In this post, I'll walk through how to configure the LLM integration and introduce the AI-powered analysis reports; in the second, I'll cover the AI Chat agent in the query tool; and in the third, I'll explore the AI Insights feature for EXPLAIN plan analysis.

Anyone who manages PostgreSQL databases in a professional capacity knows that keeping on top of security, performance, and schema design is an ongoing endeavour. You might have a checklist of things to review, or perhaps you rely on experience and intuition to spot potential issues, but it is all too easy for something to slip through the cracks, especially as databases grow in complexity. We've been thinking about how AI could help with this, and I'm pleased to introduce a suite of AI-powered features in pgAdmin 4 that bring large language model analysis directly into the tool you already use every day.

Configuring the LLM Integration

Before any of the AI features can be used, you'll need to configure an LLM provider. pgAdmin supports four providers out of the box, giving you flexibility to choose between cloud-hosted models and locally-running alternatives:

Anthropic (Claude models)
OpenAI (GPT models)
Ollama (locally-hosted open-source models)
Docker Model Runner (built into Docker Desktop 4.40 and later)

Server Configuration

At the server level, there is a master switch in config.py (or, more typically, config_local.py) that controls whether AI features are available at all:

# Master switch to enable/disable LLM features entirely.
LLM_ENABLED = True

When LLM_ENABLED is set to False, all AI functionality is hidden from users and cannot be enabled through preferences. This gives administrators full control over whether AI features are permitted in their environment, which is particularly important in organisations with strict data governance policies.

Below the master switch, you'll find default configuration for each provider:

# Default LLM Provider
# Valid values: 'anthropic', 'openai', 'ollama', 'docker', or '' (disabled)
DEFAULT_LLM_PROVIDER = ''

# Anthropic Configuration
ANTHROPIC_API_KEY_FILE = '~/.anthropic-api-key'
ANTHROPIC_API_MODEL = ''

# OpenAI Configuration
OPENAI_API_KEY_FILE = '~/.openai-api-key'
OPENAI_API_MODEL = ''

# Ollama Configuration
OLLAMA_API_URL = ''
OLLAMA_API_MODEL = ''

# Docker Model Runner Configuration
DOCKER_API_URL = ''
DOCKER_API_MODEL = ''

# Maximum tool call iterations for AI conversations
MAX_LLM_TOOL_ITERATIONS = 20

For the cloud providers (Anthropic and OpenAI), API keys are read from files on disk rather than being stored directly in the configuration, which is a deliberate security choice. The key file should contain nothing but the API key itself, with no additional whitespace or formatting. For Ollama and Docker Model Runner, you simply provide the API URL for the local service (typically http://localhost:11434 for Ollama and http://localhost:12434 for Docker).

User Preferences

Whilst the server configuration sets the defaults and boundaries, individual users can customise their AI settings through the Preferences dialog under the 'AI' section. The preferences are organised into categories:

AI Configuration contains the general settings:

Default Provider: Users can select their preferred provider from a dropdown, or choose 'None (Disabled)' to turn off AI features for their account. This setting only takes effect if LLM_ENABLED is True in the server configuration.
Max Tool Iterations: Controls how many tool call rounds the AI is allowed to perform during a single conversation, with a default of 20. Higher values allow more complex analyses but consume more resources.

Each provider has its own category with provider-specific settings:

Anthropic: API Key File path and Model selection
OpenAI: API Key File path and Model selection
Ollama: API URL and Model selection
Docker Model Runner: API URL and Model selection

One particularly nice touch is that the model selection dropdowns are populated dynamically. When you configure an API key or URL and click the refresh button, pgAdmin queries the provider's API to fetch the list of available models. For Ollama, it even shows the model sizes so you can see at a glance how much disk space each model is using. The model selectors also support typing in custom model names, so you're not limited to whatever the API returns; if you know the exact model identifier you want to use, you can simply type it in.

AI Analysis Reports

With the LLM configured, you gain access to three types of AI-powered analysis reports that can be generated from the browser tree context menu. Simply right-click on a server, database, or schema and select the appropriate report from the 'AI Analysis' submenu.

Security Reports

The security report examines your PostgreSQL configuration from a security perspective, covering a comprehensive range of areas:

Authentication Configuration: Password policies, SSL/TLS settings, authentication methods, and connection security
Access Control and Roles: Superuser accounts, privileged roles, login roles without password expiry, and role privilege assignments
Network Security: Listen addresses, connection limits, and pg_hba.conf rules
Encryption and SSL: SSL/TLS configuration, password encryption methods, and data-at-rest encryption settings
Object Permissions: Schema, table, and function access control lists, default privileges, and ownership (at database scope)
Row-Level Security: RLS policies, RLS-enabled tables, and policy coverage analysis
Security Definer Functions: Functions running with elevated privileges and their permission settings
Audit and Logging: Connection logging, statement logging, error logging, and audit trail configuration
Extensions: Installed extensions and their security implications

Security reports can be generated at the server level (covering server-wide configuration such as authentication and network settings), the database level (adding object permissions and RLS analysis), or the schema level (focusing on a specific schema's security posture).

Performance Reports

The performance report analyses your server and database configuration for potential optimisation opportunities:

Memory Configuration: shared_buffers, work_mem, effective_cache_size, maintenance_work_mem, and related settings
Checkpoint and WAL: Checkpoint settings, WAL configuration, and background writer statistics
Autovacuum Configuration: Autovacuum settings, tables needing vacuum, and dead tuple accumulation
Query Planner Settings: Cost parameters, statistics targets, JIT compilation, and planner optimisation settings
Parallelism and Workers: Parallel query configuration and worker process settings
Connection Management: Maximum connections, reserved connections, timeouts, and current connection status
Cache Efficiency: Buffer cache hit ratios, database-level cache statistics, and table-level I/O patterns
Index Analysis: Index utilisation, unused indexes, tables that might benefit from additional indexes, and index size analysis
Query Performance: Slowest queries and most frequent queries (when pg_stat_statements is available)
Replication Status: Replication lag, standby status, and WAL sender statistics

Performance reports are available at both the server and database levels, with database-level reports including additional detail on index usage and cache efficiency for that specific database.

Schema Design Reports

The design review report examines your database schema for structural quality and best practices:

Table Structure: Table definitions, column counts, sizes, ownership, and documentation coverage
Primary Key Analysis: Primary key design and tables lacking primary keys
Referential Integrity: Foreign key relationships, orphan references, and relationship coverage
Index Strategy: Index definitions, duplicate indexes, index types, and coverage analysis
Constraints: Check constraints, unique constraints, and data validation coverage
Normalisation Analysis: Repeated column patterns, potential denormalisation issues, and data redundancy
Naming Conventions: Table and column naming patterns, consistency analysis, and naming standard compliance
Data Type Review: Data type usage patterns, type consistency, and type appropriateness

Design reports are available at the database and schema levels, allowing you to review either an entire database's schema design or focus on a specific schema.

How the Reports Work

Under the hood, the report generation follows a sophisticated multi-stage pipeline that keeps each LLM interaction within manageable token limits whilst still producing comprehensive output:

Planning: The LLM first reviews the available analysis sections and the database context (server version, table count, available extensions, and so on), then selects which sections are most relevant to analyse. This means the report is tailored to your specific environment rather than running every possible check regardless of applicability.
Data Gathering: For each selected section, pgAdmin executes a set of SQL queries against the database to collect the relevant configuration data, statistics, and metadata.
Section Analysis: Each section's data is sent to the LLM independently for analysis. The LLM classifies findings by severity (Critical, Warning, Advisory, or Good) and provides specific, actionable recommendations, including SQL commands where relevant.
Synthesis: Finally, the individual section analyses are combined into a cohesive report with an executive summary, a critical issues section aggregating the most important findings, the detailed section analyses, and a prioritised list of recommendations.

As the pipeline works through these stages, the UI shows real-time progress updates: the current stage name (Planning Analysis, Gathering Data, Analysing Sections, Creating Report), a description of what's being processed (for example, 'Analysing Memory Configuration...'), and a progress bar showing how many sections have been completed out of the total. Once all four stages are finished, the completed report is rendered in the panel in one go. Each report can also be downloaded as a Markdown file for archiving or sharing with colleagues.

The reports are designed to be genuinely useful rather than generic. Because the LLM receives actual data from your database (configuration settings, role definitions, table statistics, and index information), its analysis is grounded in reality. A security report will flag your specific pg_hba.conf rules that might be overly permissive, a performance report will identify your specific tables that are missing useful indexes, and a design report will point out your specific naming inconsistencies.

A Note on Privacy and Data

It is worth noting that when using cloud-hosted LLM providers (Anthropic or OpenAI), the database metadata and configuration data gathered for reports is sent to those providers' APIs. No actual table data is sent for the reports (only metadata, configuration settings, and statistics), but administrators should be aware of this and ensure it aligns with their organisation's data handling policies. For environments where sending any data externally is not acceptable, the Ollama and Docker Model Runner options allow you to run models entirely locally.

Getting Started

If you'd like to try the AI features, the quickest way to get started is to configure an API key for either Anthropic or OpenAI, set the default provider in Preferences, and then right-click on a server in the browser tree to generate your first report. If you prefer to keep everything local, installing Ollama and pulling a model such as llama3.2 is straightforward, and Docker Desktop users on version 4.40 or later can enable the built-in model runner without any additional setup.

In the next post, I'll cover the AI Chat agent in the query tool, which brings natural language to SQL translation directly into your workflow, along with database-aware conversational assistance. Stay tuned.