Too many tables are bad for you

© Laurenz Albe 2026 (see here for more background)
Table of Contents
- The problems that the customer observed
- Typical causes for high memory usage in PostgreSQL
- How too many tables can cause memory problems
- An example of too many tables
- Mitigating the memory problem caused by too many tables
- How having too many tables can cause long-running statements
- Trying to mitigate long-running metadata queries
- Summing up the consequences of too many tables
Recently, I helped a customer investigate database problems. It turned out that these problems could be traced back to too many tables in the database. Since this may come as a surprise to many users, I thought it worth the while to write about it.
The problems that the customer observed
There were two problems that sounded like they might or might not be related to each other:
- the Linux out-of-memory (OOM) killer occasionally crashed PostgreSQL
- long-running queries hogged the CPU and delayed the application
The first step in investigating OOM problems is always to disable memory overcommit. I originally suspected other software running on the machine to cause the memory shortage, but memory context dumps showed that PostgreSQL was at fault.
Typical causes for high memory usage in PostgreSQL
There are several typical causes for high memory usage in PostgreSQL:
- shared buffers, a fixed size shared memory page cache that PostgreSQL allocates at server start
- private memory that _backend processes_ use to process SQL statements — see the parameter `work_mem`
- _dynamic shared memory areas_ that PostgreSQL uses to exchange data between parallel query workers and the backend process
- large binary data, JSON values or PostGIS geometries that an SQL statement has to _de-TOAST_ and process
However, it turned out that in this case, it was something else that hogged the memory.
How too many tables can cause memory problems
After disabling memory overcommit, we got memory context dumps in the log file, as well as log entries from autovacuum workers that failed to fork because there was too little memory. Initially, the memory context dumps did not show anything interesting: they were from victims of memory starvation rather than from the culprits. That is also why I originally suspected causes external to PostgreSQL. But then we got some memory context dumps that looked as follows:
PgSQL
1
2
3
4
5
6
7
8
9
10 TopMemoryContext: 355016 total in 9 blocks; 203824 free (326 chunks); 151192 used
TopTransactionContext: 8192 total in 1 blocks; 7736 free (1 chunks); 456 used
Record information cache: 8192 total in 1 blocks; 1576 free (0 chunks); 6616 used
...
CacheMemoryContext: 457826384 total in 15437 blocks; 468640 free (9 chunks); 457357744 used
relation rules: 32768 total in 7 blocks; 3880 free (0 chunks); 28888 used: raster_overviews
index info: 2048 total in 2 blocks; 528 free (1 chunks); 1520 used: pg_toast_2619_index
index info: 2048 total in 2 blocks; 824 free (0 chunks); 1224 used: spatial_ref_sys_pkey
index info: 2048 total in 2 blocks; 448 free (1 chunks); 1600 used: pg_toast_18156029_index
...
The only part that showed substantial memory usage was the `CacheMemoryContext`, which `src/backend/utils/mmgr/README` describes as
> CacheMemoryContext --- permanent storage for relcache, catcache, and > > related modules. This will never be reset or deleted, either, so it's > > not truly necessary to distinguish it from TopMemoryContext. But it > > seems worthwhile to maintain the distinction for debugging purposes.
The _catcache_ is a cache for metadata table (catalog) entries, and the _relcache_ caches the description of _relations_ (tables, views, indexes and everything else that PostgreSQL stores in `pg_class`). These caches don't have a fixed size or a size limit. Rather, they track all the metadata that the database connection ever used. The assumption is that the amount of metadata and the number of relations in a PostgreSQL database is limited, so these caches will never grow too large.
So how could this backend process accrue almost half a GB of RAM in `CacheMemoryContext`? The answer is that the database contained tens of thousands of tables, and as a long-lived database connection touched more and more of them, the cache grew larger and larger. Now a single database connection that hogs half a GB of RAM will probably not bring a modern machine down, but if you have many of these long-lived connections, that can eventually eat up all your memory.
An example of too many tables
It is easy to verify the problem with a little experiment. If you want to try that experiment, I recommend that you create a dedicated database for it, so that a simple `DROP DATABASE` will get rid of all the tables again. Here is a statement to create 20000 tables with 20 columns each:
PgSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38 DO
$$DECLARE
i integer;
BEGIN
FOR i IN 1..20000 LOOP
EXECUTE
format(
E'CREATE TABLE %I (\n'
' id bigint GENERATED ALWAYS AS IDENTITY\n'
' PRIMARY KEY,\n'
' col2 text,\n'
' col3 integer,\n'
' col4 timestamp,\n'
' col5 boolean,\n'
' col6 double precision,\n'
' col7 boolean,\n'
' col8 varchar(42),\n'
' col9 numeric,\n'
' col10 xml,\n'
' col11 bytea,\n'
' col12 bit varying(8),\n'
' col13 inet,\n'
' col14 date,\n'
' col15 bigint,\n'
' col16 smallint,\n'
' col17 uuid,\n'
' col18 timestamp with time zone,\n'
' col19 real,\n'
' col20 jsonb\n'
')',
'tab_'||i
);
/* avoid running out of shared memory in the lock table */
IF i%100=0 THEN
COMMIT;
END IF;
END LOOP;
END;$$;
After that, let's look at the size of the `CacheMemoryContext`:
PgSQL
1
2
3
4
5
6
7 SELECT pg_size_pretty(total_bytes)
FROM pg_get_backend_memory_contexts()
WHERE name='CacheMemoryContext';
pg_size_pretty
----------------
4128 kB
The size of the caches is moderate. Now let's use all these tables by `SELECT`ing from them:
PgSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 DO
$$DECLARE
i integer;
BEGIN
FOR i IN 1..20000 LOOP
EXECUTE
format(
'TABLE %I',
'tab_'||i
);
/* again, to prevent exhausting the lock table */
IF i%100=0 THEN
COMMIT;
END IF;
END LOOP;
END;$$;
The SQL statement `TABLE` is a shortcut for `SELECT * FROM` a database table. After that, the `CacheMemoryContext` has grown:
PgSQL
1
2
3
4
5
6
7 SELECT pg_size_pretty(total_bytes)
FROM pg_get_backend_memory_contexts()
WHERE name='CacheMemoryContext';
pg_size_pretty
----------------
221 MB
Mitigating the memory problem caused by too many tables
The best solution is not to have too many tables to begin with. If a redesign like that is out of reach, there are other ways to at least mitigate the problem:
- Use a moderately sized connection pool. Limiting the number of database connections will automatically limit the overall memory usage.
- If some of your database sessions are idle for longer time periods, you could set `idle_session_timeout` (introduced in PostgreSQL v14) to have PostgreSQL terminate them regularly, before they can accrue a large `CacheMemoryContext`. This is only an option if your application or your connection pool handle dropped database connections gracefully.
How having too many tables can cause long-running statements
The statement that kept running for a long time in the customer's database was a simple query of `raster_overviews`, a metadata view of PostGIS. Now one would not expect metadata queries to take a lot of processing time, but that is because metadata tables are normally of moderate size. The view in question used the catalog tables `pg_class`, `pg_attribute`, `pg_type`, `pg_namespace` and `pg_constraint`. Let's look at the size of these tables in our test database:
PgSQL
1
2
3
4
5
6
7
8
9
10
11 SELECT relname,reltuples
FROM pg_class
WHERE relname IN('pg_class','pg_attribute','pg_type','pg_namespace','pg_constraint');
relname|reltuples
---------------+-----------
pg_type|40621
pg_attribute|943161
pg_class|100415
pg_constraint|40194
pg_namespace|4
If these counts surprise you, remember that the tables we created actually created many more relations with their attributes, as well as other objects:
- primary key indexes
- TOAST tables
- TOAST indexes
- sequences for the identity column
- composite types for each table
- array data types for each of the composite types
Trying to mitigate long-running metadata queries
There is little you can do to speed up the metadata queries that PostgreSQL itself performs. But I figured out that it was possible to improve `raster_overviews` by simplifying the view definition. That was a step in the right direction; unfortunately it didn't prove to be enough to speed up the query in question.
Really, the only good way to deal with this problem is to reduce the number of tables and other database objects for a single database.
Summing up the consequences of too many tables
Perhaps somewhat surprisingly, having too many tables in a single database can cause memory and performance problems. I gave you some background information, shared some diagnostic tools and pointed out approaches to reduce the pain. Perhaps some day PostgreSQL can have a smarter way of dealing with large metadata caches. However, I can understand that there are more interesting things to do than optimizing a use case that you could call an _ab_ use case.
Note that while this article dealt with the special case of too many tables, you can also expect to experience difficulties if you have too many schemas in a single database or too many databases or too many roles in a single cluster. Try to keep the size of PostgreSQL metadata tables moderate!