Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The pitfalls of partitioning Postgres yourself (hatchet.run)
43 points by abelanger 3 hours ago | hide | past | favorite | 5 comments




Ha, what a coincidence. Just today I was reading a three year old Stackoverflow discussion about this [1].

It prompted Laurenz to submit the documentation patch that is cited in the article. In the discussion of the patch itself, people seem to conclude that it's a good improvement to the docs, but that the behaviour itself is a bit of a footgun. [2]

[1]: https://stackoverflow.com/questions/73951604/autovacuum-and-...

[2]: https://www.postgresql.org/message-id/Y8cQJIMFAe7QT73/%40mom...


I mentioned this towards the bottom of the post, but to reiterate: we're extremely grateful to Laurenz for helping us out here, and his post on this is more than worth checking out: https://www.cybertec-postgresql.com/en/partitioned-table-sta...

(plus an interesting discussion in the comments of that post on how the query planner chose a certain row estimate in the specific case that Laurenz shared!)

The other thing I'll add is that we still haven't figured out:

1. An optimal ANALYZE schedule here on parent partitions; we're opting to over-analyze than under-analyze at the moment, because it seems like our query distribution might change quite often.

2. Whether double-partitioned tables (we have some tables partitioned by time series first, and an enum value second) need analyze on the intermediate tables, or whether the top-level parent and bottom-level child tables are enough. So far just the top-level and leaf tables seem good enough.


I'd consider myself pretty familiar with postgres partitioning, and even worked with systems that emulated partitioning through complex dynamic SQL through stored procs before it was supported natively.

But TIL, I didn't realize you could do multiple levels of partitioning in modern postgres, found this old blog post that touches on it https://joaodlf.com/postgresql-10-partitions-of-partitions.h...

Something that stresses me is the number of partitions - we have some weekly partitions that have a long retention period, and whilst it hasn't become a problem yet, it feels like a ticking time bomb as the years go on.

Would a multi level partitioning scheme of say year -> week be a feasible way to side step the issues of growing partition counts?


They didn’t say why they didn’t use the built-in partitioning system.

They are using the built-in partitioning. They just ran into one of the "you gotta know this" pitfalls.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: