Yes, RDS seems to really hold PG back on AWS, with all the interesting pg extensions getting released now (pg_lake). It is a share I can't move to other PG vendors because it is a pain in the ass to get all privacy, legal docs in order.
I’ve been using this since early this year and it’s been great. It was what convinced me to just stick to Postgres rather than using a dedicated vector db.
Only working with 100m or so vectors, but for that it does the job.
The biggest selling point to using Postgres over qdrant or whatever is that you can put all the data in the same db and use joins and ctes, foreign keys and other constraints, lower latency, get rid of effectively n+1 cases, and ensure data integrity.
I generally agree that one database instance is ideal, but there are other reasons why Postgres everywhere is advantageous, even across multiple instances:
- Expertise: it's just SQL for the most part
- Ecosystem: same ORM, same connection pooler
- Portability: all major clouds have managed Postgres
I'd gladly take multiple Postgres instances even if I lose cross-database joins.
Yep. If performance becomes a concern, but we still want to exploit joins etc, it's easy to set up replicas and "shard" read only use cases across replicas.
Only working with 100m or so vectors, but for that it does the job.
- Expertise: it's just SQL for the most part - Ecosystem: same ORM, same connection pooler - Portability: all major clouds have managed Postgres
I'd gladly take multiple Postgres instances even if I lose cross-database joins.
https://www.tigerdata.com/blog/pgvector-vs-pinecone
https://github.com/timescale/pgvectorscale/issues/113