Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow CREATE/DROP INDEX without CONCURRENTLY #81

Open
provokateurin opened this issue Jan 7, 2024 · 12 comments
Open

Allow CREATE/DROP INDEX without CONCURRENTLY #81

provokateurin opened this issue Jan 7, 2024 · 12 comments

Comments

@provokateurin
Copy link

provokateurin commented Jan 7, 2024

It is not always desirable to create/drop indexes concurrently, so it should be possible to switch it off.

@bplunkett-stripe
Copy link
Collaborator

Hi provokateurin!

In what scenarios would you like to not build an index concurrently? I realize it does consume more CPU than a normal index build. Is that the primary use case?

@provokateurin
Copy link
Author

In the use-case we want to run all migrations in a transaction, so the index can't be built concurrently.
The other problem is that concurrent indexes are not really safe. You have come back later and check if the index was actually built.

cc @GeertJohan

@bplunkett-stripe
Copy link
Collaborator

I see!

I'd argue transactional DDL isn't super critical because you can always roll it back (with exceptions of deletes) via plan(oldSchema).

Concurrent index builds can't silently fail, so you'll know if a concurrent index build failed. And to get your database out of a partially migrated state, you can do plan(newSchema) again, or even plan(oldSchema). pg-schema-diff will identify invalid indexes from concurrent index build failures and REINDEX them.

@mortenson
Copy link

mortenson commented Jan 8, 2024

If this gets changed please make non-concurrency opt-in - concurrent index builds are really important/preferred in production to me. During development and for a small number of rows I can see how the multiple transactions are not great, though.

@provokateurin
Copy link
Author

provokateurin commented Jan 8, 2024

pg-schema-diff will identify invalid indexes from concurrent index build failures and REINDEX them.

We are only using pg-schema-diff for generating the migration statements, so we are not able to use the automatic reindexing.

If this gets changed please make non-concurrency opt-in

Agreed, it should be an option and not remove the existing functionality.

@bplunkett-stripe
Copy link
Collaborator

bplunkett-stripe commented Jan 8, 2024

We are only using pg-schema-diff for generating the migration statements, so we are not able to use the automatic reindexing.

As in, you're persisting your migration to your codebase?

If this gets changed please make non-concurrency opt-in - concurrent index builds are really important/preferred in production to me. During development and for a small number of rows I can see how the multiple transactions are not great, though.

Our philosophy is to make migration as online-as-possible, so if we did add this option in, it absolutely would be disabled by default.

I think I'm leaning towards supporting this in the future, but right now, I want to focus on the core of the library:

  • Making more migrations online
  • Adding support for more SQL objects and schemas

What do you folks think?

@provokateurin
Copy link
Author

As in, you're persisting your migration to your codebase?

Yes, the are saved as a file. The process of generating the migration and applying the migration are not the same. The migration is generated while the changes are developed and they are only applied when a new version is deployed.

@bplunkett-stripe
Copy link
Collaborator

Yes, the are saved as a file. The process of generating the migration and applying the migration are not the same. The migration is generated while the changes are developed and they are only applied when a new version is deployed.

Any thoughts on changing this process such that the migration is generated and applied when the version is deployed? This also prevents a proliferation of {x}.sql files in y our repo

@provokateurin
Copy link
Author

But we want to store all migrations as sql files, so this is exactly our use-case. I don't think there is much we want to change about it.
Anyway, this is getting a bit off topic. Let's just add the option to disable concurrent indexes and we are done :)

@bplunkett-stripe
Copy link
Collaborator

bplunkett-stripe commented Jan 11, 2024

Totally understand! Forgive my questions! I'm just trying to understand your use case and poke its edges to see how pg-schema-diff can slot into it.

I think, to rephrase the ask a bit, is generate DDL that can be run entirely in one transaction (as opposed to just getting rid of concurrent index builds). If we do implement this type of behavior, it will be disabled by default.

@provokateurin
Copy link
Author

Sorry, I didn't want to come off as rude if that was the case!

Yes, we want to be able to generate DDL that can be run in a transaction. So far we only found the concurrent indexes to be a problem, but maybe more still show up later.

@bplunkett-stripe
Copy link
Collaborator

No worries! I suspect you folks might run into issues with the "online [check|not null] constraint" addition and "online index replacement" (assuming no concurrently). I haven't really tested if that works transactionally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants