-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autoreload reliability improvements #14500
base: main
Are you sure you want to change the base?
Autoreload reliability improvements #14500
Conversation
…ions/deduperreload/deduperreload_patching.py
cc @jasongrout Very very cool stuff! |
Thanks, that is great ! Do you want to wrote a bit about this in the what's new ? https://github.com/ipython/ipython/tree/main/docs/source/whatsnew (and feel free to mention this is internship at databrick and link to any relavant inofrmation and your profile and those of people who helped you.). I don't have much time to review, and this is big, but if you work with someone like @jasongrout and they +1 then I'm happy to get than in. I might do a release tomorrow, I'm not sure this will make the cut, but otherwise at the release at end of september. |
Hi Matthias, I was Jay's intern mentor for this project, for which we were fortunate to have advising from @jasongrout (will make sure to secure his +1 at some point :P ). I can help out with shepherding this PR thru as well as adding the relevant release notes to the IPython docs. Internally we're also discussing a potential blog post that goes into a few more details -- if that ends up happening, I'll be sure to link it as well. Regardless, definitely no rush on the release -- I'll follow up here around the end of September. |
No worries. If you tell me Jason's was involved and think it's ok, I trust you. My guess is a coordinated blog post/release would be the best, so I can make a release out of the usual schedule if that helps you. I'm currently travelling but should have some time again at the end of next week. |
Hi. One potential fix is an option to ignore anything from /site-packages/ directory if "/site-packages/" in fname:
self.source_by_modname[new_modname] = ""
continue |
Hi @phihung, thanks a ton for the report. I'll take a look to see what's going on and I agree that excluding /site-packages/ from deduperreload as a workaround may be the right move |
@jaysarva opened jaysarva#1 to address @phihung's above comment |
@@ -233,7 +238,7 @@ def filename_and_mtime(self, module): | |||
|
|||
return py_filename, pymtime | |||
|
|||
def check(self, check_all=False, do_reload=True): | |||
def check(self, check_all=False, do_reload=True, use_deduper_reload=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
I'd like to see if there is a way to turn off use_duper_reload at runtime; just in case it breaks something maybe have the default as in instance varaible that cn be change with a parameter to the magic.
I'm og with true being the default, but scare we'd break someone without options to turn it off without changing version
@@ -285,6 +297,8 @@ def check(self, check_all=False, do_reload=True): | |||
file=sys.stderr, | |||
) | |||
self.failed[py_filename] = pymtime | |||
if use_deduper_reload: | |||
self.deduper_reloader.update_sources() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. (I put ok in there for myself to know how far I've read).
DefinitionAst = (ast.FunctionDef, ast.AsyncFunctionDef) | ||
|
||
|
||
def get_module_file_name(module: ModuleType | str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a note (not a request to change). Those |
, or list[]
fails on older version of Python when using runtime types extraction of type (https://docs.python.org/3/library/typing.html#typing.get_type_hints). Som in some case from __future__ import annotations
is not sufficient to use the new annotation syntax.
""" | ||
|
||
qualified_name: tuple[str, ...] | ||
abstract_syntax_tree: ast.AST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
|
||
def __init__(self) -> None: | ||
self.children: dict[str, AutoreloadTree] = {} | ||
self.defs_to_reload: dict[str, ast.AST] = {} | ||
self.defs_to_delete: set[str] = set() | ||
self.new_nested_classes: dict[str, ast.AST] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think usually you want to the type annotation at class level for proper type checking in subclasses I believe; it used to be that mypy was not able to infer type annotation of attribute defined in init. (or emit a AutoreloadTree.children
attribute might not be set/ None does not have method XXX).
I don't think that's a big problem here, but something to keep in mind depending on what
Detects what functions/methods can be reloaded by recursively comparing the old/new AST of module-level classes, module-level classes' methods, recursing through nested classes' methods. | ||
If other changes are made, original autoreload algorithm is called directly. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe wrap at 80 chars wide.
Returns `true` if we can run our targeted autoreload algorithm safely. | ||
Returns `false` if we should instead use IPython's original autoreload implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returns `true` if we can run our targeted autoreload algorithm safely. | |
Returns `false` if we should instead use IPython's original autoreload implementation. | |
Returns | |
------- | |
`True` if we can run our targeted autoreload algorithm safely. | |
`False` if we should instead use IPython's original autoreload implementation. |
We try to follow numpydoc standard when we can.
Returns `true` on success. Returns `false` on failure. | ||
Currently, only returns `true` as we do not block on failure to build this graph. | ||
""" | ||
return self._gather_dependents(new_ast.body) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience I'm slowly reading through this. I will do a release soon, and might not have the time to fully review. I've dropped some comments for myself when I come back to it.
denylist site-packages and dist-packages from deduperreload
Introducing Deduperreload, an autoreload implementation that does not produce duplicate versions of objects upon reload. This work was done as part of a Summer 2024 internship at Databricks. We hope the community can also benefit from this!
The goal of this autoreload implementation is to allow for any modification to a function/method and have those changes be implemented directly without having to refresh the repl.
Here’s our pipeline:
Note that this functionality only takes effect for CPython.
What types of changes are supported by deduperreload?
Modifications to any function/method are supported. If some other change occurs (e.g. modification to a class member variable, or some module-level variable), we fall back to superreload.
Furthermore, if the AST is not modified (in a meaningful way), reloading will not occur, even if the file has been modified (e.g. adding a new line, etc.)
How does deduperreload work?
Instead of calling
reload()
to generate a new copy of the entire module (and everything in it), we intelligently find functions/methods that have been modified and just update the attributes of these existing function/method object. We also support adding/removing new functions/methods in a similar manner.Why is dedupereload important?
With the original implementation of autoreload, function objects get duplicated due to directly calling the
reload()
function. This can cause many problems, especially with decorators, enums, etc. as it is possible for both the old function object and the new reloaded function object to exist at the same time.Fixes #14395