-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ForEach-Object -Parallel situationally drops pipeline input #12801
Comments
Yeah. It tries to marshal back to the original pipeline execution thread, but since a pipeline is already running it just runs in the current thread while pretending it's the pipeline thread. As a result a whole bunch of state corruption and other weird side effects are possible. |
You can see it by doing this: class Foo {
static [object] Echo($val) {
return [PSCustomObject]@{
ThreadId = [Threading.Thread]::CurrentThread.ManagedThreadId
RunspaceId = [runspace]::DefaultRunspace.Id
}
}
}
$cls = [Foo]
while ($true) {
1..10 | ForEach-Object -Parallel {
Start-Sleep -ms 100
($using:cls)::echo($_)
}
} You'll get a whole bunch of different thread ID's with the same runspace. Also occasionally some other evidence of state corruption like "Global scope cannot be removed" errors (note this is on Windows too, I'm guessing your Windows machine just has hardware that makes the race conditions less frequently hit). |
That's a long winded way of saying it's basically a dupe of #4003. Though a significantly more likely to happen in the wild repro. |
A possible "solution" is to try to block PowerShell classes from being carried over. Sort of like how script blocks currently work. If #4003 was fixed, this would just dead lock. Personally I'd prefer if runspace affinity was just removed from classes but I understand it's next to impossible to know the impact of a change like that. Something should probably be done though, many folks are going to turn to classes as a way to "get around" the script block problem. It might even appear to work for awhile. |
Thanks for the detailed analysis, @SeeminglyScience. If removing runspace affinity is too risky, perhaps another option is to simply recreate classes in the target runspace, similar to what is already being discussed for script blocks - #12378 - and in line with making the planned opt-in for transferring runspace state comprehensive - #12240. |
I mean doing that would already be effectively removing runspace affinity anyway -- you'd just be binding their affinity to the new runspace. It would be largely indistinguishable from always having the code execute in whichever thread it's called and just looking for the default runspace for that thread. More or less about the same there, I'd guess. |
Just straight up recreating it is a little sketchy though because you'd lose state from static properties. Property accessors don't (currently) invoke any PowerShell so they aren't subject to the same problems necessarily. Not that you should use them for state sharing in all your threads, but more that it would be confusing if it "appeared" to be gone randomly. Also recreating the classes would create a lot of confusion around type identity. |
While disallowing cross-thread class use is definitely the safest option, it also takes away potentially useful functionality. Given the "less solid" nature of PowerShell classes my guess is that loss of type identity would be less of an issue - it's a case where documenting the issue may suffice (just as users will have to understand that script blocks that are recreated in a different thread no longer form closures around the original runspace's variables, if any); the Good point about static properties, however - that sounds more problematic.
Has anyone ever looked into this? |
NOTE: The examples below should never be purposefully relied on, they are incredibly dangerous and can corrupt state in unexpected ways and/or cause general instability.
It's probably accidently relied on. Apparently a class method converted to a delegate can run on any thread, even one without a runspace. I was having trouble coming up with a scenario where this would appear useful because I assumed you needed a thread with a default runspace other than the one it was originally bound to. Knowing this, it makes a lot of .NET interop appear to work as expected. class Test {
static [psobject] GetInfo() {
return [PSCustomObject]@{
ThreadId = [Threading.Thread]::CurrentThread.ManagedThreadId
RunspaceId = [runspace]::DefaultRunspace.Id
}
}
}
[Threading.Tasks.Task[psobject]]::
Run([Func[psobject]][Test]::GetInfo).
GetAwaiter().
GetResult()
[Test]::GetInfo() Gives some variant of:
Here's something that really surprised me, if you don't block until completion, it'll actually properly marshal it to the right thread: class Test {
static [psobject] GetInfo() {
return [PSCustomObject]@{
ThreadId = [Threading.Thread]::CurrentThread.ManagedThreadId
RunspaceId = [runspace]::DefaultRunspace.Id
}
}
}
$task = [Threading.Tasks.Task[psobject]]::Run([Func[psobject]][Test]::GetInfo)
while (-not $task.AsyncWaitHandle.WaitOne(200)) { }
$task.GetAwaiter().GetResult()
[Test]::GetInfo()
So the answer is yes, it probably accidently makes a lot of complicated .NET interop work where it would otherwise throw. Any API where you pass a delegate and that delegate gets called on another thread. |
Personally I think these behaviors would be the most ideal for method invocations. If called on:
Though 2 and 3 would be breaking changes, I think they're worth it considering how dangerous they can be. cc @rjmholt @PaulHigin @daxian-dbw if any of y'all want to weigh in. |
With #3, is it more desirable to throw there, or take the time to spin up a runspace to execute in anyway, so as to "make" it work without corrupting the state of another runspace? |
I've thought about that a lot but there's a lot of questions
As much as the error has been frustrating for me in the past, it's also been pretty helpful in informing me that I basically just shouldn't use that API. It's a hard call. I'd like the option, but I'm also not super sure how you'd go about surfacing that option. Or if it'd be worth the development time. |
Another consequence of this is that if you call a second method from the first one, you get a dead lock. class Test {
static [void] FirstMethod() {
[Console]::WriteLine('First method starting')
[Test]::SecondMethod()
[Console]::WriteLine('First method finishing')
}
static [void] SecondMethod() {
[Console]::WriteLine('Second method starting')
[Console]::WriteLine('Second method finishing')
}
}
[System.Threading.Tasks.Task]::Run([action][Test]::FirstMethod).GetAwaiter().GetResult() The first method isn't actually running on the pipeline thread, so the second method call also tries to use PowerShell eventing to retake the pipeline thread. The first method will never complete, because it's waiting on the second method, which is waiting on the first method. I accidently ran into this trying to use the |
Well, just ran into this today with trying to reference a class instance created in the parent script. |
PowerShell classes are not thread safe. This is true whether they are run in |
@SeeminglyScience Thanks for taking the time to look at #17957 . As someone who isn't a developer, but uses PS a lot, classes have been great. Their introduction helped me structure my work a lot better. I have two questions. It does not seem that the documentation pages for "about_Classes" or "about_ThreadJob" mention anywhere that PS classes are not thread safe. I'm not sure if this does in fact appear anywhere in the PS documentation? Secondly, is there any chance that this might change in future? Or will it be a case of PowerShell not really being the correct tool for the job, and people who might need to use classes with threading moving away from PS, to complied C# code or A.N.Other solution? Cheers again. |
More precisely, it's affinity to session state. A runspace could have multiple session states. I think a cleaner and safer solution is to introduce an attribute to explicitly opt out of session state affinity for a powershell class. And for such a class definition, its instances object and static method calls simply ignores the session state affinity, which means they will run in the current default runspace, with the current session state in that runspace.
Of course, once declaring the attribute to the class, the class implementation should not depend on any states of a particular session state, because there is no guarantee the dependencies would be available at runtime. I will work on this idea a bit more to see if there are any potential blockers. At the meanwhile, there is a way today to create a powershell class instance without any session state affinity (no such workaround for static members though). See #3651 (comment) for the details. Note that, we officially support this "workaround": #3871 |
Note: I can only reproduce this reliably on macOS 10.15.4.
I see it occasionally on Ubuntu 18.04, and can never reproduce it on Windows.
It may be connected to using
Start-Sleep
inside the script block; not sure if the fact that a custom class is used across thread boundaries is relevant [update: it is]Steps to reproduce
Expected behavior
The loop should run indefinitely, because the test should alway succeed.
Actual behavior
The loop eventually - after a varying number of iterations - breaks due to intermittent test failures, stemming from a missing output; e.g.:
That is, not all input objects were echoed.
Environment data
The text was updated successfully, but these errors were encountered: