ForEach-Object -Parallel situationally drops pipeline input #12801

mklement0 · 2020-05-26T22:14:44Z

Note: I can only reproduce this reliably on macOS 10.15.4.
I see it occasionally on Ubuntu 18.04, and can never reproduce it on Windows.

It may be connected to using Start-Sleep inside the script block; not sure if the fact that a custom class is used across thread boundaries is relevant [update: it is]

Steps to reproduce

class Foo { static [object] Echo($val) { return $val } }
$cls = [Foo]
while ($true) { 
  $res = 1..10 | ForEach-Object -Parallel {
    Start-Sleep -ms 100; ($using:cls)::echo($_) 
  }  |  Sort-Object
  Compare-Object $res (1..10) | Should -BeNullOrEmpty
  Write-Host -NoNewline . 
}

Expected behavior

The loop should run indefinitely, because the test should alway succeed.

Actual behavior

The loop eventually - after a varying number of iterations - breaks due to intermittent test failures, stemming from a missing output; e.g.:

Expected $null or empty, but got @(@{InputObject=5; SideIndicator==>})

That is, not all input objects were echoed.

Environment data

PowerShell Core 7.1.0-preview.3  on macOS 10.15.4

The text was updated successfully, but these errors were encountered:

SeeminglyScience · 2020-05-27T00:05:23Z

not sure if the fact that a custom class is used across thread boundaries is relevant.

Yeah. It tries to marshal back to the original pipeline execution thread, but since a pipeline is already running it just runs in the current thread while pretending it's the pipeline thread. As a result a whole bunch of state corruption and other weird side effects are possible.

SeeminglyScience · 2020-05-27T00:10:31Z

You can see it by doing this:

class Foo {
    static [object] Echo($val) {
        return [PSCustomObject]@{
            ThreadId = [Threading.Thread]::CurrentThread.ManagedThreadId
            RunspaceId = [runspace]::DefaultRunspace.Id
        }
    }
}

$cls = [Foo]
while ($true) { 
    1..10 | ForEach-Object -Parallel {
        Start-Sleep -ms 100
        ($using:cls)::echo($_) 
    }
}

You'll get a whole bunch of different thread ID's with the same runspace. Also occasionally some other evidence of state corruption like "Global scope cannot be removed" errors (note this is on Windows too, I'm guessing your Windows machine just has hardware that makes the race conditions less frequently hit).

SeeminglyScience · 2020-05-27T00:12:44Z

That's a long winded way of saying it's basically a dupe of #4003. Though a significantly more likely to happen in the wild repro.

SeeminglyScience · 2020-05-27T13:01:12Z

/cc @PaulHigin @daxian-dbw

A possible "solution" is to try to block PowerShell classes from being carried over. Sort of like how script blocks currently work.

If #4003 was fixed, this would just dead lock. Personally I'd prefer if runspace affinity was just removed from classes but I understand it's next to impossible to know the impact of a change like that. Something should probably be done though, many folks are going to turn to classes as a way to "get around" the script block problem. It might even appear to work for awhile.

mklement0 · 2020-05-27T13:19:54Z

Thanks for the detailed analysis, @SeeminglyScience.

If removing runspace affinity is too risky, perhaps another option is to simply recreate classes in the target runspace, similar to what is already being discussed for script blocks - #12378 - and in line with making the planned opt-in for transferring runspace state comprehensive - #12240.

vexx32 · 2020-05-27T13:25:26Z

I mean doing that would already be effectively removing runspace affinity anyway -- you'd just be binding their affinity to the new runspace. It would be largely indistinguishable from always having the code execute in whichever thread it's called and just looking for the default runspace for that thread. More or less about the same there, I'd guess.

SeeminglyScience · 2020-05-27T13:51:50Z

Just straight up recreating it is a little sketchy though because you'd lose state from static properties. Property accessors don't (currently) invoke any PowerShell so they aren't subject to the same problems necessarily. Not that you should use them for state sharing in all your threads, but more that it would be confusing if it "appeared" to be gone randomly. Also recreating the classes would create a lot of confusion around type identity.

mklement0 · 2020-05-28T13:25:28Z

While disallowing cross-thread class use is definitely the safest option, it also takes away potentially useful functionality.

Given the "less solid" nature of PowerShell classes my guess is that loss of type identity would be less of an issue - it's a case where documenting the issue may suffice (just as users will have to understand that script blocks that are recreated in a different thread no longer form closures around the original runspace's variables, if any); the .FullName properties will be identical, but -is and -as won't work (perhaps a little white lie could fix that)?

Good point about static properties, however - that sounds more problematic.
But couldn't we special-case this to copy any static property values after recreating (just like script blocks will be special-cased, as discussed in #12378 (comment))?

I'd prefer if runspace affinity was just removed from classes but I understand it's next to impossible to know the impact of a change like that.

Has anyone ever looked into this?
Isn't the current behavior never useful, or are there legitimate scenarios where existing code relies on it?

SeeminglyScience · 2020-06-05T13:38:47Z

NOTE: The examples below should never be purposefully relied on, they are incredibly dangerous and can corrupt state in unexpected ways and/or cause general instability.

Isn't the current behavior never useful, or are there legitimate scenarios where existing code relies on it?

It's probably accidently relied on. Apparently a class method converted to a delegate can run on any thread, even one without a runspace. I was having trouble coming up with a scenario where this would appear useful because I assumed you needed a thread with a default runspace other than the one it was originally bound to. Knowing this, it makes a lot of .NET interop appear to work as expected.

class Test {
    static [psobject] GetInfo() {
        return [PSCustomObject]@{
            ThreadId = [Threading.Thread]::CurrentThread.ManagedThreadId
            RunspaceId = [runspace]::DefaultRunspace.Id
        }
    }
}

[Threading.Tasks.Task[psobject]]::
    Run([Func[psobject]][Test]::GetInfo).
    GetAwaiter().
    GetResult()

[Test]::GetInfo()

Gives some variant of:

ThreadId RunspaceId
-------- ----------
      19          1
      15          1

Here's something that really surprised me, if you don't block until completion, it'll actually properly marshal it to the right thread:

class Test {
    static [psobject] GetInfo() {
        return [PSCustomObject]@{
            ThreadId = [Threading.Thread]::CurrentThread.ManagedThreadId
            RunspaceId = [runspace]::DefaultRunspace.Id
        }
    }
}

$task = [Threading.Tasks.Task[psobject]]::Run([Func[psobject]][Test]::GetInfo)
while (-not $task.AsyncWaitHandle.WaitOne(200)) { }
$task.GetAwaiter().GetResult()

[Test]::GetInfo()

ThreadId RunspaceId
-------- ----------
      15          1
      15          1

So the answer is yes, it probably accidently makes a lot of complicated .NET interop work where it would otherwise throw. Any API where you pass a delegate and that delegate gets called on another thread.

SeeminglyScience · 2020-06-05T14:03:13Z

Personally I think these behaviors would be the most ideal for method invocations.

If called on:

The origin runspace thread - run with SessionState affinity like normal
A thread with a default runspace but one where the class or instance was not defined - run in the current thread's runspace without SessionState affinity. Maybe with TopLevelSessionState affinity
A thread with no default runspace - throw ScriptBlockDelegateInvokedFromWrongThread like ScriptBlock.Invoke currently does

Though 2 and 3 would be breaking changes, I think they're worth it considering how dangerous they can be.

cc @rjmholt @PaulHigin @daxian-dbw if any of y'all want to weigh in.

vexx32 · 2020-06-05T14:05:09Z

With #3, is it more desirable to throw there, or take the time to spin up a runspace to execute in anyway, so as to "make" it work without corrupting the state of another runspace?

SeeminglyScience · 2020-06-05T14:16:30Z

With #3, is it more desirable to throw there, or take the time to spin up a runspace to execute in anyway, so as to "make" it work without corrupting the state of another runspace?

I've thought about that a lot but there's a lot of questions

What kind of state should it have? InitialSessionState.CreateDefault()?
Is it better to do that transparently? There's a potential for a significant amount of extra overhead in a way that isn't super visible to the user.
If you throw instead, you know exactly what happened. If you create a new runspace, it's next to impossible for even the above average user to know their delegate was invoked in a different thread.
When would the runspace be disposed? Directly after invocation?

As much as the error has been frustrating for me in the past, it's also been pretty helpful in informing me that I basically just shouldn't use that API. It's a hard call. I'd like the option, but I'm also not super sure how you'd go about surfacing that option. Or if it'd be worth the development time.

SeeminglyScience · 2020-06-15T13:39:37Z

Another consequence of this is that if you call a second method from the first one, you get a dead lock.

class Test {
    static [void] FirstMethod() {
        [Console]::WriteLine('First method starting')
        [Test]::SecondMethod()
        [Console]::WriteLine('First method finishing')
    }

    static [void] SecondMethod() {
        [Console]::WriteLine('Second method starting')
        [Console]::WriteLine('Second method finishing')
    }
}

[System.Threading.Tasks.Task]::Run([action][Test]::FirstMethod).GetAwaiter().GetResult()

The first method isn't actually running on the pipeline thread, so the second method call also tries to use PowerShell eventing to retake the pipeline thread. The first method will never complete, because it's waiting on the second method, which is waiting on the first method.

I accidently ran into this trying to use the PSDataCollection<>.DataAdding event with PowerShell class methods as event handlers.

chrisbues · 2021-11-02T02:07:33Z

Well, just ran into this today with trying to reference a class instance created in the parent script.

PaulHigin · 2021-11-02T16:08:02Z

PowerShell classes are not thread safe. This is true whether they are run in ForEach-Object -Parallel or any of the PowerShell threading APIs. It would be worth looking into whether this can be detected through a using reference.

CanineBarbie · 2022-09-16T14:13:34Z

@SeeminglyScience Thanks for taking the time to look at #17957 .

As someone who isn't a developer, but uses PS a lot, classes have been great. Their introduction helped me structure my work a lot better.

I have two questions. It does not seem that the documentation pages for "about_Classes" or "about_ThreadJob" mention anywhere that PS classes are not thread safe. I'm not sure if this does in fact appear anywhere in the PS documentation?

Secondly, is there any chance that this might change in future? Or will it be a case of PowerShell not really being the correct tool for the job, and people who might need to use classes with threading moving away from PS, to complied C# code or A.N.Other solution?

Cheers again.

daxian-dbw · 2022-09-16T18:41:59Z

I'd prefer if runspace affinity was just removed from classes but I understand it's next to impossible to know the impact of a change like that.

Has anyone ever looked into this?
Isn't the current behavior never useful, or are there legitimate scenarios where existing code relies on it?

More precisely, it's affinity to session state. A runspace could have multiple session states.
The current behavior is useful for classes or class instances exposed from modules. The implementation of static/instance methods of a powershell class may depend on states within the module scope, such as functions and variables. The affinity to session state makes sure those states is accessible when calling the static/instance methods of the powershell class.

I think a cleaner and safer solution is to introduce an attribute to explicitly opt out of session state affinity for a powershell class. And for such a class definition, its instances object and static method calls simply ignores the session state affinity, which means they will run in the current default runspace, with the current session state in that runspace.

[OptOutSessionStateAffinity()]
class Foo
{
    static [object] Echo($val) { return $val } }
}

Of course, once declaring the attribute to the class, the class implementation should not depend on any states of a particular session state, because there is no guarantee the dependencies would be available at runtime.

I will work on this idea a bit more to see if there are any potential blockers.

At the meanwhile, there is a way today to create a powershell class instance without any session state affinity (no such workaround for static members though). See #3651 (comment) for the details.

Note that, we officially support this "workaround": #3871

mklement0 added the Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a label May 26, 2020

iSazonov added the WG-Cmdlets-Core cmdlets in the Microsoft.PowerShell.Core module label Jun 1, 2020

SeeminglyScience mentioned this issue Jul 15, 2020

Custom classes and thread safe variables #13179

Closed

SeeminglyScience mentioned this issue Sep 6, 2020

ForEach-Object -Parallel Pipeline and other breakdowns #13588

Closed

SeeminglyScience mentioned this issue Aug 1, 2022

ForEach-Object -Parallel | Execution hangs when method in class calls other method in class #17806

Closed

5 tasks

SeeminglyScience mentioned this issue Sep 1, 2022

Unexpected behaviour calling nested methods in PS class passed to other thread #17957

Closed

5 tasks

daxian-dbw mentioned this issue Sep 20, 2022

Make PowerShell class not affiliate with Runspace when declaring the NoRunspaceAffinity attribute #18138

Merged

13 tasks

ghost added the In-PR Indicates that a PR is out for the issue label Sep 20, 2022

daxian-dbw closed this as completed in #18138 Sep 22, 2022

ghost added Resolution-Fixed The issue is fixed. and removed In-PR Indicates that a PR is out for the issue labels Sep 22, 2022

SeeminglyScience mentioned this issue Oct 10, 2022

Thread safe class method defined in Powershell hangs inside a Lazy inside a ConcurrentDictionary #18248

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ForEach-Object -Parallel situationally drops pipeline input #12801

ForEach-Object -Parallel situationally drops pipeline input #12801

mklement0 commented May 26, 2020 •

edited

Loading

SeeminglyScience commented May 27, 2020

SeeminglyScience commented May 27, 2020 •

edited

Loading

SeeminglyScience commented May 27, 2020 •

edited

Loading

SeeminglyScience commented May 27, 2020

mklement0 commented May 27, 2020

vexx32 commented May 27, 2020

SeeminglyScience commented May 27, 2020

mklement0 commented May 28, 2020

SeeminglyScience commented Jun 5, 2020 •

edited

Loading

SeeminglyScience commented Jun 5, 2020 •

edited

Loading

vexx32 commented Jun 5, 2020

SeeminglyScience commented Jun 5, 2020 •

edited

Loading

SeeminglyScience commented Jun 15, 2020

chrisbues commented Nov 2, 2021

PaulHigin commented Nov 2, 2021

CanineBarbie commented Sep 16, 2022

daxian-dbw commented Sep 16, 2022

ForEach-Object -Parallel situationally drops pipeline input #12801

ForEach-Object -Parallel situationally drops pipeline input #12801

Comments

mklement0 commented May 26, 2020 • edited Loading

Steps to reproduce

Expected behavior

Actual behavior

Environment data

SeeminglyScience commented May 27, 2020

SeeminglyScience commented May 27, 2020 • edited Loading

SeeminglyScience commented May 27, 2020 • edited Loading

SeeminglyScience commented May 27, 2020

mklement0 commented May 27, 2020

vexx32 commented May 27, 2020

SeeminglyScience commented May 27, 2020

mklement0 commented May 28, 2020

SeeminglyScience commented Jun 5, 2020 • edited Loading

SeeminglyScience commented Jun 5, 2020 • edited Loading

vexx32 commented Jun 5, 2020

SeeminglyScience commented Jun 5, 2020 • edited Loading

SeeminglyScience commented Jun 15, 2020

chrisbues commented Nov 2, 2021

PaulHigin commented Nov 2, 2021

CanineBarbie commented Sep 16, 2022

daxian-dbw commented Sep 16, 2022

mklement0 commented May 26, 2020 •

edited

Loading

SeeminglyScience commented May 27, 2020 •

edited

Loading

SeeminglyScience commented May 27, 2020 •

edited

Loading

SeeminglyScience commented Jun 5, 2020 •

edited

Loading

SeeminglyScience commented Jun 5, 2020 •

edited

Loading

SeeminglyScience commented Jun 5, 2020 •

edited

Loading