Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add progress to Copy-Item #18735

Merged
merged 15 commits into from
Jan 25, 2023
Merged

Conversation

SteveL-MSFT
Copy link
Member

@SteveL-MSFT SteveL-MSFT commented Dec 6, 2022

PR Summary

To show useful progress, this change requires collecting total files/bytes information first. The initial collection of total files and bytes is on a separate thread so no impact to start of copying. This should address the primary concern from the community.

For the PowerShell repo (about 705 MB and > 9000 files), it took 16.5s with this change, and 15.7s without this change. This was not a scientific measurement.

Replaces #18728

Screen-Recording-2022-12-07-at-5

There's also a small fix to progress display where 0 percent complete was showing a full bar.

PR Context

Fix #18637
Fix #18850

PR Checklist

@ghost ghost assigned iSazonov Dec 6, 2022
@SteveL-MSFT SteveL-MSFT changed the title Add progress to Copy-Item for local files WIP: Add progress to Copy-Item for local files Dec 6, 2022
@SteveL-MSFT SteveL-MSFT added WG-Interactive-Console the console experience WG-Cmdlets general cmdlet issues labels Dec 6, 2022
@SteveL-MSFT
Copy link
Member Author

The main community feedback appears to want to opt into progress. However, the current design pattern for PowerShell is to control this via $ProgressPreference. Perhaps the Cmdlets WG should discuss this.

@kilasuit
Copy link
Collaborator

kilasuit commented Dec 7, 2022

This to me looks better & if we could in future on per cmdlets change progress preferences like Error action etc this would be a nicer opt out mechanism which I'll raise another issue to discuss this idea shortly

@iSazonov
Copy link
Collaborator

iSazonov commented Dec 7, 2022

My comment #18728 (comment) is still actual.
I believe it makes no sense to pre-calculate the total size and number of files. (How about implementing Measure-Items?) These values could have been useful in the complete message, but we remove it immediately. Also the design is not address "freezing" on large files. This can be achieved by simple asynchronous copying, as well as the initial delay in the display, which is already implemented in some MSFT modules.

@SteveL-MSFT
Copy link
Member Author

I've moved the calculation of the totals to a new thread which removes the concern about impacting the start of copying.

@SteveL-MSFT SteveL-MSFT changed the title WIP: Add progress to Copy-Item for local files Add progress to Copy-Item for local files Dec 7, 2022
@SteveL-MSFT
Copy link
Member Author

The @PowerShell/wg-powershell-cmdlets discussed this and agrees on the current design which is to asynchronously collect the total files/bytes information at the start of copying. However, estimating time remaining has never proven to be accurate and may raise more concerns so we recommend not including the time remaning.

@JustinGrote
Copy link
Contributor

JustinGrote commented Dec 7, 2022

I'm good with it, and I think progress should be default, its convention to expect some commands to emit progress and some not, I can't imagine there's a lot of breaking changes where someone would be explicitly capturing the progress stream and sudden introduction of copy-item results into that stream would break their flow, it would have to be very very niche. I should note I am one of those theoretically "niche" people as the Pester VSCode extension breaks output into streams, but I just silently discard progress as its not useful so this wouldn't break me for instance.

@bergmeister
Copy link
Contributor

bergmeister commented Dec 7, 2022

This change could benefit from the utils function DisplayHumanReadableFileSize that I wrote for my Invoke-Webrequest progress bar to calculate file size unit (KB, MB, GB, etc) so that it's not hard coded to MB, see my PR #14611

@kilasuit
Copy link
Collaborator

kilasuit commented Dec 7, 2022

The @PowerShell/wg-powershell-cmdlets discussed this and agrees on the current design which is to asynchronously collect the total files/bytes information at the start of copying. However, estimating time remaining has never proven to be accurate and may raise more concerns so we recommend not including the time remaning.

@SteveL-MSFT - Re the last point I'd say that in line with other tooling this gives only an estimation and is how we'd expect it based on all the potential reasons why it may not finish as quick as we'd like, which inc CPU, network, disk, or memory contentions that are out of most users control. This at least on windows can be overriding by what priority we give the process running this task, which comes with its own set of risks but is one that allows additional flexibility going forward. Was that aspect considered? Because if not it may be good to review again.

@kilasuit
Copy link
Collaborator

kilasuit commented Dec 7, 2022

@bergmeister it already should benefit from it I think thou @SteveL-MSFT or others may correct me if I am wrong here

@JustinGrote
Copy link
Contributor

The @PowerShell/wg-powershell-cmdlets discussed this and agrees on the current design which is to asynchronously collect the total files/bytes information at the start of copying. However, estimating time remaining has never proven to be accurate and may raise more concerns so we recommend not including the time remaning.

@SteveL-MSFT - Re the last point I'd say that in line with other tooling this gives only an estimation and is how we'd expect it based on all the potential reasons why it may not finish as quick as we'd like, which inc CPU, network, disk, or memory contentions that are out of most users control. This at least on windows can be overriding by what priority we give the process running this task, which comes with its own set of risks but is one that allows additional flexibility going forward. Was that aspect considered? Because if not it may be good to review again.

File progress estimations are notoriously difficult to do, moreso on spinning rust than SSDs, because little files take exponentially more time per GB than big files due to the overhead of transfer setup/teardown. Estimates can be based on how many files left, how much throughput, or as advanced as "bucketing" certain file sizes into groups and taking a weighted average of the gb/sec transfer time of those files. Any way you slice it the estimate can be thrown off very easily, so while I personally don't see any harm to it, someone who doesn't understand why the estimates vary so wildly or are inaccurate may not be so forgiving.

@SteveL-MSFT SteveL-MSFT changed the title Add progress to Copy-Item for local files WIP: Add progress to Copy-Item for local files Dec 8, 2022
@SteveL-MSFT
Copy link
Member Author

SteveL-MSFT commented Dec 8, 2022

Took care of most of the feedback.

@SteveL-MSFT
Copy link
Member Author

@kilasuit I believe the latest update addresses your concerns. See the updated gif.

@bergmeister Updated using your humanizer code.

@SteveL-MSFT SteveL-MSFT changed the title WIP: Add progress to Copy-Item for local files Add progress to Copy-Item Dec 8, 2022
@kilasuit
Copy link
Collaborator

kilasuit commented Dec 9, 2022

@SteveL-MSFT I'd still like to see an estimated amount of time just like in file explorer copy or just like most other copying utilities out there. though perhaps this could be added via an additional parameter instead of being the default?

@bergmeister
Copy link
Contributor

@SteveL-MSFT I'd still like to see an estimated amount of time just like in file explorer copy or just like most other copying utilities out there. though perhaps this could be added via an additional parameter instead of being the default?

Are you sure 🤔 😅
Screenshot_20221209_194401_Chrome

@kilasuit
Copy link
Collaborator

kilasuit commented Dec 10, 2022

As funny as that is @bergmeister I am sure

Especially if you look at things like estimated download times in any other software (which is essentially the same thing, just a different way of thinking of it)

The reason being is that you can by seeing that more easily work out, when you might be able to come back to that series of tasks and start the next one, which for many in interactive scenarios is what they need to be able to manage all of their many spinning plates.

@kilasuit
Copy link
Collaborator

kilasuit commented Jan 24, 2023

@iSazonov feedback noted. For the latter, we do have existing $ProgressPreference that should be used in automation and we also have the new -ProgressAction as well.

We need something magic for interactive scenarios in which all interactive features must be deactivated. We have NonInteractive command line parameter - can we use it for this purpose? For example, if the script is run by Task Scheduler.

I agree that we should look to expand how non-interactive works & this should be discussed in an new issue @iSazonov do you want to raise it?

@pull-request-quantifier-deprecated

This PR has 59 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!


Quantification details

Label      : Small
Size       : +57 -2
Percentile : 23.6%

Total files changed: 4

Change summary by file extension:
.cs : +51 -2
.resx : +6 -0

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a
balance between between PR complexity and PR review overhead. PRs within the
optimal size (typical small, or medium sized PRs) mean:

  • Fast and predictable releases to production:
    • Optimal size changes are more likely to be reviewed faster with fewer
      iterations.
    • Similarity in low PR complexity drives similar review times.
  • Review quality is likely higher as complexity is lower:
    • Bugs are more likely to be detected.
    • Code inconsistencies are more likely to be detected.
  • Knowledge sharing is improved within the participants:
    • Small portions can be assimilated better.
  • Better engineering practices are exercised:
    • Solving big problems by dividing them in well contained, smaller problems.
    • Exercising separation of concerns within the code changes.

What can I do to optimize my changes

  • Use the PullRequestQuantifier to quantify your PR accurately
    • Create a context profile for your repo using the context generator
    • Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
    • Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
    • Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
  • Change your engineering behaviors
    • For PRs that fall outside of the desired spectrum, review the details and check if:
      • Your PR could be split in smaller, self-contained PRs instead
      • Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

  • One line was added: +1 -0
  • One line was deleted: +0 -1
  • One line was modified: +1 -1 (git diff doesn't know about modified, it will
    interpret that line like one addition plus one deletion)
  • Change percentiles: Change characteristics (addition, deletion, modification)
    of this PR in relation to all other PRs within the repository.


Was this comment helpful? 👍  :ok_hand:  :thumbsdown: (Email)
Customize PullRequestQuantifier for this repository.

@SteveL-MSFT
Copy link
Member Author

@iSazonov I would suggest we take this PR so we can get user feedback. The risk of adding additional fields (as there's already existing ones in FileSystemProvider) is minimal. You have valid concerns about non-interactive scenarios and should be discussed in a new issue. Note that if you use pwsh -noninteractive then prompt is disabled. However, progress specifically doesn't block noninteractive use and some users may want to see progress. Progress can already be suppressed via $ProgressPreference.

@iSazonov iSazonov merged commit 1c27a0c into PowerShell:master Jan 25, 2023
@iSazonov
Copy link
Collaborator

@iSazonov perhaps you should open a new issue where -noninteractive results in different defaults for the preference variables? It would be a breaking change, however.

@iSazonov feedback noted. For the latter, we do have existing $ProgressPreference that should be used in automation and we also have the new -ProgressAction as well.

We need something magic for interactive scenarios in which all interactive features must be deactivated. We have NonInteractive command line parameter - can we use it for this purpose? For example, if the script is run by Task Scheduler.

I agree that we should look to expand how non-interactive works & this should be discussed in an new issue @iSazonov do you want to raise it?

Since ProgressBar is considered unimportant and unimportant, based on previous experience I have no desire to fall into lifeless discussions and especially unanswered PRs.

@iSazonov iSazonov added the CL-General Indicates that a PR should be marked as a general cmdlet change in the Change Log label Jan 25, 2023
@SteveL-MSFT SteveL-MSFT deleted the copy-item-progress-2 branch January 25, 2023 14:28
@ghost
Copy link

ghost commented Mar 14, 2023

🎉v7.4.0-preview.2 has been released which incorporates this pull request.:tada:

Handy links:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CL-General Indicates that a PR should be marked as a general cmdlet change in the Change Log Small WG-Cmdlets general cmdlet issues WG-Interactive-Console the console experience
Projects
None yet
8 participants