-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invoke-WebRequest and Invoke-RestMethod do not decode content in accordance with BOM/Content-Type #11547
Comments
|
@he852100 Please add info about PowerShell version. Can you repo with latest PowerShell Core build? |
PSVersion 7.0.0-daily.20200110
PSEdition Core
GitCommitId 7.0.0-daily.20200110
OS Linux 3.10.0-1062.9.1.el7.x86_64 …
Platform Unix
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
WSManStackVersion 3.0 sh> Invoke-WebRequest 'https://pscoretestdata.blob.core.windows.net/v7-0-0-daily-20200110/powershell-7.0.0-daily.20200110-linux-arm64.tar.gz' -O ~/powershell.tar.gz -Resume
StatusCode : 416
StatusDescription : RequestedRangeNotSatisfiable
Content : <?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidRange</Code><Message>The rang
e specified is invalid for the current size of the resource.
RequestId:e8b88225-401e-0127-7cdc-c866f8000000
PS /root> $a.headers.GetEnumerator()
Key Value
--- -----
Server {Windows-Azure-Blob/1.0, Microsoft-HTTPAPI/2.0}
x-ms-request-id {322455bd-301e-008d-77e3-c8f642000000}
x-ms-version {2014-02-14}
Date {Sun, 12 Jan 2020 00:56:33 GMT}
Content-Length {249}
Content-Type {application/xml}
Content-Range {bytes */46486387} PowerShell obeys the standard by assuming ISO-8859-1, but unfortunately the site is using UTF-8. |
@iSazonov It can be determined that powershell does not recognize utf8bom |
@he852100 I guess it comes from .Net Core. |
That comes from PS5 and older. If website saying, i'm |
Note: I don't know what the intended behavior is, but here is what seems to be happening: Because the response doesn't indicate a character encoding ( Because it blindly assumes ISO-8859-1, the UTF-8 BOM is read as data, and the payload is therefore not recognized as XML, which falls back to a(n incorrectly decoded) string instead of returning an Note that current RFC, RFC 7231, no longer mandates an overall default and instead defers to the default encoding of the given media type. Given that HTML5 now also defaults to UTF-8 and given that RFC 2616 is obsolete, we should consider implementing the following logic in both
|
Currently we have many workarounds. I guess they comes from PS 5.0.
|
That's promising, @iSazonov, but it looks like the referenced method gives precedence to the This is the reverse of how XML data is supposed to be handled according to RFC 7303 (leaving the additional need to respect an encoding in the XML declaration aside), and, arguably, for all textual media types, according to section "5. Security Considerations" of RFC 6657:
A BOM is an instance of in-band information, whereas the Therefore, the method you link to wouldn't solve the problem described in #12861, for instance. |
It looks like a .Net bug. You could open new issue in .Net Runtime repo. In common, I guess we could simplify the PowerShell code if we would follow the .Net API. |
@PowerShell/wg-powershell-cmdlets reviewed this. We agree that the BOM should take precedence and where it makes sense, the web cmdlets should have the same behavior as curl. We're explicitly not making any statement about implementation |
Unrecognizable and processed, garbled.
Example
Expected
Results
Read saved files,Seems no problem.
curl
The text was updated successfully, but these errors were encountered: