Jump to content

Talk:Unicode input: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Undid revision 978908051 by Peter M. Brown (talk) It is impossible for mod 256 to produce a number > 255, I assume this is vandalism
Tags: Undo Reverted
Undid revision 979070271 by Spitzak (talk) See Modular arithmetic#Examples. The numbers on both sides of the ≡ symbol can be greater than the modulus.
Tags: Undo Reverted
Line 172: Line 172:
:::::*In Wordpad, Alt+960 and Alt+0960 ''both'' produce a {{char|π}}, which is the correct Unicode character. The high-order zero doesn't matter.
:::::*In Wordpad, Alt+960 and Alt+0960 ''both'' produce a {{char|π}}, which is the correct Unicode character. The high-order zero doesn't matter.
:::::*Same counterexample. Alt+960 works just fine.
:::::*Same counterexample. Alt+960 works just fine.
:::::*960 ≡ 192 modulo 256, but <u>in Word and Wordpad</u> Alt+192 produces a {{char|}}(per CP437) and Alt+0192 produces an {{char|À}} (per Unicode and CP1252). Modulo 256 has nothing to do with it.
:::::*960 ≡ 448 modulo 256, but in Word and Wordpad Alt+448 and Alt+0448 both produce, not {{char|π}}, but the glottal stop {{char|ǀ}}. Modulo 256 has nothing to do with it.
:::::*Numbers greater than 62235 ''might'' not work? I've produced two cases of numbers that big that do work (one here and one in the article). Why is Spitzak so suspicious of the others?
:::::*Numbers greater than 62235 ''might'' not work? I've produced two cases of numbers that big that do work (one here and one in the article). Why is Spitzak so suspicious of the others?
::::I agree with {{u|John Maynard Friedman}}, above, that we should not confuse "the numeracy-challenged with incomprehensible talk of modulo 255," assuming that he really means 256. Spitzak evidently disagrees, as he has introduced such considerations into the article. However, [[Unicode input]] is, or should be, entirely concerned with Unicode input, with ways to produce characters when one knows their code points. Modulo 256, applicable to [[Notepad]], outgoing [[Gmail]]s, etc. could be discussed in the [[Alt code]] article, but it is not relevant here, because
::::I agree with {{u|John Maynard Friedman}}, above, that we should not confuse "the numeracy-challenged with incomprehensible talk of modulo 255," assuming that he really means 256. Spitzak evidently disagrees, as he has introduced such considerations into the article. However, [[Unicode input]] is, or should be, entirely concerned with Unicode input, with ways to produce characters when one knows their code points. Modulo 256, applicable to [[Notepad]], outgoing [[Gmail]]s, etc. could be discussed in the [[Alt code]] article, but it is not relevant here, because

Revision as of 17:30, 18 September 2020

Windows EnableNumKeypad clarification

Can someone please add a note about how, when using the Windows hexadecimal entry method involving EnableNumKeypad and Alt + <+>, one enters the hexadecimal digits A through F, which are not on the numeric keypad? —Largo Plazo (talk) 13:09, 9 December 2009 (UTC)[reply]

I assume you mean the EnableHexNumpad statement? I'm sorry I can not answer your question, but since the original reference source is down, I can not reproduce this to effect on my version of Windows (Windows 7) to verify its accuracy I'm putting a dubious stamp on this particular section. --oKtosiTe talk 17:21, 4 December 2010 (UTC)[reply]
I've used this in Windows Vista (32- and 64-Bit versions) for a long time, so I can tell you that it does work. I just got and tried it in Windows 7 to no effect, but now that I've tried it again after several reboots and shut downs, it does seem to work. I'm guessing that a simple reboot is all that's required to make the registry change take effect.
Hexadecimal codes involving letters are entered using the standard letter keys. It's very inconvenient, but the functionality is there.
It works on Windows 7, but you do have to reboot after setting the registry key. I've updated the article and removed the "dubious" flag. —Preceding unsigned comment added by 213.246.131.69 (talk) 10:46, 6 January 2011 (UTC)[reply]
So, my keyboard--Windows setup has a state: Numpad does either decimal or hexadecimal (icw A-F keys) interpretation. Note that when I type "ALT + 92", this could be 92-hex. (By the way; there must be extra NumPad keyboard, with USB connection, that has & does all 16 hexes?) -DePiep (talk) 21:02, 6 January 2012 (UTC)[reply]

5-Digit codes

FYI... on the Mac, it appears you are limited to only characters in the Basic Multilingual Plane. I've not been able to find any information about inputting 5-digit codes for the supplementary planes. The Unicode Hex Input method works only with 4-digit codes.

I've added an explanation on how to do this on Mac OS. However I cannot find an authoritative source. Donlibes (talk) 03:46, 5 January 2012 (UTC)[reply]
In linux the same. In Windows???--Wickey-nl (talk) 15:18, 10 April 2011 (UTC)[reply]
On Windows (at least on Windows 7), Alt-x works on 4, 5 and 6 digit codepoints (i.e. any Unicode character). BabelStone (talk) 22:59, 10 April 2011 (UTC)[reply]
Did you install extra fonts? After doing so, I could use 5-digit codes on linux. Firefox seems to recognize the system fonts.
The quivira-font has quite a lot of characters.--Wickey-nl (talk) 20:02, 15 April 2011 (UTC)[reply]
Maybe it works on Windows 7. On Vista, however, you can definitely not enter 5- or 6-digit codepoints in this way.
DIBA--193.138.91.175 (talk) 12:01, 15 December 2011 (UTC)[reply]
Are you certain? I've just tested with WordPad on Windows XP, and using alt-x I was able to convert 1000, 10000, 20000 and 10FFFF to the corresponding Unicode characters (of course, without the appropriate fonts, they may appear as square boxes, but I verified that the codes really had been converted correctly). BabelStone (talk) 12:17, 15 December 2011 (UTC)[reply]

Unicode.org

I notice that http://www.unicode.org/ (specifically http://www.unicode.org/Public/6.0.0/charts/CodeCharts.pdf -- warning: 75Mb file) is not referenced in either the Unicode or Alt-code pages and instead private sites are referenced. Does anyone know why this decision was made? — Preceding unsigned comment added by 24.77.26.31 (talk) 19:50, 2 December 2011 (UTC)[reply]

Because people like to promote their own site or their favourite site? The Unicode page does link to the official code charts (which is better to link to than http://www.unicode.org/Public/6.0.0/charts/CodeCharts.pdf as it will always reflect the latest version of Unicode, whereas the 75MB pdf will be out of date next spring). Personally I would remove links to all the private sites, and only link to the official Unicode code charts, as the private sites tend not to keep up to date with new versions of Unicode, but I got reverted when I tried to prune the external links on the Unicode page. BabelStone (talk) 12:26, 15 December 2011 (UTC)[reply]
I agree on using the Unicode links Babel's way, but I disagree on deleting other links. E.g. [1] has extra options, such as text search (single word in character Names), and Full list (of say general category: Symbol, Other). Being out of date in the future is a minor in the tradeoff (esp when going from 6.0.0 to 6.0.1 ;-) ). -DePiep (talk) 18:16, 15 December 2011 (UTC)[reply]

Request for clarification concerning hexadecimal code input in Microsoft Windows using the Alt key

The left Alt key works for entering Unicode characters, can't say anything about a right Alt key, as my keyboard doesn't have one. The AltGr key doesn't work for entering Unicode characters. I hope this clarification can be considered sufficient, and therefore remove the request for clarification from the article.K1812 (talk) 19:58, 18 August 2014 (UTC)[reply]

Concerning Unicode input in Microsoft Windows and request for citation

Concerning the request for citation: the Windows 8.1 registry initially doesn't have the value EnableHexNumpad, so if you want to enter Unicode characters the way that's described in this article, you need to edit the registry and add the string type value EnableHexNumpad, and assign the value data 1 to it. While editing, i erroneously removed your request for citation. If you don't consider the above explanation to be sufficient, please add the request for citation again.K1812 (talk) 20:24, 18 August 2014 (UTC)[reply]

Request for citation concerning Microsoft Windows versions

As i rewrote the paragraph, i accidentally deleted the request for citation. I have used the described method on Vista and Windows 8.1. Others have used it on Windows 7. I couldn't get it to work on Windows 95. I suppose the reason might have been, that Win 95 initially might not have supported Unicode at all. There is some sort of Unicode add-on for Windows 95, but at the time, i couldn't even download it from Microsoft. Please add your request for citation again if you want more sources.K1812 (talk) 20:51, 18 August 2014 (UTC)[reply]

Edit by Loginnigol of 23 September 2016

@ Loginnigol: excuse me, but you removed important information from the article and made the instructions wooly instead. Instead of leaving the instruction, that the user should add a value to a registry key in the article, you instruct the user to add a line to the registry. A line in the registry can mean another key or another value. It's important to distinguish between keys and values when editing the registry. Failing to do so can and will produce a mess. I have now restored the instruction, that a value - and not a key - should be added by the user. --K1812 (talk) 06:44, 24 September 2016 (UTC)[reply]

What do we do about RFC 1345?

I have moved the "Character Mnemonics" section here from the "Unicode input" article. Although the section (here demoted to a subsection) has passing reference to Unicode 1.0, lumped together with "many other character sets", it doesn't bear much relation to Unicode specifically but rather to RFC 1345. The RFC 1345 Character mnemonic for the Greek letter λ, for example, is L*, which corresponds to nothing in Unicode. (The code point is U+039B aand the HTML character entity name is "lambda".)

The section does seem to be good encyclopediac stuff, but I don't have the background to create a new article around it or to know of existing articles that can incorporate it.

I have deleted the last sentence of the preceding section, "Unicode input#In platform-independent applications", which read:

The capability of Vim to create custom mnemonics, as described below, which could be employed on an ad-hoc basis, requires the decimal code point.

Please: someone with the relevant knowledge incorporate the material in Mainspace appropriately. Peter Brown (talk) 22:08, 29 November 2018 (UTC)[reply]

=== Character mnemonics ===

RFC 1345 defines a large number (1,893) of suggested mnemonics for code points in Unicode 1.0 (as well as characters in ISO 2DIS 10646 and many other character sets in use at the time of publication). Although the document does not restrict the length of a mnemonic (for example, "10000R" for U+2182), most (1,338) of the mnemonics are two characters long, and most (416) of the remaining are three-characters. While never complete, and targeting obsolescent set definitions, the mnemonics themselves can still be used.

  • Vim allows mnemonics entry (confusingly called "digraphs" by Vim developers) in insert mode (the regular mode for typing text) with Ctrl+K followed by a two-keystroke RFC 1345 mnemonic; or, in addition, if the digraph option is set, by entering the first character followed by a backspace followed by the second character. Custom mnemonics can also be defined for arbitrary code points. (For example, "dig Gr 9881" associates "Gr" with U+2699 GEAR.)
  • GNU Emacs allows mnemonics entry by switching to rfc1345 input mode (by default Ctrl+u Ctrl+\).
  • GNU Screen allows mnemonics entry with (by default) Ctrl+A Ctrl+V.
  • Zsh allows mnemonics entry using the insert-composed-char widget.

RFC 1345 predates the introduction of the Euro sign (€, U+20AC), but the above applications included it as the mnemonic "Eu".

→Section moved by Peter Brown (talk) 22:08, 29 November 2018 (UTC)[reply]

I have added an abbreviated version of the Vim discussion (first bullet above) to the Unicode input#Decimal input subsection. Peter Brown (talk) 19:44, 30 November 2018 (UTC)[reply]

Here, I have reverted another editor's deletion of the section "Selection from a screen". According to the policy WP:BURDEN, however,

The burden to demonstrate verifiability lies with the editor who adds or restores material, and is satisfied by providing an inline citation to a reliable source that directly supports the contribution. (Emphasis added)

Though the section admittedly lacks the required citations, this is a burden I am unwilling to assume. I am strictly a Windows user, unfamiliar with macOS, Linux and BabelMap. Further, I never use selection from a screen in my own work. I have written an AutoHotkey script to handle em dashes and a few other characters; for anything else, I happily use Hexadecimal input techniques. I am not about to undertake a major research project into approaches that I have no intention of ever using.

So, should I self-revert, leaving "Unicode input" without the section "Selection from a screen", a section that has been part of the article since its creation in 2008? That's not acceptable either. Such selection is a technique for Unicode input, popular enough that several developers have created applets to support it. The lead paragraph lists it as a alternative. Without this section, the article would be seriously deficient.

Ideas? Will any of you, who do use the selection techniques or at least are curious about them, undertake to provide suitable citations? Or must I self-revert? In the latter case, I should probably propose that the entire article be deleted since, without the section "Selection from a screen", it fails to accomplish its purpose. Is there another approach?

Peter Brown (talk) 17:12, 3 December 2018 (UTC)[reply]

I restored the info with proper sources. TimTempleton (talk) (cont) 19:21, 5 December 2018 (UTC)[reply]

The .notdef box

We have used U+10FFFF in the hope that it is not used anywhere and thus will force display of a tofu block. But that codepoint is "private use area" and someone somewhere will use it eventually. Can anyone think of a better solution? Or just cross that bridge when we come to it? --John Maynard Friedman (talk) 09:20, 18 June 2020 (UTC)[reply]

I’d suggest using a non-character, e.g. the first one U+FDD0 “﷐”.
Further we’d better stop mixing up glyphs and food items except for real emoji. BTW why not call it (a slice of) pie? At least that has a dough crust around it. Tofu is actually filled, not empty, and while a .notdef box is white on white paper, there is still the black border left to account for. — Hnvnc (talk) 11:24, 18 June 2020 (UTC)[reply]
I think you've got a bento box in mind (though that starts full and ends empty and may have contained tofu :-) Thank you for changing the section title, I can't believe I wrote that, having challenged it as jargon only yesterday.
Yes. I support your solution. --John Maynard Friedman (talk) 12:53, 18 June 2020 (UTC)[reply]
U+10FFFF is in a PUA block, but it is in fact a non-character (like all characters ending FFFE or FFFF), so it should not occur in any conformant font. In fact it is less likely to be (mis)used than FDD0, so I think leaving it as U+10FFFF is best. BabelStone (talk) 13:57, 18 June 2020 (UTC)[reply]

It is a bit more complicated

Looking at Quotation mark#Unicode code point table on my Android phone using Chrome, for U+2E42 Double low reversed-9 etc, a simple empty box is displayed, but at U+1F676 San-serif heavy etc I see a box crossed with diagonal line. So we haven't quite solved the issue, because it seems that there are actually two issues. I suspect that we may need Hnvnc's solution and BabelStone solution?--John Maynard Friedman (talk) 12:00, 20 June 2020 (UTC)[reply]

Curiouser and curiouser: Hvnc's box is displayed on Android with two diagonal lines, not an empty box. --John Maynard Friedman (talk) 12:13, 20 June 2020 (UTC)[reply]

Would it be acceptable to use U+25AF WHITE VERTICAL RECTANGLE as a simulacrum? --John Maynard Friedman (talk) 12:27, 20 June 2020 (UTC)[reply]

No I tried that, it looks too different from the error indicator.Spitzak (talk) 18:18, 20 June 2020 (UTC)[reply]
Yes, I know, too tall and too narrow. But we don't have to reproduce it exactly, we can say "similar to ". It is enough that we convey the idea, IMO. --John Maynard Friedman (talk) 19:48, 20 June 2020 (UTC)[reply]

U+2E42: ⹂ U+1F676: 🙶 U+10FFFF: 􏿿 U+25af: ▯ U+2c00: Ⰰ U+FFFF: &#xffff; U+10FFFD: 􏿽 Spitzak (talk) 20:58, 20 June 2020 (UTC)[reply]

On mobile, I see valid characters for 2E42, 25AF, 2C00. All others render as box with diagonals except U+ffff which remained as &#xffff;. --John Maynard Friedman (talk) 22:41, 20 June 2020 (UTC)[reply]
As of why two different .notdef glyphs[1] may show up in the same application, I think it depends on what font the renderer got stuck with when giving up. — Hnvnc (talk) 11:54, 21 June 2020 (UTC)[reply]
FWIW, I have the same version of Chrome on both platforms (Android and Chrome OS). --John Maynard Friedman (talk) 13:38, 21 June 2020 (UTC)[reply]

Firefox

Using Firefox 77.0 on Win 10 and Sputzak's test line, I see valid characters for 2E42, 25AF, 2C00. All others render as box with the hex squeezed in (two rows of three hex digits) except U+ffff which remained as &#xffff;. And the glyph displayed for U+25AF is short and fat, almost identical to the empty box shown by Chrome. --John Maynard Friedman (talk) 13:38, 21 June 2020 (UTC)[reply]

References

  1. ^ "Pet peeve: empty .notdef character". TypeDrawers. 2018-05-07. Retrieved 2020-06-21.

Decimal input (Windows)

This section is misleading. It implies that Alt+0nnn produces the Unicode codepoint at nnn10. This is not true. The leading 0 only instructs the OS to chose the glyph from the currently-loaded Windows code page. (If the 0 is omitted, it uses a the OEM code page. By coincidence, for users with US or UK keyboard mapping, there may be sufficient overlap with low-value Unicode for their purposes but it is certainly not a generic Unicode input method. I suspect it encourages the misapprehension that the word "Unicode" means "Latin characters not available as standard on my keyboard".

I propose to delete this material unless someone can come up with a convincing reason to keep it. --John Maynard Friedman (talk) 09:08, 12 September 2020 (UTC)[reply]

Oppose:
Using Random.org, I picked eight 4-digit decimal numbers at random and converted them to hexadecimal. Using Wikibooks:Unicode/Character reference, I then looked each of them up to determine what character, if any, had that number as a code point. Next, using Wordpad, I tried Alt+nnnn on each of the eight.
On two of the eight, the character was undefined according to Wikibooks. For both of them, Wordpad produced a ☐. One other, U+1BD7, is a "Batak letter northern ta"; Wikibooks could not produce a glyph but only ᯗ and Wordpad yielded a ⍰. For all of the others, the character that Wordpad called up matched that from Wikibooks.
I emphasize that the numbers were chosen randomly. While there may be a few exceptions, it appears that whenever Alt+nnnn yields a character in Wordpad other than ☐ or ⍰, the character is the one associated with it by Unicode. That's a lot of numbers. It certainly suggests that using the Alt code with a character's decimal code point is a pretty reliable way of producing that character.
Yes, Unicode input § Decimal input could use some improvement. The statement that
Microsoft Windows can input at least some Unicode code points using decimal typed on the numeric keypad by using Alt codes
is correct, though an understatement; Windows can input most code points that actually correspond to printable characters that way, at least with code points up to decimal 9999. It is necessary to input at least four digits, so a leading zero is needed for numbers less than 1000. The technique also doesn't work for Unicode control characters such as characters with decimal codes 0 –31 or 128 –159.
Peter Brown (talk) 18:29, 12 September 2020 (UTC)[reply]
Then it needs to be rewritten to state clearly that codepage 1252 creates invalid (to Unicode) binary values for characters that Microsoft has reassigned to the range 0080–009F and this makes documents that use them incomprehensible to other platforms.
  • For example, dagger and double-dagger, † and ‡, have the Unicode code points 202016 and 202116 (822410 and 822510) but CP1252 assigns them to 8616 and 8716 (13410 and 13510). Thus if a Windows user enters alt+0134, a dagger symbol will be displayed and printed on their Windows machine but the file thus created will be intelligible only to another user with Windows and CP1252. The reality is that the user has not created a Unicode code-point: indeed what they have encoded is not a valid character at all because it lies in the x80 to x9F 'reserved for control-codes' block.
  • But maybe not many people use dagger symbol, so how about the euro symbol, ? Its Unicode code point is 20AC16 (836410) but Windows CP1252 assigns it to 8016 (12810)). And perhaps your nicely formatted press-release also uses curly quotes? If your publicist uses a Mac or your typesetter uses a *nix system, then you just look illiterate or incompetent or both.
It also needs to say that it can't deliver characters with numbers above 25510 (FF16). So no Eastern European haceks or macrons, overdots, underdots, comma-below, let alone Greek or Cyrillic. (and the explanation needs to be written without confusing the numeracy-challenged with incomprehensible talk of modulo 255).
It also needs to say that if you are in Japan or China or India or Russia and so have an entirely different Windows code-page default, then your Alt+0nnn will produce something completely different. --John Maynard Friedman (talk) 22:28, 12 September 2020 (UTC)[reply]
Unicode input § Decimal input is indeed misleading, but not in the way suggested. It is not necessary that the decimal code point start with a zero; rather, as I noted in my previous post, "It is necessary to input at least four digits, so a leading zero is needed for numbers less than 1000." It is also necessary that code points less than 100 start with two leading zeros. The section is easily corrected to state the requirement correctly. No mention of CP1252 is necessary or even useful.
Unicode input is only concerned with methods to input characters given their Unicode code points. The dagger has a decimal code point of 8224, so a technique recommended by the article, when corrected, will be to enter Alt+8224. This works and, so far as I know, is independent of the code page. Yes, there is another technique, one relying on CP1252, but that in no way invalidates the technique, properly stated. Agreed, the user following the CP1252 procedure has not "created a Unicode code point" — code points are numbers, according to the Unicode standard and numbers are not created entities. Does U+0086 not encode a valid character? It's not a printable character, but it does lie within the subject matter of the Wikipedia Unicode control characters article, so there's certainly a case to be made for its being a character, specifically one designating "Start of Selected Area".
"How about the Euro Symbol ?", you ask. Same point: properly updated, Unicode input § Decimal input will tell us, correctly, that it can be produced by Alt+8364. Curly quotes? Alt+8216 through Alt+8223. Also macrons, such as the combining macron Alt+0304, which does have a leading zero. Greek and Cyrillic, such as α Alt+0945 and Д Alt+1044. And Japanese characters, like , requiring five decimal digits: Alt+64048.
Peter Brown (talk) 02:19, 13 September 2020 (UTC)[reply]

Decimal input (Windows) Part 2

I bow to your more extensive knowledge and trust that you will clarify the article accordingly.
You say that the reference to CP1252 is not needed. So why is it that a user with Japanese layout gets something other than £ after typing Alt+0163? Does that not disprove your rule? 16310 is certainly the correct Unicode value for the codepoint but Windows is delivering something from the 163rd slot in its Japanese code page which is definitely not £.--John Maynard Friedman (talk) 16:33, 13 September 2020 (UTC)[reply]
I've updated the article; please take a look at it. My claim is limited to Microsoft Word and Wordpad; it also works on LibreOffice Writer but not for Notepad, Chrome, or Firefox. What application is your Japanese friend using? Peter Brown (talk) 20:17, 13 September 2020 (UTC)[reply]
Said Japanese friend here. As discussed here I am indeed trying to produce £ in a plain-text context, such as Notepad, a text input box, or this Wiki editing area. When my 'keyboard' is set to Japanese (be it 'Japanese keyboard' or Microsoft IME - or indeed Chinese pinyin for that matter), Alt+0163 does not work (it produces 」), and if I change to the Thai Kedmanee keyboard I get ฃ. If there were a 4- or even 5-digit code that worked (at one stage I had hopes for Alt+6556), that would be great, but what I currently see is that unless I switch the keyboard layout to e.g. UK or US and then use Alt+0163 (or Shift+3 in the UK keyboard), there is no simple way to input this Unicode character into such a text area. Ozaru (talk) 18:20, 14 September 2020 (UTC)[reply]
Of the applications you list, you're right: they provide no simple way to produce a £, at least none I know of. Of course, entering &pound; in the Wiki edit box will produce a £ in the resolved text, but that's not what you're after. Peter Brown (talk) 19:25, 14 September 2020 (UTC)[reply]
@Ozaru: Have you considered using a script language? I have an Autohotkey script that runs by default; I use it for em dashes among many other things. The Autohotkey script to make Cntl+F produce a £ would be just ^f::£. Peter Brown (talk) 00:37, 15 September 2020 (UTC)[reply]
There are plenty of workarounds (e.g. Windows+Space to switch to UK/US, Alt+0163 then Windows+⇧ Shift+Space to switch back; or phonetically entering ぽんど into the IME and hitting Space one or more times to select the right symbol, or Autohotkey as you say). The issue is more that despite the best intentions of moving from 8-bit SBCS to 16-bit DBCS and standardizing with Unicode while computers themselves become 32 and 64-bit... it still seems impossible to break free from the 8-bit codepage legacy, which I find incredible. It's amazing (not to say inconvenient) that even now, VBA Editor doesn't support Unicode, Excel can't save Unicode CSV files, and basic Windows 10 dialogs etc. don't have a simple, in-built way for Unicode input. So much for I18N. Ozaru (talk) 05:55, 15 September 2020 (UTC)[reply]
Couldn't put it better myself (I didn't!). As already noted, the £ glyph is just a random example, the issue is widespread. Which takes me back to my first challenge to the section. It is worse than misleading while it remains unqualified. --John Maynard Friedman (talk) 16:05, 15 September 2020 (UTC)[reply]
"... while it remains unqualified." Sorry, what is "it"? My revised wording begins, "Some programs running in Microsoft Windows, including Word and Wordpad ...". Isn't that sufficient qualification? I don't see a need, here, to mention that whether one can produce £, Ð, etc. on Notepad or VBA depends on the code page in effect. Peter Brown (talk) 19:05, 15 September 2020 (UTC)[reply]
"it" = "the text". The text that says that this method works when the real story is "it depends". Setting ever tighter parameters so that we can continue to say that it works is being "economical with the truth". We need to say that the method doesn't work reliably for keyboard settings outside the Americas, Western Europe, Southern Africa, A&NZ and (former) Western European colonies. IMO. --John Maynard Friedman (talk) 20:07, 15 September 2020 (UTC)[reply]
I think it can be stated this way:
In some cases Microsoft extended the Altcode inputs so that Unicode code points could be typed as decimal numbers.
For the numbers 0-256 the user had to type a leading zero (so that the "ANSI" code page was used) and also the ANSI code page had to be set to something that matched the first 256 characters of Unicode for all useful characters (CP1252).
For numbers greater than 256 there were numerous different results, depending on the software being used and the version of Windows:
  • The number had to be prefixed with a zero to work
  • At least 4 digits had to be typed (ie leading zero on n <= 999) to work.
  • The numbers did not work at all (usually producing the character for n modulus 256)
  • Numbers greater than 65535 might not work even if smaller numbers do.

Spitzak (talk) 20:58, 15 September 2020 (UTC)[reply]

Re Spitzak's four bullet points:
  • In Wordpad, Alt+960 and Alt+0960 both produce a π, which is the correct Unicode character. The high-order zero doesn't matter.
  • Same counterexample. Alt+960 works just fine.
  • 960 ≡ 448 modulo 256, but in Word and Wordpad Alt+448 and Alt+0448 both produce, not π, but the glottal stop ǀ. Modulo 256 has nothing to do with it.
  • Numbers greater than 62235 might not work? I've produced two cases of numbers that big that do work (one here and one in the article). Why is Spitzak so suspicious of the others?
I agree with John Maynard Friedman, above, that we should not confuse "the numeracy-challenged with incomprehensible talk of modulo 255," assuming that he really means 256. Spitzak evidently disagrees, as he has introduced such considerations into the article. However, Unicode input is, or should be, entirely concerned with Unicode input, with ways to produce characters when one knows their code points. Modulo 256, applicable to Notepad, outgoing Gmails, etc. could be discussed in the Alt code article, but it is not relevant here, because
  • discussion is limited to Word and Wordpad as well as similar programs like LibreOffice writer, and
  • for Unicode input purposes, the only point of knowing about equivalence modulo 256 (if it worked in Word etc.) is that, if one thought the number 666 accursed, one could produce the character ʚ using 154 or 410.
Peter Brown (talk) 01:47, 17 September 2020 (UTC)[reply]