I really appreciate how constructive the discussion on this forum is, and thank you for letting me be part of it; I’m trying to respect it even if I write too many words!
A big improvement in Hopscotch’s string handling would be to ensure that operations on strings are in terms of code points, rather than code units. For the symbols that are available in the emoji keyboard in HS, I think that would maintain the abstraction of “strings” as sequences of “characters” that can be typed into a string and extracted from a string individually.
Here’s what I mean by “character in … at” returning only the characters that I entered. When I look at a keyboard, I see one symbol per key, whether that’s an “A” or a smiley face emoji. I hit a key once, and one symbol shows up. A string is a sequence of those symbols. The length of the string should be the number of symbols. If I type three symbols, saved in a string in variable “Label”, then “character in (Label) at (0)” should be the first symbol I typed, and “… at (1)” to be the 2nd symbol, and “… (2)” to be the 3rd. I shouldn’t have to care about the difference between simple letters and emojis.
There may be regex for detecting multi-code-unit characters, but that is its own steep learning curve.
There is another layer of Unicode complexity, which I’m not expecting to be part of Hopscotch’s string handling anytime soon. Sometimes the thing you’d look at and call a single symbol is more complicated, like a letter with a special accent. The letter and the accent might each be one code point, so now the symbol (a “grapheme cluster”) is composed of multiple code points, which in turn need multiple code units. So, even more sophisticated string parsing could be in terms of grapheme clusters, rather than code points. But since this is stepping outside what you can easily access from the HS keyboard it seems more complicated than necessary.
sorry for the essay. I’m old.