Input Method Editor (IME) programming (3)

zhaozj2021-02-16  52

Composition String

The composition string is the current text in the composition window. This is the text that the IME converts to final characters. Each composition string consists of one or more clauses, where a clause is the smallest combination of characters that the IME can convert to a Final Character. To Get and Set The Composition String, Call The ImmgetCompationstring And ImmgetCompationstring Functions.

As the user enters text in the composition window, the IME tracks the status of the composition string. This status includes attribute information, clause information, typing information, and cursor position. You can retrieve the composition status by using the ImmGetCompositionString function.

In the attribute information array, all characters of one clause must have the same attribute. The attribute information is an array of 8-bit values ​​that specifies the status of characters in the composition string. There is one value for each byte in the string, Including one byte eachte for the lead and second bytes in the string. for each value in the array, bits 0 through 3 can be one combination of the follows.

The letter string is the current text in the alphanist combination window. This is the text that will be converted by IME to the final character. Each alphanumeric string is composed of one or more sub-strings (CLAUSES), the substrings are the minimum character combination of IME to convert to the final character. To get or set the alphanumeric string, call the imgetCompationstring and IMMSETCompationstring function.

When the user enters text in the alphanumeric window, IME will track the status of the letter combination string. These states include attribute information, substrial information, entry information, and subtrings. You can retrieve the status of the letter by using the ImmgetCompationstring function.

In the attribute information array, all letters in the same substring must have the same properties. The attribute information is an array consisting of 8-bit values ​​for determining characters in the alphanumeric string. Each byte in the string has a value, where the first and second bytes of each double-byte character in the string also correspond to one value. Each value in the array can be a combination of each value from the 0th to the third bit:

Value (value) attr_inputcharacter being entered by the user. It is yet to be converted by the IME. User entered characters. It will be converted by IME. Attr_input_errorcharacter is an error character and cannot be converted by the IME. For Example, Some Consonants Cannot Be Put Together. Character is a wrong character that cannot be converted by IME. Use some consonants to unable to be together. Attr_target_convertedcharacter Converted by The IME. The User Has Selected This Character and the IME HAS CONVERTED IT. The characters converted by IME. The user has chosen this character and IME has converted it. Attr_converteda Converted Character. The Ime Has Already Converted this character. A converted character. IME has converted it. Attr_target_notconvertedcharacter Being Converted. The User Has Selected this Character But The IME HAS NOT YET CONVERTED IT. The characters are being converted. The user has chosen this character, but IME has not converted it yet. Attr_fixedconvertedcharacters That Will NOT CONVERTED. The IME WILL NOT Convert these Characters Anymore. The characters that cannot be converted. IME will not be converted again. All other values ​​are reserved. In Japanese, any unconverted character having the ATTR_INPUT attribute is a Hiragana, Katakana, or alphanumeric character. In Korean, this character is a Hangeul character that is not converted by IME yet. In Traditional and Simplified Chinese, each IME May Limit ITS Character in Some Range.

Other values ​​(probably referring to 8 bits) will be retained. In Japanese, any characters who have attr_input attributes and cannot be converted are a flat-fake name, and the fake name is a digital letter. In Korean, such characters are a Korean characters, and it cannot be converted by IME. In the traditional (too, Traditional) and Simplified Chinese, each IME (probably refers to each input method) limits its character in its own range.

The clause information is an array of 32-bit values ​​that specifies the positions of the clauses in the composition string. There is one value for each clause and a final value that specifies the length of the full string. Each value in the array specifies the offset, in bytes, from the beginning of the string to the clause. The first value is always 0 because the first clause always starts at the beginning of the string. For example, if a string has two clauses, the clause information has three values :. the first value is 0, the second value is the offset of the second clause, and the third value is the length of the string For Unicode, the position of a clause is the position counted in Unicode characters, and the length of a String is the size in uncode characters. Subtrive information is an array of 32-bit values ​​for determining the location of the substrings in the alphanumeric string. Each substring corresponds to a value, and the last value determines the length of the entire letter combined string. Each value in the array determines the offset of the substring from the character series (in bytes) on the byte level (in bytes). The first value is always 0 because the first substring always starts from the starting point of the string. For example, if the string has two substrings, the substring message (array) will have three values: the first value is 0, the second value is the offset of the second substring, the third value is a string length. For Unicode, the position of the substrings is the location calculated by Unicode characters, and the length of the string is also counting in Unicode characters.

.

The cursor position is a value indicating the position of the cursor relative to the characters in the composition string. The value is the offset, in bytes, from the beginning of the string. If this value is 0, the cursor is immediately before the first character in the string. If the value is equal to the length of the string, the cursor is immediately after the last character. If -1, the cursor is not present. For Unicode, both position and length are measured in Unicode characters.

The entry information is a Null-Terminated Character string that represents characters entered by the keyboard. Cursor Position is a value that indicates the position of the character in the cursor relative to the alphanumeric string. This value is an offset based on byte, count from the starting point of the string. If the value is equal to the length of the string, the cursor is at the end of the last character. If the value is -1, it means no cursor (the cursor is not displayed). For Unicode, location, and lengths metrically measure with Unicode characters. You can set the composition string or elements of the composition status by using the ImmSetCompositionString function. To ensure that the composition window updates its appearance based on these changes, the function allows you to send a notification message to the window. Applications that set a combination Of Composition Statify Parameter To False for All But The Last Call To This Function So That Only ONE The Composition WINDOW.

Finally, the edit control supports two messages for changing the IME's handling of composition strings. For more information, see EM_GETIMESTATUS and EM_SETIMESTATUS. For more information on the edit control, see Edit Controls.

You can use the immsetCompationstring function to set the alphanumeric string or the properties of the elements. To ensure that the alphanumeric window has updated its appearance according to these changes, the function allows you to send a notification message to the window. Setting the application of the alphanumeric string status element combination, which is set to FASLE by default, in addition to the last call, so only one notification message will only be generated for the alphanumeric window.

Finally, the editing control also supports two IME letters combined string processing messages. For more information, see EM_GETIMESTATUS and EM_SETIMESTATUS. For more information on editing controls, see Edit Controls.

转载请注明原文地址:https://www.9cbs.com/read-22062.html

New Post(0)