C40 Encodation in Data Matrix
Data Matrix
As some customers of keepautomation.com ask questions about C40 encodation in Data Matrix, this article will explain C40 encodation and C40 encodation rules in detail.

### C40 Encodation

Data Matrix has six data encodation schemes and C40 is just one of them. The purpose of C40 encodation scheme is to optimize the encoding of upper-case alphabetic and numeric characters. However it can also encode other characters by using the shift characters in conjunction with the data character.
The characters of C40 are divided into 4 subsets. The first character set is called the basic set, including the three special shift characters, the space character, and the ASCII characters A-Z and 0-9. They are assigned to a single C40 values. While the other three character sets are assigned to one of the three shift characters, given one of the C40 values.
To encode the data character, each data character should be changed into a single C40 value or a pair of C40 values first. Then divide the complete string of C40 values into groups of three values and every triplet (C1, C2, C3) needs to be encoded into a 16-bit value codewords by this formula: (1600 * C1) + (40 * C2) + C3 +1. Each 16-bit value is then decomposed into 2 codewords by taking the most import 8 bits and the least important 8 bits.

### C40 Encodation Rules in Data Matrix

Each pair of codewords stands for a 16-bit value where the first codeword stands for the most significant 8 bits. A formula can explain three C40 values (C1, C2 and C3) encodation: (1600 * C1) + (40 * C2) + C3 +1.
When only one or two symbol characters remain in the symbol before the start of the error correction codewords, the following rules come into force:
• Condition: two symbol characters remain and three C40 values (C1, C2 and C3) remain to be encoded (which may include both data and shift characters)

Action: encode the three C40 values in the last two symbol characters and a final unlatch codeword is not required.
• Condition: two symbol characters remain and two C40 values are left to be encoded (the first C40 value may be a shift or data character. However the second must stand for a data character)

Action: encode the two remaining C40 values which are followed by a pad C40 value of 0 (Shift 1) in the last two symbol characters. A final unlatch codeword again is not required.
• Condition: two symbol characters remain and only one C40 value (data character) is left to be encoded

Action: the first symbol character is encoded as an unlatch character and the last symbol character is encoded with the data character using the ASCII encodation scheme.
• Condition: one symbol character remains and one C40 value (data character) remains to be encoded

Action: the last symbol character is encoded with the data character using the ASCII encodation scheme and the unlatch character is not encoded, but is assumed, before the last symbol character.