Reference |
Building Packets |
|
Handling multibyte values and converting numerical values to text string and bytes.
Multibyte values are values larger than can be stored in a single byte.
The following commonly used multibyte data types in .Net are:
short: 16-bit (2 byte) signed integer, range of -32,768 to 32,767
ushort: 16-bit (2 byte) unsigned integer, range of 0 to 65,535
int: 32-bit (4 byte) signed integer, range of -2,147,483,648 to 2,147,483,647
uint: 32-bit (4 byte) unsigned integer, range of 0 to 4,294,967,295
Using multibyte numbers on a single software platform or device never really poses many problems, as the operating system and software provides all the storage and retrieval functions for us.
However, when communicating with other devices, networks or operating systems or file formats the problem of endianness arises. Endianness describes the exact order that each of the bytes in a multibyte value are stored in memory, or arranged in a protocol packet or transmitted on a network.
In the list of multibyte data types above, there are several ways that each type could order its bytes:
a 2 byte value 0x0102 could be stored as { 0x01, 0x02 } or { 0x02, 0x01 }
a 4 byte value 0x01020304 could be stored as { 0x01, 0x02, 0x03, 0x04 } or { 0x04, 0x03, 0x02, 0x01 }
There are two main flavors of endianness:
Whilst the history of endianness in computing is outside the scope of this topic, it is nonetheless an issue that developers will face at some point, when implementing binary encoded protocols. Endianness does not normally affect protocols that use text encoding.
So how do we know which order to expect, and where?
Microsoft's Windows OS on x86 processor hardware uses little-endian format,
Apple's Mac OS on the PowerPC processor hardware uses big-endian format, as does the Motorola processor platform (e.g. 68000),
Ethernet protocols based on TCP/IP and UDP/IP commonly use big-endian format,
Your device's protocol documentation should state the byte order it uses.
Trivia
Big-endian format also has the similarity with the written order of the bytes, i.e. the 'big-end' of the number appears first.
The literary reader may recognize the term big-endian: it originates from the novel Gulliver’s Travels and the Lilliputian's spat with the Big-endians.
Endianness also affects multibyte character encoding systems, such as Unicode.
To ensure that the correct order of bytes is sent or interpreted when received, it is necessary to know the endianness of the sender and the receiver. The driver can then handle any byte swapping or reordering necessary before sending data packets.
Generally, 'host order' refers to the byte order of the system your software is running (i.e. the host), and 'network order' refers to the byte order of the protocol or device you are communicating with. If the endianness of both ends is the same, there is no need for any conversion.
In .Net you can use the static System.BitConverter.IsLittleEndian property to check, at run-time, whether the processor and OS platform combination is little- or big-endian. Ordinarily on x86 systems it is safe to assume little-endian.
Your device's protocol documentation may not explicitly refer to its 'endianness' so here are a few other terms to look out for:
Endianness |
Also know as... |
Little-endian |
Least-significant-byte first, LSB first, Intel format |
Big-endian |
Most-significant-byte first, MSB first, network byte order |
The Swap Bytes to Network Order pattern illustrates how to convert multibyte values between host and network byte-order:
int val = 0x12345678;
int swapped = SwapBytes( val );
the resultant value will be 0x78563412.
The two previous topics, Serial Data and Unidirectional Drivers discuss translation between strings and bytes, but what about between other data types and values?
An int variable type, for example, uses 4 bytes of storage space: whilst we can use the design pattern above to swap its internal bytes around, how do we extract those bytes to append to our packet to send?
The Value to Byte Array pattern illustrates how to extract the bytes from an integer value:
int val = 0x12345678;
byte[] bytes = ValueToByteArray( val );
the resultant array of bytes will contain (if byte swapping is included):
{ 0x12, 0x34, 0x56, 0x78 }
Once converted, the resulting bytes can be inserted into the command data.
For text encoded protocols, numbers are typically represented in their ASCII form; so how do we convert numbers to text? The text format may not always be decimal: some protocols take the esoteric approach of encoding values as hexadecimal ASCII strings. Yeah, weird.
The Building and Formatting Strings pattern illustrates how to convert numeric values to strings using String.Format.