Journey...: Big and Little Endian

Basic Memory Concepts In order to understand the concept of big and little endian, you need to understand memory. Fortunately, we only need a very high level abstraction for memory. You don't need to know all the little details of how memory works. All you need to know about memory is that it's one large array. But one large array containing what? The array contains bytes. In computer organization, people don't use the term "index" to refer to the array locations. Instead, we use the term "address". "address" and "index" mean the same, so if you're getting confused, just think of "address" as "index". Each address stores one element of the memory "array". Each element is typically one byte. There are some memory configurations where each address stores something besides a byte. For example, you might store a nybble or a bit. However, those are exceedingly rare, so for now, we make the broad assumption that all memory addresses store bytes. I will sometimes say that memory is byte-addresseable. This is just a fancy way of saying that each address stores one byte. If I say memory is nybble-addressable, that means each memory address stores one nybble. Storing Words in Memory We've defined a word to mean 32 bits. This is the same as 4 bytes. Integers, single-precision floating point numbers, and MIPS instructions are all 32 bits long. How can we store these values into memory? After all, each memory address can store a single byte, not 4 bytes. The answer is simple. We split the 32 bit quantity into 4 bytes. For example, suppose we have a 32 bit quantity, written as 90AB12CD16, which is hexadecimal. Since each hex digit is 4 bits, we need 8 hex digits to represent the 32 bit value. So, the 4 bytes are: 90, AB, 12, CD where each byte requires 2 hex digits. Big Endian : In big endian, you store the most significant byte in the smallest address. Here's how it would look: Address 1000 1001 1002 1003 Value 93 AB 12 CD Little Endian In little endian, you store the least significant byte in the smallest address. Here's how it would look: Address 1000 1001 1002 1003 Value CD 12 AB 93 Which Way Makes Sense? Different ISAs use different endianness. While one way may seem more natural to you (most people think big-endian is more natural), there is justification for either one. For example, DEC and IBMs(?) are little endian, while Motorolas and Suns are big endian. MIPS processors allowed you to select a configuration where it would be big or little endian. Why is endianness so important? Suppose you are storing int values to a file, then you send the file to a machine which uses the opposite endianness and read in the value. You'll run into problems because of endianness. You'll read in reversed values that won't make sense. Endianness is also a big issue when sending numbers over the network. Again, if you send a value from a machine of one endianness to a machine of the opposite endianness, you'll have problems. This is even worse over the network, because you might not be able to determine the endianness of the machine that sent you the data. The solution is to send 4 byte quantities using network byte order which is arbitrarily picked to be one of the endianness (not sure if it's big or little, but it's one of them). If your machine has the same endianness as network byte order, then great, no change is needed. If not, then you must reverse the bytes.

Journey...

Friday, April 23, 2010

Big and Little Endian

No comments:

Post a Comment

Health Benefits of Cashews