Skip to Content

Lecture 2: Binary Representation

Bits review

  • 1 byte = 8 bits

Converting between binary and decimal

16210=101000102162_{10} = 10100010_{2}

16210=127+026+125+024+023+022+121+020=128+32+2=162\begin{aligned} 162_{10} &= 1 \cdot 2^7 + 0 \cdot 2^6 + 1 \cdot 2^5 + 0 \cdot 2^4 + 0 \cdot 2^3 + 0 \cdot 2^2 + 1 \cdot 2^1 + 0 \cdot 2^0 \\ &= 128 + 32 + 2 \\ &= 162 \end{aligned}

Example Data representations

C Data TypeSize (bytes 32bit)Size (bytes 64bit)x86-64
char111
short222
int444
long488
float444
double888
long double--10/16
pointer488

Same size if declared as unsigned

Encoding Integers (w bits)

Unsigned Integers

B2U(X)=i=0w1xi2iB2U(X)= \sum_{i=0}^{w-1} x_i \cdot 2^i

Example:

B2U(01101)=024+123+122+021+120=0+8+4+0+1=13\begin{aligned} B2U(01101) &= 0 \cdot 2^4 + 1 \cdot 2^3 + 1 \cdot 2^2 + 0 \cdot 2^1 + 1 \cdot 2^0 \\ &= 0 + 8 + 4 + 0 + 1 \\ &= 13 \end{aligned} B2U(11101)=124+123+122+021+120=16+8+4+0+1=29\begin{aligned} B2U(11101) &= 1 \cdot 2^4 + 1 \cdot 2^3 + 1 \cdot 2^2 + 0 \cdot 2^1 + 1 \cdot 2^0 \\ &= 16 + 8 + 4 + 0 + 1 \\ &= 29 \end{aligned}

Two’s Complement

B2T(X)=xw12w1+i=0w2xi2iB2T(X)= -x_{w-1} \cdot 2^{w-1} + \sum_{i=0}^{w-2} x_i \cdot 2^i

Example:

B2T(01101)=024+123+122+021+120=0+8+4+0+1=13\begin{aligned} B2T(01101) &= 0 \cdot 2^4 + 1 \cdot 2^3 + 1 \cdot 2^2 + 0 \cdot 2^1 + 1 \cdot 2^0 \\ &= 0 + 8 + 4 + 0 + 1 \\ &= 13 \end{aligned} B2T(11101)=124+123+122+021+120=16+8+4+0+1=3\begin{aligned} B2T(11101) &= -1 \cdot 2^4 + 1 \cdot 2^3 + 1 \cdot 2^2 + 0 \cdot 2^1 + 1 \cdot 2^0 \\ &= -16 + 8 + 4 + 0 + 1 \\ &= -3 \end{aligned}

Sign Bit

  • For 2’s complement, most significant bit is the sign bit
    • 0 for positive
    • 1 for negative

Numeric Ranges

  • Assume you have a integer type that is 4 bits long

    • Unsigned: 0 to 15
      • 0b1111=150b1111=15
    • 2’s Complement: -8 to 7
      • 0b1000=80b1000=-8
  • Unsigned Values:

    • UMin=0=B2U(0000)UMin=0=B2U(000\ldots 0)
    • UMax=2w1=B2U(1111)UMax=2^w-1=B2U(111\ldots 1)
  • 2’s Complement Values

    • TMin=2w1=B2T(1000)TMin=-2^{w-1}=B2T(100\ldots 0)
    • TMax=2w11=B2T(0111)TMax=2^{w-1}-1=B2T(011\ldots 1)
    • Other interesting values 1=B2T(1111)-1=B2T(111\ldots 1)

Values for different word sizez

W=8W=16W=32W=64
TMin-128-32768-2147483648-9223372036854775808
TMax1273276721474836479223372036854775807
UMin0000
UMax25565535429496729518446744073709551615

C Operators for bitwise operations

Boolean algebra

And

&01
000
101

Example:

int a = 0b1010; int b = 0b1100; int c = a & b; // should return 0b1000

Or

|01
001
111

Example:

int a = 0b1010; int b = 0b1100; int c = a | b; // should return 0b1110

Xor

^01
001
110

Example:

int a = 0b1010; int b = 0b1100; int c = a ^ b; // should return 0b0110

Not

~01
10

Example:

int a = 0b1010; int c = ~a; // should return 0b0101

Imagine as set operations

  • & is intersection
  • | is union
  • ^ is exclusive or (symmetric difference)
  • ~ is complement

Logic operators on C

  • && is and
  • || is or
  • ! is not

Interesting properties:

  • View 0 as false and any non-zero value as true
  • Always returns 0 or 1
  • Early termination: if the first operand is 0, the second operand is not evaluated

Example:

int a = 0x69; int b = 0x55; int c = a && b; // should return 0x01 int d = a || b; // should return 0x01 (the program will not check b at all since a is non-zero by early termination) int e = !a; // should return 0x00 int f = !!a; // should return 0x01 int g = !0; // should return 0x01 int *p = NULL; bool should_access = p && *p; // (avoid null pointer access, returns 0 if p is NULL, otherwise returns true if *p is non-zero)

Using bit masks

// goal: compute val mod x and x is a power of 2 unsigned int val = 137; // some value to take mod of unsigned int x = 16; // x is a power of 2 unsigned int mask = x - 1; // mask is a bit mask that is all 1s except for the least significant bit unsigned int mod = val & mask; // mod is the result of val mod x

Shift operations

  • << is left shift
    • Shift bit-vector x left y positions
  • >> is right shift
    • Shift bit-vector x right y positions
    • Logical shift: fill with 0s
    • Arithmetic shift: fill with the sign bit
  • Undefined behavior: shift by a number greater than or equal to the word size

Example:

x0b01100010
x<<30b00010000
Logical shift x>>20b00011000
Arithmetic shift x>>20b00011000

For negative numbers:

x0b10100010
x<<30b00010000
Logical shift x>>20b00101000
Arithmetic shift x>>20b11101000

Pop count function

  • How do you implement a pop count function in a 4-byte memory?

Trivial way:

# define MASK 0x1; int pop_count(unsigned int x) { // does not work for negative numbers int count = 0; while (x!=0) { if (x & MASK) count++; x >>= 1; } return count; }

Casting Between Signed and Unsigned Integers in C

Constants

  • By default, constants are signed
  • To make a constant unsigned, add the U suffix
unsigned int a = 0x1234U;

Casting

  • Explicitly cast to a different type
int tx,ty; unsigned int ux,uy; tx = (int) ux; uy = (unsigned) ty;
  • Implicit casting also occurs via assignments and procedure calls
tx = ux; pop_count(tx); // popcount is a built-in function that returns the number of 1s in the binary representation of x (unsigned int)

When should I use unsigned integers?

  • Don’t use just because the number are non-negative
    • Easy to make mistakes
unsigned i; for (i = cnt-2; i < 0; i++) { // do something }

If cnt=1 then i will be -1 and the loop will not terminate in short time. LOL.

  • Can be very subtle
#define DELTA sizeof(int) // sizeof(int) returns unsigned int x = 0; for (int i = CNT; i-DELTA >=0; i-=DELTA) { // do something }

The expression i-DELTA >= 0 will be evaluated as unsigned and will not terminate.

Code Security Example

You can access the kernel memory region holding non user-accessible data. if you give negative index to the array, it will access the kernel memory region by interpreting the negative index as an unsigned integer.

Change int size

Extension

  • When operating with types of different widths, C automatically perform extension
  • Converting from smaller to larger type is always safe
    • Given w-bit integer x,
    • Convert x to w+k bit integer with the same value
  • Two different types of extension
    • zero extension: use for unsigned (similar to logical shift)
    • sign extension: use for signed (similar to arithmetic shift)

Truncation

  • Task:
    • Given w-bit integer x,
    • Convert x to k bit integer with the same value
  • Rule:
    • Drop high-order w-k bits
  • Effect:
    • can change the value of x
    • unsigned: mathematical mode on x
    • signed: reinterprets the bit (add -2^k to the value)

Code puzzle

what is the output of the following code?

unsigned short y=0xFFFF; int x = y; printf("%x", x); /* print the value of x as a hexadecimal number */

The output is 0x0000FFFF it will try to preserve the value of y by sign extending the value of y to x.

Last updated on