Computer Systems · Data Representation

CS3 — Floating-Point Representation

📅 Thu 11 Jun 2026 · P3+P4 (double)
~120 minutes
Learning intentions
Success criteria
Warm up — recap from CS2
Answer from memory · check when done
W1
Convert −37 to 8-bit two's complement.
W2
Convert 11110110 to denary. (Treat as 8-bit two's complement.)
W3
What is the minimum value that can be stored in 8-bit two's complement?
W4
Convert −1 to 8-bit two's complement.
W5
What is the maximum value that can be stored in 6-bit two's complement?

Key vocabulary

Floating-point
A system for representing numbers with a fractional component, using a mantissa and an exponent
Mantissa
The significant digits of the number, stored as a binary fraction in two's complement
Exponent
A power of 2 stored in two's complement that scales the mantissa to produce the final value
Binary point
The equivalent of a decimal point in binary — bits to the right represent fractions (½, ¼, ⅛…)
Normalised form
The standard mantissa format: positive starts 0.1…, negative starts 1.0…, ensuring maximum precision
Precision
The number of significant binary digits available — determined by the mantissa bit width

Why integers are not enough

Everything we have stored so far — positive integers in CS1, negative integers in CS2 — has been a whole number. But real-world data is full of values that cannot be expressed as integers:

None of these fit into any integer type — they require a fractional component. We could scale by 100 and pretend everything is in pence, but this breaks for very large or very small scientific values. We need a fundamentally different representation.

Floating-point solves this using the same idea as scientific notation: separate the significant digits from the scale. Just as 6.5 × 103 = 6500 in denary, floating-point stores a mantissa and an exponent independently in binary.

The floating-point structure

Every floating-point number is stored as:

value = mantissa × 2exponent

In our lessons we use a simplified 8-bit format: 5 bits for the mantissa + 3 bits for the exponent. Real systems use far more (IEEE 754 double precision uses 64 bits total), but the method is identical.

Mantissa (5 bits) Exponent (3 bits)
M₄
sign
M₃ M₂ M₁ M₀ E₂
sign
E₁ E₀
sign bit 2−1 2−2 2−3 2−4 sign bit 21 20

Binary fractions — place values

Just as denary fractions use tenths, hundredths, thousandths…, binary fractions use halves, quarters, eighths…. The binary point divides whole-number place values (left) from fractional place values (right):

Sign . Bit 1Bit 2Bit 3Bit 4
sign . 2−12−22−32−4
± . 0.50.250.1250.0625

To read a mantissa: sign bit 0 = positive, sign bit 1 = negative (two's complement). Then add up all fractional columns with a 1, treating the sign bit as −1 if set.

Example: mantissa 01101 → sign=0, then 0.5 + 0.25 + 0 + 0.0625 = 0.8125

Example: mantissa 10011 → sign=1, then −1 + 0 + 0 + 0.125 + 0.0625 = −0.8125

Normalised form

Many values can be expressed in floating-point in more than one way. For example, 0.375 could be represented as 0.1100 × 2−1, or as 0.0110 × 20, or 0.0011 × 21. This ambiguity causes problems: comparisons fail, precision is wasted, and hardware gets complicated.

Normalised form defines exactly one valid representation for every value:

This ensures the mantissa is as large as possible in magnitude (≥ 0.5 for positive, ≤ −0.5 for negative), using every available bit for precision. If a mantissa is not in normalised form, shift it left and subtract 1 from the exponent until it is.

Converting denary to floating-point (positive numbers)

  1. Convert the denary number to binary (whole part + fractional part)
  2. Normalise: shift the binary point until the mantissa is in the form 0.1…, counting how many places you shift (that count becomes the exponent)
  3. Store the normalised mantissa in two's complement, pad with zeros to fill all mantissa bits; store the exponent in two's complement

Converting denary to floating-point (negative numbers)

  1. Convert the positive version to normalised floating-point (steps 1–3 above)
  2. Find the two's complement of the positive mantissa: invert all bits, then add 1
  3. Keep the exponent the same — the exponent is always stored for the normalised magnitude

Worked examples

Example 1 — Represent 0.375 in normalised 8-bit floating-point (5+3)
1
Convert 0.375 to binary fraction:
0.375 × 2 = 0.75 → bit 0
0.75 × 2 = 1.5 → bit 1
0.5 × 2 = 1.0 → bit 1
So 0.375 = 0.011₂
2
Normalise to 0.1… form — shift left once (multiply mantissa by 2, subtract 1 from exponent):
0.011 → 0.110, exponent = −1
Pad to 4 fractional bits: 0.1100
3
Store mantissa and exponent:
Mantissa = 01100 (sign bit 0, fractional bits 1100)
Exponent = −1 → 3-bit two's complement = 111
Result: 01100 111
Verify: mantissa 01100 = 0.75, exponent 111 = −1, 0.75 × 2−1 = 0.375
Normalised? Sign=0, next bit=1 → ✓
Example 2 — Represent −6.5 in normalised 8-bit floating-point (5+3)
1
Convert +6.5 to binary:
6 = 110₂, 0.5 = .1₂, so 6.5 = 110.1₂
2
Normalise the magnitude to 0.1… form — shift right 3 places:
110.1 → 0.1101, exponent = 3
Positive mantissa bits: 01101
3
The number is negative — find two's complement of the mantissa:
Invert 0110110010, add 1 → 10011
Exponent = 3 → 3-bit two's complement = 011
Result: 10011 011
Verify: mantissa 10011 = −1 + 0.125 + 0.0625 = −0.8125, exponent 011 = 3, −0.8125 × 8 = −6.5
Normalised? Sign=1, next bit=0 → ✓
Example 3 — Convert 01101 010 to denary
1
Extract components:
Mantissa bits = 01101  ·  Exponent bits = 010
2
Calculate mantissa value:
Sign bit = 0 → positive
Fractional bits .1101: 0.5 + 0.25 + 0 + 0.0625 = 0.8125
3
Calculate exponent value:
010 in two's complement = 2 (MSB = 0, so positive)
Value = 0.8125 × 22 = 0.8125 × 4 = 3.25
Normalised? Sign=0, next bit=1 → ✓
Now you try
Convert the floating-point bit pattern 10110 001 to denary. Show all steps, then verify normalisation.
⚠️ Common mistakes — examiner feedback
📝 Exam tip

Floating-point is the most commonly failed topic in Higher Computing. Pupils who lose marks almost always do so for one reason: they rush. This is a completely mechanical process — there is no trick and no insight required beyond the method. Every mark is available if you write every step.

Expect these question forms:

Task Set A

Task Set A — Higher core
All questions use the 8-bit format: 5-bit mantissa + 3-bit exponent, both in two's complement.
B1
Represent 0.625 in normalised floating-point. Give the full 8-bit pattern (mantissa then exponent, space between).
B2
Represent 3.5 in normalised floating-point. Give the full 8-bit pattern.
B3
Represent −3.5 in normalised floating-point. Give the full 8-bit pattern.
B4
Convert 01100 011 to denary.
B5
Convert 10110 010 to denary.
B6
Which of these 5-bit mantissa patterns represents a normalised positive number?
B7
Explain what normalised floating-point means and why it is used.
B8
Convert 01110 001 to denary.
B9 — past paper style (2 marks)
Represent −0.625 in normalised 8-bit floating-point (5+3). Show all working.
B10 — past paper style (2 marks)
Explain why the same bit pattern can represent completely different denary values depending on whether it is treated as an unsigned integer or as a floating-point number.
✅ Higher checkpoint — B7 (normalised form explanation) and B9 (full negative conversion) are the highest-value question types in this topic. Confident on both = exam-ready.

Task Set B

Task Set B — Extension · Beyond the specification
C1
In most programming languages, 0.1 + 0.1 + 0.1 == 0.3 evaluates to False. Using your knowledge of binary fractions, explain why.
C2
Store the value 1.5 in 8-bit floating-point (5+3). Then explain what changes in the bit pattern when you multiply by 2, and give the new bit pattern. Why is multiplication by powers of 2 so efficient in floating-point?
C3
IEEE 754 double precision uses 64 bits: 1 sign bit, 11 exponent bits, 52 mantissa bits. Compare the range and precision this offers against our simplified 8-bit format (3-bit exponent, 4 fractional mantissa bits). What practical consequence does the larger exponent have?
📁 File this in OneNote under:
Higher Computing Science → Computer Systems → CS3
📌 Teacher notes — not for pupils (Shift+T to toggle)

Timing (120 min double):
5 min — warm up (CS2 recap), circulate
5 min — key vocabulary together
10 min — why integers aren't enough (discuss: how would you store 36.8°C?)
5 min — binary fraction place values (do a few together: what is 0.101₂?)
10 min — normalised form: show what it means visually with the format diagram
15 min — Examples 1 and 2 worked on board together, Example 3 and Now You Try independently
5 min — common mistakes ("has anyone made this one just now?")
25 min — tasks
5 min — cold call review on B4/B5 (conversion back, fastest to check)

Watch for: pupils who normalise in the wrong direction (shifting right instead of left, flipping the exponent sign); pupils who forget the two's complement step for negative numbers (just writing a 1 in the sign bit); and pupils who misread binary fractions (0.1₂ ≠ 0.1₁₀).

Whiteboard tip: draw the format diagram on the board before the lesson. Keep it visible throughout. Pupils lose track of which bits are which under exam conditions.

C1 is worth a brief mention even for pupils who don't attempt it — the 0.1 + 0.1 + 0.1 ≠ 0.3 result visibly surprises most pupils and motivates why floating-point precision matters.