CMaps
Looking at the cmap of “crazyones”:
pdftk crazyones.pdf output crazyones-uncomp.pdf uncompress
You can see this:
begincmap
/CMapName /T1Encoding-UTF16 def
/CMapType 2 def
/CIDSystemInfo <<
/Registry (Adobe)
/Ordering (UCS)
/Supplement 0
>> def
1 begincodespacerange
<00> <FF>
endcodespacerange
1 beginbfchar
<1B> <FB00>
endbfchar
endcmap
CMapName currentdict /CMap defineresource pop
codespacerange
A codespacerange maps a complete sequence of bytes to a range of Unicode glyphs. It defines a starting point:
1 beginbfchar
<1B> <FB00>
That means that 1B (Hex for 27) maps to the Unicode character FB00 - the ligature ff (two lowercase f’s).
The two numbers in begincodespacerange mean that it starts with an offset of
0 (hence from 1B ➜ FB00) up to an offset of FF (dec: 255), hence 1B+FF = 282
➜ FBFF.
Within the text stream, there is
(The)-342(mis\034ts.)
\034 is octal for the decimal value 28.