Chapter 3, Data Representation

Goals

After finishing this chapter you should be able to

distinguish between digital and analog information
explain data compression and calculate compression ratios
explain binary formats for negative and floating-point numbers
describe the characteristics of the ASCII and Unicode character sets
perform various types of text compression
explain the nature of sound and its representation
explain how RGB values define a color
distinguish between raster and vector graphics
explain temporal and spatial video compression

Data and Computers

Computers, without data, are useless
We need efficient ways to represent and organize this data
Types of data
- Numbers, letters (text)
- Sound (audio)
- Images, still and moving (graphics, video)
MULTIMEDIA - several different media types
- Probably referring to text, sound and graphics
- Possibly CD rom (or DVD)
Data Compression - Reducing the amount of space needed to store a piece of data
- Save disk space
- Not a problem with text files, but audio and video becomes a problem
- Personal MP3 Players.
  - Download data
  - Sizes 64K to 40 GB
  - Size of device plus compression determine how much music you can store.
- Compression Ratio - size after compression divided by size before compression
  - If a file was originally 200K and compressed to 100K the compression ratio is 100/200 = .5
  - If a file was originally 200K and compressed to 40K the compression ration is 40/200 = 2/10 = .2
- If a compression technique preserves all of the information, the the compression technique is called lossless, or if data is lost it is known as lossy
  - Lossless compression is preferred.
  - But sometimes, we can save quite a bit of space using a lossy compression technique
  - but only when the loss of data doesn't hurt.
    - The following images are from the WebMuseum, Paris
    - Some different quality
    - You have a tradeoff of space (or quality) for size
      In every part of data storage/transmission
      Analog vs Digital
      
      Stored as a position - analog
      Stored as a number digital
      Continuous vs Discrete
      From www.rolex.com
      From www.kmart.com
      Whis is better?
      Digitize convert into digital data.
      Why Binary?
      
      Easier to detect on/off
      Electronic signal stabalization
      Bits requried for representation
      
      1 bit, 0 or 1 two things (2¹)
      two bits (00, 01, 10, 11) four things, (2²)
      three bits (000, 001, 010, 011, 100, 101, 110, 111) eight things, (2³)
      On the other hand, how many bits do we need to represent n things?
      
      5? (more than 4 less than 8) 3 bits
      12? (more than 8 less than 16) 4 bits
      ceiling - integer greater than or equal to
      ceil(4) = 4
      ceil(4.2) = 5
      ceil(4.99) = 5
      lg is the log base two of a number
      if a = 2^x then lg(a ) = x
      lg(8) = 3
      lg(16) = 4
      lg(10 ) = 3.32.
      You need ceil(lg(n)) bits to represent n things.
      Easy/Hard method, find next larger power of two
      Hard/Easy method ceil(lg(n))
      Why do we care?
      Image Size Compression Ratio
      
      Original 346K
      
      95 190K .55
      
      75 85K .25
      
      50 50K .14
      
      25 26K .075
      
      10 12K .035

Image	Size	Compression Ratio
Original	346K
95	190K	.55
75	85K	.25
50	50K	.14
25	26K	.075
10	12K	.035