Lab 2: Information Theory

This lab is part of Assignment 1, and a continuation of Lab 1. You are to work on this with the same partner as for Lab 1, and to complete the activities below before arriving in lab.

As always, your code must comply with the CS 190 Style Specifications, and work on the ECF machines. The marking scheme for Assignment 1 is available here.

Part One: Run-Length Encoding

Run-Length Encoding is a form of lossless data compression. Each time a letter is repeated, you record how many repetitons there are, and replace that with the repeated letters.

Examples:
WWWW -> 4W
WWWWW -> 5W
WWWWWAAA -> 5W3A
WWWWWAAAWB -> 5W3A1W1B

TODO: write a program run_length_encoding.c, which uses scanf to take in a string of at most 50 characters. You can assume the input string contains no whitespace. Using this string, write a method, void encode(char* string), which prints the run-length encoding of the string. In this method, you must use * notation for arrays instead of [ ] notation.

Example test cases:
WWWW -> 4W
WWWWWAAAWB -> 5W3A1W1B
WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWW -> 12W1B12W3B22W

Ensure that your output matches the test cases above.

Part Two: Polynomials

TODO: For this part, use scanf to take in the coefficients of polynomial, Ax^4 + Bx^3 + Cx^2 + Dx + E, where all coefficients are non-negative integers. There should be a space between each term in the polynomial. For this section, you may assume the input for the polynomial will never evaluate to 0.

Create a file, numbers.c. In the file, create a method, int evaluate_poly(int* coeffs, int x), which takes in the array of coefficients [E, D, C, B, A], and a value x. Return the value of the evaluated polynomial.

Test cases:
0x^4 + 0x^3 + 0x^2 + 0x + 1 and x = 1 -> 1
0x^4 + 0x^3 + 1x^2 + 2x + 3 and x = 2 -> 11
5x^4 + 4x^3 + 3x^2 + 2x + 1 and x = 3 -> 547

TODO: Next, write a method, int get_first_digit(int x), which takes in an integer, x, and returns its first (leftmost) digit.

Test cases:
1 -> 1
12 -> 1
54712 -> 5

TODO: create a method, void values_poly(int* coeffs, int* values), to generate the result of a given polynomial when the value of x iterates from 1 to 1000, and save it to the array values. For example, for the polynomial x^3, the values would be 1, 8, 27, 64, 125, 216, ..., 996004, 998001, 1000000. Call the method evaluate_poly as a helper method.

Since we are modifying values, a parameter given to our function, it is important to note the pre/post conditions for this methods! Remember to include pre/post conditions for all your methods.

TODO: create a method void first_values_poly(int* values) to extract the first digits of the sequence generated by values_poly -- so for x^3: 1, 8, 2, 6, 1, 2, ... 9, 9, 1. The extracted values should be saved back into the array values. You will want to use get_first_digit as a helper method.

TODO: Create a function void value_frequencies(int* values) that will take an array of first digits as an argument and compute their frequencies. You should create an array, freqs[9], where freqs[0] is the number of times 1 has been the first digit, freqs[1] is the number of times that 2 has been the first digit, etc. (Assume you will never be given a polynomial which always evaluates to 0.) Once the frequencies are computed, calculate the percentage each digit was seen as the first one.

In this method, print out the frequencies using the same formatting as in the test cases below. (Don't panic if your frequencies add up to 99% -- some roundoff error may happen in your calculations.)

Test cases:
0x^4 + 1x^3 + 0x^2 + 0x + 0
1: 22.500000%
2: 15.900000%
3: 12.400000%
4: 10.600000%
5: 9.400000%
6: 8.300000%
7: 7.400000%
8: 7.100000%
9: 6.300000%
0x^4 + 1x^3 + 2x^2 + 3x + 4
1: 22.600000%
2: 15.800000%
3: 12.700000%
4: 10.500000%
5: 9.200000%
6: 8.400000%
7: 7.500000%
8: 6.800000%
9: 6.400000%

Look up Benford's Law -- do the distributions we have seen follow this pattern? Why or why not? How can you tell? Document this in your report.