Problem 1 - Decrypting Government Data
Your job is to summarize this gov data about oil consumation
- The format of the file is rather bizzare - note that each line has data for two months, in two different years! (Plus I had to hand edit the file to make it parseable)
- Fortunately, Python is great for untangling and manipulating data.
- Write a generator that reads from the given url over the network, and produces a summary line for a year’s data on each ‘next’ call
- remember that urllib.request returns ‘bytes arrays’, not strings
- The generator should read the lines of the oil2.txt file in a lazy fashion - it should only read 13 lines for every two years of output. Note a loop can have any number of ‘yield’ calls in it.
- Ignore the monthly data, just extract the yearly info
- Drop the month column
- In addition to the ‘oil’ generator function, my solution had a separate helper function, ‘def makeCSV- Line(year, data):’
Here is the first two years of data, 2014 and 2013
1 | Year,Quantity,QuantityChange,Unknown,Unknown2,Price,PriceChange |
now that we have something that looks like a CVS file, can do all kinds of things
Input:
1 | with open('/tmp/oil.csv', 'w') as f: |
Output:
1 | Year Quantity QuantityChange Unknown Unknown2 Price PriceChange |
Problem 2
- suppose we want to convert between C(Celsius) and F(Fahrenheit), using the equation 9C = 5 (F-32)
- could write functions ‘c2f’ and ‘f2c’
- do all computation in floating point for this problem
Problem 3 - Hamlet
Python is very popular in ‘digital humanities’
MIT has the complete works of Shakespeare in a simple html format
You will do a simple analysis of Hamlet by reading the html file, one line at a time(usual iteration scheme) and doing pattern matching
The goal is to return a list of the linecnt, total number of ‘speeches’(look at the file format), and a dict showing the number of ‘speeches’ each character gives
Your program should read directly from the url given, but you may want to download a copy to examine the structure of the file.
remember that usrlib.request returns ‘byte arrays’, not strings
here’s a short sample of the file
Problem 5
define the mul method for polydict
Input:1
[pd1, pd2, pd3, pd1 * pd2, pd1 * pd3, pd2 * pd3]
Output:1
2
3
4
5
6
7
8[+ 3 * X ** 2 + 2 * X + 1,
+ 5 * X ** 2 + 10 * x,
3 * X ** 2 + 2 * X + 1,
5 * X ** 2 + 10 * X,
5 * X ** 2 + 10 * X ** -1,
15 * X ** 4 + 40 * X ** 3 + 25 * X ** 2 + 10 * X,
15 * X ** 4 + 10 * X ** 3 + 5 * X ** 2 + 30 * X + 20 * X ** -1,
25 * X ** 4 + 50 * X ** 3 + 50 * X + 100]