Python代写:COMSW3101-Introduction-To-Python

Problem 1 - Decrypting Government Data

Your job is to summarize this gov data about oil consumation

  • The format of the file is rather bizzare - note that each line has data for two months, in two different years! (Plus I had to hand edit the file to make it parseable)
  • Fortunately, Python is great for untangling and manipulating data.
  • Write a generator that reads from the given url over the network, and produces a summary line for a year’s data on each ‘next’ call
  • remember that urllib.request returns ‘bytes arrays’, not strings
  • The generator should read the lines of the oil2.txt file in a lazy fashion - it should only read 13 lines for every two years of output. Note a loop can have any number of ‘yield’ calls in it.
  • Ignore the monthly data, just extract the yearly info
  • Drop the month column
  • In addition to the ‘oil’ generator function, my solution had a separate helper function, ‘def makeCSV- Line(year, data):’

Here is the first two years of data, 2014 and 2013

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Year,Quantity,QuantityChange,Unknown,Unknown2,Price,PriceChange
2014,2700903,-112867,246409332,-26397845,91.23,-5.72
2013,2813770,-283638,272807177,-40367786,96.95,-4.15
2012,3097408,-224509,313174963,-18407090,101.11,1.29
2011,3321917,-55160,331582053,79421544,99.82,25.15
2010,3377077,62290,252160509,63448733,74.67,17.74
2009,3314787,-275841,188711776,-153200712,56.93,-38.29
2008,3590628,-99940,341912488,104700835,95.22,30.95
2007,3690568,-43658,237211653,20584322,64.28,6.26
2006,3734226,-20445,216627331,40871990,58.01,11.20
2005,3754671,-66308,175755341,44012676,46.81,12.33
2004,3820979,144974,131742665,32575492,34.48,7.50
2003,3676005,257983,99167173,21883842,26.98,4.37
2002,3418022,-53045,77283331,2990437,22.61,1.21
2001,3471067,71827,74292894,-15583539,21.40,-5.04
2000,3399240,171148,89876433,38986812,26.44,10.68
1999,3228092,-14620,50889621,13637399,15.76,4.28
1998,3242712,173281,37252222,-16973685,11.49,-6.18
1997,3069431,175785,54225907,-704950,17.67,-1.32
1996,2893646,126333,54930857,11181204,18.98,3.17

now that we have something that looks like a CVS file, can do all kinds of things

Input:

1
2
3
4
5
6
7
8
9
10
11
12
with open('/tmp/oil.csv', 'w') as f:
for l in oil(url):
f.write(l + '\n')

o = oil(url)
ls = list(o)
s = '\n'.join(ls)
import pandas as pd
import io
# we will cover StringIO next week - kind of an 'in-memory' file
df = pd.read_csv(io.StringIO(s))
df

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
    Year Quantity QuantityChange    Unknown   Unknown2   Price PriceChange
0 2014 2700903 -112867 246409332 -26397845 91.23 -5.72
1 2013 2813770 -283638 272807177 -40367786 96.95 -4.15
2 2012 3097408 -224509 313174963 -18407090 101.11 1.29
3 2011 3321917 -55160 331582053 79421544 99.82 25.15
4 2010 3377077 62290 252160509 63448733 74.67 17.74
5 2009 3314787 -275841 188711776 -153200712 56.93 -38.29
6 2008 3590628 -99940 341912488 104700835 95.22 30.95
7 2007 3690568 -43658 237211653 20584322 64.28 6.26
8 2006 3734226 -20445 216627331 40871990 58.01 11.20
9 2005 3754671 -66308 175755341 44012676 46.81 12.33
10 2004 3820979 144974 131742665 32575492 34.48 7.50
11 2003 3676005 257983 99167173 21883842 26.98 4.37
12 2002 3418022 -53045 77283331 2990437 22.61 1.21
13 2001 3471067 71827 74292894 -15583539 21.40 -5.04
14 2000 3399240 171148 89876433 38986812 26.44 10.68
15 1999 3228092 -14620 50889621 13637399 15.76 4.28
16 1998 3242712 173281 37252222 -16973685 11.49 -6.18
17 1997 3069431 175785 54225907 -704950 17.67 -1.32
18 1996 2893646 126333 54930857 11181204 18.98 3.17
19 1995 2767313 63116 43749653 5270236 15.81 1.58
20 1994 2704197 160822 38479417 10041 14.23 -0.90
21 1993 2543375 248805 38469376 -83679 15.13 -1.68

Problem 2

  • suppose we want to convert between C(Celsius) and F(Fahrenheit), using the equation 9C = 5 (F-32)
  • could write functions ‘c2f’ and ‘f2c’
  • do all computation in floating point for this problem

Problem 3 - Hamlet

Python is very popular in ‘digital humanities’

MIT has the complete works of Shakespeare in a simple html format

You will do a simple analysis of Hamlet by reading the html file, one line at a time(usual iteration scheme) and doing pattern matching

The goal is to return a list of the linecnt, total number of ‘speeches’(look at the file format), and a dict showing the number of ‘speeches’ each character gives

Your program should read directly from the url given, but you may want to download a copy to examine the structure of the file.
remember that usrlib.request returns ‘byte arrays’, not strings

here’s a short sample of the file

Problem 5

define the mul method for polydict
Input:

1
[pd1, pd2, pd3, pd1 * pd2, pd1 * pd3, pd2 * pd3]

Output:

1
2
3
4
5
6
7
8
[+ 3 * X ** 2 + 2 * X + 1,
+ 5 * X ** 2 + 10 * x,
3 * X ** 2 + 2 * X + 1,
5 * X ** 2 + 10 * X,
5 * X ** 2 + 10 * X ** -1,
15 * X ** 4 + 40 * X ** 3 + 25 * X ** 2 + 10 * X,
15 * X ** 4 + 10 * X ** 3 + 5 * X ** 2 + 30 * X + 20 * X ** -1,
25 * X ** 4 + 50 * X ** 3 + 50 * X + 100]