ST 445 -- SPRING 2008 HOMEWORK #6 DUE MONDAY, 07 APRIL 2008 EXERCISE In the apr08 directory are three files containing the NC prison population for each of the years 2005, 2006, and 2007. Your task is to read in these files, put them together to get the whole series from 01 Jan 2005 through 31 Dec 2007, and to find some information. (Among the issues are goofy format, missing values (more below), ordering of obs by year, month, day.) The files are prisn05.csv prisn06.csv prisn07.csv which, as you may surmise, are comma-delimited files. The creator of these files (actually a web-app) committed a crime by leaving some empty cells in moving from a spreadsheet to a .csv file. You have learned about reading comma-delimited files using the dlm=',' option in the infile statement; here the trick is a different option called dsd, which 1) sets the delimiter to a comma, 2) treats two consecutive delimiters as a missing value, and 3) removes quotation marks from values (common for character values in spreadsheets). To check your numbers, the correponding .html files are also included in the apr08 directory. Other information to be found: a) maximum population (and when) b) minimum population (and when) c) biggest one day increase in population (and when) d) biggest one day decrease in population (and when) And, lastly, plot the data. (In looking at these data several years ago, it appeared as if about a thousand prisoners took a Christmas vacation.) A function that may be useful is mdy(month,day,year) which give the SAS date for given values of month, day, and year. If no such day occurs, such as 31 June 1984, then the function returns a missing value.