ST445 Class Exercise #4 June 07, 2012 Course web site contains three files containing the NC prison population for each of the years 2005, 2006, and 2007. Your task is to read in these files, put them together to get the whole series from 01 Jan 2005 through 31 Dec 2007, and to find some information. (Among the issues are goofy format, missing values (more below), ordering of obs by year, month, day.) The files are prisn05.csv prisn06.csv prisn07.csv which, as you may surmise, are comma-delimited files. The creator of these files (actually a web-app) committed a crime by leaving some empty cells in moving from a spreadsheet to a .csv file. You have learned about reading comma-delimited files using the dlm=',' option in the infile statement; here the key is the dsd option, which 1) treats two consecutive delimiters as a missing value, 2) doesn't look inside character variables for embedded commas, and 3) removes quotation marks from character values (common for character values in spreadsheets). To check your numbers, the corresponding .html files are also included in the website. a) Read each file, creating datasets for each, with variables population, year, month, and day b) Sort by year, month, and day c) Put the datasets together with set d) Create a date variable, using the SAS function mdy(month,day,year) which give the SAS date for given values of month, day, and year. (Note: If no such day occurs, such as 31 June 1984, then the function returns a missing value. Delete days that do not exist.) e) Create a variable that computes the difference in the population from the previous day. f) Find some useful information: maximum and minimum population, biggest one day increase and decrease in population. g) (optional) Can you determine when those events occurred? (In looking at these data several years ago, it appeared as if about a thousand prisoners took a Christmas vacation.) Submit your output and log and answer any question