cd "C:\" *1.Aggregate and merge the county-level data with the state-level data. use "http://www.unm.edu/~ckbutler/stats/classData/CountyCensusData.dta", clear rename state stnumber collapse (mean) stnumber (sum) census2000pop popestimate2000 popestimate2001 popestimate2002 popestimate2003 popestimate2004 popestimate2005 popestimate2006 popestimate2007, by(stname) sort stnumber save "AggregatedCountyCensusData.dta", replace use "http://www.unm.edu/~ckbutler/stats/classData/StateDemographics2008.dta", clear gen stnumber=. replace stnumber=1 if state=="Alabama" replace stnumber=2 if state=="Alaska" replace stnumber=4 if state=="Arizona" replace stnumber=5 if state=="Arkansas" replace stnumber=6 if state=="California" replace stnumber=8 if state=="Colorado" replace stnumber=9 if state=="Connecticut" replace stnumber=10 if state=="Delaware" replace stnumber=11 if state=="District of Columbia" replace stnumber=12 if state=="Florida" replace stnumber=13 if state=="Georgia" replace stnumber=15 if state=="Hawaii" replace stnumber=16 if state=="Idaho" replace stnumber=17 if state=="Illinois" replace stnumber=18 if state=="Indiana" replace stnumber=19 if state=="Iowa" replace stnumber=20 if state=="Kansas" replace stnumber=21 if state=="Kentucky" replace stnumber=22 if state=="Louisiana" replace stnumber=23 if state=="Maine" replace stnumber=24 if state=="Maryland" replace stnumber=25 if state=="Massachusetts" replace stnumber=26 if state=="Michigan" replace stnumber=27 if state=="Minnesota" replace stnumber=28 if state=="Mississippi" replace stnumber=29 if state=="Missouri" replace stnumber=30 if state=="Montana" replace stnumber=31 if state=="Nebraska" replace stnumber=32 if state=="Nevada" replace stnumber=33 if state=="New Hampshire" replace stnumber=34 if state=="New Jersey" replace stnumber=35 if state=="New Mexico" replace stnumber=36 if state=="New York" replace stnumber=37 if state=="North Carolina" replace stnumber=38 if state=="North Dakota" replace stnumber=39 if state=="Ohio" replace stnumber=40 if state=="Oklahoma" replace stnumber=41 if state=="Oregon" replace stnumber=42 if state=="Pennsylvania" replace stnumber=44 if state=="Rhode Island" replace stnumber=45 if state=="South Carolina" replace stnumber=46 if state=="South Dakota" replace stnumber=47 if state=="Tennessee" replace stnumber=48 if state=="Texas" replace stnumber=49 if state=="Utah" replace stnumber=50 if state=="Vermont" replace stnumber=51 if state=="Virginia" replace stnumber=53 if state=="Washington" replace stnumber=54 if state=="West Virginia" replace stnumber=55 if state=="Wisconsin" replace stnumber=56 if state=="Wyoming" sort stnumber save "StateDemographics2008_newMaster.dta", replace merge stnumber using "AggregatedCountyCensusData.dta" *2.Check whether population2004 and population2007 from StateDemographics2008.dta popestimate2004 and popestimate2007 from the aggregated CountyCensusData.dta, respectively, are the same. corr population2004 popestimate2004 corr population2007 popestimate2007 *3.What is the correlation between a state's electoral votes and its population in the year 2000? corr electoralvotes census2000pop *4.What is the correlation between a state's electoral votes and its estimated population in the year 2006? corr electoralvotes popestimate2006 *5.Generate a new variable that is a state's estimated population in 2004 divided by its electoral votes. Summarize the new variable and the two variables that went into its construction. gen ratio2004=popestimate2004/electoralvotes sum ratio2004 popestimate2004 electoralvotes *6.Calculate the national ratio of population to each electoral vote for the year 2004. (You can use the “display” command.) *Here are two ways to do this. *The first recognizes that national totals = number of observations * the mean. (Yes, the 51 cancels itself out, but you can check the subvalues.) display (51*5748853)/(51*10.54902) *The second method uses 'extensions to generate' and then a simple generate command to arrive at the same number. egen totalElectoralVotes=sum(electoralvotes) egen totalNationalPopulation2004=sum(popestimate2004) gen nationalRatio2004=totalNationalPopulation2004/totalElectoralVotes display nationalRatio2004 *7.Do a t-test of whether the variable in question 5 equals the value calculated in question 6. (“ttest var=value”) *Depending on which method you used in question 6: ttest ratio2004=544965.6 ttest ratio2004=nationalRatio2004 *The tests are somewhat different, but the interpretation of the p-values is the same: The average state ratio is different from the national one. *8.Produce a scatterplot of a state's electoral votes (on the y-axis) and its estimated population in the year 2004. scatter electoralvotes popestimate2004, name(graph1) *This graph is a no-brainer: bigger states (by population) have more electoral votes. *A more telling graph is the following in which the state ratio is plotted against the national ratio of population to electoral votes. *It shows clearly that larger states have more population per electoral vote compared to smaller states. scatter ratio2004 popestimate2004, yline(544965.6) ylabel(0 167753 544966 660436) mlabel(state) name(graph2)