We began this past week with a fast paced introduction to statistics, which included information on how to collect statistical data (probability or non-probability sampling), a summary of descriptive and inferential statistics, and definitions of the null hypothesis (that there is no relationship between variables), p (the probability that the null hypothesis is true), alpha (the stipulated number p must be below to reject the null hypothesis), r (the correlation between variables), beta (the correlation when controlled for effects of other variables), and r² (an expression of overall explanatory power of the independent variables). The descriptive statistics were relatively familiar to me from earlier math experience, but my only exposure to inferential statistics had been posts by Nate Silver on FiveThirtyEight, a political blog with heavy use of statistics to interpret polling data.
The lab for this week (posted here) was largely an exercise in statistical analysis using SPSS. As a class, we added data about independent variables, which potentially affect our environmental variables from last week, to the shared spreadsheet. In lab teams, we then picked three of these independent variables to analyze against per capita carbon dioxide. My team chose infant mortality (deaths per 1000 live births), capital formation (as a percent of GDP), and urban population (% of total). After importing the data into SPSS, we collected descriptive statistics of the data, and ran a difference of means test, zero-order correlation and linear regression between our independent variables and the dependent variable, per capita carbon emissions. We found a statistically and substantially significant relationship between infant mortality and per capita carbon dioxide, which persisted throughout all three of our tests; the relationship between urban population and carbon dioxide was significantly correlated, but this relationship disappeared with a linear regression. We interpreted the causative powers of infant mortality suggested by the linear regression as being merely reflective of economic and societal development, which would be the true driver of per capita carbon dioxide emissions underlying this statistical relationship. Still, it is interesting that our model suggested 44% of the variation in per capita carbon emissions was explained by infant mortality, a result which definitely shows how misleading statistics can be if not thoughtfully interpreted.
Lastly, we have begun the process of deciding on a concentration. I am almost certainly focusing on urban and transportation planning, with a nascent narrowed focus on the political economy of streets and transportation. I will be situating this issue in the United States, but am very undecided as to a more specific situation.