This week’s lab built upon the driving variable we compiled data on in last week’s lab. Last week, we did literary research on the GEF index on biodiversity and got a better idea of the background behind measuring the data and observed some general trends between biodiversity, country size, and forest cover. This week was much more data heavy — we followed a very detailed procedure that brought us through cleaning up data on a spreadsheet, importing it into SPSS, and running several inferential statistics and regression analysis tests. This whole process was vaguely familiar to me since it was similar to the data analysis we performed in Bio 141, but it was still a long process trying to relearn it. Navigating through Excel and SPSS wasn’t too hard, but the interpretation of the results that SPSS generated was more challenging. The tables that were produced in SPSS were all headed with what seemed like a coded language of abbreviations like “sig,” “f,” and “t.” After working with my group and consulting Jim, we extracted the important information — “r” and “p” values, which represent substantive and statistical significance, respectively. In order to have statistical significance, the “r” value must be below 0.05, and in order to have substantive significance the “p” value must be above 0.2.
But what does this all mean? That is exactly the question my group had after about two and a half hours of crunching numbers. As it turns out, substantive significance is the amount each data point within the data set correlates to each other, while statistical significance is the probability of getting the same result by chance. In other words, substantive significance answers if there is a relationship, and statistical significance answers if it can actually be applied to a larger scale. None of our data had statistical significance, but the relationship between the percentage of a population living in an urban environment with the biodiversity score the country received had substantive significance.
Although it’s always a little bit disappointing to not find statistical significance in a study, it’s actually an equally interesting outcome to consider. We chose each environmental factor (percentage urban population, education spending, and population density) because we thought there might be a connection. We might talk a lot about certain things being a problem, (to use a classic environmental theory, “population is the problem”) but when we actually attach a data set to the debate, we can quantifiably decide whether or not it is actually a problem. Doing these statistical analyses adds concrete information to discussions that can become very blown out of proportion, and they are therefore a very important part of environmental studies.