3. Data Exploration
Table 1. Sample data table. Site ID contains information about sampling site, year and month. DOC and SUVA are water chemistry values used in this project. The rest of the columns are environmental descriptions about catchment area, human land use, surficial geology, and wildfire.
Table 1 shows a portion of the data used for this project. The sampling units are individual samples with unique site ID information, water chemistry, and landscape information about fire, land use, as well as surficial geology. The response variable I would look at is DOC concentrations and it is a continuous variable. The predictor variables are catchment information regarding the percent of catchment area burned by wildfire, percent of each land use type (including disturbed, wetland, forest, settlement, roads, water, cropland, other) as well as the percent of different types of surficial geology (including colluvial deposits, glaciolacustrine deposits, icethrust moraine, moraine, organic deposits, and pre-glacial fluvial deposits). All predictor variables are continuous variables and they were observed using tabulate intersection tool in ArcGIS Pro. The wildfire perimeter data was retrieved from Alberta Parkland County; land use and surficial geology data were retrieved from Alberta Open Government website.
I classified fire-disturbed and fire-undisturbed catchments based on the presence of wildfire within each catchment. Among all fifty-five catchments, there are thirty-eight catchments that were not impacted by the wildfire. Seventeen catchments were impacted by the wildfire and their burned proportions are displayed in Figure 2. Instead of categorizing the fire-disturbed catchments into different fire impacted levels, I would treat them as a group separate from the fire-undisturbed catchments to maintain sample size for the subsequent statistical analysis.
Figure 2. Stacked bar chart for the 17 catchments that were disturbed by the wildfire. Bars in red represent the proportion burned of each catchment area.
Figure 3. Changes of DOC concentrations (mg/L) and SUVA (L/mg·m) in fire-disturbed and fire-undisturbed catchments pre and post the wildfire.
Figure 3 presents changes of DOC concentrations (mg/L) and SUVA (L/mg·m) in fire-disturbed and fire-undisturbed catchments pre and post the wildfire. Thinner lines represent individual values of a specific catchment at a specific sampling campaign; the thicker lines represent the averages calculated from the individual values. The gap in the x axis indicates lack of samples. Average DOC concentrations are higher in the fire-disturbed catchments compared to that of the fire-undisturbed catchments. Average SUVA values in fire-disturbed and fire-undisturbed catchments seem similar. In general, mean DOC concentrations and SUVA before the fire appear lower than the values after the fire.
I calculated the mean values and visualized the distributions using histograms. Although not necessary, each histogram appeared skewed and I experimented log, log10, and square-root transformations to meet the normality assumption for ANOVA. Figure 4 displayed the distribution of original data and the log10 transformed data.
Figure 4. Example histograms of averaged DOC concentrations and SUVA before and after normalization. Two displayed examples are from fire-undisturbed catchments before the wildfire.