Title: | Accessing Public Data from the ABMI |
---|---|
Description: | Public data from the Alberta Biodiversity Monitoring Institute. |
Authors: | Peter Solymos [aut, cre] , Joan Fang [aut], ABMI [cph] |
Maintainer: | Peter Solymos <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.1 |
Built: | 2024-11-01 11:15:31 UTC |
Source: | https://github.com/ABbiodiversity/abmidata |
Function names are prepended with ad_
referring to ABMI data.
ad_get_table_names
lists names of tables available for download,
ad_get_table_header
lists table headers,
ad_get_table_size
gives the size of the table,
ad_get_table_data
gets a chunk of a table,
ad_get_table
gets the entire table.
ad_get_table_names() ad_get_table_header(table) ad_get_table_data(table, site = NULL, year = NULL, skip = 0L, take = 1000L) ad_get_table_size(table) ad_get_table(table, site = NULL, year = NULL)
ad_get_table_names() ad_get_table_header(table) ad_get_table_data(table, site = NULL, year = NULL, skip = 0L, take = 1000L) ad_get_table_size(table) ad_get_table(table, site = NULL, year = NULL)
table |
Title of a given table, e.g. |
site |
Site name (return sites contains the given value). |
year |
Year (return records in the given year). |
skip |
Start from which row (default 0). |
take |
Total rows to return (default 1000). |
ad_get_table_names
returns a named character vector with table names and descriptions.
ad_get_table_header
returns a character vector with header names.
ad_get_table_size
returns a single numeric value with the size of the table.
ad_get_table_data
returns a data frame.
ad_get_table
returns a data frame.
ad_convert_na()
for dealing with missing value indicators.
## Not run: ad_get_table_names() ad_get_table_header("T01A") ad_get_table_data("T01A", take=10) ad_get_table_size("T01A") str(ad_get_table("T01A", year=2010)) ## End(Not run)
## Not run: ad_get_table_names() ad_get_table_header("T01A") ad_get_table_data("T01A", take=10) ad_get_table_size("T01A") str(ad_get_table("T01A", year=2010)) ## End(Not run)
ABMI specific missing value indicators complicate data processing because numeric variables are treated as character. These helpers handle these indicators. There are 4 kinds of missing value indicators in ABMI data tables:
ad_convert_na(x) ad_process_na(x)
ad_convert_na(x) ad_process_na(x)
x |
A vector. |
VNA: Variable Not Applicable. Some ABMI data is collected in a nested manner. For example Tree Species is a parent variable. This variable has a number of child variables that are used to describe the parent variable in more detail (e.g., condition, DBH, decay stage). When the parent variable is recorded as None, child variables are no longer applied and are recorded as VNA. VNA is also used when the protocol calls for a modified sampling procedure based on site conditions (e.g., surface substrate protocol variant for hydric site conditions). The use of VNA implies that users of the data should not expect that any data could be present.
DNC: Did Not Collect. DNC is used to describe variables that should have been collected but were not. There are a number of reasons that data might not have been collected (e.g. staff oversight, equipment failure, safety concerns, environmental conditions, or time constraints). Regardless of the reason data was not collected, if under ideal conditions it should have been, the record in the data entry file reads DNC. The use of DNC implies that users should expect the data to be present - though it is not.
PNA: Protocol Not Available. The ABMI's protocols were, and continue to be, implemented in a staged manner. As a result, the collection of many variables began in years subsequent to the start of the prototype or operational phases or where discontinued after a few years of trial. When a variable was not collected because the protocol had yet to be implemented by the ABMI (or was discontinued by the ABMI), the data entry record reads PNA. This is a global constraint to the data (i.e. a protocol was not implemented until 2006, therefore, previous years cannot have this variable). PNA is to be used to describe the lack of data collection for entire years.
SNI: Species Not Identified. In various fields related to species identification, SNI is used to indicate that the organism was not identified to the species level. Some possible reasons that identification to the species level of resolution was not possible include, insufficient or deficient sample collected and lack of field time.
ad_convert_na
returns a numeric vector, ABMI's special missing value indicators set to NA
.
ad_process_na
returns a data frame with ABMI's special missing value
indicators as their own indicator columns (1 or 0) and the value
column
containing the numeric output from ad_convert_na(x)
.
## Not run: z <- ad_get_table("T01A", year=2010) x <- z[["Aspect (degrees)"]][1:100] x ad_convert_na(x) summary(ad_process_na(x)) ## End(Not run)
## Not run: z <- ad_get_table("T01A", year=2010) x <- z[["Aspect (degrees)"]][1:100] x ad_convert_na(x) summary(ad_process_na(x)) ## End(Not run)