-
Notifications
You must be signed in to change notification settings - Fork 0
rli_database_read
All the (R) code needed to access the database can be found in the Recodepository. You can get a copy of the repository on your machine either by using Git or downloading the whole repo as a zip-file. If your're using the zip-file, just download it and unzip it somewhere on on your local hard drive.
root_folder
refers either to the root of the unzipped folder or the root of cloned Git repository.
If you are using RStudio for code editing it's a good idea to set up a new project by clicking
"Project" > "Create Project" > "Existing Directory"
and selecting root_folder
as the project folder. RStudio sets you current working directory to root_folder
which is important as many scripts rely on relative paths within the folder.
If you're using something else than RStudio you must set the working directory manually before running the any of the scripts. This can be done by
setwd(ROOT_FOLDER)
Open the file scripts/rli_database_read.R
. If you're running the script for the first time, it's perhaps a good idea to execute the code line by line.
First off, let's execute a set of helper functions related to accessing the database.
source("R/postgresql.R")
Next, we will have to install a set of packages needed by the database connection if not already installed. More specifically, this function will install packages pgUtils and RPostgreSQL.
install.deps()
If you are interested in learning more about the RPostgreSQL package and functionalities therein, take a look at their examples.
In order to access to database, you need to tell the script the right credentials. These credentials are included in config.R
, but for security reasons these are not distributed with the rest of the code. Instead, ask Joona to email the file to you and place it in folder called R
.
Once you have config.R
in place, you are ready to establish a connection. Execute the following line
con <- connect.rli("R/config.R")
If you didn't get any error statements, you now have an open connection to the database.
Next up, let's fetch some data from the database.
habitat <- fetch.rli.data(con, table="habitat")
habitat.threat <- fetch.rli.data(con, table="habitat_threat")
imp <- fetch.rli.data(con, table="implementation")
prog.targets <- fetch.rli.data(con, table="programme_targets")
progs <- fetch.rli.data(con, table="programmes")
threat <- fetch.rli.data(con, table="threat")
For descriptions on the tables and on their origins, see here.
That's it! You now have all the above mentioned tables are ordinary dataframes in R amenable to everything you are willing to throw at them.
The following line demonstrates how to combine programme codes with programme descriptions
imp <- merge(imp, progs)
For calculating the per programme total euros spent and hectares acquired, package plyr is your friend. If you don't have it installed yet, do so now and load the it
install.packages(plyr)
library(plyr)
Do the actual summarizing using function ddply
imp.by.prog <- ddply(imp, c("ProgrammeID", "LongDesc"), summarise, total_euros = sum(Euros),
total_ha = sum(Hectares))