Skip to content

Commit

Permalink
Work on week 9
Browse files Browse the repository at this point in the history
  • Loading branch information
robjhyndman committed Apr 25, 2024
1 parent 3462828 commit fae23d8
Show file tree
Hide file tree
Showing 12 changed files with 400 additions and 4 deletions.
4 changes: 4 additions & 0 deletions header.tex
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
% Unicode
\usepackage{pmboxdraw}


\makeatletter
\def\input@path{{..}}
\makeatother
Expand Down
Binary file added week9/images/change.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added week9/images/decisions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added week9/images/downstream.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added week9/images/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added week9/images/pipeline_graph.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Binary file added week9/images/workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
222 changes: 218 additions & 4 deletions week9/slides.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,9 @@ source(here::here("course_info.R"))

# Assignments

## Assignments 3 and 4
## Assignments

* Assignment 2 feedback
* Assignment 3 due 10 May
* Assignment 4 due 24 May

Expand All @@ -51,7 +52,7 @@ source(here::here("course_info.R"))
* Heavier reliance on pandoc Lua filters
* Uses pandoc templates for extensions

\centerline{\includegraphics[width = 10cm]{qmd.png}}
\centerline{\includegraphics[width = 10cm]{images/qmd.png}}

## Choose your engine

Expand Down Expand Up @@ -105,6 +106,8 @@ mtcars |>
```
````

Reference the figure using `@fig-chunklabel`.

## Chunk options

* Quarto consistently uses hyphenated options (`fig-width` rather than `fig.width`)
Expand All @@ -121,16 +124,227 @@ mtcars |>
\fontsize{13}{17}\sf

* Quarto extensions modify and extend functionality.
* They are stored locally, in the `_extensions` folder alongside the qmd document.
* See <https://quarto.org/docs/extensions/> for a list.
* Templates are extensions used to define new output formats.
* Journal templates at\newline <https://quarto.org/docs/extensions/listing-journals.html>
* Monash templates at\newline <https://robjhyndman.com/hyndsight/quarto_templates.html>

## quarto on the command line
\fontsize{14}{14.5}\sf\vspace*{-0.4cm}

* `quarto render` to render a quarto or Rmarkdown document.
* `quarto preview` to preview a quarto or Rmarkdown document.
* `quarto add <gh-org>/<gh-repo>` to add an extension from a github repository.
* `quarto update <gh-org>/<gh-repo>` to update an extension
* `quarto remove <gh-org>/<gh-repo>` to remove an extension
* `quarto list extensions installed`
* `quarto use template <gh-org>/<gh-repo>` to use existing repo as starter template.

## Add a custom format

From the CLI:\qquad `quarto add numbats/monash-quarto-memo`\pause

New folder/files added

```{verbatim}
├── _extensions
│ └── numbats
│ └── memo
│ └── ...
```

\pause

Update YAML

```{verbatim}
---
title: "My new file using the `memo-pdf` format"
format: memo-pdf
---
```

## Exercise

* Create a quarto document using an html format
* Add a code chunk to generate a figure with a figure caption.
* Set up a new project.
* Create a quarto document using an html format.
* Add a code chunk to generate a figure with a caption.
* Reference the figure in the text using `@fig-chunklabel`.
* Add the monash memo extension and generate a pdf output.

# targets

## targets: reproducible computation at scale

\placefig{0.5}{1.8}{width=5cm}{images/logo.png}

\begin{textblock}{15}(0.5,8.5)
\textcolor{gray}{\footnotesize Some images from https://wlandau.github.io/targets-tutorial}
\end{textblock}

\begin{textblock}{10}(6, 2)
\begin{itemize}
\item Supports a clean, modular, function-oriented programming style.
\item Learns how your pipeline fits together.
\item Runs only the necessary computation.
\item Abstracts files as R objects.
\item Similar to Makefiles, but with R functions.
\end{itemize}
\end{textblock}

## Interconnected tasks

\only<1>{\placefig{0.5}{2}{width=13cm}{images/workflow.png}}
\only<2>{\placefig{0.5}{2}{width=13cm}{images/change.png}}
\only<3>{\placefig{0.5}{2}{width=13cm}{images/downstream.png}}

## Dilemma: short runtimes or reproducible results?

\fullheight{images/decisions.png}

## Let a pipeline tool do the work

\fullwidth{images/pipeline_graph.png}\vspace*{-0.15cm}

* Save time while ensuring computational reproducibility.
* Automatically skip tasks that are already up to date.

## Typical project structure

```{verbatim}
_targets.R # Required top-level configuration file.
R/
└── functions.R
data/
└── my_data.csv
```

### _targets.R
\vspace*{-0.26cm}

```{r}
#| eval: false
library(targets)
tar_source() # source all files in R folder
tar_option_set(packages = c("tidyverse", "fable"))
list(
tar_target(my_file, "data/my_data.csv", format = "file"),
tar_target(my_data, read_csv(my_file)),
tar_target(my_model, model_function(my_data))
)
```

## Generate `_targets.R` in working directory

```{r}
#| eval: false
library(targets)
tar_script()
```


## Useful targets commands

* `tar_make()` to run the pipeline.
* `tar_make(starts_with("fig"))` to run only targets starting with "fig".
* `tar_read(object)` to read a target.
* `tar_load(object)` to load a target.
* `tar_load_everything()` to load all targets.
* `tar_manifest()` to list all targets
* `tar_visnetwork()` to visualize the pipeline.
* `tar_destroy()` to remove all targets.
* `tar_outdated()` to list outdated targets.

## Debugging

Errored targets to return `NULL` so pipeline continues.

```{r}
#| eval: false
tar_option_set(error = "null")
```

\pause

See error messages for all targets.

```{r}
#| eval: false
tar_meta(fields = error, complete_only = TRUE)
```

\pause

See warning messages for all targets.

```{r}
#| eval: false
tar_meta(fields = warnings, complete_only = TRUE)
```

## Debugging
\fontsize{14}{15.5}\sf

* Try loading all available targets: `tar_load_everything()`. Then run the command of the errored target in the console.

* Pause the pipeline with `browser()`

* Use the debug option: `tar_option_set(debug = "target_name")`

* Save the workspaces:

- `tar_option_set(workspace_on_error = TRUE)`
- `tar_workspaces()`
- `tar_workspace(target_name)`


## Random numbers

* Each target runs with its own seed based on its name and the global seed from `tar_option_set(seed = ???)`
* So running only some targets, or running them in a different order, will not change the results.

## Folder structure

```{verbatim}
├── .git/
├── .Rprofile
├── .Renviron
├── renv/
├── index.Rmd
├── _targets/
├── _targets.R
├── _targets.yaml
├── R/
├──── functions_data.R
├──── functions_analysis.R
├──── functions_visualization.R
├── data/
└──── input_data.csv
```

## targets with quarto

```{r}
#| eval: false
library(targets)
library(tarchetypes) # <1>
tar_source() # source all files in R folder
tar_option_set(packages = c("tidyverse", "fable"))
list(
tar_target(my_file, "data/my_data.csv", format = "file"),
tar_target(my_data, read_csv(my_file)),
tar_target(my_model, model_function(my_data))
tar_quarto(report, "file.qmd", extra_files = "references.bib") # <2>
)
```

1. Load `tarchetypes` package for quarto support.
2. Add a quarto target.

## Exercise

* Add a targets workflow to your quarto document.
* Create a visualization of the pipeline network using `tar_visnetwork()`.

## Assignment 4
9 changes: 9 additions & 0 deletions week9/targets_example/_targets.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
library(targets)
source("functions.R")
tar_option_set(packages = c("readr", "dplyr", "ggplot2"))
list(
tar_target(file, "data.csv", format = "file"),
tar_target(data, get_data(file)),
tar_target(model, fit_model(data)),
tar_target(plot, plot_model(model, data))
)
Loading

0 comments on commit fae23d8

Please sign in to comment.