vignettes/auditing-an-r-package.Rmd
auditing-an-r-package.RmdThis vignette is the canonical end-to-end walkthrough: a colleague
hands you an R package and asks “is this CRAN-ready?”. The goal is to
surface every CRAN-blocking issue with the smallest possible
number of R CMD check runs, then apply the safe
automatic fixes.
For audits that have their own pipeline (full CRAN environment,
file-system snapshots), see the companion vignette
vignette("pre-submission-gates", package = "checkhelper").
pkg <- "/path/to/the/package"
# 1. Run R CMD check ONCE and reuse it everywhere it's needed.
chk <- rcmdcheck::rcmdcheck(pkg, args = "--as-cran")
# 2. Static audits (no extra check needed).
audit_tags(pkg) # exported funs without @return / internals without @noRd
audit_ascii(pkg) # non-ASCII characters in R/, tests/, vignettes/, man/, DESCRIPTION, NAMESPACE
audit_dataset_doc(pkg) # datasets in data/ without a roxygen block
audit_citation(pkg) # old-style personList() / citEntry() in inst/CITATION
audit_dontrun(pkg) # \dontrun{} blocks in man/*.Rd
audit_description(pkg) # unquoted package names in DESCRIPTION's Description field
# 3. Audits that need the check output - pass `chk` to skip a 2nd run.
audit_globals(pkg, checks = chk)
# 4. Apply the safe fixes.
fix_globals(pkg, checks = chk, write = TRUE)
# Preview before applying: fix_ascii() returns invisibly, so capture
# it to see which files would change.
preview <- fix_ascii(pkg, dry_run = TRUE)
preview[preview$changed, ]
fix_ascii(pkg, dry_run = FALSE) # then apply
fix_dataset_doc("my_data", pkg = pkg,
description = "Description of my_data",
source = "Internal") # one call per undocumented datasetaudit_globals() and fix_globals() parse the
notes field of an rcmdcheck::rcmdcheck()
result to extract the
no visible binding for global variable and
no visible global function definition notes. By default
each call runs its own check, which is slow on a real package.
Both functions accept a checks = argument. When
supplied, they skip the rcmdcheck() call and parse the
existing object. This lets you run the check once and
reuse the result for the whole audit.
chk <- rcmdcheck::rcmdcheck(pkg, args = "--as-cran")
audit_globals(pkg, checks = chk)
fix_globals(pkg, checks = chk, write = TRUE)The other audits do not need a check at all:
| Audit | Needs R CMD check? |
Notes |
|---|---|---|
audit_tags() |
no | static via roxygen2 |
audit_ascii() |
no | line-by-line via stringi::stri_enc_isascii()
|
audit_dataset_doc() |
no | inspects data/ and R/
|
audit_citation() |
no | static parse of inst/CITATION
|
audit_description() |
no | tokenises DESCRIPTION’s Description |
audit_dontrun() |
no | line-by-line scan of man/*.Rd
|
audit_globals() |
yes (reusable) | accepts checks =
|
audit_userspace() |
yes (own pipeline) | takes file-system snapshots, separate |
audit_check() |
yes | this is the check, with CRAN env |
no visible binding)
audit_globals() returns a 3-element list of names CRAN
flagged:
globalVariables - undeclared variables that need a
utils::globalVariables() declaration.functions - external functions that need an
@importFrom line.operators - NSE tokens, data.table / rlang pronouns
(:=, .SD, .N, .data,
!!, …) that also need an @importFrom rather
than a globalVariables() entry.fix_globals(write = TRUE) writes the
globalVariables set into R/globals.R (merging
with whatever names that file already declares - the freshly detected
names are added on top of the existing ones, deduplicated). The
operators section is printed on stdout so you wire each one into a
roxygen @importFrom block by hand:
audit_globals(pkg, checks = chk)
fix_globals(pkg, checks = chk, write = TRUE)When a token is exported by more than one candidate package
(e.g. := is exported by both data.table and rlang), every
candidate is listed and you pick one consciously - no silent
guessing.
Without write = TRUE, fix_globals() only
prints both blocks to copy-paste.
audit_tags() flags exported functions without
@return and documented internals without
@noRd. Read-only - no automatic fix because adding accurate
@return text needs a human:
audit_tags(pkg)audit_ascii() walks R/,
tests/, vignettes/, man/,
DESCRIPTION and NAMESPACE line-by-line and
reports every line containing non-ASCII characters (columns:
file, line, text,
n_tokens). fix_ascii() then rewrites them -
using the parser AST so each token is rewritten per its context: string
literals become \uXXXX escapes, comments and roxygen get
Latin-ASCII transliteration. It dry-runs by
default:
audit_ascii(pkg)
# Always preview which files would change. fix_ascii() returns
# invisibly - capture the result to inspect per-file detail
# (path, changed, n_tokens, n_chars).
preview <- fix_ascii(pkg, dry_run = TRUE)
preview[preview$changed, ]
# Apply when you've reviewed the proposed rewrite.
fix_ascii(pkg, dry_run = FALSE)Identifiers with non-ASCII characters are refused by default (renaming would be a breaking change).
audit_dataset_doc() lists every data/*.rda
without a matching roxygen block under R/.
fix_dataset_doc() writes a documentation skeleton (one call
per dataset, takes the dataset name):
audit_dataset_doc(pkg)
fix_dataset_doc("my_data",
pkg = pkg,
description = "Description of my_data",
source = "Internal")The skeleton is editable: you fill in the description / source /
column-by-column comments by hand, then re-run
devtools::document().
inst/CITATION
audit_citation() parses inst/CITATION
statically (no eval()) and surfaces every call to
personList(), as.personList() or
citEntry() that CRAN rejects on submission with
Package CITATION file contains call(s) to old-style .... It
returns a tibble with call, line and a
one-line suggestion for the modern equivalent
(c() on person() objects;
bibentry() instead of citEntry()):
audit_citation(pkg)Read-only - rewriting a CITATION file usually needs editorial
judgment, so there is no automated fix_citation().
Description
CRAN incoming pretest emits
Package names should be quoted in the Description field
when a package name (or any software name) appears in the
Description field of DESCRIPTION without
surrounding single quotes.
audit_description() reads the Description
field, tokenises it, and surfaces every word that matches an installed
package name yet is not wrapped in single quotes. The package’s own name
is intentionally skipped, and so are compound forms like
dplyr-style or httr2-based (a hyphen on either
side disqualifies the token from being a standalone package reference).
Returns a tibble with word, position and
suggestion:
audit_description(pkg)Read-only - the fix is editorial (decide whether each hit is a real package reference or a coincidental word, then wrap with single quotes).
\dontrun{} blocks in examples
CRAN policy is that \dontrun{} should only wrap example
code that genuinely cannot be executed (missing API key, missing system
dependency, side effect on the user’s filespace). Otherwise prefer
\donttest{}, which still gets exercised by
R CMD check --run-donttest but is skipped by default.
audit_dontrun() walks man/*.Rd line-by-line
and surfaces every \dontrun{} opener (commented-out
% \dontrun{ mentions are ignored), with the source Rd file,
the documented topic, the line number and a one-line suggestion.
Read-only - the call is your review checklist:
audit_dontrun(pkg)create_example_pkg() builds a fake package that
deliberately trips each audit. The two with_* flags below
activate the non-ASCII and undocumented-dataset fixtures so every audit
has something to surface:
pkg <- create_example_pkg(with_nonascii = TRUE,
with_undocumented_data = TRUE)
chk <- rcmdcheck::rcmdcheck(pkg, args = "--as-cran")
audit_tags(pkg) # @return / @noRd issues
audit_ascii(pkg) # accents in comments / strings
audit_dataset_doc(pkg) # data/demo_dataset.rda has no doc
audit_citation(pkg) # old-style personList() / citEntry()
audit_dontrun(pkg) # \dontrun{} blocks in examples
audit_description(pkg) # unquoted package names in Description
audit_globals(pkg, checks = chk)
fix_globals(pkg, checks = chk, write = TRUE)
fix_ascii(pkg, dry_run = FALSE)
fix_dataset_doc("demo_dataset", pkg = pkg,
description = "A small demo dataset",
source = "Generated by create_example_pkg()")After applying the fixes, re-run the check (the package state has
changed, so a new rcmdcheck() is needed) and confirm 0 / 0
/ 0.
When the dev-time audits above are clean, run the heavier gates that
have their own pipeline and cannot reuse
chk:
audit_check() - R CMD check with the full
CRAN incoming environment.audit_userspace() - checks that tests / examples /
vignettes leave no files behind.Both are documented in
vignette("pre-submission-gates", package = "checkhelper").