Writes the rewritten file in place (unless dry_run = TRUE). Files that do not contain any non-ASCII characters are skipped without rewriting, so file mtimes stay clean.

asciify_file(
  path,
  strategy = c("auto", "escape", "translit", "report"),
  identifiers = c("error", "warn", "skip"),
  dry_run = FALSE
)

Arguments

path

path to a file. Suffixes .R, .r, .Rmd, .qmd are handled as R source. .Rnw (Sweave) and any other suffix are scanned read-only - Sweave's <<>>= ... @ chunk syntax is not handled by the rewriter.

strategy

one of:

  • "auto" (default): per-token policy - \\uXXXX escape inside string literals (so they remain semantically equivalent and CRAN-safe); Latin-ASCII transliteration (drops diacritics, e.g. an accented e becomes plain e) inside comments and roxygen blocks (where escapes would not be interpreted).

  • "escape": force \\uXXXX escape on every non-identifier token.

  • "translit": force ASCII transliteration on every non-identifier token.

  • "report": rewrite nothing, just return the input unchanged. Useful in conjunction with find_nonascii_tokens() for a dry run.

identifiers

what to do when a non-ASCII identifier (variable, function name, formal, slot...) is found:

  • "error" (default): stop. Renaming an identifier changes the API surface and is not safe to automate.

  • "warn": warn and leave the token unchanged.

  • "skip": silently leave the token unchanged.

dry_run

logical. If TRUE, report what would change but do not write the file. Default FALSE.

Value

invisibly, a list with:

  • path: path of the file

  • changed: TRUE if the file was rewritten (or would be in dry-run)

  • n_chars: number of non-ASCII characters in the file

  • n_tokens: number of distinct source locations to rewrite (an accented string literal counts once regardless of how many non-ASCII characters it contains)

  • text: rewritten content if changed, else the original