This is a replacement for dplyr::na_if()
.
It is useful if you want to convert annoying values to NA
.
Unlike dplyr::na_if()
, this function allows you to specify multiple values
to be replaced with NA
at the same time.
na_if_in()
replaces values that match its arguments with NA
.
na_if_not()
replaces values that do not match its arguments with NA
.
na_if_in(x, ...)
na_if_not(x, ...)
Vector to modify
Values to replace with NA
, specified as either:
An object, vector of objects, or list of objects.
A function (including a purrr-style lambda function)
that returns a logical vector of the same length as x
.
See section "Formulas" for more details.
A modified version of x
with selected values replaced with
NA
.
These functions accept one-sided formulas that can evaluate to logical
vectors of the same length as x
.
The input is represented in these conditional statements as ".
".
Valid formulas take the form ~ . < 0
.
See examples.
dplyr::na_if()
to replace a single value with NA
.
dplyr::coalesce()
to replace missing values with a specified value.
tidyr::replace_na()
to replace NA
with a value.
dplyr::recode()
and dplyr::case_when()
to more generally replace
values.
x <- sample(c(1:5, 99))
# We can replace 99...
# ... explicitly
na_if_in(x, 99)
#> [1] 4 NA 1 3 5 2
# ... by specifying values to keep
na_if_not(x, 1:5)
#> [1] 4 NA 1 3 5 2
# ... or by using a formula
na_if_in(x, ~ . > 5)
#> [1] 4 NA 1 3 5 2
messy_string <- c("abc", "", "def", "NA", "ghi", 42, "jkl", "NULL", "mno")
# We can replace unwanted values...
# ... one at a time
clean_string <- na_if_in(messy_string, "")
clean_string <- na_if_in(clean_string, "NA")
clean_string <- na_if_in(clean_string, 42)
clean_string <- na_if_in(clean_string, "NULL")
clean_string
#> [1] "abc" NA "def" NA "ghi" NA "jkl" NA "mno"
# ... or all at once
na_if_in(messy_string, "", "NA", "NULL", 1:100)
#> [1] "abc" NA "def" NA "ghi" NA "jkl" NA "mno"
na_if_in(messy_string, c("", "NA", "NULL", 1:100))
#> [1] "abc" NA "def" NA "ghi" NA "jkl" NA "mno"
na_if_in(messy_string, list("", "NA", "NULL", 1:100))
#> [1] "abc" NA "def" NA "ghi" NA "jkl" NA "mno"
# ... or using a clever formula
grepl("[a-z]{3,}", messy_string)
#> [1] TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
na_if_not(messy_string, ~ grepl("[a-z]{3,}", .))
#> [1] "abc" NA "def" NA "ghi" NA "jkl" NA "mno"
# na_if_in() is particularly useful inside dplyr::mutate
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
faux_census %>%
mutate(
state = na_if_in(state, "Canada"),
age = na_if_in(age, ~ . < 18, ~ . > 120)
)
#> # A tibble: 20 × 6
#> state gender age race income relig…¹
#> <chr> <chr> <dbl> <chr> <dbl> <chr>
#> 1 CA female 80 Native American 2.8 e4 Christ…
#> 2 NY Woman 89 Latino 1.49e5 Spirit…
#> 3 CA Female 48 White 4.79e5 Cathol…
#> 4 TX Male 63 latinx 8.5 e4 christ…
#> 5 PA Male 47 asian 4.19e4 Baptist
#> 6 TX Gender is a social construct 57 Race is a social con… 1.00e7 Religi…
#> 7 NA Male 49 white 1.49e5 method…
#> 8 TX Female 50 White 9.88e4 Luther…
#> 9 NY f NA white 9.07e4 Agnost…
#> 10 WA F 33 White 4.50e4 Jewish
#> 11 TX Male 30 White 1.27e5 none
#> 12 OH Non-binary 42 Caucasian 2.16e4 Roman …
#> 13 NC Female 22 African American 7.42e4 atheist
#> 14 LA Male NA White 6.1 e4 Christ…
#> 15 LA Female 28 Black 2 e4 Not re…
#> 16 CA male 34 Asian American 7.74e4 Christ…
#> 17 TN M 64 white 1.00e7 Nothing
#> 18 FL Female 68 white 4.71e4 None
#> 19 OH Male 39 black 2.38e4 baptist
#> 20 NH male 73 Hispanic 3.32e4 Christ…
#> # … with abbreviated variable name ¹religion
# This function handles vector values differently than dplyr,
# and returns a different result with vector replacement values:
na_if_in(1:5, 5:1)
#> [1] NA NA NA NA NA
dplyr::na_if(1:5, 5:1)
#> [1] 1 2 NA 4 5