This set of functions is an alternative to the bootBW()
function. This set
attempts to make the blocked weighted bootstrap algorithm more efficient
through vectorisation and use of parallelisation techniques. The function
syntax has been kept consistent with bootBW()
for ease of transition. A
more in depth discussion of the efficiencies gained from this alternative
function is discussed here.
Usage
boot_bw(
x,
w,
statistic,
params,
outputColumns = params,
replicates = 400,
strata = NULL,
parallel = FALSE,
cores = parallelly::availableCores(omit = 1)
)
boot_bw_parallel(
x,
w,
statistic,
params,
outputColumns = params,
replicates = 400,
strata = NULL,
cores = parallelly::availableCores(omit = 1)
)
boot_bw_sequential(
x,
w,
statistic,
params,
outputColumns = params,
replicates = 400,
strata = NULL
)
boot_bw_weight(w)
boot_bw_sample_clusters(x, w, index = FALSE)
boot_bw_sample_within_clusters(cluster_df)
Arguments
- x
A
data.frame()
with primary sampling unit (PSU) in variable namedpsu
and at least one other variable containing data for estimation.- w
A
data.frame()
with primary sampling unit (PSU) in variable namedpsu
and survey weights (i.e. PSU population) in variable namedpop
.- statistic
Am estimator function operating on variables in
x
containing data for estimation. The functionsbootClassic()
andbootPROBIT()
are examples.- params
Parameters specified as names of columns in
x
that are to be passed to the function specified instatistic
.- outputColumns
Names to be used for columns in output
data.frame()
. Default to names specified inparams
.- replicates
Number of bootstrap replicates to be performed. Default is 400.
- strata
A character value for name of variable in
x
providing information on howx
is grouped such that resampling is performed for each group. Default to NULL for no grouping and resampling is performed for full data.- parallel
Logical. Should resampling be done in parallel? Default to FALSE.
- cores
The number of computer cores to use or number of child processes to be run simultaneously. Default to one less than the available number of cores on current machine.
- index
Logical. Should index values be returned or a list of
data.frame()
s. Default to FALSE.- cluster_df
A list of
data.frame()
s for selected clusters.
Value
For boot_bw()
, a data.frame()
with number of columns equal to
length of outputColumns
; number of rows equal to number of replicates
;
and, names of variables equal to values of outputColumns
. For
boot_bw_weight()
, A data.frame()
based on w
with two additional
variables for weight
and cumWeight
. For boot_bw_sample_clusters()
,
either a vector of integers corresponding to the primary sampling unit
(psu) identifier of the selected clusters (when index = TRUE
) or a list
of data.frame()
s corresponding to the data for the selected clusters
(when index = FALSE
). For boot_bw_sample_within_clusters()
, a matrix
similar in structure to x
of resampled data from each selected cluster.
Examples
boot_bw(
x = indicatorsHH, w = villageData, statistic = bootClassic,
params = "anc1", replicates = 9, parallel = TRUE
)
#>
#> ── Resampling in parallel ──
#>
#> ℹ Setting up 3 parallel operations
#> ✔ Setting up 3 parallel operations [302ms]
#>
#> ℹ Resampling with 9 replicates in parallel
#> ✔ Resampling with 9 replicates in parallel [767ms]
#>
#> ℹ Tidying up resampling outputs
#> ✔ Tidying up resampling outputs [22ms]
#>
#> ℹ Closing 3 parallel operations
#> ✔ Closing 3 parallel operations [15ms]
#>
#> $params
#> [1] "anc1"
#>
#> $replicates
#> [1] 9
#>
#> $strata
#> NULL
#>
#> $boot_data
#> anc1
#> 1 0.2363465
#> 2 0.2167766
#> 3 0.1907895
#> 4 0.2003863
#> 5 0.2114475
#> 6 0.2924805
#> 7 0.1790763
#> 8 0.2573599
#> 9 0.2370479
#>
#> attr(,"class")
#> [1] "boot_bw" "list"