This package adds resampling methods for the {mlr3} package framework suited for spatial, temporal and spatiotemporal data. These methods can help to reduce the influence of autocorrelation on performance estimates when performing cross-validation. While this article gives a rather technical introduction to the package, a more applied approach can be found in the mlr3book section on “Spatiotemporal Analysis”.
After loading the package via library("mlr3spatiotempcv"), the spatiotemporal resampling methods and example tasks provided by {mlr3spatiotempcv} are available to the user alongside the default {mlr3} resampling methods and tasks.
To make use of spatial resampling methods, a {mlr3} task that is aware of its spatial characteristic needs to be created. Two child classes exist in {mlr3spatiotempcv} for this purpose:
TaskClassifSTTaskRegrSTTo create one of these, one can either pass a sf object as the “backend” directly:
# create 'sf' object
data_sf = sf::st_as_sf(ecuador, coords = c("x", "y"), crs = 4326)
# create mlr3 task
task = TaskClassifST$new("ecuador_sf",
  backend = data_sf, target = "slides", positive = "TRUE"
)or use a plain data.frame. In this case, the constructor of TaskClassifST needs a few more arguments:
data = mlr3::as_data_backend(ecuador)
task = TaskClassifST$new("ecuador",
  backend = data, target = "slides",
  positive = "TRUE", extra_args = list(coordinate_names = c("x", "y"))
)Now this Task can be used as a normal {mlr3} task in any kind of modeling scenario. Have a look at the mlr3book section on “Spatiotemporal Analysis” on how to apply a spatiotemporal resampling method to such a task.
In {mlr3}, dictionaries are used for overview purposes of available methods. The following sections show which dictionaries get appended with new entries when loading {mlr3spatiotempcv}.
Additional task types:
TaskClassifST
TaskRegrST
mlr_reflections$task_types
#>       type          package          task        learner        prediction
#> 1: classif             mlr3   TaskClassif LearnerClassif PredictionClassif
#> 2: classif mlr3spatiotempcv TaskClassifST LearnerClassif PredictionClassif
#> 3:    regr             mlr3      TaskRegr    LearnerRegr    PredictionRegr
#> 4:    regr mlr3spatiotempcv    TaskRegrST    LearnerRegr    PredictionRegr
#>           measure
#> 1: MeasureClassif
#> 2: MeasureClassif
#> 3:    MeasureRegr
#> 4:    MeasureRegrAdditional column roles:
coordinatesmlr_reflections$task_col_roles
#> $regr
#> [1] "feature" "target"  "name"    "order"   "stratum" "group"   "weight" 
#> [8] "uri"    
#> 
#> $classif
#> [1] "feature" "target"  "name"    "order"   "stratum" "group"   "weight" 
#> [8] "uri"    
#> 
#> $classif_st
#> [1] "feature"     "target"      "name"        "order"       "stratum"    
#> [6] "group"       "weight"      "uri"         "coordinates"
#> 
#> $regr_st
#> [1] "feature"     "target"      "name"        "order"       "stratum"    
#> [6] "group"       "weight"      "uri"         "coordinates"Additional resampling methods:
spcv_block
spcv_buffer
spcv_coords
spcv_env
sptcv_cluto
sptcv_cstf
and their respective repeated versions.
as.data.table(mlr_resamplings)
#>                      key                                  params iters
#>  1:            bootstrap                           repeats,ratio    30
#>  2:               custom                                             0
#>  3:                   cv                                   folds    10
#>  4:              holdout                                   ratio     1
#>  5:             insample                                             1
#>  6:                  loo                                            NA
#>  7:          repeated_cv                           repeats,folds   100
#>  8:  repeated_spcv_block folds,repeats,rows,cols,range,selection    10
#>  9: repeated_spcv_coords                           folds,repeats    10
#> 10:    repeated_spcv_env                  folds,repeats,features    10
#> 11: repeated_sptcv_cluto                           folds,repeats    10
#> 12:  repeated_sptcv_cstf                           folds,repeats    10
#> 13:           spcv_block         folds,rows,cols,range,selection    10
#> 14:          spcv_buffer               theRange,spDataType,addBG     0
#> 15:          spcv_coords                                   folds    10
#> 16:             spcv_env                          folds,features    10
#> 17:          sptcv_cluto                                   folds    10
#> 18:           sptcv_cstf                                   folds    10
#> 19:          subsampling                           repeats,ratio    30Additional example tasks:
tsk("ecuador") (spatial, classif)tsk("cookfarm") (spatiotemp, regr)The following table lists all methods implemented in {mlr3spatiotempcv}, their upstream R package and scientific references.
| Literature | Package | Reference | mlr3 Sugar | 
|---|---|---|---|
| Spatial Buffering | blockCV | Valavi et al. (2018) | rsmp("spcv_buffer") | 
| Spatial Blocking | blockCV | Valavi et al. (2018) | rsmp("spcv_block") | 
| Spatial CV | sperrorest | Brenning (2012) | rsmp("spcv_coords") | 
| Environmental Blocking | blockCV | Valavi et al. (2018) | rsmp("spcv_env") | 
| - | - | - | rsmp("sptcv_cluto") | 
| Leave-Location-and-Time-Out | CAST | Meyer et al. (2018) | rsmp("sptcv_cstf") | 
| Spatiotemporal Clustering | skmeans | Zhao and Karypis (2002) | rsmp("repeated_sptcv_cluto") | 
| Repeated Spatial Blocking | blockCV | Valavi et al. (2018) | rsmp("repeated_spcv_block") | 
| Repeated Spatial CV | sperrorest | Brenning (2012) | rsmp("repeated_spcv_coords") | 
| Repeated Env Blocking | blockCV | Valavi et al. (2018) | rsmp("repeated_spcv_env") | 
| - | - | - | rsmp("repeated_sptcv_cluto") | 
| Repeated Leave-Location-and-Time-Out | CAST | Meyer et al. (2018) | | rsmp("repeated_sptcv_cstf") | 
| Repeated Spatiotemporal Clustering | skmeans | Zhao and Karypis (2002) | rsmp("repeated_sptcv_cluto") |