Based on `POSIXct`

columns of the data, a set of date related features is computed and added to
the feature set of the output task. If no `POSIXct`

column is found, the original task is
returned unaltered. This functionality is based on the `add_datepart()`

and
`add_cyclic_datepart()`

functions from the `fastai`

library. If operation on only particular
`POSIXct`

columns is requested, use the `affect_columns`

parameter inherited from
`PipeOpTaskPreprocSimple`

.

If `cyclic = TRUE`

, cyclic features are computed for the features `"month"`

, `"week_of_year"`

,
`"day_of_year"`

, `"day_of_month"`

, `"day_of_week"`

, `"hour"`

, `"minute"`

and `"second"`

. This
means that for each feature `x`

, two additional features are computed, namely the sine and cosine
transformation of `2 * pi * x / max_x`

(here `max_x`

is the largest possible value the feature
could take on `+ 1`

, assuming the lowest possible value is given by 0, e.g., for hours from 0 to
23, this is 24). This is useful to respect the cyclical nature of features such as seconds, i.e.,
second 21 and second 22 are one second apart, but so are second 60 and second 1 of the next
minute.

`R6Class`

object inheriting from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

PipeOpDateFeatures$new(id = "datefeatures", param_vals = list())

`id`

::`character(1)`

Identifier of resulting object, default`"datefeatures"`

.`param_vals`

:: named`list`

List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default`list()`

.

Input and output channels are inherited from `PipeOpTaskPreprocSimple`

.

The output is the input `Task`

with date-related features computed and added to the
feature set of the output task and the `POSIXct`

columns of the data removed from the
feature set (depending on the value of `keep_date_var`

).

The `$state`

is a named `list`

with the `$state`

elements inherited from
`PipeOpTaskPreprocSimple`

.

The parameters are the parameters inherited from `PipeOpTaskPreprocSimple`

, as well as:

`keep_date_var`

::`logical(1)`

Should the`POSIXct`

columns be kept as features? Default FALSE.`cyclic`

::`logical(1)`

Should cyclic features be computed? See Internals. Default FALSE.`year`

::`logical(1)`

Should the year be extracted as a feature? Default TRUE.`month`

::`logical(1)`

Should the month be extracted as a feature? Default TRUE.`week_of_year`

::`logical(1)`

Should the week of the year be extracted as a feature? Default TRUE.`day_of_year`

::`logical(1)`

Should the day of the year be extracted as a feature? Default TRUE.`day_of_month`

::`logical(1)`

Should the day of the month be extracted as a feature? Default TRUE.`day_of_week`

::`logical(1)`

Should the day of the week be extracted as a feature? Default TRUE.`hour`

::`logical(1)`

Should the hour be extracted as a feature? Default TRUE.`minute`

::`logical(1)`

Should the minute be extracted as a feature? Default TRUE.`second`

::`logical(1)`

Should the second be extracted as a feature? Default TRUE.`is_day`

::`logical(1)`

Should a feature be extracted indicating whether it is day time (06:00am - 08:00pm)? Default TRUE.

The cyclic feature transformation always assumes that values range from 0, so some values (e.g. day of the month) are shifted before sine/cosine transform.

Only methods inherited from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

Only fields inherited from `PipeOpTaskPreproc`

/`PipeOp`

.

https://mlr3book.mlr-org.com/list-pipeops.html

Other PipeOps:
`PipeOpEnsemble`

,
`PipeOpImpute`

,
`PipeOpTargetTrafo`

,
`PipeOpTaskPreprocSimple`

,
`PipeOpTaskPreproc`

,
`PipeOp`

,
`mlr_pipeops_boxcox`

,
`mlr_pipeops_branch`

,
`mlr_pipeops_chunk`

,
`mlr_pipeops_classbalancing`

,
`mlr_pipeops_classifavg`

,
`mlr_pipeops_classweights`

,
`mlr_pipeops_colapply`

,
`mlr_pipeops_collapsefactors`

,
`mlr_pipeops_colroles`

,
`mlr_pipeops_copy`

,
`mlr_pipeops_encodeimpact`

,
`mlr_pipeops_encodelmer`

,
`mlr_pipeops_encode`

,
`mlr_pipeops_featureunion`

,
`mlr_pipeops_filter`

,
`mlr_pipeops_fixfactors`

,
`mlr_pipeops_histbin`

,
`mlr_pipeops_ica`

,
`mlr_pipeops_imputeconstant`

,
`mlr_pipeops_imputehist`

,
`mlr_pipeops_imputelearner`

,
`mlr_pipeops_imputemean`

,
`mlr_pipeops_imputemedian`

,
`mlr_pipeops_imputemode`

,
`mlr_pipeops_imputeoor`

,
`mlr_pipeops_imputesample`

,
`mlr_pipeops_kernelpca`

,
`mlr_pipeops_learner`

,
`mlr_pipeops_missind`

,
`mlr_pipeops_modelmatrix`

,
`mlr_pipeops_multiplicityexply`

,
`mlr_pipeops_multiplicityimply`

,
`mlr_pipeops_mutate`

,
`mlr_pipeops_nmf`

,
`mlr_pipeops_nop`

,
`mlr_pipeops_ovrsplit`

,
`mlr_pipeops_ovrunite`

,
`mlr_pipeops_pca`

,
`mlr_pipeops_proxy`

,
`mlr_pipeops_quantilebin`

,
`mlr_pipeops_randomprojection`

,
`mlr_pipeops_randomresponse`

,
`mlr_pipeops_regravg`

,
`mlr_pipeops_removeconstants`

,
`mlr_pipeops_renamecolumns`

,
`mlr_pipeops_replicate`

,
`mlr_pipeops_scalemaxabs`

,
`mlr_pipeops_scalerange`

,
`mlr_pipeops_scale`

,
`mlr_pipeops_select`

,
`mlr_pipeops_smote`

,
`mlr_pipeops_spatialsign`

,
`mlr_pipeops_subsample`

,
`mlr_pipeops_targetinvert`

,
`mlr_pipeops_targetmutate`

,
`mlr_pipeops_targettrafoscalerange`

,
`mlr_pipeops_textvectorizer`

,
`mlr_pipeops_threshold`

,
`mlr_pipeops_tunethreshold`

,
`mlr_pipeops_unbranch`

,
`mlr_pipeops_updatetarget`

,
`mlr_pipeops_vtreat`

,
`mlr_pipeops_yeojohnson`

,
`mlr_pipeops`

library("mlr3") dat = iris set.seed(1) dat$date = sample(seq(as.POSIXct("2020-02-01"), to = as.POSIXct("2020-02-29"), by = "hour"), size = 150L) task = TaskClassif$new("iris_date", backend = dat, target = "Species") pop = po("datefeatures", param_vals = list(cyclic = FALSE, minute = FALSE, second = FALSE)) pop$train(list(task)) #> $output #> <TaskClassif:iris_date> (150 x 13) #> * Target: Species #> * Properties: multiclass #> * Features (12): #> - dbl (11): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width, #> date.day_of_month, date.day_of_week, date.day_of_year, date.hour, #> date.month, date.week_of_year, date.year #> - lgl (1): date.is_day #> pop$state #> $dt_columns #> [1] "date" #> #> $affected_cols #> [1] "Petal.Length" "Petal.Width" "Sepal.Length" "Sepal.Width" "date" #> #> $intasklayout #> id type #> 1: Petal.Length numeric #> 2: Petal.Width numeric #> 3: Sepal.Length numeric #> 4: Sepal.Width numeric #> 5: date POSIXct #> #> $outtasklayout #> id type #> 1: Petal.Length numeric #> 2: Petal.Width numeric #> 3: Sepal.Length numeric #> 4: Sepal.Width numeric #> 5: date.day_of_month numeric #> 6: date.day_of_week numeric #> 7: date.day_of_year numeric #> 8: date.hour numeric #> 9: date.is_day logical #> 10: date.month numeric #> 11: date.week_of_year numeric #> 12: date.year numeric #> #> $outtaskshell #> Empty data.table (0 rows and 13 cols): Species,Petal.Length,Petal.Width,Sepal.Length,Sepal.Width,date.year... #>