

The biggest advantage of this method is that it creates a very simple, sequential Graph. The outputs of both streams are then combined into a single dataset using po("featureunion"). In the following example, we create a Graph that centers its inputs ( po("scale")) and then copies the centered data to two parallel streams: one replaces the data with columns that indicate whether data is missing ( po("missind")), and the other imputes missing data using the median ( po("imputemedian")), which we will return to in Section 9.3. However, by using the gunion() function, we can instead combine multiple PipeOps, Graphs, or a mixture of both, into a parallel Graph. Given a single PipeOp or Learner, the %>%-operator will arrange these objects into a linear Graph with each PipeOp acting in sequence. We saw the power of the %>%-operator in Chapter 7 to assemble graphs from combinations of multiple PipeOps and Learners. We will then look at tuning pipelines by combining methods in mlr3tuning and mlr3pipelines and will consider some concrete examples using multi-fidelity tuning ( Section 5.3) and feature selection ( Chapter 6).

In this chapter, we will take this further and look at non-sequential pipelines that can perform more complex operations. In Chapter 7 we looked at simple sequential pipelines that can be built using the Graph class and a few PipeOp objects. Leibniz Institute for Prevention Research and Epidemiology – BIPS, and University of Bremen, and University of Copenhagen Ludwig-Maximilians-Universität München, and Munich Center for Machine Learning (MCML)
