FlatMap Operator
The Flat-Map Operator allows an input from the source to be split into multiple entries in the sink. Because it is a map operator, split inputs can be mapped too. However, unlike a map operator, it creates a one to many mapping. Its a powerful tool that could be used to digest arrays, objects, or other nested data. In this example, we will write an dataflow that splits a string in half with FlatMap.
Prerequisites
This guide uses local
Fluvio cluster. If you need to install it, please follow the instructions at here.
Transformation
Below is an example of a transform function using flat-map. The function takes the string and splits in half. Both halves are inserted into the sink.
transforms:
- operator: flat-map
run: |
fn halfword(input: String) -> Result<Vec<String>> {
let mut ret: Vec<String> = Vec::new();
let mid = input.len() / 2;
ret.push(format!("first half: {}",&input[..mid]));
ret.push(format!("second half: {}",&input[mid..]));
Ok(ret)
}
In the example function, the return type is a vector of strings. The split string is re-encoded. In general, the transform function should return a vector of the sink's type.
Running the Example
Copy and paste following config and save it as dataflow.yaml
.
# dataflow.yaml
apiVersion: 0.5.0
meta:
name: flat-map-example
version: 0.1.0
namespace: examples
config:
converter: raw
topics:
sentences:
schema:
value:
type: string
halfword:
schema:
value:
type: string
services:
flat-map-service:
sources:
- type: topic
id: sentences
transforms:
- operator: flat-map
run: |
fn halfword(input: String) -> Result<Vec<String>> {
let mut ret: Vec<String> = Vec::new();
let mid = input.len() / 2;
ret.push(format!("{}",&input[..mid]));
ret.push(format!("{}",&input[mid..]));
Ok(ret)
}
sinks:
- type: topic
id: halfword
To run example:
$ sdf run
Produce sentences to in sentence
topic:
$ echo "0123456789" | fluvio produce sentences
Consume topic halfword
to retrieve the result in another terminal:
$ fluvio consume halfword -Bd
first half: 01234
second half: 56789
Here two strings are produced from the input.
Cleanup
Exit sdf
terminal and clean-up. The --force
flag removes the topics:
$ sdf clean --force
Conclusion
We just covered another basic operator in SDF, the Flat-Map Operator. The Flat-Map is a powerful operator. As a matter of fact, the Flat-Map operator can be used inplace of other operators.