I went through the Data pipelines course, which was great. I am now wondering, is there an existing package one would suggest to have similar functionalities and possibly more?
I know about sklearn.pipeline… but it does not seem exactly the same thing.
Since I am not a data engineer myself, I´d rely on an exiting tool, rather then develop it my self. Among things I´d love to be able to do are for instance:
- handle more complex functions with multiple inputs and outputs
- plotting graphically the graph of tasks in the pipeline, displaying function names, input and outputs of each
- trace tasks across different modules within a package (I assume this is not immediate with the Pipeline class from the course)
Thanks for any answer