rain.core package#

Submodules#

rain.core.base module#

Copyright (C) 2023 Università degli Studi di Camerino and Sigma S.p.A. Authors: Alessandro Antinori, Rosario Capparuccia, Riccardo Coltrinari, Flavio Corradini, Marco Piangerelli, Barbara Re, Marco Scarpetta

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.

class rain.core.base.ComputationalNode(node_id: str)[source]#

Bases: SimpleNode, InputMixin, OutputMixin

Class representing a computational node, having both input and output attributes.

Parameters:

node_id (str) – The unique identifier of the node.

abstract execute()[source]#

Expose the main functionality: depending on the node, the computation is done using a specific Python library and its function/s.

has_attribute(attribute: str) bool[source]#

Tell if the node has the given attribute

Parameters:

attribute (str) – The name of the parameter to check.

Returns:

True if the node has the given parameter, False otherwise.

Return type:

bool

class rain.core.base.DataFlow(dataflow_id: str, executor: ~typing.Any = <rain.core.execution.LocalExecutor object>)[source]#

Bases: object

Class representing a Dataflow in Rain, containing nodes and edges.

Parameters:
  • dataflow_id (str) – The unique identifier of the dataflow

  • executor (Any, default LocalExecutor) – The executor used to run the Dataflow

add_edge(edge: MultiEdge)[source]#

Method used to add an edge to the Dataflow.

Parameters:

edge (MultiEdge) – The edge that should be added to the Dataflow.

add_edges(edges: List[MultiEdge])[source]#

Method used to add a list of edges to the Dataflow.

Parameters:

edges (List[MultiEdge]) – The list of edges that should be added to the Dataflow.

add_node(node) bool[source]#

Add a node to the dataflow. If a node with the same node id exists then an exception will be raised.

Parameters:

node (SimpleNode) – The node to add.

Returns:

True if the node has been correctly added.

Return type:

bool

Raises:

DuplicatedNodeId – If a node with the same node id already exists.

add_nodes(nodes) bool[source]#

Add a node to the dataflow. If a node with the same node id exists then an exception will be raised.

Parameters:

nodes (list of SimpleNode) – The node to add.

Returns:

True if the node has been correctly added.

Return type:

bool

Raises:

DuplicatedNodeId – If a node with the same node id already exists.

execute()[source]#

Execute all the nodes contained in the Dataflow if there are no cycle.

get_edge(source: SimpleNode, destination: SimpleNode) MultiEdge[source]#

Method used to get the edge with the specif source and destination node.

Parameters:
  • source (SimpleNode) – The source node of the edge.

  • destination (SimpleNode) – The destination node of the edge.

Returns:

The required edge with the specific source and destination node.

Return type:

MultiEdge

get_execution_ordered_nodes()[source]#

Returns a list of SimpleNode in topologically sorted order.

Returns:

The list of ordered nodes to be executed.

Return type:

List[SimpleNode]

get_ingoing_edges(destination) List[MultiEdge][source]#

Method used to get all the ingoing edges of the specif destination node.

Parameters:

destination (SimpleNode) – The destination node of the edges.

Returns:

The required ingoing edges with the specific destination node.

Return type:

List[MultiEdge]

get_node(node_id: str)[source]#

Method used to return the SimpleNode given its id.

Parameters:

node_id (str) – The id of the node to return.

Returns:

The SimpleNode with the given id.

Return type:

SimpleNode

get_outgoing_edges(source: SimpleNode) List[MultiEdge][source]#

Method used to get all the outgoing edges of the specif source node.

Parameters:

source (SimpleNode) – The source node of the edges.

Returns:

The required outgoing edges with the specific source node.

Return type:

List[MultiEdge]

has_node(node: SimpleNode)[source]#

Tell if the Dataflow contains the given SimpleNode

Parameters:

node (SimpleNode) – The SimpleNode to check

Returns:

True if the Dataflow contains the given node, False otherwise.

Return type:

bool

is_acyclic()[source]#

Returns True if the Dataflow is a directed acyclic graph (DAG) or False if not.

class rain.core.base.EdgeContentSpecifier(node: SimpleNode, nodes_attributes: Union[str, List])[source]#

Bases: object

It works as an attribute specifier for the nodes that are used within a Multiedge.

Parameters:
  • node (SimpleNode) – The node that contains the chosen attributes

  • nodes_attributes (Union[str, List]) – The chosen attributes of the node, they can either be the input or the output of the node.

class rain.core.base.InputMixin[source]#

Bases: object

Mixin used by a SimpleNode to inherit that it is an input node, so that the right output variables are set.

get_output_value(output_name: str) Any[source]#

Given the name of an output attribute return the corresponding value.

Parameters:

output_name (str) – The name of the output attribute.

Returns:

The value of the given attribute.

Return type:

Any

class rain.core.base.InputNode(node_id: str)[source]#

Bases: SimpleNode, InputMixin

Class representing an input node.

Parameters:

node_id (str) – The unique identifier of the node.

abstract execute()[source]#

Expose the main functionality: depending on the node, the computation is done using a specific Python library and its function/s.

has_attribute(attribute: str) bool[source]#

Tell if the node has the given attribute

Parameters:

attribute (str) – The name of the parameter to check.

Returns:

True if the node has the given parameter, False otherwise.

Return type:

bool

class rain.core.base.LibTag(value)[source]#

Bases: Enum

Enumeration representing the library which the SimpleNode refers to.

BASE = 'Base'#
MONGODB = 'PyMongo'#
OTHER = 'Other'#
PANDAS = 'Pandas'#
PYSAD = 'PySad'#
SKLEARN = 'Scikit-Learn'#
SPARK = 'PySpark'#
TPOT = 'TPOT'#
class rain.core.base.Meta(clsname, bases, dct)[source]#

Bases: type

Metaclass used by a SimpleNode to manage the inheritance of the attributes. In particular, it updates the variables related to the inputs, outputs and methods: in this way the attributes of the parents class are no longer lost by child classes.

class rain.core.base.MultiEdge(source: EdgeContentSpecifier, destination: EdgeContentSpecifier)[source]#

Bases: object

Represents an edge of the dataflow.

Parameters:
class rain.core.base.OutputMixin[source]#

Bases: object

Mixin used by a SimpleNode to inherit that it is an output node, so that the right input variables are set.

set_input_value(input_name: str, input_value: Any)[source]#
Parameters:
  • input_name (str) – The name of the input attribute.

  • input_value (Any) – The value to set for the given attribute.

class rain.core.base.OutputNode(node_id: str)[source]#

Bases: SimpleNode, OutputMixin

Class representing an output node.

Parameters:

node_id (str) – The unique identifier of the node.

abstract execute()[source]#

Expose the main functionality: depending on the node, the computation is done using a specific Python library and its function/s.

has_attribute(attribute: str) bool[source]#

Tell if the node has the given attribute

Parameters:

attribute (str) – The name of the parameter to check.

Returns:

True if the node has the given parameter, False otherwise.

Return type:

bool

class rain.core.base.SimpleNode(node_id: str)[source]#

Bases: object

Base class of each node in Rain.

Parameters:

node_id (str) – The unique identifier of the node

abstract execute()[source]#

Expose the main functionality: depending on the node, the computation is done using a specific Python library and its function/s.

abstract has_attribute(attribute: str) bool[source]#

Tell if the node has the given attribute

Parameters:

attribute (str) – The name of the parameter to check.

Returns:

True if the node has the given parameter, False otherwise.

Return type:

bool

class rain.core.base.Tags(library: LibTag, type: TypeTag)[source]#

Bases: object

DataClass that acts as a tag for a SimpleNode: it stores the library and the type of the node

Notes

library: LibTag

The library used by the node.

type: TypeTag

The type of the SimpleNode

library: LibTag#
type: TypeTag#
class rain.core.base.TypeTag(value)[source]#

Bases: Enum

Enumeration representing the type of the SimpleNode according to its functionality.

CLASSIFIER = 'Classifier'#
CLUSTERER = 'Clusterer'#
CUSTOM = 'Custom'#
ESTIMATOR = 'Estimator'#
INPUT = 'Input'#
METRICS = 'Metrics'#
OTHER = 'Other'#
OUTPUT = 'Output'#
PREDICTOR = 'Predictor'#
REGRESSOR = 'Regressor'#
TRAINER = 'Trainer'#
TRANSFORMER = 'Transformer'#

rain.core.exception module#

Copyright (C) 2023 Università degli Studi di Camerino and Sigma S.p.A. Authors: Alessandro Antinori, Rosario Capparuccia, Riccardo Coltrinari, Flavio Corradini, Marco Piangerelli, Barbara Re, Marco Scarpetta

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.

exception rain.core.exception.CyclicDataFlowException(dataflow_id: str)[source]#

Bases: Exception

exception rain.core.exception.DuplicatedNodeId(msg: str)[source]#

Bases: Exception

exception rain.core.exception.EdgeConnectionError(msg: str)[source]#

Bases: Exception

exception rain.core.exception.EstimatorNotFoundException(msg)[source]#

Bases: Exception

exception rain.core.exception.InputNotFoundException(msg)[source]#

Bases: Exception

exception rain.core.exception.PandasSequenceException(msg)[source]#

Bases: Exception

exception rain.core.exception.ParametersException(msg)[source]#

Bases: ValueError

rain.core.execution module#

Copyright (C) 2023 Università degli Studi di Camerino and Sigma S.p.A. Authors: Alessandro Antinori, Rosario Capparuccia, Riccardo Coltrinari, Flavio Corradini, Marco Piangerelli, Barbara Re, Marco Scarpetta

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.

class rain.core.execution.LocalExecutor(*args, **kwargs)[source]#

Bases: object

A Local executor, meaning that the execution is performed on the machine that runs the Dataflow

execute(dataflow)[source]#

Method that executes the given Dataflow in a precise order. At each step it propagates the results to the following nodes by checking the edges.

Parameters:

dataflow (Dataflow) – The dataflow that has to be executed.

class rain.core.execution.Singleton[source]#

Bases: type

Singleton class to represent all the possible executors available in Rain

rain.core.parameter module#

Copyright (C) 2023 Università degli Studi di Camerino and Sigma S.p.A. Authors: Alessandro Antinori, Rosario Capparuccia, Riccardo Coltrinari, Flavio Corradini, Marco Piangerelli, Barbara Re, Marco Scarpetta

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.

class rain.core.parameter.KeyValueParameter(name: str, p_type: type, value: Optional[Any] = None, is_mandatory: bool = False)[source]#

Bases: SimpleParameter

A KeyValue Parameter contains information about parameters that can be used during the transformation.

Parameters:
  • name (str) – The name of this parameter.

  • p_type (type) – The type of this parameter.

  • value (Any, default None) – The value of this parameter.

  • is_mandatory (bool, default False) – True if the parameter is mandatory, False otherwise.

property name: str#

Returns the variable containing the name of the parameter.

property type: type#

Returns the variable containing the type of the parameter.

property value: Any#

Returns the variable containing the value of the parameter.

class rain.core.parameter.Parameters(**kwargs)[source]#

Bases: object

Parameters handles all the parameters within a SimpleNode.

It gives the possibility to add one or several parameters, group parameters together, retrieve parameters and get a dictionary representation of the parameters useful to pass them to library functions as kwargs.

add_all_parameters(**kwargs)[source]#

Add one or more parameter in the collection.

Parameters:

kwargs (dict) – Of the form {param_name: parameter}. Each key will be set as the attribute name.

add_group(group_name: str, keys: list)[source]#

Adds a group name to some parameters.

Parameters:
  • group_name (str) – Name of the group.

  • keys (list[str]) – Used to specify the parameters to include in the group. Each string must correspond to the attribute name of the parameter.

add_parameter(parameter_name: str, parameter)[source]#

Add a parameter in the collection.

Parameters:
  • parameter_name (str) – Name of the parameter, can be used later to reference it as an attribute.

  • parameter (SimpleParameter) – The parameter to add.

get_all()[source]#

Gets all the parameters.

Return type:

list[SimpleParameter]

get_all_from_group(group_name: str)[source]#

Gets all the parameters contained in a group.

Parameters:

group_name (str) – Name of the group.

Return type:

list[SimpleParameter]

get_dict()[source]#

Gets all the KeyValueParameters as a dictionary, in order to simplify passing parameters to library functions.

Returns:

dict of the form {param_lib_name, param_value} where the key is the name of the parameter as required from the library.

Return type:

dict[str, Any]

get_dict_from_group(group_name: str)[source]#

Gets all the KeyValueParameters contained in a group as a dictionary, in order to simplify passing parameters to library functions.

Returns:

dict of the form {param_lib_name, param_value} where the key is the name of the parameter as required from the library.

Return type:

dict[str, Any]

group_all(group_name: str)[source]#

Adds a group name to all the parameters.

Parameters:

group_name (str) – Name of the group.

class rain.core.parameter.SimpleHyperParameter(is_mandatory: bool = False)[source]#

Bases: SimpleParameter

A KeyValue Parameter contains information about parameters that can be used during the transformation.

Parameters:

is_mandatory (bool, default False) – Name of the group of this parameter, used to pass it to the right function.

property is_mandatory#

Returns the variable that specify if the parameter is mandatory.

class rain.core.parameter.SimpleParameter(is_mandatory: bool = False, group_name: Optional[str] = None)[source]#

Bases: object

Base class that represents a Parameter for a given node.

Parameters:
  • is_mandatory (bool, default False) – True if the parameter is mandatory, False otherwise.

  • group_name (str, default None) – Name of the group of this parameter, used to pass it to the right function.

property is_mandatory#

Returns the variable that specify if the parameter is mandatory.

Module contents#

Copyright (C) 2023 Università degli Studi di Camerino and Sigma S.p.A. Authors: Alessandro Antinori, Rosario Capparuccia, Riccardo Coltrinari, Flavio Corradini, Marco Piangerelli, Barbara Re, Marco Scarpetta

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.