API
clustergraph.clustergraph module
- class clustergraph.clustergraph.ClusterGraph(X, clusters, metric_clusters='centroids', metric_points=scipy.spatial.distance.euclidean, parameters_metric_points={}, type_pruning='conn', algo='bf', weight='weight', knn_g=None, weight_knn_g='weight', k_compo=2, dist_weight=True)
Bases:
GraphPreprocess,GraphPruningClusterGraph Class
A class representing a graph of clusters. The graph nodes represent clusters, and the edges represent the distances between clusters, which are computed based on the distance between points in the clusters or centroids.
Inherits from:
GraphPreprocess GraphPruning
- X
The input data points.
- Type:
ndarray, shape (n_samples, n_features)
- clusters
A list where each element is an array representing the points in a cluster.
- Type:
list of arrays
- metric_clusters
The method used to calculate the distance between clusters. Default is “centroids”.
- Type:
str, optional
- metric_points
The distance metric used to calculate distances between points. Default is Euclidean distance.
- Type:
callable, optional
- parameters_metric_points
Additional parameters for the point distance metric.
- Type:
dict, optional
- type_pruning
The type of pruning method to apply. Default is “conn”.
- Type:
str, optional
- algo
The algorithm used for graph pruning. Default is “bf”.
- Type:
str, optional
- weight
The edge weight attribute name in the graph. Default is “weight”.
- Type:
str, optional
- knn_g
The precomputed k-NN graph. Default is None, meaning the k-NN graph will be computed.
- Type:
networkx.Graph or None, optional
- weight_knn_g
The edge weight attribute for the k-NN graph. Default is “weight”.
- Type:
str, optional
- k_compo
A parameter related to the graph pruning. Default is 2.
- Type:
int, optional
- dist_weight
Whether to apply distance-based weighting for pruning. Default is True.
- Type:
bool, optional
- Graph
The graph representing the clusters and their distances.
- Type:
networkx.Graph
- is_knn_computed
Flag indicating whether the k-NN graph has been computed.
- Type:
int
- original_graph
The original, unpruned graph.
- Type:
networkx.Graph
Initializes the ClusterGraph with the given parameters.
- param X:
The input data points.
- type X:
ndarray, shape (n_samples, n_features)
- param clusters:
A list where each element is an array representing the points in a cluster.
- type clusters:
list of arrays
- param metric_clusters:
The method used to calculate the distance between clusters. Default is “centroids”.
- type metric_clusters:
str, optional
- param metric_points:
The distance metric used to calculate distances between points. Default is Euclidean distance.
- type metric_points:
callable, optional
- param parameters_metric_points:
Additional parameters for the point distance metric.
- type parameters_metric_points:
dict, optional
- param type_pruning:
The type of pruning method to apply. Default is “conn”.
- type type_pruning:
str, optional
- param algo:
The algorithm used for graph pruning. Default is “bf”.
- type algo:
str, optional
- param weight:
The edge weight attribute name in the graph. Default is “weight”.
- type weight:
str, optional
- param knn_g:
The precomputed k-NN graph. Default is None, meaning the k-NN graph will be computed.
- type knn_g:
networkx.Graph or None, optional
- param weight_knn_g:
The edge weight attribute for the k-NN graph. Default is “weight”.
- type weight_knn_g:
str, optional
- param k_compo:
A parameter related to the graph pruning. Default is 2.
- type k_compo:
int, optional
- param dist_weight:
Whether to apply distance-based weighting for pruning. Default is True.
- type dist_weight:
bool, optional
- get_graph()
Retrieves the pruned or original graph.
- Returns:
The ClusterGraph, either pruned or unpruned depending on the state of the graph.
- Return type:
networkx.Graph
Notes
If the graph is pruned, the pruned edges will be removed and the pruned graph will be returned.
- prune_distortion(knn_g=10, nb_edge_pruned=-1, score=False, algo='bf', weight_knn_g='weight', k_compo=2, dist_weight=True)
Performs graph pruning based on distortion score.
- Parameters:
knn_g (int or networkx.Graph, optional) – The number of nearest neighbors or a precomputed k-NN graph. Default is 10.
nb_edge_pruned (int, optional) – The number of edges to prune. Default is -1 (no pruning limit).
score (bool, optional) – Whether to compute and return the pruning score. Default is False.
algo (str, optional) – The pruning algorithm. Default is “bf”.
weight_knn_g (str, optional) – The edge weight attribute for the k-NN graph. Default is “weight”.
k_compo (int, optional) – A parameter related to pruning. Default is 2.
dist_weight (bool, optional) – Whether to apply distance-based weighting for pruning. Default is True.
- Returns:
The pruned graph after distortion pruning.
- Return type:
networkx.Graph
clustergraph.GraphPruning module
- class clustergraph.GraphPruning.GraphPruning(graph=None, type_pruning='conn', algo='bf', weight='weight', knn_g=None, weight_knn_g='weight', k_compo=2, dist_weight=True)
Bases:
objectA class to perform pruning of a graph using different strategies like connectivity pruning or metric distortion pruning.
The class allows for the pruning of edges based on connectivity preservation or metric distortion. It also supports merging disconnected components and pruning edges in the merged graph to reduce noise.
Initializes the GraphPruning object with the provided graph and pruning strategy.
- Parameters:
graph (networkx.Graph, optional) – The graph to prune. Default is None.
type_pruning (str, optional) – The type of pruning to apply. Options are: “conn” for connectivity pruning (default), “md” for metric distortion pruning.
algo (str, optional) – The algorithm to use for pruning edges. Options are: “bf” for brute-force (best but slowest), “ps” for path simplification (faster). Default is “bf”.
weight (str, optional) – The key in the graph to use as the edge weight. Default is “weight”.
knn_g (networkx.Graph, optional) – A k-nearest neighbors graph used for metric distortion pruning. Default is None.
weight_knn_g (str, optional) – The key for the edge weight in the knn graph. Default is “weight”.
k_compo (int, optional) – The number of edges to add to merge disconnected components after metric distortion pruning. Default is 2.
dist_weight (bool, optional) – If True, edge weights are used when calculating metric distortion. Default is True.
- get_merged_graph(key, nb_edges)
Retrieves the merged graph after pruning edges.
- Parameters:
key (str) – The key indicating which pruning strategy was used (“md_bf”, “md_ps”, or “conn_merged”).
nb_edges (int) – The number of edges to prune. If -1, all edges will be pruned.
- Returns:
The merged and pruned graph.
- Return type:
networkx.Graph
- merge_graph(nb_edges=-1, k_compo=2, score=False)
Merges disconnected components in the graph and prunes edges among the merged components.
- Parameters:
nb_edges (int, optional) – The number of edges to prune from the merged graph. If -1, all edges will be pruned. Default is -1.
k_compo (int, optional) – The number of edges to add to merge disconnected components after pruning. Default is 2.
score (bool, optional) – If True, the method returns the score evolution (connectivity). Default is False.
- Returns:
networkx.Graph – The merged and pruned graph.
list of float, optional – A list of float values representing the evolution of the score if score is True.
- merge_graph_draft(pruned_gg=None, nb_edges=-1)
Merges disconnected components in the graph and prunes edges among the merged components.
- Parameters:
pruned_gg (networkx.Graph, optional) – The graph to merge and prune. If None, the pruned graph is used. Default is None.
nb_edges (int, optional) – The maximum number of edges to prune from the merged graph. If -1, all edges will be pruned. Default is -1.
- Returns:
The merged and pruned graph.
- Return type:
networkx.Graph
- prune(graph=None, nb_edge_pruned=-1, score=False)
Prunes edges from the graph using the selected pruning strategy.
- Parameters:
graph (networkx.Graph, optional) – The graph to prune. If None, the graph provided at initialization is used. Default is None.
nb_edge_pruned (int, optional) – The maximum number of edges to prune. If -1, all possible edges will be pruned. Default is -1.
score (bool, optional) – If True, the method returns the score evolution (connectivity or metric distortion). Default is False.
- Returns:
networkx.Graph – The pruned graph.
list of float, optional – A list of float values representing the evolution of the score if score is True.
- prune_conn(nb_edge_pruned=-1, score=False, algo='bf', weight='weight')
Performs connectivity pruning to retain the most important edges in the graph.
- Parameters:
nb_edge_pruned (int, optional) – The number of edges to prune. If -1, all possible edges will be pruned. Default is -1.
score (bool, optional) – If True, the method returns the score evolution (connectivity). Default is False.
algo (str, optional) – The pruning algorithm to use. Options are “bf” for brute-force and “ps” for path simplification. Default is “bf”.
weight (str, optional) – The key for edge weight in the graph. Default is “weight”.
- Returns:
networkx.Graph – The pruned graph.
list of float, optional – A list of float values representing the evolution of the score if score is True.
- prune_distortion_pr(knn_g, nb_edge_pruned=-1, score=False, algo='bf', weight_knn_g='weight', k_compo=2, dist_weight=True, is_knn_computed=-1)
Performs metric distortion pruning using the k-nearest neighbors graph.
- Parameters:
knn_g (networkx.Graph) – The k-nearest neighbors graph to use for metric distortion pruning.
nb_edge_pruned (int, optional) – The number of edges to prune. If -1, all possible edges will be pruned. Default is -1.
score (bool, optional) – If True, the method returns the score evolution (metric distortion). Default is False.
algo (str, optional) – The pruning algorithm to use. Options are “bf” for brute-force and “ps” for path simplification. Default is “bf”.
weight_knn_g (str, optional) – The key for edge weight in the k-nearest neighbors graph. Default is “weight”.
k_compo (int, optional) – The number of edges to add to merge disconnected components after pruning. Default is 2.
dist_weight (bool, optional) – If True, edge weights are used when calculating metric distortion. Default is True.
is_knn_computed (int, optional) – The identifier for the computed k-nearest neighbors graph.
- Returns:
networkx.Graph – The pruned graph.
list of float, optional – A list of float values representing the evolution of the score if score is True.
clustergraph.ConnectivityPruning module
- class clustergraph.ConnectivityPruning.ConnectivityPruning(algo='bf', weight='weight')
Bases:
objectA class to prune edges from a graph using different algorithms while preserving connectivity.
This class implements edge pruning strategies based on connectivity preservation. The algorithms aim to prune edges in a manner that minimally impacts the overall connectivity of the graph.
Reference:
Zhou, F., Mahler, S., Toivonen, H.: Simplification of Networks by Edge Pruning. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery: An Introduction to Concept, Algorithms, Tools, And Applications, pp. 179–198. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31830-6_13
- param algo:
The algorithm used to prune edges. Possible values are: - “bf” for Brute Force (slower but better) - “ps” for Path Simplification (faster but less accurate) Default is “bf”.
- type algo:
str, optional
- param weight:
The key under which the weight of edges is stored in the graph. Default is “weight”.
- type weight:
str, optional
Initializes the ConnectivityPruning object with the specified algorithm and edge weight key.
- param algo:
The algorithm used to prune edges. Options are “bf” or “ps”. Default is “bf”.
- type algo:
str, optional
- param weight:
The key under which the weight/size of edges is stored in the graph. Default is “weight”.
- type weight:
str, optional
- BF_edge_choice(g, nb_edges=-1, score=False)
Prunes edges from the graph using the Brute Force algorithm based on connectivity.
- Parameters:
g (networkx.Graph) – The input graph from which edges are to be pruned.
nb_edges (int, optional) – The number of edges to prune. If -1, all edges will be considered for pruning. Default is -1.
score (bool, optional) – If True, returns the connectivity score evolution after each pruning step. Default is False.
- Returns:
networkx.Graph – The pruned graph.
list – A list of removed edges in the form of tuples (u, v).
list, optional – If score=True, a list of float values representing the connectivity score evolution.
- PS_edge_choice(g, nb_edges, score=False)
Prunes edges from the graph using the Path Simplification algorithm.
- Parameters:
g (networkx.Graph) – The input graph from which edges are to be pruned.
nb_edges (int) – The number of edges to prune.
score (bool, optional) – If True, returns the evolution of the evaluation criteria after each pruning step. Default is False.
- Returns:
networkx.Graph – The pruned graph.
list – A list of removed edges in the form of tuples (u, v).
list, optional – If score=True, a list of float values representing the evaluation criteria after each pruning step.
- connectivity_graph(graph)
Computes the global connectivity of a given graph.
The global connectivity is calculated by summing the inverse shortest paths between all pairs of nodes.
- Parameters:
graph (networkx.Graph) – The input graph for which the global connectivity is computed.
- Returns:
The global connectivity score of the graph.
- Return type:
float
clustergraph.Metric_distortion_class module
- class clustergraph.Metric_distortion_class.Metric_distortion(graph, knn_g, weight_knn_g='weight', k_compo=2, dist_weight=True, algo='bf')
Bases:
objectInitializes the Metric_distortion object for pruning a graph based on metric distortion.
- Parameters:
graph (networkx.Graph) – Graph to prune.
knn_g (networkx.Graph) – The k-nearest neighbors graph from which the intrinsic distance between points of the dataset is retrieved. The dataset should be the same as the one used for computing the “graph”.
weight_knn_g (str, optional) – Key/label under which the weight of edges is stored in the “knn_g” graph. The weight corresponds to the distance between two nodes (default is ‘weight’).
k_compo (int, optional) – Number of edges that will be added to each disconnected component to merge them after the metric distortion pruning process. The edges added are those that connect disconnected components and are the shortest (default is 2).
dist_weight (bool, optional) – If True, the distortion is computed with weight on edges. If False, the distortion is computed without weights (default is True).
algo ({'bf', 'ps'}, optional) – Choice of the algorithm used to prune edges. ‘bf’ refers to the brute force algorithm (slowest) and ‘ps’ to the quickest algorithm. (default is ‘bf’).
- associate_cluster_one_compo(cluster)
For a given cluster, finds the dominating connected component in the k-nearest neighbors graph.
- Parameters:
cluster (list of int) – List of indices representing the cluster.
- Returns:
The index of the dominating connected component in the k-nearest neighbors graph, and the number of points that belong to this component.
- Return type:
int, int
- associate_clusters_compo()
Associates each node with a connected component in the k-nearest neighbors graph based on the most represented component within the cluster.
- Returns:
A dictionary with connected component indices as keys and a list of clusters that belong to each component.
- Return type:
dict
- conn_prune_merged_graph(pruned_gg, nb_edges_pruned=None, k_compo=None)
After merging components, prunes a specified number of edges to obtain a less noisy graph.
- Parameters:
pruned_gg (networkx.Graph) – The graph after merging components.
nb_edges_pruned (int, optional) – The maximum number of edges to prune, by default None (prunes as many edges as possible).
k_compo (int, optional) – The number of nearest neighbors between components, by default None.
- Returns:
networkx.Graph – The pruned and merged graph.
list – The list of removed edges.
list – The list of connectivity values after each pruning iteration.
- connectivity_graph(graph)
Computes the global connectivity of a graph based on the inverse shortest path distances between all pairs of nodes.
- Parameters:
graph (networkx.Graph) – The graph for which connectivity is computed.
- Returns:
The global connectivity value of the graph.
- Return type:
float
- create_knn_graph_merge_compo_CG(distance_matrix, k)
Creates a k-nearest neighbors graph from the provided distance matrix.
- Parameters:
distance_matrix (numpy.ndarray) – The distance matrix for which the k-nearest neighbors graph is to be created.
k (int) – The number of neighbors in the k-nearest neighbors graph.
- Returns:
The k-nearest neighbors graph created from the distance matrix.
- Return type:
networkx.Graph
- distortion_graph_no_weight(graph, intrinsic_graph)
Computes the distortion between a graph and its intrinsic version without taking cluster sizes into account.
- Parameters:
graph (networkx.Graph) – The graph for which the distortion is computed.
intrinsic_graph (networkx.Graph) – The intrinsic graph containing the edges’ intrinsic distances between clusters.
- Returns:
The distortion between the graph and its intrinsic counterpart. Returns 0 if no edges could be evaluated.
- Return type:
float
- distortion_graph_weight(graph, intrinsic_graph)
Computes the distortion between a graph and its intrinsic version while considering the sizes of clusters.
- Parameters:
graph (networkx.Graph) – The graph for which the distortion is computed.
intrinsic_graph (networkx.Graph) – The intrinsic graph containing the edges’ intrinsic distances between clusters.
- Returns:
The weighted distortion between the graph and its intrinsic counterpart. Returns 0 if no edges could be evaluated.
- Return type:
float
- get_distance_matrix_ccompo(pruned_graph)
Returns a distance matrix between all nodes in the graph, where the distance is the real distance for nodes in different components and the maximum distance for nodes in the same component.
- Parameters:
pruned_graph (networkx.Graph) – The pruned graph to compute the distance matrix for.
- Returns:
A 2D distance matrix between all nodes in the graph.
- Return type:
numpy.ndarray
- greedy_pruning(alpha=0.5, nb_edges=-1, weight='distortion')
Prunes edges with distortion lower than a given threshold alpha, prioritizing edges with higher distortion.
- Parameters:
alpha (float, optional) – The threshold for edge distortion. Edges with distortion smaller than alpha will be pruned, by default 0.5.
nb_edges (int, optional) – The number of edges to prune. If negative, all edges are considered, by default -1.
weight (str, optional) – The attribute name used to store distortion on edges, by default “distortion”.
- Returns:
The pruned graph.
- Return type:
networkx.Graph
- intr_two_clusters(c1, c2)
Computes the average intrinsic distance between two clusters, ignoring disconnected points.
- Parameters:
c1 (list of int) – List of indices representing the first cluster.
c2 (list of int) – List of indices representing the second cluster.
- Returns:
The intrinsic distance between the two clusters.
- Return type:
float
- intr_two_points(i, j)
Computes the intrinsic distance between two data points, represented by their indices, using the shortest path.
- Parameters:
i (int) – Index of the first data point.
j (int) – Index of the second data point.
- Returns:
The shortest path (intrinsic distance) between the two points, or -1 if the points are not in the same connected component.
- Return type:
float
- intrin_dist_cg()
Computes the intrinsic distance between clusters and adds them to the graph.
This method creates the intrinsic graph in which the distance between nodes is the average shortest path between points. It removes edges between disconnected components and updates the graph with the intrinsic distances between connected clusters.
- Return type:
None
- merge_components(pruned_gg)
Merges disconnected components in the graph by adding edges between the nearest neighbors from different components.
- Parameters:
pruned_gg (networkx.Graph) – The pruned graph with disconnected components.
- Returns:
networkx.Graph – The graph with added edges between components.
list – The list of edges added to the graph to connect components.
- plt_conn_prune_computed()
Plots the evolution of connectivity depending on the number of edges pruned.
- Returns:
Displays the plot of connectivity vs. number of pruned edges.
- Return type:
None
- plt_md_prune_computed(save=None)
Plots the evolution of metric distortion depending on the number of edges pruned.
- Parameters:
save (str, optional) – If provided, the plot will be saved as a PDF with the specified filename, by default None.
- prune_edges_BF(graph, nb_edges_pruned=-1, md_plot=True)
Iteratively prunes edges by selecting the edge that minimizes metric distortion at each iteration.
- Parameters:
graph (networkx.Graph) – The graph to prune.
nb_edges_pruned (int, optional) – The maximum number of edges to prune. If negative, all edges will be pruned, by default -1.
md_plot (bool, optional) – If True, the method will return a list with the metric distortion at each iteration, by default True.
- Returns:
networkx.Graph – The pruned graph.
list – The list of pruned edges, or the list of metric distortions per iteration if md_plot is True.
- prune_edges_PS(g, nb_edges_pruned=None, md_plot=True)
Prunes edges iteratively by selecting the edge with the highest metric distortion at each iteration.
- Parameters:
g (networkx.Graph) – The graph to prune.
nb_edges_pruned (int, optional) – The maximum number of edges to prune. If None, all edges are considered, by default None.
md_plot (bool, optional) – If True, the method will return a list with the metric distortion at each iteration, by default True.
- Returns:
networkx.Graph – The pruned graph.
list – The list of pruned edges, or the list of metric distortions per iteration if md_plot is True.
- remove_edges(graph)
Removes edges connecting nodes that belong to different disconnected components in the k-nearest neighbors graph.
- Parameters:
graph (networkx.Graph) – The graph from which edges between disconnected components should be removed.
- Returns:
A list of removed edges with their data, and the updated graph.
- Return type:
list, networkx.Graph
- set_distortion_edges(graph, weight='distortion')
Computes the distortion for each edge in the graph as the ratio of edge weight to intrinsic distance.
- Parameters:
graph (networkx.Graph) – The graph on which distortion is computed.
weight (str, optional) – The label under which distortion is stored in the graph, by default ‘distortion’.
- Returns:
The graph with distortion values set on each edge.
- Return type:
networkx.Graph
clustergraph.EdgeStrategy module
- class clustergraph.EdgeStrategy.EdgeStrategy(graph, palette=None, weight='weight', variable=None, norm_weight='id', type_coloring='label', color_labels=None, coloring_strategy_var='lin')
Bases:
objectClass for managing and preprocessing edge attributes in a graph, including edge colors and weights based on various strategies.
- Parameters:
graph (networkx.Graph) – Graph to preprocess. Its edges will be colored and normalized.
palette (Colormap, optional) – Colormap used to color edges. Default is None, which uses a predefined colormap.
weight (str, optional) – Key in the graph under which the size/weight of edges is stored. Default is “weight”.
variable (str, optional) – Key giving access to the continuous variable used to color edges. Default is None.
norm_weight (str, optional) – Method used to normalize the weight of edges. Options are “log”, “lin”, “exp”, “id”, “max”. Default is “id”.
type_coloring (str, optional) – Defines whether edge coloring is based on “label” or “variable”. Default is “label”.
color_labels (list, dict or numpy array, optional) – Labels of each edge for coloring. Default is None.
coloring_strategy_var (str, optional) – Strategy for coloring based on the “variable” key. Options are “log”, “lin”, “exp”. Default is “lin”.
- Raises:
ValueError – If an invalid option is provided for norm_weight, type_coloring, or coloring_strategy_var.
- dictLabelToHexa()
Converts labels in the form of a dictionary to hexadecimal colors.
- Returns:
A dictionary mapping labels to their corresponding hexadecimal color.
- Return type:
dict
- edgeColHexa_dictEdgeHexa()
Creates a dictionary mapping edges to their corresponding hexadecimal color.
- Return type:
None
- edgeColHexa_dictLabHexa()
Creates a dictionary mapping edges to their corresponding hexadecimal color, using labels.
- Return type:
None
- edgeColHexa_listHexa()
Creates a dictionary mapping edges to their corresponding hexadecimal color, based on a list of colors.
- Return type:
None
- fit_edges()
Launches the methods to set the weight (size) and colors of edges.
- Return type:
None
- getDictLabelHexaIdentity()
Creates a dictionary where each label is mapped to its own hexadecimal color.
- Return type:
None
- get_color_edge_unique(e)
Sets a unique color for each edge based on its label.
- Parameters:
e (tuple) – The edge for which the color should be set.
- Return type:
None
- get_color_var_exp(val)
Transforms a value into a hexadecimal color using exponential normalization.
- Parameters:
val (float) – The variable value of an edge.
- Returns:
The hexadecimal color corresponding to the variable value.
- Return type:
str
- get_color_var_lin(val)
Transforms a value into a hexadecimal color using linear normalization.
- Parameters:
val (float) – The variable value of an edge.
- Returns:
The hexadecimal color corresponding to the variable value.
- Return type:
str
- get_color_var_log(val)
Transforms a value into a hexadecimal color using logarithmic normalization.
- Parameters:
val (float) – The variable value of an edge.
- Returns:
The hexadecimal color corresponding to the variable value.
- Return type:
str
- get_labels()
Sets the color labels for the edges. If no labels are provided, assigns black as the default color.
- Return type:
None
- get_labels_into_hexa()
Transforms the color labels into hexadecimal color codes and stores them in EdgeHexa.
- Return type:
None
- get_mini_maxi()
Returns the minimum and maximum weight of the graph’s edges.
- Returns:
Maximum and minimum weights of edges in the graph.
- Return type:
tuple of float
- get_set_val_var_edge(e)
Stores the variable’s value for an edge under the key “data_variable” and returns the value.
- Parameters:
e (tuple) – The edge for which to store the variable value.
- Returns:
The variable’s value for the edge.
- Return type:
float
- get_val_var_edge_graph(e)
Retrieves the value of the variable for a given edge stored in the graph.
- Parameters:
e (tuple) – The edge for which to retrieve the variable value.
- Returns:
The variable’s value for the edge.
- Return type:
float
- identity_weight(weight, mini_weight, maxi_weight)
Returns the given weight without any normalization.
- Parameters:
weight (float) – Weight to normalize.
mini_weight (float) – Minimum weight in the graph.
maxi_weight (float) – Maximum weight in the graph.
- Returns:
The weight as is.
- Return type:
float
- listLabelToHexa()
Converts labels in the form of a list to hexadecimal colors.
- Returns:
A list of hexadecimal colors corresponding to each label.
- Return type:
list
- normalize_exp_min_max(weight, mini_weight, maxi_weight)
Applies an exponential normalization of a given weight.
- Parameters:
weight (float) – Weight to normalize.
mini_weight (float) – Minimum weight in the graph.
maxi_weight (float) – Maximum weight in the graph.
- Returns:
The normalized weight.
- Return type:
float
- normalize_lin_min_max(weight, mini_weight, maxi_weight)
Applies a linear normalization of a given weight.
- Parameters:
weight (float) – Weight to normalize.
mini_weight (float) – Minimum weight in the graph.
maxi_weight (float) – Maximum weight in the graph.
- Returns:
The normalized weight.
- Return type:
float
- normalize_log_min_max(weight, mini_weight, maxi_weight)
Applies a logarithmic normalization of a given weight.
- Parameters:
weight (float) – Weight to normalize.
mini_weight (float) – Minimum weight in the graph.
maxi_weight (float) – Maximum weight in the graph.
- Returns:
The normalized weight.
- Return type:
float
- normalize_max(weight, mini_weight, maxi_weight)
Applies a maximum normalization of a given weight.
- Parameters:
weight (float) – Weight to normalize.
mini_weight (float) – Minimum weight in the graph.
maxi_weight (float) – Maximum weight in the graph.
- Returns:
The normalized weight.
- Return type:
float
- set_color_edges_labels()
Sets the color of each edge based on its label.
- Return type:
None
- set_color_edges_variable()
Sets the color of each edge based on the value of a continuous variable.
- Return type:
None
- set_min_max_mean_var()
Sets the minimum and maximum values of the edge variable and stores the variable’s value in each edge.
- Return type:
None
- set_weight_edges()
Sets the normalized weight of each edge under the key “weight_plot” in the graph.
- Return type:
None
clustergraph.NodeStrategy module
- class clustergraph.NodeStrategy.NodeStrategy(graph, size_strategy='lin', type_coloring='label', palette=None, color_labels=None, X=None, variable=None, choiceLabel='max', coloring_strategy_var='lin', MIN_SIZE_NODE=0.1)
Bases:
objectInitialize the NodeStrategy class for preprocessing graph nodes with colors and sizes.
- Parameters:
graph (networkx.Graph) – Graph to preprocess, including setting colors and sizes for nodes.
size_strategy (str, optional) – Defines the formula for normalizing node sizes. Options are “lin”, “log”, “exp”, or “id”. Default is “lin”.
type_coloring (str, optional) – Defines the coloring method for nodes. Options are “label” or “variable”. Default is “label”.
palette (matplotlib.colors.ListedColormap, optional) – The colormap used to color nodes. Default is None.
color_labels (list, dict, or numpy array, optional) – Labels or colors for each node, either as a list or dictionary. Default is None.
X (numpy ndarray, optional) – Dataset used to compute variable-based coloring. Default is None.
variable (str or int, optional) – Feature used for variable-based coloring. Can be column name (str) or index (int).
choiceLabel (str, optional) – If “max” or “min” is chosen, the label is selected based on the most or least frequent label. Default is “max”.
coloring_strategy_var (str, optional) – Defines how the color changes based on variable values. Options are “log”, “lin”, or “exp”. Default is “lin”.
MIN_SIZE_NODE (float, optional) – Minimum size of nodes in the plot. Default is 0.1.
- Raises:
ValueError – If invalid options are provided for size_strategy, type_coloring, or coloring_strategy_var.
- expo_size(size, mini_size, maxi_size)
Exponentially normalize the size of a node.
- Parameters:
size (int) – The size/number of points covered by a node.
mini_size (int) – Minimum size of a node.
maxi_size (int) – Maximum size of a node.
- Returns:
The exponentially normalized size of the node.
- Return type:
float
- fit_nodes()
Set the size and color of nodes based on the chosen strategies.
This method updates the size and color of all nodes in the graph.
- get_color_node_points_covered(n)
Assign a color to a node based on the number of points it covers.
- Parameters:
n (int) – The node for which to assign a color.
- get_color_node_unique(n)
Assign a color to a node when there is a unique label for each node.
- Parameters:
n (int) – The node for which to assign a color.
- get_color_var_exp(n)
Assign a color to a node based on the exponential normalization of its variable value.
- Parameters:
n (int) – The node for which to assign a color.
- get_color_var_lin(n)
Assign a color to a node based on the linear normalization of its variable value.
- Parameters:
n (int) – The node for which to assign a color.
- get_color_var_log(n)
Assign a color to a node based on the logarithm of its variable value.
- Parameters:
n (int) – The node for which to assign a color.
- get_labels()
Set the color labels for nodes.
If no labels are provided, each node is assigned a unique color.
- get_labels_into_hexa()
Convert the node labels into hexadecimal color values.
This method uses the matplotlib colormap to assign colors to labels.
- get_mini_maxi()
Calculate the maximum and minimum size (number of points covered) of nodes in the graph.
- Returns:
The maximum and minimum sizes of nodes in the graph.
- Return type:
int, int
- get_val_var_node_Xnum(n)
Retrieve the variable value of a node from the numeric dataset.
- Parameters:
n (int) – The node for which to retrieve the variable value.
- Returns:
The value of the variable for the node.
- Return type:
float
- get_val_var_node_Xpand(n)
Retrieve the variable value of a node from the expanded dataset.
- Parameters:
n (int) – The node for which to retrieve the variable value.
- Returns:
The value of the variable for the node.
- Return type:
float
- get_val_var_node_graph(n)
Retrieve the variable value of a node from the graph.
- Parameters:
n (int) – The node for which to retrieve the variable value.
- Returns:
The value of the variable for the node.
- Return type:
float
- id_size(size, mini_size, maxi_size)
Return the same size for every node.
- Parameters:
size (int) – The size/number of points covered by a node.
mini_size (int) – Minimum size of a node.
maxi_size (int) – Maximum size of a node.
- Returns:
Always returns 1 for every node.
- Return type:
float
- linear_size(size, mini_size, maxi_size)
Linearly normalize the size of a node.
- Parameters:
size (int) – The size/number of points covered by a node.
mini_size (int) – Minimum size of a node.
maxi_size (int) – Maximum size of a node.
- Returns:
The linearly normalized size of the node.
- Return type:
float
- log_size(size, mini_size, maxi_size)
Logarithmically normalize the size of a node.
- Parameters:
size (int) – The size/number of points covered by a node.
mini_size (int) – Minimum size of a node.
maxi_size (int) – Maximum size of a node.
- Returns:
The logarithmically normalized size of the node.
- Return type:
float
- set_color_nodes_labels()
Set the color of each node based on its label.
This method assigns colors to nodes using the color_labels attribute.
- set_color_nodes_variable()
Set the color of each node based on the variable’s value.
This method assigns colors to nodes using the specified variable-based coloring strategy.
- set_size_nodes()
Set the size of each node in the plot.
This method assigns the appropriate size to each node based on the number of points covered.
clustergraph.c_GraphPreprocess module
- class clustergraph.c_GraphPreprocess.GraphPreprocess(graph=None, nodeStrat=None, edgeStrat=None)
Bases:
objectA class for preprocessing a graph by assigning colors and sizes to nodes and edges based on different strategies.
This class allows for customization of node and edge appearance based on various strategies such as size, coloring, and normalization.
Initializes the GraphPreprocess object with the provided graph and preprocessing strategies.
- Parameters:
graph (networkx.Graph, optional) – The graph to preprocess. If None, the graph will not be set initially. Default is None.
nodeStrat (NodeStrategy, optional) – The strategy for node preprocessing. If None, a new strategy will be created. Default is None.
edgeStrat (EdgeStrategy, optional) – The strategy for edge preprocessing. If None, a new strategy will be created. Default is None.
- color_graph(node_size_strategy='log', node_type_coloring='label', node_palette=None, node_color_labels=None, node_X=None, node_variable=None, node_choiceLabel='max', node_coloring_strategy_var='lin', MIN_SIZE_NODE=0.1, edge_palette=None, edge_weight='weight', edge_variable=None, edge_norm_weight='id', edge_type_coloring='label', edge_color_labels=None, edge_coloring_strategy_var='lin')
Applies both node and edge preprocessing strategies using default parameters.
This method internally calls fit_nodes() and fit_edges() with the specified parameters to preprocess the graph.
- Parameters:
node_size_strategy (str, optional) – Defines the formula used to normalize the size of nodes. Options are “lin”, “log”, “exp”, or “id”. Default is “lin”.
node_type_coloring (str, optional) – Defines how to color nodes. Options are “label” (color by label) or “variable” (color based on a feature). Default is “label”.
node_palette (matplotlib.colors.ListedColormap, optional) – The colormap for nodes. If None, the default colormap will be used. Default is None.
node_color_labels (list, dict or numpy array, optional) – Object for retrieving colors of nodes. If a list or numpy array is given, it should have the same length as the number of nodes. Default is None.
node_X (numpy.ndarray, optional) – The dataset to use when coloring nodes by a variable. Default is None.
node_variable (str or int, optional) – The feature (column index or name) to use for coloring nodes. Default is None.
node_choiceLabel (str, optional) – Defines how to choose the label when node_type_coloring is “label”. Options are “max” (most represented label) or “min” (least represented label). Default is “max”.
node_coloring_strategy_var (str, optional) – Defines how the color will change based on the variable value when node_type_coloring is “variable”. Options are “log”, “lin”, or “exp”. Default is “lin”.
MIN_SIZE_NODE (float, optional) – The minimum size of nodes in the plot. Default is 0.1.
edge_palette (matplotlib.colors.ListedColormap, optional) – The colormap for edges. Default is None.
edge_weight (str, optional) – The key in the graph for edge weights. Default is “weight”.
edge_variable (str, optional) – The key in the graph for the variable used to color edges. Default is None.
edge_norm_weight (str, optional) – The method for normalizing edge sizes. Default is “id”, which does not normalize.
edge_type_coloring (str, optional) – Defines how to color edges. Options are “label” (color by label) or “variable” (color based on a feature). Default is “label”.
edge_color_labels (list, dict or numpy array, optional) – Object for retrieving edge color labels. Default is None.
edge_coloring_strategy_var (str, optional) – Defines how the color will change based on the edge variable value when edge_type_coloring is “variable”. Options are “log”, “lin”, or “exp”. Default is “lin”.
- fit_edges(edge_palette=None, edge_weight='weight', edge_variable=None, edge_norm_weight='id', edge_type_coloring='label', edge_color_labels=None, edge_coloring_strategy_var='lin')
Preprocesses the edges of the graph based on the specified strategy.
This method fits the edge preprocessing strategy and applies it to the graph’s edges.
- Parameters:
edge_palette (matplotlib.colors.ListedColormap, optional) – The colormap for edges. Default is None.
edge_weight (str, optional) – The key in the graph for edge weights. Default is “weight”.
edge_variable (str, optional) – The key in the graph for the variable used to color edges. Default is None.
edge_norm_weight (str, optional) – The method for normalizing edge sizes. Default is “id”, which does not normalize.
edge_type_coloring (str, optional) – Defines how to color edges. Options are “label” (color by label) or “variable” (color based on a feature). Default is “label”.
edge_color_labels (list, dict or numpy array, optional) – Object for retrieving edge color labels. Default is None.
edge_coloring_strategy_var (str, optional) – Defines how the color will change based on the edge variable value when edge_type_coloring is “variable”. Options are “log”, “lin”, or “exp”. Default is “lin”.
- fit_nodes(node_size_strategy='log', node_type_coloring='label', node_palette=None, node_color_labels=None, node_X=None, node_variable=None, node_choiceLabel='max', node_coloring_strategy_var='lin', MIN_SIZE_NODE=0.1)
Preprocesses the nodes of the graph based on the specified strategy.
This method fits the node preprocessing strategy and applies it to the graph’s nodes.
- Parameters:
node_size_strategy (str, optional) – Defines the formula used to normalize the size of nodes. Options are “lin”, “log”, “exp”, or “id”. Default is “lin”.
node_type_coloring (str, optional) – Defines how to color nodes. Options are “label” (color by label) or “variable” (color based on a feature). Default is “label”.
node_palette (matplotlib.colors.ListedColormap, optional) – The colormap for nodes. Default is None.
node_color_labels (list, dict or numpy array, optional) – Object for retrieving colors of nodes. Default is None.
node_X (numpy.ndarray, optional) – The dataset to use when coloring nodes by a variable. Default is None.
node_variable (str or int, optional) – The feature (column index or name) to use for coloring nodes. Default is None.
node_choiceLabel (str, optional) – Defines how to choose the label when node_type_coloring is “label”. Default is “max”.
node_coloring_strategy_var (str, optional) – Defines how the color will change based on the variable value when node_type_coloring is “variable”. Default is “lin”.
MIN_SIZE_NODE (float, optional) – The minimum size of nodes in the plot. Default is 0.1.
- get_graph_prepro()
Returns the preprocessed graph.
- Returns:
The graph that has been preprocessed, assuming other methods were called first.
- Return type:
networkx.Graph
clustergraph.distances module
- clustergraph.distances.EMD_for_two_clusters(X_1, X_2, distance_points, normalize=True)
Computes the Earth Mover’s Distance (EMD) between two clusters of points.
This function computes the optimal transport (Earth Mover’s Distance) between two sets of points, X_1 and X_2, using the distance_points function for measuring the distance between points. Optionally, the distance can be normalized by the number of distances computed.
- Parameters:
X_1 (numpy.ndarray) – A dataset (array of points) representing the first cluster.
X_2 (numpy.ndarray) – A dataset (array of points) representing the second cluster.
distance_points (callable) – A function or object that calculates the distance between two points.
normalize (bool, optional) – If True, the computed distance will be normalized by the number of distances evaluated. If False, no normalization is performed. Default is True.
- Returns:
The Earth Mover’s Distance (EMD) between the two clusters X_1 and X_2. The result is normalized if normalize is True, otherwise it returns the raw distance.
- Return type:
float
- clustergraph.distances.average_dist(X_1, X_2, distance_points)
Computes the average distance between all pairs of points from two point sets.
- Parameters:
X_1 (numpy.ndarray) – A dataset (array of points) representing the first cluster.
X_2 (numpy.ndarray) – A dataset (array of points) representing the second cluster.
distance_points (callable) – A function that calculates the distance between two points.
- Returns:
The average distance between all pairs of points from X_1 and X_2, computed using the provided distance_points function.
- Return type:
float
- clustergraph.distances.centroid_dist(X_1, X_2, distance_points)
Computes the distance between the centroids of two point sets.
- Parameters:
X_1 (numpy.ndarray) – A dataset (array of points) representing the first cluster.
X_2 (numpy.ndarray) – A dataset (array of points) representing the second cluster.
distance_points (callable) – A function that calculates the distance between two points.
- Returns:
The distance between the centroids of the two clusters, calculated using the provided distance_points function.
- Return type:
float
- clustergraph.distances.max_dist(X_1, X_2, distance_points)
Computes the maximum distance between any pair of points from two point sets.
- Parameters:
X_1 (numpy.ndarray) – A dataset (array of points) representing the first cluster.
X_2 (numpy.ndarray) – A dataset (array of points) representing the second cluster.
distance_points (callable) – A function that calculates the distance between two points.
- Returns:
The maximum distance between any pair of points from X_1 and X_2, computed using the provided distance_points function.
- Return type:
float
- clustergraph.distances.min_dist(X_1, X_2, distance_points)
Computes the minimum distance between any pair of points from two point sets.
- Parameters:
X_1 (numpy.ndarray) – A dataset (array of points) representing the first cluster.
X_2 (numpy.ndarray) – A dataset (array of points) representing the second cluster.
distance_points (callable) – A function that calculates the distance between two points.
- Returns:
The minimum distance between any pair of points from X_1 and X_2, computed using the provided distance_points function.
- Return type:
float
clustergraph.plot_graph module
- clustergraph.plot_graph.draw_graph(graph, nb_edges=None, edge_variable='weight_plot', draw_edge_labels=True, scale_nodes=True, size_nodes=1000, random_state=42, precision=2, ax=None, **kwargs)
Plot a graph with specified nodes and edges, with optional sorting of edges.
- Parameters:
graph (networkx.Graph) – The graph to be displayed.
nb_edges (int, optional) – The number of edges to display. If specified, only the first nb_edges edges (sorted by weight) are shown.
edge_variable (str, optional) – The edge attribute to be used for edge labels. Defaults to ‘weight_plot’.
draw_edge_labels (bool, optional) – If True, edge labels are drawn. Defaults to True.
scale_nodes (bool, optional) – If True, node sizes are scaled based on the ‘size_plot’ attribute. Defaults to True.
size_nodes (int, optional) – The baseline size of nodes. Larger values make nodes bigger. Defaults to 1000.
random_state (int or None, optional) – Random seed for the node layout. Defaults to 42.
precision (int, optional) – Number of decimal places for edge labels. Defaults to 2.
ax (matplotlib.axes.Axes, optional) – The axes to draw the graph on. If None, the current axes are used.
**kwargs (keyword arguments) – Additional arguments passed to networkx.draw_networkx.
- clustergraph.plot_graph.draw_graph_pie(graph, nb_edges=None, edge_variable='weight_plot', draw_edge_labels=True, scale_nodes=True, size_nodes=0.05, random_state=42, ax=None, **kwargs)
Draw a graph with pie charts at each node representing its attributes.
- Parameters:
graph (networkx.Graph) – The graph to be drawn.
nb_edges (int, optional) – The number of edges to display. If specified, the graph is truncated to include only the smallest nb_edges edges.
edge_variable (str, optional) – The edge attribute used for edge labels. Defaults to ‘weight_plot’.
draw_edge_labels (bool, optional) – If True, edge labels are drawn. Defaults to True.
scale_nodes (bool, optional) – If True, node sizes are scaled according to the attribute ‘size_plot’. Defaults to True.
size_nodes (float, optional) – The baseline size of nodes if scale_nodes is False. Defaults to 0.05.
random_state (int, optional) – The random state for the node positioning. Defaults to 42.
ax (matplotlib.axes.Axes, optional) – The axes on which to draw the graph. If None, the current axes are used.
**kwargs (keyword arguments) – Additional arguments passed to networkx.draw_networkx.
- clustergraph.plot_graph.plot_slider_graph(g, reverse=False, random_state=None, weight='weight', weight_shown='weight_plot', max_node_size=800, min_node_size=100)
Display an interactive graph with a slider to control the number of displayed edges.
- Parameters:
g (networkx.Graph) – The graph to be displayed.
reverse (bool, optional) – If True, edges are sorted from longest to shortest. Otherwise, they are sorted from shortest to longest.
random_state (int or None, optional) – Random seed for node positioning. Defaults to None.
weight (str, optional) – The edge attribute used for sorting the edges. Defaults to ‘weight’.
weight_shown (str, optional) – The edge attribute used for displaying edge labels. Defaults to ‘weight_plot’.
max_node_size (int, optional) – The maximum size of nodes in the plot. Defaults to 800.
min_node_size (int, optional) – The minimum size of nodes in the plot. Defaults to 100.
- Returns:
The slider widget used to control the number of displayed edges.
- Return type:
matplotlib.widgets.Slider
clustergraph.subsampling module
- class clustergraph.subsampling.Subsampling(clusters, variable_clusters='points_covered', perc=0.5, seed=None)
Bases:
objectClass that performs subsampling on clusters of data, either from a graph or provided as input.
- Parameters:
clusters (networkx.Graph or list of lists) – The clusters to subsample. If a graph is provided, clusters are extracted from the graph.
variable_clusters (str, optional) – The node attribute to be used to extract clusters from the graph. Default is ‘points_covered’.
perc (float, optional) – The percentage of each cluster to retain after subsampling. Should be between 0 and 1. Default is 0.5.
seed (int or None, optional) – The random seed for reproducibility. Default is None.
- perc
The percentage of the clusters to subsample.
- Type:
float
- clusters
The clusters to subsample, either from the graph or provided directly.
- Type:
list of lists
- subsampled_clusters
The result of subsampling the clusters.
- Type:
ndarray
- dict_old_new_indices
A dictionary mapping old indices to new indices after subsampling.
- Type:
dict
- dict_new_old_indices
A dictionary mapping new indices to old indices.
- Type:
dict
- X_restricted
The restricted dataset after subsampling.
- Type:
ndarray
Initializes the Subsampling object.
- Parameters:
clusters (networkx.Graph or list of lists) – The clusters to subsample. If a graph is provided, clusters are extracted from the graph.
variable_clusters (str, optional) – The node attribute to be used to extract clusters from the graph. Default is ‘points_covered’.
perc (float, optional) – The percentage of each cluster to retain after subsampling. Default is 0.5.
seed (int or None, optional) – The random seed for reproducibility. Default is None.
- Raises:
ValueError – If the percentage (perc) is not between 0 and 1.
- data_transformation(X)
Transforms the dataset X by selecting only the rows corresponding to the subsampled clusters.
The method creates two dictionaries to map between old indices (from the original dataset) and new indices (from the restricted dataset), and returns the restricted dataset.
- Parameters:
X (ndarray) – The original dataset from which data will be selected based on subsampled clusters.
- Returns:
This method modifies the object in place and stores the transformed dataset in self.X_restricted.
- Return type:
None
- get_clusters_from_graph(g_clusters, variable_clusters)
Extracts the clusters from a graph based on a node attribute.
- Parameters:
g_clusters (networkx.Graph) – The graph from which clusters will be extracted.
variable_clusters (str) – The node attribute to use for extracting clusters.
- Returns:
A numpy array containing the clusters from the graph.
- Return type:
ndarray
- subsampling_clusters()
Subsamples the clusters based on the specified percentage.
This method creates a subsampled version of each cluster, where a percentage of the original elements are randomly selected without replacement.
- Returns:
A numpy array containing the subsampled clusters.
- Return type:
ndarray
clustergraph.utils module
- clustergraph.utils.get_clusters_from_BM(bm)
From a BallMapper object, returns a list of clusters, where each cluster is a list of indices corresponding to the points covered.
- Parameters:
bm (BallMapper) – A BallMapper object which contains information about the clusters.
- Returns:
A list of clusters, where each element is a list of indices corresponding to the points covered by that cluster.
- Return type:
list
- clustergraph.utils.get_clusters_from_Mapper(graph)
From a Mapper object, returns a list of clusters, where each cluster is a list of indices corresponding to the points covered.
- Parameters:
graph (dict) – A Mapper object which contains the cluster information.
- Returns:
A list of clusters, where each element is a list of indices corresponding to the points covered by that cluster.
- Return type:
list
- clustergraph.utils.get_clusters_from_scikit(prediction, return_mapping=False)
From a list of predictions, returns a list of clusters with each cluster being a list of indices.
- Parameters:
prediction (list or numpy.ndarray) – Cluster labels. At each index there is a label corresponding to the cluster of the data point.
return_mapping (bool, optional) – If True, returns a dictionary mapping each cluster label to its index. Default is False.
- Returns:
list – A list of clusters, where each element is a numpy array containing the indices of the data points in that cluster.
dict, optional – If return_mapping is True, a dictionary mapping each cluster label to an index.
- clustergraph.utils.get_corresponding_edges(vertices, edges)
Returns the edges that correspond to a given set of vertices.
- Parameters:
vertices (list) – A list of vertices.
edges (list) – A list of edges, where each edge is represented as [vertex_1, vertex_2, value].
- Returns:
A list of edges where both vertices are in the given list of vertices.
- Return type:
list
- clustergraph.utils.get_sorted_edges(graph, variable_length='label')
Returns the edges of the graph sorted by the specified edge attribute.
- Parameters:
graph (networkx.Graph) – A NetworkX graph object.
variable_length (str, optional) – The attribute used for sorting the edges. Default is “label”.
- Returns:
A list of edges sorted by the specified attribute.
- Return type:
list
- clustergraph.utils.get_values(list_key_value)
Extracts the values from a list of key-value pairs.
- Parameters:
list_key_value (list) – A list of key-value pairs, where each element is a list [key, value].
- Returns:
A list of values extracted from the input list of key-value pairs.
- Return type:
list
- Raises:
ValueError – If the input list is empty.
- clustergraph.utils.insert_sorted_list(liste, element_to_insert)
Inserts an element into an already sorted list based on the ‘value’ element (the third element in the list).
- Parameters:
liste (list) – A list of elements, each represented by a list [key_1, key_2, value]. The list is already sorted based on the ‘value’.
element_to_insert (list) – A list [key_1, key_2, value] that we want to insert in the list while maintaining the order based on ‘value’.
- Returns:
The ordered list with the new element inserted.
- Return type:
list
- Raises:
ValueError – If element_to_insert contains fewer than 3 elements.
- clustergraph.utils.max_size_node_graph(graph, variable, nodes=None)
Returns the maximum size of a node based on a given attribute in a graph.
- Parameters:
graph (networkx.Graph) – A NetworkX graph object.
variable (str) – The attribute of the node that is used to determine the size.
nodes (list, optional) – A list of nodes to check. If None, all nodes in the graph are checked.
- Returns:
The maximum size of the node based on the given attribute.
- Return type:
int
- clustergraph.utils.replace_in_array(list_1, list_2, arr, val)
Replaces the values in a numpy array at the positions specified by list_1 and list_2 (and their symmetric positions) with the given value.
- Parameters:
list_1 (list or numpy.ndarray) – The rows in which we want to change the value.
list_2 (list or numpy.ndarray) – The columns in which we want to change the value.
arr (numpy.ndarray) – The numpy array to modify.
val (float or int) – The value to place at the specified positions.
- Returns:
The modified numpy array.
- Return type:
numpy.ndarray