Data Analysis & Visualization
Here you can find the different data analysis and visualization functions in the package.
Visualization Utilities
- pyBiodatafuse.analyzer.utils.plot_pie_chart(template_df: DataFrame, fig_size: tuple = (10, 10)) <module 'matplotlib.pyplot' from '/home/docs/checkouts/readthedocs.org/user_builds/pybiodatafuse/envs/stable/lib/python3.9/site-packages/matplotlib/pyplot.py'>[source]
Plot a pie chart.
- Parameters:
template_df – A dataframe with two columns: “label” and “value”
fig_size – A tuple with the size of the figure
- Returns:
A pie chart
- pyBiodatafuse.analyzer.utils.plot_hbarplot_chart(template_df: DataFrame, x_label: str = 'Label', y_label: str = 'Value', fig_size: tuple = (10, 10)) <module 'matplotlib.pyplot' from '/home/docs/checkouts/readthedocs.org/user_builds/pybiodatafuse/envs/stable/lib/python3.9/site-packages/matplotlib/pyplot.py'>[source]
Plot a bar plot.
- Parameters:
template_df – A dataframe with two columns: “label” and “value”
x_label – The x-axis label
y_label – The y-axis label
fig_size – A tuple with the size of the figure
- Returns:
A bar plot
- pyBiodatafuse.analyzer.utils.plotly_pie_chart(template_df: DataFrame, fig_size: tuple = (10, 10)) <module 'plotly.express' from '/home/docs/checkouts/readthedocs.org/user_builds/pybiodatafuse/envs/stable/lib/python3.9/site-packages/plotly/express/__init__.py'>[source]
Plot a pie chart using Plotly.
- Parameters:
template_df – A dataframe with two columns: “label” and “value”
fig_size – A tuple with the size of the figure
- Returns:
A plotly pie chart
- pyBiodatafuse.analyzer.utils.plotly_barplot_chart(template_df: DataFrame, x_label: str = 'Label', y_label: str = 'Value', fig_size: tuple = (10, 10)) <module 'plotly.express' from '/home/docs/checkouts/readthedocs.org/user_builds/pybiodatafuse/envs/stable/lib/python3.9/site-packages/plotly/express/__init__.py'>[source]
Plot a bar plot using Plotly.
- Parameters:
template_df – A dataframe with two columns: “label” and “value”
x_label – The x-axis label
y_label – The y-axis label
fig_size – A tuple with the size of the figure
- Returns:
A bar plot
Literature Explorer
- pyBiodatafuse.analyzer.explorer.literature.get_wikidata_gene_literature(bridgedb_df: DataFrame) Dict[str, Set[str]][source]
Get PubMed articles linked to a gene or its encoded protein.
- Parameters:
bridgedb_df – BridgeDb output for creating the list of gene ids to query
- Returns:
a dictionary with the NCBI gene id as the key and the PMIDs as the value.
Patent Explorer
- pyBiodatafuse.analyzer.explorer.patent.get_patent_from_pubchem(bridgedb_df: DataFrame) dict[source]
Get patent data summary from PubChem compounds.
The output is the following: {CID: [“US: X”, “EP: X”, “WO: X”, “Others: X”]} :param bridgedb_df: A dataframe with the BridgeDb or Pubchem harmonized output :returns: A dictionary with the PubChem Compound ID as key and the patent counts as value
Graph Summary
- class pyBiodatafuse.analyzer.summarize.BioGraph(graph=None, graph_path=None, graph_format='pickle', disease_df=None)[source]
Bases:
MultiDiGraphBioGraph class to analyze the graph.
Initialize the BioGraph class.
- Parameters:
graph – networkx graph object
graph_path – path to the graph file
graph_format – format of the graph file
disease_df – disease dataframe to build the graph
- Raises:
ValueError – if graph_format is not ‘pickle’ or ‘gml’
- count_nodes_by_type(plot: bool = False, interactive: bool = False) DataFrame | None[source]
Count the differnent nodes type in the graph.
- count_edge_by_type(plot: bool = False, interactive: bool = False) DataFrame | None[source]
Count the different edge types in the graph.
- count_nodes_by_data_source(plot: bool = False) DataFrame | None[source]
Get the count of nodes by data source.
- count_edge_by_data_source(plot: bool = False) DataFrame | None[source]
Get the count of edges by data source.
- get_subgraph(node_types: list)[source]
Get subgraph of the graph.
- Parameters:
node_types – list of node types
- Raises:
AssertionError – if node type not in the graph
- Returns:
subgraph with the given node types