PyGraphistry: Explore Relationships

PyGraphistry is a Python visual graph analytics library to extract, transform, and load big graphs into Graphistry's visual graph analytics platform. It is typically used by data scientists, developers, and operational analysts on problems like visually mapping the behavior of devices and users.

The Python client makes it easy to go from your existing data to a Graphistry server. Through strong notebook support, data scientists can quickly go from data to accelerated visual explorations, and developers can quickly prototype stunning solutions with their users.

Graphistry supports unusually large graphs for interactive visualization. The client's custom WebGL rendering engine renders up to 8MM nodes and edges at a time, and most older client GPUs smoothly support somewhere between 100K and 1MM elements. The serverside GPU analytics engine supports even bigger graphs.

Interactive Demo
Graph Gallery
Installation
Tutorial
API Reference

Demo of Friendship Communities on Facebook

Click to open interactive version! (For server-backed interactive analytics, use an API key)

Source data: SNAP

PyGraphistry is...

Fast & Gorgeous: Cluster, filter, and inspect large amounts of data at interactive speed. We layout graphs with a descendant of the gorgeous ForceAtlas2 layout algorithm introduced in Gephi. Our data explorer connects to Graphistry's GPU cluster to layout and render hundreds of thousand of nodes+edges in your browser at unparalleled speeds.
Notebook Friendly: PyGraphistry plays well with interactive notebooks like Juypter, Zeppelin, and Databricks: Process, visualize, and drill into with graphs directly within your notebooks.
Great for Events, CSVs, and more: Not sure if your data is graph-friendly? PyGraphistry's hypergraph transform helps turn any sample data like CSVs, SQL results, and event data into a graph for pattern analysis:
```
rows = pandas.read_csv('transactions.csv')[:1000]
graphistry.hypergraph(rows)['graph'].plot()
```

Batteries Included: PyGraphistry works out-of-the-box with popular data science and graph analytics libraries. It is also very easy to turn arbitrary data into insightful graphs:

Pandas

edges = pd.read_csv('facebook_combined.txt', sep=' ', names=['src', 'dst'])
graphistry.bind(source='src', destination='dst').plot(edges)

table_rows = pd.read_csv('honeypot.csv')
graphistry.hypergraph(table_rows, ['attackerIP', 'victimIP', 'victimPort', 'vulnName'])['graph'].plot()

graphistry.hypergraph(table_rows, ['attackerIP', 'victimIP', 'victimPort', 'vulnName'], 
    direct=True, 
    opts={'EDGES': {
        'attackerIP': ['victimIP', 'victimPort', 'vulnName'], 
        'victimIP': ['victimPort', 'vulnName'],
        'victimPort': ['vulnName']
}})['graph'].plot()

Neo4j (notebook demo)

NEO4J_CREDS = {'uri': 'bolt://my.site.ngo:7687', 'auth': ('neo4j', 'mypwd')}
graphistry.register(bolt=NEO4J_CREDS)
graphistry.cypher("MATCH (a)-[p:PAYMENT]->(b) WHERE p.USD > 7000 AND p.USD < 10000 RETURN a, p, b").plot()

graphistry.cypher("CALL db.schema()").plot()

or

from neo4j import GraphDatabase, Driver
graphistry.register(bolt=GraphDatabase.driver(**NEO4J_CREDS))
graphistry.cypher("MATCH (a)-[p:PAYMENT]->(b) WHERE p.USD > 7000 AND p.USD < 10000 RETURN a, p, b").plot()

TigerGaph (notebook demo)

g = graphistry.tigergraph(protocol='https', ...)
g2 = g.gsql("...", {'edges': '@@eList'})
g2.plot()
print('# edges', len(g2._edges))
g.endpoint('my_fn', {'arg': 'val'}, {'edges': '@@eList'}).plot()

IGraph

graph = igraph.read('facebook_combined.txt', format='edgelist', directed=False)
graphistry.bind(source='src', destination='dst').plot(graph)

NetworkX (notebook demo)

graph = networkx.read_edgelist('facebook_combined.txt')
graphistry.bind(source='src', destination='dst', node='nodeid').plot(graph)

HyperNetX (notebook demo)

hg.hypernetx_to_graphistry_nodes(H).plot()
hg.hypernetx_to_graphistry_bipartite(H.dual()).plot()

Splunk (notebook demo)

df = splunkToPandas("index=netflow bytes > 100000 | head 100000", {})    
graphistry.bind(source='src_ip', destination='dest_ip').plot(df)

Gallery

Twitter Botnet	Edit Wars on Wikipedia Source: SNAP	100,000 Bitcoin Transactions
Port Scan Attack	Protein Interactions Source: BioGRID	Programming Languages Source: Socio-PLT project

Installation

We recommend four options for installing PyGraphistry:

Graphistry AMI: One-click launch with Graphistry, PyGraphistry, and Jupyter notebooks preinstalled and ready to go out-of-the-box
pip install graphistry: If you already have Jupyter Notebook installed or are using a system like Google Colab, install the PyGraphistry pip package. (Requires a Graphistry server.)
Docker: For quickly trying PyGraphistry when you do not have Jupyter Notebook installed and find doing so difficult, use our complete Docker image. (Requires a Graphistry server.)

Option 1: New users - Graphistry AWS Server with Preinstalled PyGraphistry client

For new users who have AWS accounts, simply launch the self-serve Graphistry AMI.

It provides several benefits for getting started:

PyGraphistry is preinstalled
Jupyter notebooks is preinstalled
Starter examples of using with different files, databases, and Nvidia RAPIDS are provided
Preconfigured backend server: Nvidia drivers, nvidia-docker, Graphistry server, etc.
Running in your private AWS means you can safely explore private data there

The server gracefully stops/starts: Control AWS spending by simply stopping the server when not using it.

Option 2: PyGraphistry pip package

Install PyGraphistry into your own Python app or data science notebook environment such as Jupyter and Google Colab. Requires a Graphistry server such as the self-serve Graphistry AMI

Install PyGraphistry with Python's pip package manager:

Pandas only (recommended): pip install graphistry
- neo4j: pip install "graphistry[bolt]"
- IGraph, NetworkX, Neo4j: pip install "graphistry[all]"

The latter two can be skipped if you already have the third-party Python packages at the appropriate versions installed.

Jupyter Notebook Integration

API Key

An API key gives each visualization access to your Graphistry GPU server. Set your key after the import graphistry statement and you are good to go:

import graphistry
graphistry.register(key='Your key')

Optionally, for convenience, you may set your API key in your system environment and thereby skip the register step in all your notebooks. In your .profile or .bash_profile, add the following and reload your environment:

export GRAPHISTRY_API_KEY="Your key"

Tutorial: Les Misérables

Let's visualize relationships between the characters in Les Misérables. For this example, we'll choose Pandas to wrangle data and IGraph to run a community detection algorithm. You can view the Jupyter notebook containing this example.

Our dataset is a CSV file that looks like this:

source	target	value
Cravatte	Myriel	1
Valjean	Mme.Magloire	3
Valjean	Mlle.Baptistine	3

Source and target are character names, and the value column counts the number of time they meet. Parsing is a one-liner with Pandas:

import pandas
links = pandas.read_csv('./lesmiserables.csv')

Quick Visualization

If you already have graph-like data, use this step. Otherwise, try the Hypergraph Transform

PyGraphistry can plot graphs directly from Pandas dataframes, IGraph graphs, or NetworkX graphs. Calling plot uploads the data to our visualization servers and return an URL to an embeddable webpage containing the visualization.

To define the graph, we bind source and destination to the columns indicating the start and end nodes of each edges:

import graphistry
graphistry.register(key='YOUR_API_KEY_HERE')

plotter = graphistry.bind(source="source", destination="target")
plotter.plot(links)

You should see a beautiful graph like this one:

Adding Labels

Let's add labels to edges in order to show how many times each pair of characters met. We create a new column called label in edge table links that contains the text of the label and we bind edge_label to it.

links["label"] = links.value.map(lambda v: "#Meetings: %d" % v)
plotter = plotter.bind(edge_label="label")
plotter.plot(links)

Controlling Node Title, Size, Color, and Location

Let's size nodes based on their PageRank score and color them using their community. IGraph already has these algorithms implemented for us. If IGraph is not already installed, fetch it with pip install python-igraph. Warning: pip install igraph will install the wrong package!

We start by converting our edge dateframe into an IGraph. The plotter can do the conversion for us using the source and destination bindings. Then we create two new node attributes (pagerank & community).

ig = plotter.pandas2igraph(links)
ig.vs['pagerank'] = ig.pagerank()
ig.vs['community'] = ig.community_infomap().membership

plotter.bind(point_color='community', point_size='pagerank').plot(ig)

To control the location, add x and y columns to the node tables (see demos). You may also want to bind point_title.

Next Steps

If you don't have an API key to a Graphistry server, one-click launch Graphistry in AWS
Check out the analyst and developer introductions, or try your own CSV
Explore the demos folder for your favorite file format, database, API, or kind of analysis

References

Graphistry UI Guide
Full Python (including IPython/Juypter) API documentation.
Within a notebook, you can always run help(graphistry), help(graphistry.hypergraph), etc.
Additional Graphistry API docs, including the predefined color palette values (color brewer)

Name		Name	Last commit message	Last commit date
Latest commit History 481 Commits
demos		demos
docs		docs
graphistry		graphistry
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
DEVELOP.md		DEVELOP.md
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
pipupload.sh		pipupload.sh
setup.cfg		setup.cfg
setup.py		setup.py
versioneer.py		versioneer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyGraphistry: Explore Relationships

Demo of Friendship Communities on Facebook

PyGraphistry is...

Gallery

Installation

Option 1: New users - Graphistry AWS Server with Preinstalled PyGraphistry client

Option 2: PyGraphistry pip package

Jupyter Notebook Integration

API Key

Tutorial: Les Misérables

Quick Visualization

Adding Labels

Controlling Node Title, Size, Color, and Location

Next Steps

References

About

Releases

Packages

Languages

License

mewbak/pygraphistry

Folders and files

Latest commit

History

Repository files navigation

PyGraphistry: Explore Relationships

Demo of Friendship Communities on Facebook

PyGraphistry is...

Gallery

Installation

Option 1: New users - Graphistry AWS Server with Preinstalled PyGraphistry client

Option 2: PyGraphistry pip package

Jupyter Notebook Integration

API Key

Tutorial: Les Misérables

Quick Visualization

Adding Labels

Controlling Node Title, Size, Color, and Location

Next Steps

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages