All notable changes to the PyGraphistry are documented in this file. The PyGraphistry client and other Graphistry components are tracked in the main Graphistry major release history documentation.
The changelog format is based on Keep a Changelog. This project adheres to Semantic Versioning and all PyGraphistry-specific breaking changes are explictly noted here.
- AI: Easy import of featurization kwargs for
g.umap(**kwargs)
andg.featurize(**kwargs)
- AI:
g.get_features_by_cols
returns featurized submatrix withcol_part
in their columns - AI:
g.conditional_graph
andg.conditional_probs
assessing conditional probs and graph - AI Demos folder: OSINT, CYBER demos
- AI: Full text & semantic search (
g.search(..)
andg.search_graph(..).plot()
) - AI: Featurization: support for dataframe columns that are list of lists -> multilabel targets
set using
g.featurize(y=['list_of_lists_column'], multilabel=True,...)
- AI:
g.embed(..)
code for fast knowledge graph embedding (2-layer RGCN) and its usage for link scoring and prediction - AI: Exposes public methods
g.predict_links(..)
andg.predict_links_all()
- AI: automatic naming of graphistry objects during
g.search_graph(query)
->g._name = query
- AI: RGCN demos - Infosec Jupyterthon 2022, SSH anomaly detection
- GIB: Add missing import during group-in-a-box cudf layout of 0-degree nodes
- Tests: SSO login tests catch more unexpected exns
- Personal keys:
register(personal_key_id=..., personal_key_secret=...)
- SSO:
register()
(no user/pass),register(idp_name=...)
(org-specific IDP)
- Type errors
- AI:
umap(engine='cuml')
now supports older RAPIDS versions via knn fallback for edge creation. Also:"umap_learn"
, defaults to"auto"
prune_self_edges()
to drop any edges where the source and destination are the same
- Infra: Updated github actions versions and Ubuntu environment for publishing
- AI: full text & semantic search (
g.search(..)
andg.search_graph(..).plot()
) - Featurization: support for dataframe columns that are list of lists -> multilabel targets
set using
g.featurize(y=['list_of_lists_column'], multilabel=True,...)
Only supports single-column data targets
- Infra: Updated github actions
encode_axis()
now correctly sets axis- work around mypy mistyping operator & on pandas series
- Speed up
g.umap()
>100x by using cuML UMAP engine - Drop official support for Python 3.6 - its LTS security support stopped 9mo ago
- neo4j: v5 support - backwards-compatible changing derefs from id to element_id
- umap: Optional
engine
parameter (defaultcuml
) forUMAP()
- ipynb: UMAP purpose, functionality and parameter details, with general UMAP notebook planned in future (features folder)
- has_umap: removed as no longer necessary
- neo4j: v5 support (experimental)
- Infra: suppress igraph pandas FutureWarnings
- Infra: Remove heavy AI dependencies from
pip install graphistry[dev]
- igraph: Optional
use_vids
parameter (defaultFalse
) forto_igraph()
and its callers (layout_igraph
,compute_graph
) - igraph: add
coreness
andharmonic_centrality
tocompute_igraph
- igraph: CI errors around igraph
- igraph: Tolerate deprecation warning of
clustering
- Docs: Typos and updates - thanks @gadde5300 + @szhorvat !
- Speed up
import graphistry
10X+ by lazily importing AI dependencies. Use ofpygraphistry[ai]
features will still trigger slow upstream dependency initialization times upon first use.
- Docs: Update Labs references to Hub
group_in_a_box_layout()
: Remove verbose outputgroup_in_a_box_layout()
: Remove synthesized edge weight
- Types: Switch
materialize_nodes
engine param to explicitly usingEngine
typing (no change to untyped user code)
g.keep_nodes(List or Series)
g.group_in_a_box_layout(...)
: Both CPU (pandas/igraph) and (cudf/cugraph) versions, and various partitioning/layout/styling settings- Internal clientside Brewer palettes helper for categorical point coloring
- Infra: CI early fail on deeper lint
- Infra: Move Python 3.6 from core to minimal tests due to sklearn 1.0 incompatibility
- lint
- suppress known dgl bindings test type bug
_table_to_arrow()
forcudf
: Updated for RAPIDS 2022.02+ to handle deprecation ofcudf.DataFrame.hash_columns()
in favor of newcudf.DataFrame.hash_values()
materialize_nodes()
: Supportscudf
, materializing acudf.DataFrame
nodes table when._edges
is an instance ofcudf.DataFrame
to_cugraph()
,from_cugraph()
,compute_cugraph()
,layout_cugraph()
- docs: cugraph demo notebook
- Infra: Update GPU test env settings
materialize_nodes
: Return regular index
hypergraph()
in dask handles failing metadata type inference- tests: gpu env tweaks
- tests: umap logging was throwing warnings
g.transform()
g.transform_umap()
g.scale()
- Memoization on UMAP and Featurize calls
- Adds **kwargs and propagates them through to different function calls (featurize, umap, scale, etc)
- Final deprecation of
register(api=2)
protobuf/vgraph mode - also works around need for protobuf test upgrades
register(..., org_name='my_org')
: Optionally upload into an organizationg.privacy(mode='organization')
: Optionally limit sharing to within your organization
- docs:
org_name
inREADME.md
and sharing tutorial
compute_igraph()
layout_igraph()
scene_settings()
from_igraph
usesg._node
instead of'name'
in more cases
g.from_igraph(ig)
will use IDs (ex: strings) for src/dst values instead of igraph indexes
Major version bump due to breaking igraph change
- igraph handlers:
graphistry.from_igraph
,g.from_igraph
,g.to_igraph
- docs: README.md examples of using new igraph methods
- Deprecation warnings in old igraph methods:
g.graph(ig)
,igraph2pandas
,pandas2igraph
- Internal igraph handlers upgraded to use new igraph methods
network2igraph
andigraph2pandas
renamed output node ID column to_n_implicit
(constants.NODE
)
- Expose symbols for
.chain()
predicates as top-level: previousast
export was incorrect
Major version bump due to large dependency increases for kitchen-sink installs and overall sizeable new feature
- Use buildkit with pip install caching for test dockerfiles
- Graph AI branch: Autoencoding via dirty_cat and sentence_transformers (
g.featurize()
) - Graph AI branch: UMAP via umap_learn (
g.umap()
) - Graph AI branch: GNNs via DGL (
g.build_dgl_graph()
) g.reset_caches()
to clear upload and compute caches (last 100)- Central
setup_logger()
- Official Python 3.10 support
- Logging: Refactor to
setup_logger(__name__)
- hypergraph: use default logger instead of DEBUG
- `g.collapse(node='root_id', column='some_col', attribute='some_val')
- Avoid runtime import exn when on GPU-less systems with cudf/dask_cudf installed
- Docs:
readme.md
digest of compute methods
- Docs:
get_degree()
->get_degrees()
(graphistry#330) - Upload memoization handles column renames (graphistry#326)
g.edges()
now takes an optional 4th named parameteredge
ID
Code that looks like g.edges(some_fn, None, None, some_arg)
should now be like g.edges(some_fn, None, None, None, some_arg)
- Similar new optional
edge
ID parameter ing.bind()
g.hop()
now takes optionalreturn_as_wave_front=False
, primarily for internal use bychain()
g.chain([...])
withgraphistry.ast.{n, e_forward, e_reverse, e_undirected}
- Node dictionary-based filtering:
g.filter_nodes_by_dict({"some": "value", "another": 2})
- Edge dictionary-based filtering:
g.filter_edges_by_dict({"some": "value", "another": 2})
- Hops support edge filtering:
g.hop(hops=2, edge_match={"type": "transaction"})
- Hops support pre-node filtering:
g.hop(hops=2, source_node_match={"type": "account"})
- Hops support post-node filtering:
g.hop(hops=2, destination_node_match={"type": "wallet"})
- Hops defaults to full graph if no initial nodes specified:
g.hop(hops=2, edge_match={"type": "transaction"})
- Horizontal and radial axis using
.encode_axis(rows=[...])
- Docs: Work around sphinx-doc/sphinx#10291
- Better implementation of
.tree_layout(...)
using Sugiyama; good for small/medium DAGs - Layout rotation method
.rotate(degree)
- Compute method
.hops(nodes, hops, to_fixed_point, direction)
- Infra:
test-cpu-local-minimum.sh
accepts params
- Docs: Point color encodings
- Unpin Networkx
- Docs: Removed deprecated
api=1
,api=2
registration calls (#280 by @pradkrish) - Docs: Fixed bug in honeypot nb (#279 by @pradkrish)
- Tests: Networkx test version sniffing
- Docs: Sharing control demos/more_examples/graphistry_features/sharing_tutorial.ipynb
- Feature: global
graphistry.privacy()
and compositionalPlotter.privacy()
- Docs: How to use
privacy()
- Docs: Start removing deprecated 1.0 API docs
- Fix: NetworkX 2.5+ support - accept minor version tags
- Fix: igraph
.plot()
arrow coercion syntax error (graphistry#257) - Fix: Lint duplicate import warning
- CI: Treat lint warnings as CI failures
- Infra: Add CI stage that installs and tests with minimal core deps (graphistry#254)
- Fix: Core tests pass with minimal install dependencies (graphistry#253, graphistry#254)
- Feature: Compute methods
materialize_nodes
,get_degrees
,drop_nodes
,get_topological_levels
- Feature: Layout methods
tree_layout
,layout_settings
- Docs: New compute and layout methods
- Feature:
g.fetch_edges()
for neptune/gremlin edge attributes
- Fix:
g.fetch_nodes()
for neptune/gremlin node attrbutes
- Docs: Updated demos/for_analysis.ipynb to
api=3
- Fix: Gremlin (Neptune) connector deduplicates nodes/edges
- Feature: Gremlin connector (GraphSONSerializersV2d0)
- Feature: Cosmos connector
- Feature: Neptune connector
- Feature: Chained composition operators:
g.pipe((lambda g, a1, ...: g2), a1, ...)
g.edges((lambda g, a1, ...: df), None, None, a1, ...)
g.nodes((lambda g, a1, ...: df), None, a1, ...)
- Feature: plotter::infer_labels: Guess node label names when not set, instead of defaulting to node_id. Runs during plots.
- Infra: Jupyter notebook:
cd docker && docker-compose build jupyter && docker-compose up jupyter
- Docs: Neptune, Cosmos, chained composition
- Refactor: Split out PlotterBase, interface Plottable
- Fix: Plotter has
hypergraph()
- Docs: security.md
- Hypergraphs - detect and handle mismatching types across partitions
- Infra: Speedup testing containers via incrementalization and docker settings
- Infra: Update testing container base builds
- Feature: Hypergraphs in dask, dask_cudf modes. Mixed nan support. (graphistry#225)
- Feature: Dask/dask_cuda frames can be passed in, which will be .computed(), memoized, and converted to arrow (graphistry#225)
- Infra: Test env var controls - WITH_LINT=1, WITH_TYPECHECK=1, WITH_BUILD=1 (graphistry#225)
- Docs: Inline hypergraph examples (graphistry#225)
- CI: Disable seccomp during test (docker perf) (graphistry#225)
- Feature: cudf mode for hypergraph (graphistry#224)
- Feature: pandas mode for hypergraph uses all-vectorized operations (graphistry#224)
- Infra: Engine class for picking dataframe engine - pandas/cudf/dask/dask_cudf (graphistry#224)
- CI: mypy type checking (graphistry#222)
- CI: GPU test harness (graphistry#223)
- Hypergraph: Uses new pandas/cudf implementations (graphistry#224)
- Infra: Issue templates for bugs and feature requests
- Docs: Overhaul Sphinx docs - Update, clean all warnings, add to CI, reject commits that fail
- Docs: Setup.py (pypi) now includes full README.md
- Docs: Added ARCHITECTURE, CONTRIBUTE, and expanded DEVELOP
- Garden: DRY for CI + local dev via shared bin/ scripts
- Docker: Downgrade local dev 3.7 -> 3.6 to more quickly catch minimum version errors
- CI: Now tests building docs (fail on warnings), pypi wheels distro, and neo4j connector
- Changes in setup.py extras_require: 'all' installs more
- Docs: ARCHITECTURE.md and CONTRIBUTE.md
- Quieted memoization fail warning
- CI: Removed TravisCI in favor of GHA
- CD: GHA now handles PyPI publish on tag push
- Docs: Readme install clarifies Python 3.6+
- Docs: Update DEVELOP.md dev flow
- Friendlier error message for calling .cypher(...) without setting BOLT auth/driver (graphistry#204)
- CI: Run containerized neo4j connector tests
- Infrastructure: Set Python 3.9 support metadata
- Memoization: When memoize hashes throw exceptions, emit warning and fallback to unmemoized (b7a25c74e)
- Friendlier error message for api=1, 2 server non-json responses (graphistry#187)
- CI: Moved to GitHub Actions for CI + optional manual publish
- CI: Added Python 3.9 to test matrix
- Infrastructure: Upgraded Versioneer to 0.19
- Infrastructure: Fewer warnings and enforce flake8 CI checks
- None known; many small changes to fix warnings so version bump out of caution
- File API: Enable via
.plot(as_files=True)
. By default, auto-skips file re-uploads (disable via.plot(memoize=False)
) for tables with same hash as those already uploaded in the same session. Use with.register(api=3)
clients on Graphistry2.34
+ servers. More details at (graphistry#195) . - Dev: More docs and logging as part of graphistry#195
- Auth service account docs in README.md (12.2.2020)
- Examples for icons, badges, and new node/edge bindings
- graph-app-kit links
- Slack link
- Python test matrix: Removed 3.9
- Propagate misformatted etl1/2 server errors
- Warnings: Standardizing on Python's warnings.warn
- Neo4j: Improve handling of empty query results (graphistry#178)
- Icons: Add new as_text, blend_mode, border, and style options (Graphistry 2.32+)
- Badges: Add new badge encodings (Graphistry 2.32.+)
- Python 3.8, 3.9 in test matrix
- New binding shortcuts
g.nodes(df, col)
andg.nodes(df, src_col, dst_col)
- Python 2.7: Removed future (Python 2.7 has already been EOL so not breaking)
- Redid ipython detection
- Imports: Refactoring for more expected style
- Testing: Fixed most warnings in preperation for treating them as errors
- Testing: Integration tests against self-contained neo4j instance
- Chainable methods
.addStyle()
and.style()
inapi=3
for controlling foreground, background, logo, and page metadata. Requires Graphistry 2.31.10+ 08eddb8 - Chainable methods
.encode_[point|edge]_[color|icon|size]()
for more powerful complex encodings, and underlying generic handler__encode()
. Requires Graphistry 2.31.10+ f370ca8 - More usage examples in README.md
- Split
ArrowLoader::*encoding*
methods to*binding*
vs.*encoding*
ones to more precisely reflect the protocol. Not considered breaking as an internal method.
- Neo4j 4 temporal and spatial type support - #172
- CHANGELOG.md
- Removed deprecated docker test harness in favor of
docker/
- #172