Start with HeatGeo
The Heat-Geodesic embedding preserves the heat-geodesic dissimilarity defined as \[ d_t(x_i,x_j) = \bigg[ -4t \log (\mathbf{H}_t)_{ij} - \sigma 4 t \log(\mathbf{V})_{ij} \bigg] ^{1/2}, \] where \(\mathbf{H}_t\) is a heat kernel on a graph, and \(\mathbf{V}\) is a volume regularization term. This dissimilarity is inspired by Varadhan’s formula which relates the heat kernel to the geodesic distance on a manifold. For more details on the heat-geodesic dissimilarity read our preprint A Heat Diffusion Perspective on Geodesic Preserving Dimensionality Reduction.
We are currently updating this repository to provide examples and improve the documentation.
Install
The package is available on PyPI, you can install it by running
pip install heatgeo
To reproduce the results in experiments/
or try the embeddings with different graph constructions, you need additional packages that can be installed via the development version. In this case run
pip install heatgeo['dev']
We provide an example below.
How to use
To create the embedding of a dataset data
, run
from heatgeo.embedding import HeatGeo
= HeatGeo(knn=5)
emb_op = emb_op.fit_transform(data) emb
We provide a Google colab example on the swiss roll
The directory experiments
contains code to reproduce our main results. We used hydra
, the parameters can be changed in config
or directly in the CLI. In notebooks
, we provide examples on toy datasets.
Contributing
We are using nbdev
for this package and the documentation. See this introduction to start using nbdev
. The code and documentation should be modified in the notebooks nbs/
, then run nbdev_prepare
before a commit. This command will export the notebooks to .py
files in heatgeo
, it will also clean the metadata, and run some test. The page will then automatically be deployed through GitHub actions.
Acknowledgements
This repository is a simplified version of a larger codebase used for development. It loses the original commit history which contains contributions from other authors of the paper. This repository uses or modify code from the PHATE implementation, and the Chebychev polynomials implementation of the paper Fast Multiscale Diffusion on Graphs.