Skip to content

Commit

Permalink
Update the documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Jonas Marcello committed Jul 27, 2023
1 parent 418b872 commit ad699e2
Show file tree
Hide file tree
Showing 15 changed files with 485 additions and 62 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ check:

docs:
@echo "Generating documentation"
sphinx-build -b html -d _build/doctrees docs _build/html/
sphinx-build -a -b html -d _build/doctrees docs _build/html/

build:
@echo "Building packages"
Expand Down
93 changes: 48 additions & 45 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,17 @@ A Python library to work with, analyze, filter and inspect the `Human Phenotype

Visit the `PyHPO Documentation`_ for a more detailed overview of all the functionality.

.. _Human Phenotype Ontology: https://hpo.jax.org/
.. _PyHPO Documentation: https://pyhpo.readthedocs.io/en/latest/

Main features
=============

- Identify patient cohorts based on clinical features
- Cluster patients or other clinical information for GWAS
- Phenotype to Genotype studies
- HPO similarity analysis
- Graph based analysis of phenotypes, genes and diseases
* 👫 Identify patient cohorts based on clinical features
* 👨‍👧‍👦 Cluster patients or other clinical information for GWAS
* 🩻→🧬 Phenotype to Genotype studies
* 🍎🍊 HPO similarity analysis
* 🕸️ Graph based analysis of phenotypes, genes and diseases


**PyHPO** allows working on individual terms ``HPOTerm``, a set of terms ``HPOSet`` and the full ``Ontology``.
Expand All @@ -25,8 +27,45 @@ Internally the ontology is represented as a branched linked list, every term con

It provides an interface to create ``Pandas Dataframe`` from its data, allowing integration in already existing data anlysis tools.

Examples
--------

Getting started
===============

The easiest way to install **PyHPO** is via pip

.. code:: bash
pip install pyhpo
or, you can additionally install optional packages for extra functionality

.. code:: bash
# Include pandas during install
pip install pyhpo[pandas]
# Include scipy
pip install pyhpo[scipy]
# Include all dependencies
pip install pyhpo[all]
.. note::

Some features of PyHPO require ``pandas`` and ``scipy``. The standard installation via pip will not include pandas or scipy and PyHPO will work just fine. (You will get a warning on the initial import though).

Without installing ``pandas``, you won't be able to export the Ontology as a ``Dataframe``, everything else will work fine.

Without installing ``scipy``, you won't be able to use the ``stats`` module, especially the enrichment calculations.


Usage example
=============

Basic use cases
---------------

Some examples for basic functionality of PyHPO

How similar are the phenotypes of two patients
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -85,40 +124,6 @@ How close are two HPO terms
"""
Getting started
===============

The easiest way to install **PyHPO** is via pip

.. code:: bash
pip install pyhpo
or, you can additionally install optional packages for extra functionality

.. code:: bash
# Include pandas during install
pip install pyhpo[pandas]
# Include scipy
pip install pyhpo[scipy]
# Include all dependencies
pip install pyhpo[all]
.. note::

Some features of PyHPO require ``pandas`` and ``scipy``. The standard installation via pip will not include pandas or scipy and PyHPO will work just fine. (You will get a warning on the initial import though).

Without installing ``pandas``, you won't be able to export the Ontology as a ``Dataframe``, everything else will work fine.

Without installing ``scipy``, you won't be able to use the ``stats`` module, especially the enrichment calculations.


Usage example
=============

HPOTerm
-------
An ``HPOTerm`` contains various metadata about the term, as well as pointers to its parents and children terms. You can access its information-content, calculate similarity scores to other terms, find the shortest or longes connection between two terms. List all associated genes or diseases, etc.
Expand Down Expand Up @@ -308,7 +313,6 @@ It can be reused across several modules, e.g:
return Ontology.get_hpo_object(term)
HPOSet
------
An ``HPOSet`` is a collection of ``HPOTerm`` and can be used to represent e.g. a patient's clinical information. It provides APIs for filtering, comparisons to other ``HPOSet`` and term/gene/disease enrichments.
Expand Down Expand Up @@ -406,7 +410,8 @@ Examples:
*(This script is complete, it should run "as is")*


For a more detailed description of how to use PyHPO, visit the `PyHPO Documentation`_.
For a more detailed description of how to use PyHPO, visit the `PyHPO Documentation <https://pyhpo.readthedocs.io/en/latest/>`_.



Contributing
Expand All @@ -424,6 +429,4 @@ PyHPO is using the Human Phenotype Ontology. Find out more at http://www.human-p

Sebastian Köhler, Leigh Carmody, Nicole Vasilevsky, Julius O B Jacobsen, et al. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Research. (2018) doi: 10.1093/nar/gky1105

.. _PyHPO Documentation: https://centogene.github.io/pyhpo/
.. _MIT license: http://www.opensource.org/licenses/mit-license.php
.. _Human Phenotype Ontology: https://hpo.jax.org/
5 changes: 3 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@

# General information about the project.
project = 'PyHPO'
copyright = '2021, CENTOGENE GmbH'
copyright = '2023, Jonas Marcello'
author = pyhpo.__author__

# The version info for the project you're documenting, acts as replacement for
Expand All @@ -74,7 +74,7 @@
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
language = "en"

# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
Expand Down Expand Up @@ -296,3 +296,4 @@
autodoc_member_order = 'bysource'

napoleon_use_param = True
autodoc_typehints = "both"
34 changes: 26 additions & 8 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,29 @@
#################################
Welcome to PyHPO's documentation!
#################################
###################
PyHPO documentation
###################

.. toctree::
:maxdepth: 1
:caption: API documentation:
:caption: 🚀 Getting started:

tutorial/installation
tutorial/basics
tutorial/data

.. toctree::
:maxdepth: 1
:caption: 🖥️ Examples:

tutorial/examples
tutorial/terms
tutorial/ontology
tutorial/sets
tutorial/enrichment

.. toctree::
:maxdepth: 1
:hidden:
:caption: 📄 API documentation:

hpoterm
ontology
Expand All @@ -16,17 +35,16 @@ Welcome to PyHPO's documentation!
data
parser

*************
Introduction:
*************


.. include:: ../README.rst
:end-before: Getting started


##################
Indices and tables
##################

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

165 changes: 165 additions & 0 deletions docs/tutorial/basics.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
Basics
------

**PyHPO** provides and easy interface to work with the Human Phenotype Ontology. The main interface is the :doc:`ontology` object, which must be instantiated once. ``Ontology`` is designed as a singleton, so the same instance can be used across different modules.

Ontology
~~~~~~~~

``Ontology`` can be instantiated with the default master data, simply by calling

.. code:: python
from pyhpo import Ontology
_ = Ontology()
It can now be used across all modules. Imagine the following source code:

::

/mymodule
|- foo.py
|- bar.py
|- main.py


``foo.py``:

.. code:: python
from pyhpo import Ontology
def ontology_len():
print(len(Ontology))
``bar.py``:

.. code:: python
from pyhpo import Ontology
def get_term_name(term_id: int) -> str:
try:
term = Ontology[term_id]
except KeyError:
print("Term not present in Ontology")
return ""
``main.py``:

.. code:: python
import foo
import bar
from pyhpo import Ontology
# This is the only time where the Ontology is instantiated.
_ = Ontology()
foo.ontology_len()
# ==> Prints the number of HPO terms in the Ontology
bar.get_term_name(118) # ==> "Phenotypical abnormality"
This code works as expected, the ``Ontology`` singleton is shared across all modules and submodules. It must be instantiated only once. Other modules only need to import the ``Ontology`` object.


By default, ``Ontology()`` will load the HPO version provided along with the library (see the :doc:`data` section for details about how to update or change the masterdata.


The Ontology holds references to all HPO terms, genes and diseases. Since terms are the most common use-case, ``Ontology`` allows easy subsetting to retrieve terms. For this, use the integer form of the HPO-Term ID:

.. code:: python
from pyhpo import Ontology
_ = Ontology()
term = Ontology[118] # ==> returns term `HP:0000118`
Alternatively, terms can be retrieved by using the full HPO-Term ID:

.. code:: python
from pyhpo import Ontology
_ = Ontology()
term = Ontology.get_hpo_object("HP:0000118") # ==> returns term `HP:0000118`
The ``Ontology`` can also be used as an iterator, it iterates all HPO-Terms in random order:

.. code:: python
from pyhpo import Ontology
_ = Ontology()
for term in Ontology:
print(term)
HPOTerm
~~~~~~~

Another object that is a key part of **PyHPO** are the :doc:`terms`. HPOTerms are the building block of the ontology and provide a lot of relevant functionality. They hold references to all their ancestor and child terms, allowing a fast traversal of individual arms of the ontology.

.. code:: python
from pyhpo import Ontology
_ = Ontology()
term = Ontology[118]
for child in term.children:
print(f"{child}")
for parent in term.parents:
print(f"{parent}")
# You can also iterate over all parents and their parents and grandparents etc.
for ancestor in term.all_parents:
print(f"{ancestor}")
Do not try to instantiate ``HPOTerm`` s manually. Doing this would miss all important links to parents, children, genes, diseases etc.


HPOSet
~~~~~~

:doc:`sets` are an important feature of **PyHPO** for doing patient or disease based data analysis. An HPOSet is primarily just that: A set of HPOTerms. You can use it to document the clinical information or full phenotype of a patient or to describe a disease. ``HPOSet`` work on top of Pythons standard ``set`` (``Set[HPOTerm]``) and can easily be build from such. They do, however, provide a lot of additional functionality.

HPOSets can be compared to each each other to identify similar patients or diseases. The similarity comparisons can be used for clustering patient cohorts.

.. code:: python
from pyhpo import Ontology, HPOSet
_ = Ontology()
ci_1 = HPOSet.from_queries([
'HP:0002943',
'HP:0008458',
'HP:0100884',
'HP:0002944',
'HP:0002751'
])
ci_2 = HPOSet.from_queries([
'HP:0002650',
'HP:0010674',
'HP:0000925',
'HP:0009121'
])
# Determine the similarity
ci_1.similarity(ci_2) # ==> 0.7593552670152157
Enrichment
~~~~~~~~~~

**PyHPO** includes statistical tests to determine the hypergeometric enrichment of linked diseases or genes in a set of HPOTerms. You can use this to find genes that are relevant for the phenotype of a patient. More examples are documented in :doc:`enrichment`.
Loading

0 comments on commit ad699e2

Please sign in to comment.