Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/0.10.0 #62

Merged
merged 19 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
ad3ef80
refactor: Improve error handling in parser
anergictcell Apr 2, 2024
9ec4a19
Update documentation on which gene_to_phenotype files to download
anergictcell Apr 2, 2024
d73d132
refactor: Add better error messages in SimilarityCombiner trait
anergictcell Apr 2, 2024
bad8fcf
Add error handling to parsing of phenotype.hpoa file
anergictcell Apr 2, 2024
128b81b
refactor: Add error handling to parsing gene_pheno file
anergictcell Apr 3, 2024
a9f17e9
Modify Ontology.add_gene due improve parsing error handling
anergictcell Apr 4, 2024
b7b343e
Update compare-ontologies example to also compare transitive vs normal
anergictcell Apr 5, 2024
504b3c7
Minor formatting fix
anergictcell Apr 6, 2024
b4dc6af
Add more tests to
anergictcell Apr 6, 2024
03a91cb
Reorder methods to simplify the documentation
anergictcell Apr 6, 2024
ea9b191
Version bump to 0.10.0
anergictcell Apr 6, 2024
2a283a1
Merge pull request #58 from anergictcell/refactor/error-handling
anergictcell Apr 6, 2024
9e98edb
Remove parser from benchmarks
anergictcell Apr 6, 2024
c98d088
Merge pull request #59 from anergictcell/refactor/reorder-ontology
anergictcell Apr 6, 2024
e4929a6
Internalize code from statrs crate to reduce dependencies
anergictcell Mar 29, 2024
f9faf20
Add tests for hypergeom enrichment
anergictcell Apr 7, 2024
49cebbd
Merge pull request #60 from anergictcell/refactor/remove-statrs
anergictcell Apr 7, 2024
9619a32
Merge branch 'main' into release/0.10.0
anergictcell Jun 11, 2024
e1b193e
Fixed merge request coming from parser refactoring on multiple branches
anergictcell Jun 11, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,28 @@ All notable changes to this project will be documented in this file.

## Unreleased

## [0.10.0]

### Feature

- Add Orphante diseases (`OrphaDisease`) to Ontology
- Filter gene and disease annotations in subontology based on association with phenotypes
- Add binary version 3
- Add new example ontology

### Documentation

- Change orders of methods in `Ontology` to clean up the documentation.

### Refactor

- Improve the OBO parser with better error handling
- [**breaking**] Add `Disease` trait that is needed to work with `OmimDisease` and `OrphaDisease`
- Update example ontology
- Update unit- and doctests to align with updated example ontology

## [0.9.0] - 2024-03-27

## [0.9.1] - 2024-03-30

### Bugfix

Expand Down
3 changes: 1 addition & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "hpo"
version = "0.9.1"
version = "0.10.0"
edition = "2021"
authors = ["Jonas Marcello <[email protected]>"]
description = "Human Phenotype Ontology Similarity"
Expand All @@ -15,7 +15,6 @@ categories = ["science", "data-structures", "parser-implementations"]
[dependencies]
thiserror = "1.0"
aquamarine = "0" # used in Docs
statrs = "0.16.0"
tracing = "0.1"
smallvec = "1"

Expand Down
16 changes: 10 additions & 6 deletions examples/compare_ontologies.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ fn ontology(path_arg: &str) -> Ontology {
}
}

fn ontology2(path_arg: &str) -> Ontology {
fn ontology_transitive(path_arg: &str) -> Ontology {
let path = Path::new(path_arg);

match path.is_file() {
Expand Down Expand Up @@ -226,17 +226,21 @@ fn print_replacement_diff(replacements: Option<(Option<HpoTermId>, Option<HpoTer
fn main() {
let mut args = std::env::args();

if args.len() != 3 {
if args.len() < 2 {
println!("Compare two Ontologies to each other and print the differences\n\n");
println!("Usage:\ncompare_ontologies </PATH/TO/ONTOLOGY> </PATH/TO/OTHER-ONTOLOGY>");
println!("Usage:\ncompare_ontologies </PATH/TO/ONTOLOGY> [</PATH/TO/OTHER-ONTOLOGY>]");
println!("e.g.:\ncompare_ontologies tests/ontology.hpo tests/ontology_v2.hpo:\n");
println!("Alternatively compare transitive vs non-transitive:\ncompare_ontologies tests/ontology.hpo\n");
process::exit(1)
}
let arg_old = args.nth(1).unwrap();
let arg_new = args.next().unwrap();

let lhs = ontology(&arg_old);
let rhs = ontology2(&arg_new);

let rhs = if let Some(arg_new) = args.next() {
ontology(&arg_new)
} else {
ontology_transitive(&arg_old)
};

let diffs = lhs.compare(&rhs);

Expand Down
15 changes: 10 additions & 5 deletions src/annotations/gene.rs
Original file line number Diff line number Diff line change
Expand Up @@ -119,11 +119,6 @@ impl Gene {
&self.name
}

/// Connect another [HPO term](`crate::HpoTerm`) to the gene
pub fn add_term<I: Into<HpoTermId>>(&mut self, term_id: I) -> bool {
self.hpos.insert(term_id)
}

/// The set of connected HPO terms
pub fn hpo_terms(&self) -> &HpoGroup {
&self.hpos
Expand Down Expand Up @@ -193,6 +188,16 @@ impl Gene {
pub fn to_hpo_set<'a>(&self, ontology: &'a Ontology) -> HpoSet<'a> {
HpoSet::new(ontology, self.hpos.clone())
}

/// Connect another [HPO term](`crate::HpoTerm`) to the gene
///
/// # Note
///
/// This method does **not** add the [`Gene`] to the [HPO term](`crate::HpoTerm`).
/// Clients should not use this method, unless they are creating their own Ontology.
pub fn add_term<I: Into<HpoTermId>>(&mut self, term_id: I) -> bool {
self.hpos.insert(term_id)
}
}

impl PartialEq for Gene {
Expand Down
81 changes: 74 additions & 7 deletions src/annotations/omim_disease.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@ use std::hash::Hash;
use crate::annotations::disease::DiseaseIterator;
use crate::annotations::{AnnotationId, Disease};
use crate::term::HpoGroup;
use crate::HpoError;
use crate::HpoTermId;
use crate::{HpoError, HpoSet, HpoTermId, Ontology};

/// A set of OMIM diseases
///
Expand Down Expand Up @@ -93,15 +92,83 @@ impl Disease for OmimDisease {
&self.name
}

/// Connect another [HPO term](`crate::HpoTerm`) to the disease
fn add_term<I: Into<HpoTermId>>(&mut self, term_id: I) -> bool {
self.hpos.insert(term_id)
}

/// The set of connected HPO terms
fn hpo_terms(&self) -> &HpoGroup {
&self.hpos
}

/// Returns a binary representation of the `OmimDisease`
///
/// The binary layout is defined as:
///
/// | Byte offset | Number of bytes | Description |
/// | --- | --- | --- |
/// | 0 | 4 | The total length of the binary data blob as big-endian `u32` |
/// | 4 | 4 | The `OmimDiseaseId` as big-endian `u32` |
/// | 8 | 4 | The length of the `OmimDisease` Name as big-endian `u32` |
/// | 12 | n | The `OmimDisease` name as u8 vector |
/// | 12 + n | 4 | The number of associated HPO terms as big-endian `u32` |
/// | 16 + n | x * 4 | The [`HpoTermId`]s of the associated terms, each encoded as big-endian `u32` |
///
/// # Examples
///
/// ```
/// use hpo::annotations::{Disease, OmimDisease};
///
/// let mut disease = OmimDisease::new(123.into(), "FooBar");
/// let bytes = disease.as_bytes();
///
/// assert_eq!(bytes.len(), 4 + 4 + 4 + 6 + 4);
/// assert_eq!(bytes[4..8], [0u8, 0u8, 0u8, 123u8]); // ID of disease => 123
/// assert_eq!(bytes[8..12], [0u8, 0u8, 0u8, 6u8]); // Length of Name => 6
/// ```
fn as_bytes(&self) -> Vec<u8> {
fn usize_to_u32(n: usize) -> u32 {
n.try_into().expect("unable to convert {n} to u32")
}
let name = self.name().as_bytes();
let name_length = name.len();
let size = 4 + 4 + 4 + name_length + 4 + self.hpos.len() * 4;

let mut res = Vec::new();

// 4 bytes for total length
res.append(&mut usize_to_u32(size).to_be_bytes().to_vec());

// 4 bytes for OMIM Disease-ID
res.append(&mut self.id.to_be_bytes().to_vec());

// 4 bytes for Length of OMIM Disease Name
res.append(&mut usize_to_u32(name_length).to_be_bytes().to_vec());

// OMIM Disease name (n bytes)
for c in name {
res.push(*c);
}

// 4 bytes for number of HPO terms
res.append(&mut usize_to_u32(self.hpos.len()).to_be_bytes().to_vec());

// HPO terms
res.append(&mut self.hpos.as_bytes());

res
}

/// Returns an [`HpoSet`] from the `OmimDisease`
fn to_hpo_set<'a>(&self, ontology: &'a Ontology) -> HpoSet<'a> {
HpoSet::new(ontology, self.hpos.clone())
}

/// Connect another [HPO term](`crate::HpoTerm`) to the disease
///
/// # Note
///
/// This method does **not** add the [`OmimDisease`] to the [HPO term](`crate::HpoTerm`).
/// Clients should not use this method, unless they are creating their own Ontology.
fn add_term<I: Into<HpoTermId>>(&mut self, term_id: I) -> bool {
self.hpos.insert(term_id)
}
}

impl PartialEq for OmimDisease {
Expand Down
4 changes: 4 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ pub mod utils;
pub use ontology::comparison;
pub use ontology::Ontology;
pub use set::HpoSet;
#[doc(inline)]
pub use term::{HpoTerm, HpoTermId};

const DEFAULT_NUM_PARENTS: usize = 10;
Expand Down Expand Up @@ -59,6 +60,9 @@ pub enum HpoError {
/// Failed to convert an integer to a float
#[error("cannot convert int to float")]
TryFromIntError(#[from] std::num::TryFromIntError),
/// Failed to parse a line of input data from the JAX obo
#[error("invalid input data: {0}")]
InvalidInput(String),
}

impl From<ParseIntError> for HpoError {
Expand Down
Loading
Loading