Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OMIM PMID, UMLS, and Orphanet mappings out of date #192

Open
joeflack4 opened this issue Feb 5, 2025 · 13 comments · May be fixed by #194
Open

OMIM PMID, UMLS, and Orphanet mappings out of date #192

joeflack4 opened this issue Feb 5, 2025 · 13 comments · May be fixed by #194
Assignees
Labels
bug Something isn't working

Comments

@joeflack4
Copy link
Contributor

Overview

@twhetzel I don't know how important these are, if we're using them at all, but I just noticed some very old code: https://github.com/monarch-initiative/omim/blob/main/omim2obo/parsers/omim_txt_parser.py#L433

You can see the output of that function used here: https://github.com/monarch-initiative/omim/blob/main/omim2obo/main.py#L417-L424

This is hard coded to a local file called updated_01_2020_to_08_2021.json, but there's a todo about getting this from the OMIM API.

@joeflack4 joeflack4 self-assigned this Feb 5, 2025
@joeflack4 joeflack4 added the bug Something isn't working label Feb 5, 2025
@twhetzel
Copy link
Contributor

twhetzel commented Feb 5, 2025

does anything call that method?

@joeflack4
Copy link
Contributor Author

Yeah. You can see it in the second code block I linked. That code block and directly below it shows IAO:0000142 and skos:exactMatch that are created based on this data.

@twhetzel
Copy link
Contributor

twhetzel commented Feb 6, 2025

@joeflack4 can you summarize the consequences of that code block as it currently is and point out what data is effected because of this and if this relates to any data we are actually using?

@joeflack4
Copy link
Contributor Author

The consequence, in summary, is the title itself: "OMIM's PMID, UMLS, and Orphanet mappings out of date".

Basically, that file is hard coded to a specific date: updated_01_2020_to_08_2021.json.

What's needed is to get this file through the API instead .

I don't know if these mappings are used for anything or how they might be used. I was wondering if you might know. If not, Nico should know.

@matentzn
Copy link
Member

matentzn commented Feb 6, 2025

Are these mappings making their way into omim.owl? If not, delete the file to avoid confusion.

@twhetzel
Copy link
Contributor

twhetzel commented Feb 6, 2025

I don't know if these mappings are used for anything or how they might be used.

Ok, this is what I was asking for if/how these mappings are used. @joeflack4 can you follow-up on Nico's question to see if these mappings are making their way into the omim.owl file and need this other information?

@joeflack4
Copy link
Contributor Author

@matentzn I just want to be really clear before I start removing stuff. So, it is true that updated_01_2020_to_08_2021.json is currently being used to add these mappings to omim.owl. Also there is a legacy_omim.ttl that Dazhi created which also adds mappings. They're both old.

So when you delete the file, I interpret that you mean that I can do any of these things:
a. Remove these old files, as well as the code that processes them.
b. Remove these files, but leave the code, should we need it later. And add exception handling so that it doesn't error out if these files are missing.

Do you have a preference?

@matentzn
Copy link
Member

matentzn commented Feb 7, 2025

If they effect omim.owl they effect our alignment system, so no, in this case I think it's a bug, and these files need to be updated (ideally as part of the build). Does OMIM provide any up to date mappings?

@twhetzel
Copy link
Contributor

twhetzel commented Feb 7, 2025

@matentzn how is this effecting the alignment system? The code looks like it's using the static file (updated_01_2020_to_08_2021.json) to add additional annotations to an OMIM class for PubMed, UMLS, and Orphanet. Where are these annotations used in downstream processing of the omim.owl file?

Image

@matentzn
Copy link
Member

matentzn commented Feb 7, 2025

The lexical matching will use the mappings for matching (lexmatch).

@twhetzel
Copy link
Contributor

twhetzel commented Feb 7, 2025

Thanks, I see them now in the omim.sssom.tsv file

OMIM:615119	mitochondrial complex 4 deficiency, nuclear type 6	skos:exactMatch	MONDO:0014051		semapv:UnspecifiedMatching
OMIM:615119	mitochondrial complex 4 deficiency, nuclear type 6	skos:exactMatch	Orphanet:1561		semapv:UnspecifiedMatching
OMIM:615119	mitochondrial complex 4 deficiency, nuclear type 6	skos:exactMatch	UMLS:C3554534		semapv:UnspecifiedMatching

FWIW, the OMIM API has a limit of 5,000 calls per day.

Also, IAO:0000142 is 'mentions'.

@joeflack4
Copy link
Contributor Author

@twhetzel @matentzn I can figure it out from here you guys. There's already some starter code for calling the API for this.

@twhetzel I'll let you know if I run into any API rate limit issues.
@twhetzel Just need to know what you think the relative priority is for this. March board may be a no go as I have some vacation time next week.

@twhetzel
Copy link
Contributor

twhetzel commented Feb 7, 2025

I left in Backlog for March and it's fine if it carries over to April. Thanks for checking.

@joeflack4 joeflack4 linked a pull request Feb 8, 2025 that will close this issue
@joeflack4 joeflack4 linked a pull request Feb 8, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants