Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that now-present chain identifiers in the GOA GPIs are correctly propagated #123

Open
2 tasks
kltm opened this issue Feb 4, 2025 · 7 comments
Open
2 tasks

Comments

@kltm
Copy link
Member

kltm commented Feb 4, 2025

Recently, GOA started supplying chain identifiers in their GPIs this ticket is to ensure that they are

  • being correctly propagated into the NEO load and
  • do not cause any issue with Noctua curation

An example would be: UniProtKB:A0A3B3IS91-PRO_0000454209

tagging @pgaudet @vanaukenk

@kltm
Copy link
Member Author

kltm commented Feb 4, 2025

@pgaudet These identifiers do not seem to appear in my load created on 2025-01-31. Could you confirm the date and location, just to be sure?

@kltm
Copy link
Member Author

kltm commented Feb 4, 2025

@pgaudet @vanaukenk I'm not sure what project this should belong to; I've temporarily added it to the pipeline/data QC rolling project. Please feel free to move it.

@pgaudet
Copy link

pgaudet commented Feb 4, 2025

These identifiers do not seem to appear in my load created on 2025-01-31. Could you confirm the date and location, just to be sure?

That seems right. The latest 'official' file according to our yaml is timestamped 2024-12-21 17:57

The new files for the future GOC-GOA data exchange pipeline do contain the chains, if you want to test.

Thanks, Pascale

@kltm
Copy link
Member Author

kltm commented Feb 4, 2025

@pgaudet Okay, I had assumed that these were in the "main" GAFs and GPIs we were picking up, but this looks like it's a little more in-progress, for the future.
It looks like the files at https://ftp.ebi.ac.uk/pub/contrib/goa/panther_proteomes/ would be for the post-GOA derivative production at the GO site. I'll download the current version there and try it in our derivatives-from-goa pipeline.

kltm added a commit to geneontology/pipeline that referenced this issue Feb 4, 2025
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Feb 4, 2025
kltm added a commit to geneontology/pipeline-from-goa that referenced this issue Feb 4, 2025
kltm added a commit to geneontology/pipeline that referenced this issue Feb 4, 2025
@pgaudet
Copy link

pgaudet commented Feb 5, 2025

Sounds good !

@kltm
Copy link
Member Author

kltm commented Feb 6, 2025

Well, thinking through this, even going through the new pipeline, as it doesn't examine GPI files, there is no real test. Can I confirm with you that it only affect GPI files? Is it only new entities added, with no change in current entities (give or take)?

@pgaudet
Copy link

pgaudet commented Feb 6, 2025

The only existing entities that I can think of are the Sars-Cov genes, for which we had somehow loaded chains back in 2020.
But these should be included here, I assume in the same format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants