-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make life easier #171
base: master
Are you sure you want to change the base?
make life easier #171
Conversation
calculate tfidf using summary instead of entire text calculate tfidf using old vocab to make train faster refresh all of tfidf datas on first day of month to fix vocab calculate similarity using updating strategy instead of recalculating to save time change similarity object structure
using infinite loop to sync all papers fix some downloading error pdf bug add left time log make code more readable
fetching all papers by time add some more ai repositories such as cat:eess.AS cat:eess.IV etc. infinite loop to download data until success ordinary sync every day,sync 10 days before last paper updated date sync every month tracking last 3 months published data because of arxiv data error make code more readable
do not use anymore
update because similarity structure changed
adding some time utilities remove not used txt_dir
remove things about parse_pdf_to_text.py
change code because of tfidf meta structure change
used by dowload_pdfs.py and thumb_pdf.py
update commands
add more repositories
add more repositories
change tittxt style
update ui with new repositories
change repositories
add more utilities
codes run much faster
codes run faster and more readable
add 403 check
update because make_cache.py changed
constant move to utils
a file including fetching,downloading,analyzing etc.
update information about all_in_one.py
misspelling fix
fetch contents add cat:eess.SP,cat:eess.SY
add cat:eess.SP,cat:eess.SY
add eess.[SP|SY]
add eess.[SP|SY]
add categories
fix show not correct
simplify code of list to set update vocab every half year add run method for outer call
make code more readable add run method for outer call
add run method for outer call
add file suffix to not ready files
add run method for outer call remove repository of cs.CY,eess.SP and eess.SY
remove repository of cs.CY,eess.SP and eess.SY
remove repository of cs.CY,eess.SP and eess.SY
make code more readable
remove repository of cs.CY,eess.SP and eess.SY
remove repository of cs.CY,eess.SP and eess.SY
remove repository of cs.CY,eess.SP and eess.SY
tittxt revert to 600px
change log content:convert to magick simplify set to list convert code
remove user warning
fix bug of waiting time
delete download at 14:00 code
add listener to change state
fix some bugs
fix some bugs
change timeout
+1 for the possibility of a functionality update/sanity pass over the code cleaning it up a bit. I have not really technically reviewed this code for technical passing, but only a quick skim to check the high-level temperature of the changes. The get_time function in line 40 of the proposed update within get_papers.py though is a little worrying for me in terms of being a canary for the types of changes being made to the codebase in a wholesale kind of fashion. That said, the feel that I get for the rest of the code changes is positive, I get the feel that it fills in a whole lot of just glaring feature gaps within the original codebase, and comparatively a lot of the code does really seem to simplify the original codebase in several areas, so this is a net positive. In any case, I think, my +1 for this kind of effort, thanks for sharing this (not an admin or mod, just someone with a vested interest in ASV's upkeep :D). |
Thanks for your approval,I really appreciate it :) |
fix bug of infinite loop
change cmd to python replace method consider of platform difference
make it more clear
update mispells
fix bug of not removing
convert so slow, so changed to thread version
decide thread count by cpu count
fix a bug
make code more readable and faster