-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow using an alternative tokenizer #65
base: main
Are you sure you want to change the base?
Allow using an alternative tokenizer #65
Conversation
21a9eda
to
0c69542
Compare
|
||
def __init__( | ||
self, tokengen: Iterator[tokenize.TokenInfo], *, path: str = "", verbose: bool = False | ||
self, | ||
tokengen: Iterator[TokenInfo[TokenType]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not name it token_generator
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This predate this work so I did not feel like changing it.
src/pegen/tokenizer.py
Outdated
line: str | ||
|
||
|
||
class TokenTypes(Generic[TokenType]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the annotations of abstract methods in this class don't result in an exception when they are not implemented as the class doesn't derive from ABC
. if that is intended, it should be mentioned, or the methods can raise a NotImplementedError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in latest commit
This needs rebasing 😅 |
0c69542
to
8e2449a
Compare
Rebased but will #71 to pass on 3.10 |
7172b48
to
cfb6a74
Compare
Rebased again so that CIs should pass @pablogsal once the 3.11beta5 is live could you review this ? |
8ec1174
to
6251ba8
Compare
@pablogsal I resolved the conflict and hopefully fixed the 3.8 failure. Could you please review ? |
5aae014
to
3cefc85
Compare
It would be nice to finally merge this |
if type in exact_token_types: | ||
if tok.type == exact_token_types[type]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be nicer to use hasattr and getattr? Not all objects have dict
I am not sure this approach is the nicest and it looks a bit brittle in places but it would allow to use alternate tokenizers with a well defined interface which would be nice.
Comments welcome !