-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python.gram Rust/C++/Borgo compatible match syntax: parse without delimiters #107
Comments
Replacing Here's the fundamental issue:
works fine. However,
fails to parse because there is a So after parsing How can I change the grammar so that INDENT means newline + n spaces and DEDENT means newline - n spaces? In this example n=4. If it's not possible because this is baked deeply into python grammar, I could give up and accept the least intrusive delimiter and move on. |
I spent sometime looking into how DEDENT is handled in the tokenizer. My reading of the code is that you're using python's C tokenizer and this behavior is coming from there. I understand the motivation here is to maximize compatibility. Does it make sense to have another tokenizer that handles DEDENT explicitly where the fine grained control, not compatibility matters more? |
https://github.com/adsharma/python-grammar/blob/main/tokenizer.py Has a pure python implementation of the tokenizer. Haven't tested it with the generated parser yet. But it serves as a useful starting point of discussion for how the tokenizer should handle DEDENT. |
Test case
Modified Grammar
If I remove the
endpmatch
delimiter from both the test case and the grammar, we fail to parse.I've tried running
parser.py -vv test3.py
and tried to understand why it fails. But it isn't entirely clear.Is there a solution to this problem such as:
invalid_foo
rulesI would really like to avoid using
endpmatch
.The text was updated successfully, but these errors were encountered: