Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for enforcing upper/lower case #836

Open
ctdunc opened this issue Jan 16, 2025 · 6 comments
Open

Add support for enforcing upper/lower case #836

ctdunc opened this issue Jan 16, 2025 · 6 comments

Comments

@ctdunc
Copy link
Contributor

ctdunc commented Jan 16, 2025

First off, thank you for developing this awesome tool. I went from fearing the process of writing a code formatter to getting a prototype up in an afternoon.

Is your feature request related to a problem? Please describe.
I am trying to write a formatter for a proprietary dialect of SQL, and would like to be able to enforce that identifiers be either lower or upper case depending on the users configuration. As far as I can tell, this is not supported.

Describe the solution you'd like
Something like

(some_loud_symbol) @upper_case
(some_quiet_symbol) @lower_case

would take the text captured in some_{loud,quiet}_symbol and make it upper/lower case respectively.

Describe alternatives you've considered
We were using sqlfluff before, but it doesn't like the dsl-specific features of this language.

I would be interested in trying to contribute this if it's something that seems like a good fit for the project.

@Xophmeister
Copy link
Member

This seems like an interesting idea.

My immediate thoughts are that its usefulness will:

  1. ...require languages that aren't case sensitive. These are quite rare; SQL is probably the "go to" example. These captures would therefore have limited utility.

  2. ...depend on the Tree-sitter grammar. Anecdotally, many grammars don't use nodes to mark keywords, so to apply the case changing properly, one would have to target an appropriate anonymous node. In a fictitious SQL grammar, for example, we may have:

    (select_statement
      "select" @upper_case
    )

    However, in SQL's case, I presume the anonymous node is sufficiently specified to allow any casing in the keyword. As such, the correct query would probably be closer to:

    (select_statement
      .
      _ @upper_case
    
      (#not-eq? @upper_case "SELECT")
    )

    (Note: We have to use @upper_case in the predicate because there's currently no other way; see Using Tree-sitter predicates requires us to use Topiary capture names #824.)

That's not to say that these are barriers -- not at all -- but just to demonstrate that its usage mightn't be so clear-cut.

@ctdunc
Copy link
Contributor Author

ctdunc commented Jan 17, 2025

Thanks! I think these barriers are probably globally solved by adding a "watch out for potential footguns" note in the documentation.

As for the specifics:

  1. I still think its useful for case sensitive languages. e.g. for python, one may want to rename constangs declared at the beginning of a module to be upper case. This use case is out of scope since youd need a way to find all instances of the variable throughout a project, but still might be useful in a #warn! directive.

  2. For my use case, the keywords are all their own nodes, so this is a really basic alternation of those nodes.

Also, offtopic before I open a new issue (dont want to spam), but do you support (or plan to support) neovim style ;extends to merge query files? Would be helpful for query distribution.

Thanks again!

@Xophmeister
Copy link
Member

Also, offtopic before I open a new issue (dont want to spam), but do you support (or plan to support) neovim style ;extends to merge query files? Would be helpful for query distribution

I'm not familiar with what NeoVim does here, but we're using Nickel as our configuration language and provide the -M/--merge-configuration option to access Nickel's configuration merging functionality. Does this achieve the same as your NeoVim case?

@ctdunc
Copy link
Contributor Author

ctdunc commented Jan 17, 2025

I am not super familiar with nickel, but I don't think so. I am talking about merging the .scm files, not language configs.

For example, if I have by default that all queries should be uppercase in my distributed formatter query, but want to add the ability for a user to override that by including their own query, it might be like

; $NEOVIM_PLUGIN_INSTALL_LOCATION/sql/sql.scm
; ...
[
(... keyword_nodes)
] @upper_case
; ...
;extends
; $NEOVIM_USER_CONFIG/queries/sql/sql.csm

[
 (...keyword_nodes)
] @lower_case

would prefer matches from the second file. I am thinking about distributing this formatter through conform, and want to be able to add some user configuration.

See https://neovim.io/doc/user/treesitter.html#treesitter-query-modeline-extends

LMK if I should open a new issue for this one.

@Xophmeister
Copy link
Member

Right, I understand. No, Topiary doesn't currently have the ability to modularise the query files in any way. Again, this is just my gut impression -- please don't let me put you off! -- but query files are parsed by Tree-sitter verbatim and, AFAIK, Tree-sitter doesn't have any merge/modularisation functionality. As such, we'd have to parse the query files twice to achieve this; even if the first parse is relatively inexpensive, I think it's something we'd be keen to avoid.

You can certainly add this as a feature request, but this is just something to bear in mind. Indeed, given that we're leaning on Tree-sitter to do this, it might be worthwhile adding it as a feature request to Tree-sitter itself, instead.

@ctdunc
Copy link
Contributor Author

ctdunc commented Jan 17, 2025

I have another branch with the SQL-specific logic on it, but felt that it was out of scope for this PR.

https://github.com/ctdunc/topiary/tree/feature/sql if you would like to test the features in a language where case doesn't matter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants