Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: Possible to derive from this a Scala based lexer? #204

Closed
Sciss opened this issue May 29, 2021 · 4 comments
Closed

Q: Possible to derive from this a Scala based lexer? #204

Sciss opened this issue May 29, 2021 · 4 comments

Comments

@Sciss
Copy link

Sciss commented May 29, 2021

Hi there. Since I'm not very familiar with VS Code, or the regex mechanisms of JS / TS, I wanted to query your opinion if the definitions provided in this project are potentially useful to somehow 'convert' to be usable in a pure Scala project, in order for that project to create a syntax highlight token stream from Scala 2 or Scala 3 sources.

Any entry points in such 'conversion'? I guess I have to somehow construct java.util.regex.Matcher objects from the different branches, and traverse the tree of matchers in some manner? Or use joni or tm4e.core perhaps?

@Sciss
Copy link
Author

Sciss commented May 29, 2021

Ok, I tried tm4e.core now, and does seem to eat the Scala.tmLanguage.json file.

How does the selection between Scala 2 and Scala 3 happen? I don't see any obvious language version branching in the file.

@Sciss
Copy link
Author

Sciss commented May 29, 2021

Demo

@Sciss Sciss closed this as completed May 29, 2021
@MaximeKjaer
Copy link
Contributor

The project defines TextMate grammars for both Scala 2 and Scala 3. You can read more about how VS Code uses these in these docs.

There are multiple implementations of highlighters that use TextMate grammars; for instance GitHub has one, VS Code has one, and of course, TextMate has one too. So it would definitely be possible to write another pure Scala implementation.
To implement such a highlighter, you would have to implement of a pushdown automaton, which takes a description of the automaton in JSON format, and uses regexes to evaluate the transitions between states.

Do note however, that to be fully compliant with the TextMate grammar spec, you have to use a regex engine that supports a specific regex variant called Oniguruma regular expressions. Note that not everybody does this correctly; for instance, GitHub gets it wrong. It is possible, however, that this is a conscious design decision; if you cannot use a Oniguruma regex engine for whatever reason, perhaps another engine can get you close to being correct.

Having a quick look at both joni and tm4e, it seems that tm4e basically implements the above.

We make no distinction between Scala 2 and Scala 3, the highlighter aims to work for both. It would be very difficult for our grammar to be able to distinguish one from the other, so we make no attempt to do so.

@Sciss
Copy link
Author

Sciss commented May 31, 2021

Thank you for the explanation. Yes, it seems I can use tm4e now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants