Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Your observation seems plausible at first. However when I gave it a second thought I have concluded otherwise.

In the natural language context, a part of speech is actually determined syntactically. "Type" of a token in formal languages by comparison is also syntactical. Any highlighter must recognize the syntactic structure of its target language and work like a parser, that is, not like a tokenizer. You have to "push" something to the stack, if you know what I mean.

To give a bad, but intuitive example, if a highlighter assigned "gray" to every "=" character it comes across, it would not be able to highlight the same characters inside a quotation mark.



> Any highlighter must recognize the syntactic structure of its target language and work like a parser, that is, not like a tokenizer. You have to "push" something to the stack, if you know what I mean.

How often does syntax matter in "syntax highlighting"? Most highlighters I've used don't even try to recognise syntactic structure.

> To give a bad, but intuitive example, if a highlighter assigned "gray" to every "=" character it comes across, it would not be able to highlight the same characters inside a quotation mark.

The problem with this example is that strings are a single token.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: