Description
In order to address concerns in #11, #12, wycats/language-reporting#6, and probably others.
I've been experimenting with merging the APIs of codespan
/language-reporting
/annotate-snippets
, and the below API surface is what that I think makes the most sense.
NOTE: the suggested API has changed multiple times from feedback, see conversation starting at this comment for the most recent API and discussion.
Original Proposal
An experimental implementation of the API based on #12 is at CAD97/retort#1 (being pushed within 24 hours of posting, I've got one last bit to "port" but I've got to get to bed now but I wanted to get this posted first).
EDIT: I've reconsidered this API, though the linked PR does implement most of it. I'm sketching a new slightly lower-level design from this one, and the diagnostic layout of this current API will probably be a wrapper library around
annotate-snippets
. (I get to use theretort
name!)
API
use termcolor::WriteColor;
trait Span: fmt::Debug + Copy {
type Origin: ?Sized + fmt::Debug + Eq;
fn start(&self) -> usize;
fn end(&self) -> usize;
fn new(&self, start: usize, end: usize) -> Self;
fn origin(&self) -> &Self::Origin;
}
trait SpanResolver<Sp> {
fn first_line_of(&mut self, span: Sp) -> Option<SpannedLine<Sp>>;
fn next_line_of(&mut self, span: Sp, line: SpannedLine<Sp>) -> Option<SpannedLine<Sp>>;
fn write_span(&mut self, w: &mut dyn WriteColor, span: Sp) -> io::Result<()>;
fn write_origin(&mut self, w: &mut dyn WriteColor, origin: Sp) -> io::Result<()>;
}
#[derive(Debug, Copy, Clone)]
pub struct SpannedLine<Sp> {
line_num: usize,
char_count: usize,
span: Sp,
}
impl Span for (usize, usize) {
type Origin = ();
}
impl<Sp: Span<Origin=()>> Span for (&'_ str, Sp) {
type Origin = str;
}
impl<Sp: Span> SpanResolver<Sp> for &str
where Sp::Origin: fmt::Display;
mod diagnostic {
#[derive(Debug, Clone)]
struct Diagnostic<'a, Sp: Span> {
pub primary: Annotation<'a, Sp>,
pub code: Option<Cow<'a, str>>,
pub secondary: Cow<'a, [Annotation<'a, Sp>]>,
}
#[derive(Debug, Clone)]
struct Annotation<'a, Sp: Span> {
pub span: Sp,
pub level: Level,
pub message: Cow<'a, str>,
}
#[derive(Debug, Copy, Clone, Eq, PartialEq, Hash)]
enum Level {
Err,
Warn,
Info,
Hint,
}
impl<Sp: Span> Diagnostic<'_, Sp> {
pub fn borrow(&self) -> Diagnostic<'_, Sp>;
pub fn into_owned(self) -> Diagnostic<'static, Sp>;
}
impl<Sp: Span> Annotation<'_, Sp> {
pub fn borrow(&self) -> Annotation<'_, Sp>;
pub fn into_owned(self) -> Annotation<'static, Sp>;
}
impl fmt::Display for Level;
}
mod style {
#[derive(Debug, Copy, Clone, Eq, PartialEq, Hash)]
enum Mark {
None,
Start,
Continue,
End,
}
#[non_exhaustive]
#[derive(Debug, Copy, Clone)]
pub enum Style {
Base,
Code,
Diagnostic(Level),
LineNum,
TitleLine,
OriginLine,
}
trait Stylesheet {
fn set_style(&mut self, w: &mut dyn WriteColor, style: Style) -> io::Result<()>;
fn write_marks(&mut self, w: &mut dyn WriteColor, marks: &[Mark]) -> io::Result<()>;
fn write_divider(&mut self, w: &mut dyn WriteColor) -> io::Result<()>;
fn write_underline(
&mut self,
w: &mut dyn WriteColor,
level: Level,
len: usize,
) -> io::Result<()>;
}
struct Rustc; impl Stylesheet for Rustc;
// other styles in the future
}
mod renderer {
fn render<'a, Sp: Span>(
w: &mut dyn WriteColor,
stylesheet: &dyn Stylesheet,
span_resolver: &mut dyn SpanResolver<Sp>,
diagnostic: &'a Diagnostic<'a, Sp>,
) -> io::Result<()>;
fn lsp<'a, Sp: Span + 'a>(
diagnostics: impl IntoIterator<Item = Diagnostic<'a, Sp>>,
source: Option<&'_ str>,
span_resolver: impl FnMut(Sp) -> lsp_types::Location,
) -> Vec<lsp_types::PublishDiagnosticsParams>;
}
Notes:
- I've skipped imports and implementation bodies for clarity. All definitions are exported where I've written them.
- I've liberally used
dyn Trait
, so the only monomorphization should be over theSpan
type. - I'm not particularly attached to any of the organization of exports, things can move around.
Span::new
is only used forimpl SpanResolver<impl Span> for &str
; making that impl more specific can get rid of that trait method.SpanResolver
takes&mut
for its methods primarily because it can, in order to allow use of a single-threaded DB that requires&mut
access for caching as a span resolver.Span
resolution is passed throughSpanResolver
at the last moment such that aSpanResolver
can supply syntax highlighting for errors.SpanResolver::write_origin
only getsio::Write
because styling is done ahead of time byStylesheet
. BecauseWriteColor
does not have an upcast method, this means we can't usedyn WriteColor
anywhere that will end up callingSpanResolver::write_origin
. This can be changed to takeWriteColor
if desired.Diagnostic
's layout is tuned to have similar layout to the language server protocol'sDiagnostic
.Diagnostic
is set up so thatDiagnostic<'_, Sp>
can be borrowed but also an ownedDiagnostic<'static, Sp>
can be produced by usingCow
s. This eases use with constructed diagnostics.- Potential style improvement: extend
Style::Code
to be an enum of general code token types (e.g. the list from pygments),SpanResolver::write_span
just gets the ability to set the style to one of those, which goes through theStyleSheet
for styling.
Activity
CAD97 commentedon Oct 12, 2019
cc list of potentially interested parties:
codespan
)language-reporting
)annotate-snippets
)rowan
+codespan
)zbraniecki commentedon Oct 12, 2019
Hi @CAD97 !
Thanks for taking a look at this and I'm excited to work together, if we end up deciding that we're aiming for the same shape of the API.
I did a first read of your proposal and only have very rough, unsorted initial thoughts to share so far. Most of them are critical, but please, do not read it as a general criticizm - it's just easier to focus on what I see as potentially incompatible. The code generally looks good!
Foundational Crate
You seem to use dependencies liberally. I'm of strong opinion that this functionality should be treated as a "foundational crate" per raphlinus terminology and as such should aim to maintain minimal, or even zero if possible, dependency tree. Your proposal has 22 dependencies,
annotate-snippets
has zero.I'm open to add some, if we see a value in doing so, but I'd like to keep it at the very minimum and if possible look for dependencies that themselves don't introduce a long chain of dependencies in return.
API
Your API seems to resemble more imperative approach much closer than what I'm aiming for with
annotate-snippets
. The chain operationsDiagnostics::build().code().level().primary().secondary()
feels fairly awkward to me I must admit. I've been working at TC39 on JavaScript at the time when that model was very popular (jQuery!) and I'm not very convinced that it leads to a clean API use and highly-maintainable code (I'm not talking about our code, but the code that uses the API).In principle, I see what I call
Snippet
struct as a data structure. Rust doesn't have a very good way to provide complex slash optional parameter list to constructor, but I came to conclusion that likely in this case we don't need it.So, instead of making people write
Snippet::new(title, description, level, code, &[secondary], ...)
or yourSnippet::build().title().description().level().code()
, we can just expose ability to doSnippet { title: None, description: Some(...), level: Some("E203"), ... }
.It's very clean, well maintained, allows for omitting optional fields with
Default
trait, and what's most powerful, can be then wrapped in many different ways to construct an instance.If I'm not mistaken, the only shortcoming of that approach is lack of inter-field validation allowing one to define a snippet with 5 lines, but an annotation on a sixth.
Initially, that led me to try to squeeze one or more constructors, because I dislike ability to produce internally inconsistent data, but eventually I decided that it's not necessary to fix it. In
annotate-snippets
model the struct gets passed to a function which generates aDisplayList
our of it. It's on this step that the validation of the input (Snippet
) takes place and can be rejected.I really like this model for a foundational functionality of a foundational crate. I'm sure it's possible to build different nicer high-level APIs to aid people in constructing
Snippet
orSlice
, but I think we should focus on exposing everything we can now and letting others or ourselves add sugar later.Your model seems much closer to what
codespan
and in resultlanguage-reporting
are doing. I would prefer us to avoid starting with that API, while I'm open to get to it later.Flexibility
annotate-snippets
is intentionally very vague about the core concepts in it. It is meant to be a vague API which can be used for displaying errors, but also for tutorials, helpers, explainers etc. In particular I believe that the range of annotations in them can be very vast and I'd love to end up with something flexible and extendible.For that reason I'd like to minimize the amount of places in our API that we name after some function. Due to the nature of your proposed API you not only do that, but also add API methods like "primary", "secondary", etc. while at the same time making it a bit less visible what is the relation between them, if it's possible to specify multiple or just one (can I specify multiple "secondary()"? I can! But can I do the same to "primary()"? If so, what it means? And can I specify multiple "level()"? Or will the next one overwrite the previous one?), and so on.
For me, the difference between:
and
is that in the latter, the only meaning is assigned to annotations via
annotation_type
, and there may be many different ones including custom ones added by the user.In the former, the title is decided by a
primary
and thus cannot be different than an in-source annotation, the level is defined per slice, not per annotation, and we use the concept ofprimary/secondary
which would only be extensible via sometertiary
?annotate-snippets
allows you to define title different than any source annotations, or footer, or multiple footers, multiple titles, multiple annotations, which may or may not overlap. It seems like a fairly low-level approach, but in result very flexible.Your API seems much more constrain and intended for getting just one style of annotation snippets.
Cow
You use a lot of
Cow
but I'm not sure how valuable it is. I'm not as convinced to that decision, but I think that in all cases I've been able to find,&str
works well, and I'm not sure if we need an owned messages by the annotation/slice/snippet.Performance
I focus a lot on performance in my rewrite of
annotate-snippets
in #12 . I was unable to compile your PR so I can't measure performance but I think it'd be important to compare.API discrepancy
Finally, and I struggle to ask this since it borderlines NIH-bias which I'd like to avoid, I'm wondering why do you feel the need to design a new API. I asked the authors of
codespan
andlanguage-reporting
if they see any shortcomings of my crate and they stated that they don't and the plan to converge onannotate-snippets
API seems reasonable unless we find any limitation of it.Have you encountered a limitation? Do you dislike
annotation-snippets
API? Any other reason?===
I've been on vacation over this week, but I plan to get back and finish #12 now. If you believe that there's a value in diverging from its API, I'm happy to discuss the above differences (or any other that you see!) and compare the results!
I don't want to be attached to my API but I have not seen or heard of any problem with it yet, and I find it the most flexible, robust and extendible of all I've seen.
In your code I noticed several ideas which I like, but they're internal rather than API surface and I'd rather see them as PRs against
annotate-snippets
than a full new API.Let me know what you think!
CAD97 commentedon Oct 12, 2019
Big apologies: I forgot that I had an out-of-date example in the repository when I posted this; I didn't mean to mislead. I've dropped the builder API (if you look at the API overview in the OP, it's not present) and that's why the example won't compile. There were a few other issues as well because I pushed a WIP checkpoint commit. The implementation is still not quite finished for performance testing as I still need to port one final bit first.
So let me address the points some:
Dependencies
lsp-types
is the only heavy dependency, and should be completely opt-in for the LSP target. The PR now correctly marks the dependency as optional. The LSP conversion could also be pulled out-of-tree if really desired, but the diagnostic layout should be LSP-friendly. Of the other three:termcolor
(+wincolor
,winapi-util
,winapi
+friends) is I think the best option for an abstracted color-capable sink (
codespan/
language-reportingagree with me here). @brendanzab says the current use of
ansi_term(which also depends on
winapi+friends) is what currently keeps them from going all-in on
annotate-snippets` (being the lack of injectable custom writer and global state).scopeguard
: highly used leaf crate; could be inlined if really desired.bytecount
: I included it because clippy yelled at me not to do a naive byte count for newline characters. It's also a leaf crate. Given expected source sizes, it might be reasonable to drop it. Is only used inimpl SpanResolver for &str
.For some reason I'm having trouble installing
cargo-tree
so I can't give a better overview currently.API
The builder API used in the example was old and discarded; I'm just allowing record creation much closer to
annotate-snippets
now:Flexibility
Yeah, the API I've proposed as-is is quite targeted at diagnostics and being compatible with the LSP diagnostic API. I'm perfectly happy to generalize it some more, though.
The intent as currently designed is that the
primary
annotation is the "short" annotation (i.e. the one that shows up as the red squiggly before asking for more information), and thesecondary
annotations are any related information. Note that thesecondary
annotations do not need to be within theprimary
span or even from the sameSpan.origin
; that's just a limitation of my current implementation trying to cobble something together to demonstrate the API.Cow
The main reason I've used
Cow
is to help the use case where a consumer wants to build aDiagnostic
/Annotation
list up during some analysis. If they're purely borrowed, the user has to implement a similar structure that's owned to build up the list, then borrow it when pushing it to the sinkannotate-snippets
renderer. I'd like to avoid that necessity, but I can be talked down if it's a sticking point, especially if we provide an owned variant separate instead of mushing them together withCow
. (Basically, I'm not usingCow
as copy-on-write but as maybe-owned.)Performance
Should be on par with
cleanup
, as the real work was directly ported from it. Again, port is not quite finished, so can't be measured yet.NIH
I'm happy to find a middle ground, this is mainly to share the results of my experimentation so that we can try to find the best end result. The big parts of this I'd really like to see adopted in some form: 1) A LSP target is practical, 2) delayed
Span
resolution to support source highlighting, and 3) a genericWriteColor
target rather than baking in ANSI or no color as the only target implicitly.matklad commentedon Oct 12, 2019
I didn't read this past the first paragraph (I hope to do so, once I have more time), but I'd advise against using lsp-types as a dependency, even an optional one, It changes way to often, and I don't think it's worth it make it non-breaking. Rather, I think it's better to vendor just diagnostics related bits of the LSP, which should be stable. See, for example how I've hard-coded setup/tear down messages in lsp-server. It seems like the case where depending on a 3rd party lib doesn't actually by that much
zbraniecki commentedon Oct 12, 2019
Thanks for the response!
I'm just responding to your points, I'll review more tomorrow:
Dependencies
Yes, I can understand why you aim for
lsp-types
. Maybe we want it, I'll have to look deeper.As for
termcolor
vsansi_term
- I'm not strongly opinionated. I can see us switching iftermcolor
is better. I want it to be optional tho.Others - I'd like to all
extra
functionality to be optional. I can see an argument for, say, unicode crate for character boundaries count and breaking, but I believe it should be optional because a fundantional crate is likely to be used often without that extra piece.I saw things like
serde
andserde_json
compiled as part of your PR. I think they should not be necessary.If they're needed by
lsp-types
I will question whether we needlsp-types
:)API
Oh, I'm sorry for not reviewing the example vs. code. As I said, I just got back from PTO.
As for your example, it looks much better to me! I still would like to separate title from in-source annotations, because it's easier to reference/copy one from another than to separate if you need them to be different.
Flexibility
As with above - it's easier to specialize later, if the crate allows for generic behavior. I'd like to not dictate what behavior happens on the level of our crate, but rather on the level of some higher level API. Then you can have an API that uses the foundational crate and is specific to generating errors and it picks the "primary" and sets it as the only title, and as the primary annotation and so on.
Cow
I can be talked up to incorporate Cow. My main concern about it is that I'm torn on whether the API should be constructable. There's one way to think about it that it should. There's another that there should be some higher level that constructs it (maybe over time) and then spawns the
Diagnostics
struct.As I said, I can definitely be talked into the idea of making the API buildable.
One idea against is that I'm trying to minimize allocations and one way to do that was to cut out all
Vec
replacing them with&[]
. That would make the code simpler at the cost that when you need to build, you do this prior. My initial position is that in many cases you would be able to avoid that step so there's a tangible win. But maybe I'm wrong here! I'll investigate!Performance
Cool! Let's both finish our PRs and compare! :)
NIH
Oh, awesome! I'm so happy to see actual points listed out this way. This is very helpful, thank you!
annotate-snippets
supports syntax highlighting, and if it doesn't, it's a bug and I'm open to look for ways to fix it!cleanup
branch now. I want it to be agnostic of the styling, and able to support differentstylesheets
and even differentthemes
(think - you build different DisplayList for terminal and different for web browser or some other rich GUI, which is different from just styling it).The last step is the last piece of my cleanup, so I'd like to finish my proposal. I like your updated example much more (d'uh! it's closer to annotate-snippets ;)) and I'm really excited to see you bringing your experience and perspective! Let's finish our PRs and compare them and figure out how to approach it.
From your response I feel that we're aiming for close enough goal that we should end up with a single crate, which is awesome (the alternative is also awesome, but less attractive for the bus-factor removal which I care about deeply!).
Onwards! :)
CAD97 commentedon Oct 12, 2019
Just a few more notes:
annotate-snippets
's more general snippet annotation.&mut dyn WriteColor
sink is probably my most desired part of this proposal.SpanResolver
to paint the span.The rest is a question of what layout decisions should be at what layer.
zbraniecki commentedon Oct 12, 2019
Sounds good! I'll look into
&mut dyn WriteColor
tomorrow, and then intoSpanResolver
. Give me several days and I should have something tangible (either code or opinion at least!)kevinmehall commentedon Oct 12, 2019
Is the intention that you could
impl Span
for types likecodemap::Span
andcodespan::Span
that do the "index into concatenation of all files" trick? I don't see how they could provideorigin()
without reference tocodemap::CodeMap
/codespan::Files
. Alone, these span types can't provide a filename to display, or even test whether two spans refer to the same file for theEq
bound.The requirement that a span's start and end are
usize
also precludes implementingSpan
for types that store positions as a separate line and column number.What if the interface exposed column numbers for the library to do its layout computations, but kept
Span
opaque? Here's a rough sketch:Line
is generic and notusize
because it's O(n) to index a plain string by line number. An implementation ofLine
forstr
could cache the byte index of the start of line.codemap
kevinmehall/codemap-diagnostic#7CAD97 commentedon Oct 12, 2019
@kevinmehall yes, the intent was that it would be possible to implement
Span
for types that map multiple files into one dimensional space. I overlooked that resolution to the origin would have to go through the resolver as well for that to work. Of course, now withannotate-snippets
being targeted lower-level than this sketch originally aimed, I don't think a single render should need to deal with spans from multiple origins at all. (That would be the next level up.)kevinmehall commentedon Oct 13, 2019
rustc
has some diagnostics where the primary and secondary spans are in different files. How would that be handled? For example:zbraniecki commentedon Oct 13, 2019
In
annotate-snippets
that's handled by https://github.com/rust-lang/annotate-snippets-rs/blob/master/examples/multislice.rsCAD97 commentedon Oct 13, 2019
Let me drop this here while it's fresh on my mind, though I should admit it's unbaked while I need to run off to bed again.
Here's the input structure I'm experimenting with for a much reduced
annotate-snippets
responsibility from the OP, now with documentation!:API
I suspect adding a "collection of snippets" type (that if I'm not mistaken, is closer to the current
Snippet
than mine, which is closer toSlice
) would be desirable on top of this and as the argument for the render function. That "snippet collection" would also hold the front matter. Or we could even just say that we only care about snippet annotation and the caller should handle the other lines.Marwes commentedon Oct 13, 2019
@matklad
Did you see gluon-lang/lsp-types#117 ? Would let lsp-types go to 1.0 at the cost of having a slightly more awkward API (though in a way, a more honest one).
brendanzab commentedon Oct 14, 2019
I just want to say I really appreciate the time being put into this! Would be exciting to converge on something nice.
24 remaining items