Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more information to Link nodes (for round-trip rendering) #98

Open
robinst opened this issue Aug 29, 2017 · 6 comments
Open

Add more information to Link nodes (for round-trip rendering) #98

robinst opened this issue Aug 29, 2017 · 6 comments

Comments

@robinst
Copy link
Collaborator

robinst commented Aug 29, 2017

For #12, we need to preserve link reference definitions. We also need to be able to distinguish between the different link types (inline links, autolinks, reference links).

There is an old PR which might serve as a starting point: #20 (comment)

I'm not sure yet what the final Node types should look like. Ideally we'd have an interface that all the links can implement for code that doesn't care about what the source representation was.

If someone is interested in working on this, please comment here first so we can figure out what the end result should look like.

@jonsampson
Copy link

jonsampson commented Sep 16, 2017

Hi @robinst,

I've been looking this over since last night and I'll have some time on various Saturdays in the next few months that might serve to implement this. In the mean time: talk is cheap! :)

I've reviewed #20, #12 and the appropriate spec. Regarding your notes in #20

Do you think it would make sense to have

[foo]: /url "title"

be its own Node type, like ReferenceDefinition? Something like this maybe:

[foo](/uri "title")

is a Link, which has getDestination, getTitle (not sure about getLabel)

[foo][foo], [foo][] and [foo]

are ReferenceLink extends Link, which have an additional getDefinition which returns the ReferenceDefinition

plus the need for a generic link interface, I was thinking this:

public interface GenericLink {
	String getLabel();
	String getLinkDestination();
	String getTitle();
}

Autolinks only have linkDestinations, which calls into question a functional need for inheritance to Link.

public class Autolink extends Node implements GenericLink {
	private String linkDestination;
	interface impl just refers to linkDestination...
}

ReferenceDefinitions do break down to exactly the same components as Link, so inheritance is worthwhile from a reduction of code duplication standpoint but I don't know if a ReferenceDefinition is a Link. So straight inheritance, a class just for type purposes:

public class ReferenceDefinition extends Link implements GenericLink {
	not much here...
}

or compositional inheritance:

public class ReferenceDefinition extends Node implements GenericLink {
private Link linkReferenced;
methods call through to linkReferenced..
}

^edit: nope. Don't like that idea after I've thought about it.

or a complete duplicate/copy of Link's functionality. An inheritance decision here might also influence Autolink's implementation.

Then, we'll need something that references the ReferenceDefinition. From the spec:

A link reference definition does not correspond to a structural element of a document. Instead, it defines a label which can be used in reference links...

Emphasis mine, because I've thought of this as a ReferenceLabel

public class ReferenceLabel extends Node {
	private ReferenceDefinition referenceDefinition;
	...
}

It's not a link itself, just a pointer.

Thanks for reading!

@robinst
Copy link
Collaborator Author

robinst commented Sep 20, 2017

Hey! Thanks for picking this up :).

So, a couple of thoughts:

  • I think Link itself should be the name of the interface (not GenericLink)

  • Link should be implemented by all nodes that result in a rendered link. So that means the following:

    Link within text: [foo]
    
    [foo]: /url "title"
    

    Would result in: [foo] is a ReferenceLink ... implements Link. The last line would be a LinkReferenceDefinition which does not implement Link.

  • I'm not sure getLabel belongs on the interface. Link labels are only used for reference links, but not inline links.

Does that make sense?

@jonsampson
Copy link

I'm happy to help, bud! Thanks for doing lots of the heavy lifting on this project.

  • Link as name of interface - I have no problem with that. The purpose of GenericLink was solely to disambiguate it from org.commonmark.node.Link.
  • Link is implemented by all nodes that result in a rendered link - Good note; makes sense.
  • getLabel in the interface - Link labels are optional in inline links - Link spec refers to them as link text. Currently in commonmark-java, a Link node has a destination and title and optionally contains a Text child node for link text. A LinkReferenceDefinition as proposed would be implemented similarly.

So, to distill: do you think that the link label of a link reference definition equates to the link text of an inline link?

Inline Link Example 459 - link is link text
[link](/uri "title")

yields <p><a href="/uri" title="title">link</a></p>

Ref Def Link Example 159 - foo is link label
[foo]: /url "title"

[foo]

yields <p><a href="/url" title="title">foo</a></p>

Proposal: link = foo

getLabel may not be the best choice for the interface method name since we're talking about building a link from this object. getLinkText or getText might be more in line with the nature of the work.

So now, having written this, I've stumbled across http://spec.commonmark.org/0.28/#reference-link, a sub part of the Link section, which could further inform on how I should visualize this. I hadn't realized there are full reference links, collapsed reference links, and shortcut reference links (which appear to be what I was modelling this stuff based upon). I've got some more reading to do!

@robinst
Copy link
Collaborator Author

robinst commented Oct 21, 2017

Yeah, I see what you're saying. I think the ReferenceLink would have foo as child node(s). But it would also be useful to just have a String representation of it, which allows you to look up the corresponding LinkReferenceDefinition from a Map. Calling that getLabel would be fine. But it wouldn't be on the generic Link interface because not all links have a label.

Anyway, I think as soon as someone starts implementing this, it should become clear and we can discuss details on the PR.

robinst added a commit that referenced this issue Apr 15, 2019
The spec was updated to clarify that they can also be part of setext
headings. The way we make these work in our parser is to be able to
"replace" the active paragraph block. In that case, we still need to
collect link reference definitions from it.

In order to implement this, I had to separate it from InlineParser. The
new way is cleaner, and allows us to add a Node to the document as well
(for #98).

I think there's still room for improvement: We could parse the
definition as we go in ParagraphParser and collect them earlier. That
might eliminate the current "double parsing" that we do in some cases.
robinst added a commit that referenced this issue Jul 12, 2019
This is part of #98 and was enabled by the refactoring of definition
parsing for spec 0.29.
@robinst robinst changed the title Preserve link reference definitions as Node & add more information to Link Add more information to Link nodes (for round-trip rendering) Jul 12, 2019
@robinst
Copy link
Collaborator Author

robinst commented Jul 12, 2019

The first part of this is now done, see cb43ae9.

@robinst
Copy link
Collaborator Author

robinst commented Apr 6, 2024

With the MarkdownRenderer now merged:

The goal of this issue is now nice and clear: Preserve enough details about links in nodes to be able to render them back in their original format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants