Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

composebox_typeahead: Avoid generating broken links. #30071

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kuv2707
Copy link
Collaborator

@kuv2707 kuv2707 commented May 13, 2024

The #**stream>topic** syntax generates broken links for topics containing two backticks or starting/ending with *, because of architectural flaws in the backend markdown processor. So we avoid generating the syntax for such topics and instead generate the normal link syntax in markdown.

Fixes #19873

CZO discussion

Screenshots and Screen Captures

chrome_M0x0tB29yL

Self-review checklist
  • Self-reviewed the changes for clarity and maintainability
    (variable names, code reuse, readability, etc.).

Communicate decisions, questions, and potential concerns.

  • Explains differences from previous plans (e.g., issue description).
  • Highlights technical choices and bugs encountered.
  • Calls out remaining decisions and concerns.
  • Automated tests verify logic where appropriate.

Individual commits are ready for review (see commit discipline).

  • Each commit is a coherent idea.
  • Commit message(s) explain reasoning and motivation for changes.

Completed manual review and testing of the following:

  • Visual appearance of the changes.
  • Responsiveness and internationalization.
  • Strings and tooltips.
  • End-to-end functionality of buttons, interactions and flows.
  • Corner cases, error conditions, and easily imagined bugs.

@zulipbot zulipbot added size: S area: markdown (mentions) Mentions for users/groups/stream/topic/time. bug rust community request Issues of interest to the Rust community labels May 13, 2024
Copy link
Collaborator Author

@kuv2707 kuv2707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made a preliminary implementation and highlighted the reasons behind some technical choices.
Once we decide the correct logic to detect faulty topic names, I'll refactor the functions and put them in a sensible place.

@@ -995,8 +997,29 @@ export function content_typeahead_selected(item, query, input_element, event) {
// Stream + topic mention typeahead; close the stream+topic mention syntax
// with the topic and the final **. Note that token.length can be 0
// if we are completing from `**streamname>`.
function will_produce_broken_link(topic) {
Copy link
Collaborator Author

@kuv2707 kuv2707 May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, I have considered only two cases which produced buggy links:

  • When there are two or more backticks
  • When the topic name begins or ends with asterisks.

There can still be topic names which cause broken links. I had earlier tried to parse the syntax text using the frontend markdown parser and see if the link in the parsed html points to the correct stream and topic. But double-backtick topic names gave different urls through frontend and backend markdown processors (in fact, the frontend parser correctly parsed the double-backtick topic name, while the backend processor generated an incorrect url).

This means that checking if the frontend markdown processor parses the syntax correctly gives no guarantee about the correctness of the link in the final message. So I thought the best option would be to check for the individual reported cases which produce broken urls.

I infer from CZO discussions that we will eventually move to a different markdown processor, and this will no longer be an issue then.

web/src/composebox_typeahead.js Outdated Show resolved Hide resolved
@alya
Copy link
Contributor

alya commented May 14, 2024

@timabbott Do you want to take a look and provide some feedback here?

@alya alya added the integration review Added by maintainers when a PR may be ready for integration. label May 14, 2024
beginning.slice(0, syntax_start_index) +
url_syntax(get_stream_name(beginning), item.topic) +
" ";
}
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extract this logic to a function and/or add unit tests for it? It's complicated enough that it's very much worth having a test for.

Copy link
Collaborator Author

@kuv2707 kuv2707 May 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have extracted the logic and added some unit tests.

@kuv2707
Copy link
Collaborator Author

kuv2707 commented May 20, 2024

@timabbott Addressed the suggestion. Please take a look.

);
assert.equal(
ct.stream_topic_link_syntax("#**stream>t", "*asterisk"),
"[#stream>*asterisk](#narrow/stream/-1-unknown/topic/*asterisk) ",
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do all of these have -1 as the stream ID in the URLs? The first couple don't, which makes me think you've got some sort of weird bug in either one test case or the other.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stream ids are only required when we generate the markdown link syntax (since we need to generate the url ourselves).
That is why the first two didn't need them.

stream_topic_hash: hash_util.by_stream_topic_url,
get_stream_by_name() {
return "Rome";
},
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mock seems unnecessary; can't we just use add the stream to the stream_data database and avoid the hackery?

Copy link
Collaborator Author

@kuv2707 kuv2707 May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I was trying to mimic the code path followed by the frontend markdown processor (which uses markdown_config.ts), which in retrospect, seems unnecessary.

I have removed these and used the streams already present for the tests. This simplified the code a lot.

@@ -1180,6 +1185,31 @@ export function content_typeahead_selected(
return beginning + rest;
}

function will_produce_broken_stream_topic_link(topic_name: string): boolean {
return /(\*+)|(.*`.*`.*)/.test(topic_name);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The links for topic names with double backticks are now generated correctly, but the double backticks cause the enclosed part of the topic name to render as an inline code block.
I don't know what to do in this case since backslash escaping is disabled.

The #**stream>topic** syntax generates broken links for
topics containing two backticks or ending with *, because of
architectural flaws in the backend markdown processor.
So we avoid generating the syntax for such topics and instead
generate the normal link syntax in markdown.

Fixes zulip#19873
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: markdown (mentions) Mentions for users/groups/stream/topic/time. bug integration review Added by maintainers when a PR may be ready for integration. rust community request Issues of interest to the Rust community size: L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix linking to topic names containing multiple backticks with in-message markdown
4 participants