How to safely escape OnClick linkUrl without breaking URL?

Hi. I’ve been following some MDN examples to write a simple addon to add an extra context menu option to open links without passing the referer. The addon copies a link and opens a new tab to the selected link.

It mostly works, but I’ve noticed some servers give 404 errors when a plain “&” ampersand symbol is replaced by “& amp;”. Some sites resolve to the correct page, but others don’t see “& amp;” as equivalent to & in the URL. (Space added between “&” and “amp;” only because the forum or browser is replacing the escaped character code “&”.)

The code below borrowed from the MDN examples uses the escapeHTML function to replace potential XSS characters. I’m not sure how to approach this. Even if I split the parameters and concatenate them back together with plain “&” symbols in-between, it seems like that would still open up the potential for unwanted special character codes in the reassembled query string.

What’s the best way to handle sanitizing a link, when it seems like many sites will not resolve pages with “& amp;” substitutions in the URL? Is this escaping even necessary in this case of opening a selected link in a new tab?

Does the linkUrl property of menus.OnClickData already have XSS protection built-in? How does the built-in context menu handle sanitizing the links selected for the default “Open Link in New Tab” without breaking links?

browser.contextMenus.onClicked.addListener((info, tab) => {
 	
    if (info.menuItemId === "open-link-in-new-tab") {

        // Always HTML-escape external input to avoid XSS.
        const safeUrl = escapeHTML(info.linkUrl);

		browser.tabs.create({url: safeUrl, active: false});
// https://gist.github.com/Rob--W/ec23b9d6db9e56b7e4563f1544e0d546
function escapeHTML(str) {

    return String(str)
        .replace(/&/g, "&")
        .replace(/"/g, """).replace(/'/g, "'")
        .replace(/</g, "&lt;").replace(/>/g, "&gt;");

Why do you want to escape the url? browser.tabs.create expects the url as you would type it into the browser. No need to escape it.

Thanks for the reply, Lusito. I think that answers my question. If it’s not necessary in this case, then removing the “& amp;” replacement will easily solve my issue of broken links.

I was escaping here because it was done in the MDN code example I followed that also copied a link. I thought it was generally good practice to avoid potentially malicious XSS from user input. In this case, the user input would be a selected link on a page. Although the add-on only uses the selected link input to paste into a new tab destination, I wasn’t sure whether or not this was a case where input escaping was required.

Before I mark this [Solved], does anyone else have a comment on the necessity of escaping or sanitizing a user selected link in this case? Is there a good rule of thumb to help to figure out when escaping is absolutely necessary to avoid XSS vulnerabilities on user input?

If you want to avoid XSS you should be weary of data: and blob: Links instead. You are not inserting something into HTML here, so no need to remove HTML tags per se, or ensure that the links can’t escape from an <a> tag, you just need to ensure that it is actually an URL that should be loaded.

Thanks, Martin. Is there a built-in function to validate linkUrl and exclude data/blob links in the Web Extension API? Or should I use something similar to the javascript RegEx solutions found here:

Or am I over-thinking things here and really don’t need this extra layer for a simple copy link and paste into tab operation?

isURL(str) {
  var pattern = new RegExp('^(https?:\\/\\/)?'+ // protocol
  '((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|'+ // domain name and extension
  '((\\d{1,3}\\.){3}\\d{1,3}))'+ // OR ip (v4) address
  '(\\:\\d+)?'+ // port
  '(\\/[-a-z\\d%@_.~+&:]*)*'+ // path
  '(\\?[;&a-z\\d%@_.,~+&:=-]*)?'+ // query string
  '(\\#[-a-z\\d_]*)?$','i'); // fragment locator
  return pattern.test(str);
}

I mean, if all you want to do is to ensure that you’re getting a valid HTTP(S) url, you do something like

isValidURL(str) => {
  try {
    const url = new URL(str);
    return url.protocol.startsWith("http");
  }
  catch(e) {
    return false;
  }
};

Ah, I see. Thanks again for your help, Martin. Sorry for asking such basic questions.

I’ll use your example to test the selected linkUrl, then get rid of the escapeHTML function. It sounds like anything more than that would be excessive for this simple “copy link to a new tab” operation.