I’m trying to port an userscript to webextension since the script won’t work on GM4+ due to the changing of API. I had read some policies of webextensions on MDN. And I want to make sure how to port one of its functionality:
The function is about generating the feed list (timeline). The userscript would fetch feed lists of what user had followed from multiple first party source, and try to merge all the feeds in lists by the time it posted. While doing so, the userscript would use xhr to fetch the feed list pages, find out its feeds, sort them by time, and insert to current page. And the url fetched by xhr would be first party page (same domain).
I’m trying to port this piece of userscript to webextension’s page script (or content script).
To me, I think it is secure enough since: It only load first party content. It won’t making the website gained any privileges. It won’t be tracked by any third party. (First party may always track the usage of these extensions which had modified there pages’ content.) It just make the page works as what it should be.
But I wonder whether it will be rejected simply due to inserting remote contents. If so, what’s the best alternative to me?
You shouldn’t insert remote content (html/scripts) in privileged pages (i.e. the background page and your extensions pages). In content pages it is generally acceptable.
You still shouldn’teval anything you XHRed that wasn’t meant as a script and you should sanatize everything you insert into the pages DOM (better don’t rely on the server doing that).
Depending on the HTML content you want to process, that can be pretty simple:
/**
* Sanitizes untrusted HTML by escaping everything that is not recognized as a whitelisted tag or entity.
* @param {string} html Untrusted HTML input.
* @return {string} Sanitized HTML that won't contain any tags but those whitelisted by `rTag` below.
* @author Niklas Gollenstede
* @license MIT
*/
function sanitize(html) {
const parts = (html ? html +'' : '').split(rTag);
return parts.map((s, i) => i % 2 ? s : s.replace(rEsc, c => oEsc[c])).join('');
} // `rTag` must not contain any capturing groups except the one wrapping the entire expression
const rTag = /(&(?:[A-Za-z]+|#\d+|#x[0-9A-Ea-e]+);|<\/?(?:a|abbr|b|br|code|details|em|i|p|pre|kbd|li|ol|ul|small|spam|span|strong|summary|sup|sub|tt|var)(?: download(?:="[^"]*")?)?(?: href="(?!(?:javascript|data):)[^\s"]*?")?(?: title="[^"]*")?>)/;
const oEsc = { '&': '&', '<': '<', '>': '>', "'": ''', '"': '"', '/': '/', }, rEsc = new RegExp('['+ Object.keys(oEsc).join('') +']', 'g');
Not so easy to my situation. I cannot sanitize the content, otherwise it will just break. I’m only sure the server response me a list of feeds. But I have no idea of the dom details of feeds. And I want merge feeds from other pages to the current one. What I currently doing is simply insert it into the webpage. Any sanitize tool would just break it and make it unusable.
cannot sanitize the content, otherwise it will just break
Oh certainly you can. Everything can be sanitized. I once did it on entire websites. It involved an entire HTML parser and an about 300 line JSON object whitelisting tags and attributes. But it made sure the code couldn’t contain any but the (many) whitelisted tags and attributes.
Anyway. from your code it is pretty obvious that the stuff you are inserting is really meant as HTML markup. I’d additionally check the Content-Type of the response to be text/html (or whatever exactly they serve).
If their own markup contains javascript: URLs, they really can’t complain about XSS -.-
It not depend on how many lines of sanitized I should use. There isn’t a list of what may be there and what may not. There are many special types of feeds form post with picture, post with video, post with article, post with whatever they “invented”. And the codes for feed list update very frequently. I don’t think trying to sanitize could / should be a possible idea. That will be more complex than pack whole ViolentMonkey in my extension. (Or may i chose this solution?)
Btw, there did be a XSS attract some years ago for this site.
Btw, there did be a XSS attract some years ago for this site.
Well, apparently that did not worry them enough to build a site that can have a CSP, which is one of the best defenses against XSS.
[Sanitizing] will be more complex than pack whole ViolentMonkey in my extension.
Possibly, yes (but I am always very careful with saying that things are “impossible”).
But what I mainly argued is that: There is no benefit to doing so (sanitize). There isn’t any security issues here at all.
I agree in so far as that there are no additional security issues introduced by what you do.
You are basically just moving content (which may include code) from on visitable page to another. If the site was using CSPs that might make a difference if the pages in question use different rules, but I don’t think they do (see above).
But as I said I’d check the Content-Type, to make sure what you fetched is actually meant to be HTML (otherwise its a XSS risk).