Add-on for print / save direcly in pdf or png an active tab in a specific folder

learning

(Stageinfo) #1

Hi my name is Anthony.
I’m currently in training session and my employer asked me to develop an add-on for Firefox (version 60 or more) which will be able to print / download a web page into a pdf or png file. The main request is that the user can do it in one clic. The pdf / png file will be send directly into a folder already know (defined in the add-on’s options menu) and apart from the downloads folder .

Can it be done ?


(Niklas Gollenstede) #2

Well, you have the browserAction icon which can either open a panel or ha (single) click action.

tabs.saveAsPDF() “Saves the current page as a PDF file. This will open a dialog, supplied by the underlying operating system, asking the user where they want to save the PDF file.” (which isn’t quite what you want)

tabs.captureTab() " Creates a data URI encoding an image of the visible area of the given tab. You must have the <all_urls> permission to use this method." So basically a screenshot that you can do with whatever you want, e.g. pass it to

downloads.download() which accepts a filename “A string representing a file path relative to the default downloads directory […] Absolute paths, empty paths, and paths containing back-references ( ../ ) will cause an error.” (which may also not quite be what you are looking for)


if that is not enough, you will have to serialize the page (including all assets), render it (via an online or local service) and either offer it as a normal download or use native messaging to place it anywhere on the local disk.
As it happens I have already done those exact things. Please ask if you want to know more.


(Stageinfo) #3

Thanks a lot NilkasG
That helps me very much to clarify my task and what I have to do.
I have already use tabs.saveAsPDF() and downloads.download() but not with tabs.captureTab()
I will try that and ask my employee if it’s enough.
Else I’ll go for the next step
Thank you again


(Stageinfo) #4

Hello,
I’m currently trying to use downloads.download() with tabs.captureTab().
But I don’t find how to use the URI i’ve got with tabs.captureTab() and the downloads API which requires an URL. I’ve tried many things but I can’t read the file I’ve got (when i’ve got one) . Can I have a clue ?


(Niklas Gollenstede) #5

Well, tabs.captureTab() resolves to a data URL, presumably base64, starting with data:image/jpg;base64,<data>.
Given tat that is a URL, you shpuld be able to use it as the url parameter in downloads.download(). If that doesn’t work with data URLs (maybe because Firefox would save the URL (which in this case is the entire download data) in the history), you have to use the bodyparameter. It’s “A string representing the post body of the request.”, which is somewhat vauge, because it doesn’t specify the encoding. If just punching in the base64 encoded <data> portion of the URL directly results in corrupt output (likely) you need to decode it first. The global atob function (no idea why it is called that) decodes base64 into strings, but I am not sure if it works on any data or only data that results in 7bit ASCII strings. If it is the latter, you need to either decode directly or use a library that does it for you.


(Stageinfo) #6

I’ve found a way to download the captured tab after several attempts.
tabs.captureTab() (or tabs.captureVisibleTab()) give me an URI I use like that :

var binary = atob(imageUri.split(’,’)[1]);
var array = [];
for(var i = 0; i < binary.length; i++) {
array.push(binary.charCodeAt(i));
}
var blob = new Blob([new Uint8Array(array)], {type: ‘image/png’});
var urlTest = URL.createObjectURL(blob);

Then I put the url into the “downloads.download()” 's settings.
Next step is to find a way to capture the full Web page.
I think I will have to use native messaging to move the file into the root directory.


(Niklas Gollenstede) #7

Ah, yes. I guess a blob: URL should work as well.

Butt he way you chose to convert it is rather inefficient.
This is non-blocking and should be faster:

const blob = await (await fetch(imageUri)).blob();

Next step is to find a way to capture the full Web page.

tabs.captureTab() won’t give you that. tabs.saveAsPDF()would, but you cant control where to save that.

You can either:

  • scroll the page, take multiple screnshots and combine them on a canvas
  • serialize the page (by inlining images, fonts, style links, maybe even element’s computed styles) and render it elswhere (maybe that works on a canvas, otherwise use a headless browser and/or an online service)

(Juraj Masiar) #8

there is one working example of such add-on here:


(jscher2000) #9

For pages that aren’t huge, this is an example of grabbing an image up to 32767 pixels tall and wide:


(Niklas Gollenstede) #10

I didn’t know Firefox had a ctx.drawWindow() function, that could indeed make this a lot easier.

One alternative I thought of was to simply increase the size of the window, but the restrictions here are even lower then with the canvas. I tested it on Windows 10, and the following restrictions seem to apply:

  • setting a window width or height of more than 214 (16384) will be clipped to that value
  • setting a wight that results in a .innerWidth of more than 213 (8192) will resize the native window, but neither the content nor the UI, resulting in a blank bar at the right
  • setting a height of more than 8198 (don’t know where this restriction comes from) will still flow the content correctly, but hide any additional pixel rows under a black bar
  • for me that means a practical maximum of 8204, by 8198 px, but Windows doesn’t handle it particularly well if you drag or minimize windows of that size

Bottom line I am actually rather disappointed by this, that’s just 600px wider than my current monitor setup (if I include the laptop screen, which I usually don’t use). But two 5k monitors are already wider than this. That isn’t really future-proof …


(Stageinfo) #11

Thank you all for your help

I tried :
const blob = await (await fetch(imageUri)).blob();
but I had some problems using asynchronous functions in my code so I rather used :

fetch(imageUri)
.then(function(response) {
    return response.blob();
})
.then(function(myBlob) {
  var urlTest = URL.createObjectURL(myBlob); 
  var dl = browser.downloads.download({
	url : urlTest,
	filename : "DossierCapture/test.png",
	conflictAction : 'uniquify'
  });

I tried “Take Screenshot” on Chrome but I didn’t manage to download a page. It scrolls down the page and open a new tab (about:blank) but I didn’t get anything after that.
The “Save Screenshot” Addon helped me a lot especially this part :

function SaveScreenshot(aLeft, aTop, aWidth, aHeight) {

    	// Maximum size is limited!
    	// https://dxr.mozilla.org/mozilla-central/source/dom/canvas/CanvasRenderingContext2D.cpp#5517
    	// https://dxr.mozilla.org/mozilla-central/source/gfx/2d/Factory.cpp#316
     
    	if (aHeight > 32767) aHeight = 32767;
    	if (aWidth > 32767) aWidth = 32767;

    	var canvas = document.createElementNS("http://www.w3.org/1999/xhtml", "html:canvas");
    	canvas.height = aHeight;
    	canvas.width = aWidth;

    	var ctx = canvas.getContext("2d");
    	ctx.drawWindow(window, aLeft, aTop, aWidth, aHeight, "rgb(0,0,0)");

    	let imgdata;
    	imgdata = canvas.toDataURL("image/png");

    	sendMessage(imgdata);
    }

It took me some time to understand how to make content script and background script communicate.
I use
browser.tabs.query and browser.tabs.sendMessage for the background.
browser.runtime.onMessage and browser.runtime.sendMessage for the content.
browser.runtime.onMessage again but for the background.

Thanks again


(Juraj Masiar) #12

I would highly recommend using the await versions. For that you need to mark your function as async, for example:

async function fn(imageUri) { const blob = await (await fetch(imageUri)).blob(); }

This function returns Promise - just like any function that is decalred as async.

Once you start using these, you reallize how much nicer your code can be.

Also make sure to read the API (it’s worth it)



(Stageinfo) #13

Thanks juraj.masiar
I read again the API and I will surely have to use await and the asynchronous functions more often. For my code I find a way. My mistake was to try directly to download in the async function I think.

function downloadPage(urlToDownload){
    var dl = browser.downloads.download({
	url : urlToDownload,
	filename : "DossierCapture/test.png",
	conflictAction : 'uniquify'
	});
    dl.then(onStartedDownload, onFailed);
}

async function DataURLtoURL(aDataURL) {
	const blob = await (await fetch(aDataURL)).blob();
	var urlToDownload = URL.createObjectURL(blob); 
	downloadPage(urlToDownload);
}

Last step is to find a way to move the downloaded file into a folder located in the root directory. I’m currently studying Native messaging.


(Stageinfo) #15

I’ve tried a very very simple form of the native messaging method (directly with a .bat or a .sh file to move the downloaded file).

  1. In onStartedDownload() , I put
var navigatorPlatform = navigator.platform;
var regexWin = /win/gi;
var regexLinux = /lin/gi;
      
if (regexWin.test(navigatorPlatform)) {
    	  var port = browser.runtime.connectNative("MoveBat");
}
else if (regexLinux.test(navigatorPlatform)) {
    	  var port = browser.runtime.connectNative("MoveSh");
}
else {
    	  console.log("neither Linux nor Windows");
} 
  1. The app’s manifests have been made and put in the right place (the registry key Windows too).

  2. The app’s manifests (Windows and Linux) target a very simple .bat or .sh file with that command :
    windows :
    move C:\Users\%username%\Downloads\*(filename or part of the filename)* C:\(Folder)\
    linux :
    mv ~/Downloads/*(filename or part of the filename)* ~/(Folder)/

I’m well aware that’s beginner’s work since the .bat / .sh command can’t be change by the webextension and there is no message between the extension and the application (which is empty). It was enough to give a sample of the native messaging method to my employer who decide not to use it.

I will work on the WebExtension’s Options page to choose / create a folder in the Downloads’s folder. Thanks for your help


(Stageinfo) #16

Work done.
My add-on’s name is CapturePage.
I put my source code on https://github.com/stageinfo/CapturePage.
Thanks for your help