One of my Manifest V3 addons is not waking up when it should

OK, thanks for the help!
I’ll go sleep soon so good night and have a nice weekend :slight_smile:.

1 Like

Assuming the issue is still reproducing on your end, I have a pair of follow up questions for you, @juraj.masiar.

  1. Could you run the following snippet in Browser Toolbox and share the result? It should be a short string.
WebExtensionPolicy.getByID("admin@fastaddons.com_SendTabToDeviceEnhancer").extension.backgroundState
  1. Could you share the addonStartup.json.lz4 file from your profile directory with me? You can find the path to your profile directory by visiting about:profiles. I just sent you a message with my address.

Hello,
Yes, my Firefox is still running with addon still broken.

WebExtensionPolicy.getByID("admin@fastaddons.com_SendTabToDeviceEnhancer").extension.backgroundState
"stopped" 

I’m sending you the " addonStartup" now.

Thanks, @juraj.masiar! From what I’ve heard, so far everything looks pretty normal, so unfortunately we’re still digging. I’ll let you know if the team has any more questions for you.

I’m almost sure now this affects more addons.

I have another addon that I’ve migrated to non-persistent background script and since than, I’ve noticed a strange bug.

Sometimes, when I update storage (by marking a message as “read”), the badge number doesn’t update (it shows number of un-read messages).

The badge is updated in the background script and there is a event handler for it:

browser.storage.local.onChanged.addListener(updateBadge);

BUT the badge will update the moment I open popup window (I suspect this wakes up the background script). And then the issue doesn’t happen again, even when I wait for the background script to stop again.

I would like to test this with this broken addon - to see, if waking it up, even temporarily, fixes it.

Is there a way to wake up the background script of this addon? It doens’t have a popup and inspecting background script when it’s unloaded doesn’t wake it up.

@dotproto Good news!
I’ve just managed to reproduce the issue I’ve mentioned above (where the bade doesn’t update).

It’s super random, I had to wait for the script to stop and then update storage and repeat like 10 times and NOW I have a stack trace error:

Text version:

WebExtension context not found! ExtensionParent.sys.mjs:1362:13
    getContextById resource://gre/modules/ExtensionParent.sys.mjs:1362
    recvAPICall resource://gre/modules/ExtensionParent.sys.mjs:1181
    _recv resource://gre/modules/ConduitsChild.sys.mjs:90
    receiveMessage resource://gre/modules/ConduitsParent.sys.mjs:470
    (Async: JSActor query)
    _doSend resource://gre/modules/ConduitsChild.sys.mjs:64
    _send resource://gre/modules/ConduitsChild.sys.mjs:125
    callParentAsyncFunction resource://gre/modules/ExtensionChild.sys.mjs:900
    get chrome://extensions/content/child/ext-storage.js:318
    callAsyncFunction resource://gre/modules/ExtensionCommon.sys.mjs:1196
    callAsyncFunction resource://gre/modules/ExtensionChild.sys.mjs:726
    callAndLog resource://gre/modules/ExtensionChild.sys.mjs:706
    callAsyncFunction resource://gre/modules/ExtensionChild.sys.mjs:725
    stub resource://gre/modules/Schemas.sys.mjs:2954
    O moz-extension://6329fc15-83a7-467b-bd3d-f1a4a343afe0/background/background_sntd.js:1
    ae moz-extension://6329fc15-83a7-467b-bd3d-f1a4a343afe0/background/background_sntd.js:1
    InterpretGeneratorResume self-hosted:1417
    AsyncFunctionNext self-hosted:804
    (Async: async)
    <anonymous> moz-extension://6329fc15-83a7-467b-bd3d-f1a4a343afe0/background/background_sntd.js:1
    <anonymous> moz-extension://6329fc15-83a7-467b-bd3d-f1a4a343afe0/background/background_sntd.js:1

Something throws when the event should fire and that will fail to wake up the background script!

Please pass this to the devs, I’m still running 128.0b5 version.

I’ll go sleep now, good night :slight_smile:.

1 Like

Oh dang! Thanks for the update. Passing this on now.

Can you share the source of the extension that produced this error?

I’ve audited the code base to see if there is any way to end up in the state described in this thread. And the only explanation I can find is if the extension process has somehow crashed during the startup of the background script, which can be triggered by any of your extensions. When the extension process crashes, Firefox automatically tries to recover by starting it again, but gives up if there are more than 5 crashes within 30 seconds. After this, the extension process will only resume if there is a document load that needs to be hosted in the extension process, such as extension tabs, action popups, options pages.

When the extension process is revived successfully after the crash threshold had exceeded, the background page will still not restart automatically, with a few exceptions. One such exception is that the next runtime.sendMessage call (e.g. from an extension popup or a content script) will wake up the background page. This can explain your observed behavior with the Send Note to Device. The stack trace itself could be interpreted as the background page being terminated very suddenly, before Firefox has had a chance to process the browser.storage.sync.get() call (crxviewer link).

To check whether this is the case, could you report the outputs of the following three commands from the browser toolbox?

// Whether any crash has happened this browser session.
ChromeUtils.importESModule("resource://gre/modules/Extension.sys.mjs").ExtensionProcessCrashObserver.lastCrashedProcessChildID

// Whether there have been recent crashes.
ChromeUtils.importESModule("resource://gre/modules/Extension.sys.mjs").ExtensionProcessCrashObserver.lastCrashTimestamps

// Whether we stopped reviving the extension process due to too many crashes.
ChromeUtils.importESModule("resource://gre/modules/Extension.sys.mjs").ExtensionProcessCrashObserver.processSpawningDisabled

And also check about:crashes for any crashes, and if there are recent crashes, submit them and share the crash IDs here.

1 Like

Hello Rob!
Here you go:

There have been no crashes this year.
And the commands execution using “Multiprocess Browser Toolbox” can be seen above, but it looks like there are no crashes either.

Isn’t there only one extension process for all extensions? That would also break all of my other extension, but everything else works here just fine.

Let me make a short video, maybe it will help you see something important…

@rob here the video where I managed to reproduce it again, what a luck! :smiley:
https://mega.nz/file/kHZxkQoa#QcdnNXbAPGCLCYVZUngY_Kf0OWaz6d-jD2Io3NSih2c

Not a crash? Interesting… Does your global Browser Console contain any errors from near the startup of your Firefox instance that could potentially relate to the non-started background page? If in doubt, share everything. A video of you scrolling through the console would also work. Feel free to share it privately by mail with me or Simeon.

Other than the original issue (of the background being asleep), I don’t see anything in the video that shows that the popup thing is the same issue. To show that it is a similar issue, you would have to show that the background script was stopped when the badge text should have been updated. If this is the case, please run the snippet that Simeon shared before in comment 10 to retrieve the internal persistentListeners, and check whether it looks similar to your original case (i.e. listeners is missing added key).

The stack trace at your previous comment (24) shows the “WebExtension context not found!” error, with a call originating from background_sntd.js. This shows that the background did live before, but that it was closed before the storage.sync.get API call was processed.

In the video I also saw a bunch of error messages. Most of them are likely unrelated to the original issue, but for completeness I list them along with their meaning:

  • “sendRemoveListener on closed conduit …” - bug 1633315 - often triggered when an extension context (including content scripts) was unloaded.
  • “AbortError: Actor ‘Conduits’ destroyed before query ‘RunListener’ was resolved” - This happens when an event is dispatched to an extension context, but the extension context unloads before it was able to acknowledge that the event has been processed.
  • “WebExtension context not found” - bug 1447806 - when an API call from the child process for a specific extension context is received for an unknown context (often due to the context having been unloaded).
  • “FrameData missing for [extension ID] page moz-extension://[uuid]/popup/popup.html” - bug 1749116 - when the popup closed while the content was initializing.

I’m almost sleeping here (it’s midnight :slight_smile:), so I’m not sure I can think straight anymore.
But let’s see:

  1. sadly, the console has only today’s logs. I’ve started Firefox like a week ago.

  2. you are right, these are most likely two distinct issues.
    The issue with my “Send Tab to Device Enhancer” is interesting, because it’s super simple code, yet broken.
    But I don’t think I will be able to reproduce it after the restart.
    Is there anything else I can run to find out how it got it’s current state?
    If not, then I can try to reload it using the developer tool “Reload button”. Maybe it will fail with some new error, that would be nice :slight_smile:.

  3. now, about the second issue, I’ve just tested it more now and I did managed to reproduce it again, and when it throws (when the badge is not updated), the background script is NOT woken up:


    The interesting thing is, when I reproduced it again, I’ve actually noticed, that the state may have changed to “Running” for a few milliseconds.
    So I think it will wake it up, but also stop it right away, which then throws the stack trace.
    UPDATE:
    I’ve seen it now, the background script state changes to “Running” for a blink of an eye and then it’s “Stopped” again. So for sure a different issue, but still a bad one :slight_smile:.

Let’s not try to reload the extension. If you do that, the extension will work as usual and then go away.

To determine the exact state of the extension, could you run the following snippets and share the result?

Run the following snippet and take a screenshot of the expanded object. You only need to expand once, you don’t have to expand the sub objects:

WebExtensionPolicy.getByID('admin@fastaddons.com_SendTabToDeviceEnhancer').extension

Run the following snippet, expand the object and mentioned properties, and take a screenshot of the expanded object:

{let extension = WebExtensionPolicy.getByID('admin@fastaddons.com_SendTabToDeviceEnhancer').extension;
extension.apiManager.apis.get(extension).get('backgroundPage');}
// Expand: backgroundBuilder
// Expand: backgroundBuilder.backgroundContextOwner
// Expand: backgroundBuilder.backgroundContextOwner.bgInstance
// Expand: backgroundBuilder.backgroundContextOwner.context
// Expand: backgroundBuilder.idleManager

This snippet will copy the internal set of listeners to the clipboard (share as a file since this can contains thousands of characters):

{
let extension, LISTENERS, ONCE_MAP, onceMapReverse;
extension = WebExtensionPolicy.getByID('admin@fastaddons.com_SendTabToDeviceEnhancer').extension;
LISTENERS = Object.getOwnPropertySymbols(extension.emitter).find(s=>s.description === 'listeners');
ONCE_MAP = Object.getOwnPropertySymbols(extension.emitter).find(s=>s.description === 'onceMap');
onceMapReverse = new Map(ChromeUtils.nondeterministicGetWeakMapKeys(extension.emitter[ONCE_MAP]).map(k => [extension.emitter[ONCE_MAP].get(k), k]));
copy(JSON.stringify(
Array.from(extension.emitter[LISTENERS], ([eventName, listeners], i, emitter) => {
  return [eventName, Array.from(listeners, l => String(onceMapReverse.get(l) ?? "/**/" + l))];
})
, null, 2));
}

The third is not that long:

[
  [
    "add-permissions",
    [
      "/**/(ignoreEvent, permissions) => {\n      for (let perm of permissions.permissions) {\n        this.permissions.add(perm);\n      }\n      this.policy.permissions = Array.from(this.permissions);\n\n      updateAllowedOrigins(this.policy, permissions.origins, /* isAdd */ true);\n      this.allowedOrigins = this.policy.allowedOrigins;\n\n      if (this.policy.active) {\n        this.setSharedData(\"\", this.serialize());\n        Services.ppmm.sharedData.flush();\n        this.broadcast(\"Extension:UpdatePermissions\", {\n          id: this.id,\n          origins: permissions.origins,\n          permissions: permissions.permissions,\n          add: true,\n        });\n      }\n\n      this.cachePermissions();\n      this.updatePermissions();\n    }"
    ]
  ],
  [
    "remove-permissions",
    [
      "/**/(ignoreEvent, permissions) => {\n      for (let perm of permissions.permissions) {\n        this.permissions.delete(perm);\n      }\n      this.policy.permissions = Array.from(this.permissions);\n\n      updateAllowedOrigins(this.policy, permissions.origins, /* isAdd */ false);\n      this.allowedOrigins = this.policy.allowedOrigins;\n\n      if (this.policy.active) {\n        this.setSharedData(\"\", this.serialize());\n        Services.ppmm.sharedData.flush();\n        this.broadcast(\"Extension:UpdatePermissions\", {\n          id: this.id,\n          origins: permissions.origins,\n          permissions: permissions.permissions,\n          add: false,\n        });\n      }\n\n      this.cachePermissions();\n      this.updatePermissions();\n    }"
    ]
  ],
  [
    "shutdown",
    [
      "(what, isAppShutdown) => {\n      if (this.onShutdown) {\n        this.onShutdown(isAppShutdown);\n      }\n      this.extension = null;\n    }",
      "/**/() => {\n        extension.off(\"background-script-started\", done);\n        extension.off(\"background-script-aborted\", done);\n        extension.off(\"shutdown\", done);\n        resolve();\n      }",
      "(what, isAppShutdown) => {\n      if (this.onShutdown) {\n        this.onShutdown(isAppShutdown);\n      }\n      this.extension = null;\n    }"
    ]
  ],
  [
    "background-script-started",
    [
      "() => {\n      extension.emit(\"background-first-run\");\n    }",
      "/**/() => {\n        extension.off(\"background-script-started\", done);\n        extension.off(\"background-script-aborted\", done);\n        extension.off(\"shutdown\", done);\n        resolve();\n      }",
      "() => {\n        this.idleManager.resetTimer();\n      }"
    ]
  ],
  [
    "background-script-aborted",
    [
      "/**/() => {\n        extension.off(\"background-script-started\", done);\n        extension.off(\"background-script-aborted\", done);\n        extension.off(\"shutdown\", done);\n        resolve();\n      }"
    ]
  ],
  [
    "background-script-reset-idle",
    [
      "/**/(event, { reason }) => {\n      if (\n        extension.backgroundState == BACKGROUND_STATE.SUSPENDING &&\n        // After we begin suspending the background, parent API calls from\n        // runtime.onSuspend listeners shouldn't cancel the suspension.\n        reason !== \"parentapicall\"\n      ) {\n        extension.backgroundState = BACKGROUND_STATE.RUNNING;\n        extension.emit(\"background-script-suspend-canceled\");\n      }\n\n      this.idleManager.resetTimer();\n\n      if (this.isWorker) {\n        // TODO(Bug 1790087): record similar telemetry for service workers.\n        return;\n      }\n      if (reason === \"event\" || reason === \"parentapicall\") {\n        // Bug 1868960: not recording these because too frequent.\n        return;\n      }\n\n      // Keep in sync with categories in WEBEXT_EVENTPAGE_IDLE_RESULT_COUNT.\n      let KNOWN = [\"nativeapp\", \"streamfilter\", \"listeners\"];\n      ExtensionTelemetry.eventPageIdleResult.histogramAdd({\n        extension,\n        category: `reset_${KNOWN.includes(reason) ? reason : \"other\"}`,\n      });\n    }"
    ]
  ],
  [
    "background-script-idle-waituntil",
    [
      "/**/(_, { promise, reason }) => {\n      this.idleManager.waitUntil(promise, reason);\n    }"
    ]
  ],
  [
    "start-background-script",
    [
      "async () => {\n      if (!this.extension || this.extension.hasShutdown) {\n        // Extension was shut down. Don't build the background page.\n        // Primed listeners have been cleared in onShutdown.\n        return;\n      }\n      await this.build();\n    }"
    ]
  ],
  [
    "background-script-event",
    [
      "async () => {\n      await ExtensionParent.browserPaintedPromise;\n      extension.emit(\"start-background-script\");\n    }"
    ]
  ]
]

I’ve released new version 3.9.1 with disabled minification.
Now, let me see if I can still reproduce it…

UPDATE:
Yes, I can still reproduce it. The background script is woken up and stopped right away, with the “WebExtension context not found!” exception.
It’s pretty tough to reproduce though. I’ve tried to write an automatic script that randomly updates storage every 40 seconds, but doesn’t reproduce. I guess something needs to be “un-cached” to slow it down to trigger the race condition…

Thanks for all information that you provided. It encouraged me to take another look at the implementation, this time without trusting that anything behaves like it should. By doing so, I found one way to trigger your issue that doesn’t require a crash of the extension process. I filed this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1905153 (“Non-functional extension APIs at startup due to race condition, when asyncLoadModule is called before loadModule”)

1 Like

Thank you Rob for all this hard work!
I’m happy to help any time :slight_smile:.

But reading your amazingly written bug report, it sounds pretty bad. Race conditions are the worst (and so common).

Funny thing, my extension process just crashed moments ago, I think I’ve discovered a memory leak/infinite loop in my speed dial addon. So after a year, I have a new crash report :slight_smile:…, it’s karma for bragging that I had none this year :smiley: .

So… I hate to ask this so soon, but what about the second issue? :slight_smile:
(background script being terminated right after waking up)

And anything I can do to help?

Following your statement “background script terminated right after waking up”, I audited the code base and found the root cause of the issue, filed as https://bugzilla.mozilla.org/show_bug.cgi?id=1905505. Thanks for bringing it to our attention!

1 Like

Great job Rob! Thank you!
And I’m glad I could help :slight_smile:.

I wanted to say that I’m happy I’m not the only one working on weekend, but it’s likely not good thing so I wish you less weekend jobs :slight_smile:.

Have a nice weekend and restful summer Rob :slight_smile: (and Simeon, you too!).