Screeching to a halt

At midnight last night, all three of my instances of the Things Gateway stopped working. There’s nothing logged beyond midnight. The user interface times out. External software cannot connect to it.

I’ve lost control of my home lighting and my heating. I woke to a cold house at 3am.

Did a software update go seriously wrong? Or is this just the ignominious end of the line?

Yikes! We did a fair amount of testing with OTA updates and didn’t run into any issues. The biggest change this time around was a switch to Node 10, so there was a Node version update plus some forced add-on updates.

Is there anything in the log? You can attempt to roll back as follows:

cd ~/mozilla-iot
/home/pi/mozilla-iot/gateway/tools/rollback.sh

rollback doesn’t give errors.

However, I can see in /var/log/syslog this error:

Error: /lib/arm-linux-gnueabihf/libc.so.6: version `GLIBC_2.28’ not found (required by /home/pi/mozilla-iot/gateway/node_modules/sqlite3/lib/binding/node-v64-linux-arm/node_sqlite3.node)

Thereafter, I can see:

mozilla-iot-gateway.service: Unit entered failed state.

It tries to rollback, but never recovers.

Woah. Not sure how that hadn’t happened until now. I’m guessing you flashed these gateways a long time ago, with something based on Raspbian Stretch.

You should be able to do something like this:

  1. Manually update:
    cd ~/mozilla-iot
    /home/pi/mozilla-iot/gateway/tools/upgrade.sh
    
  2. Stop the gateway:
    sudo systemctl stop mozilla-iot-gateway
    
  3. Rebuild the node modules:
    cd ~/mozilla-iot/gateway
    npm install --production
    
  4. Start the gateway:
    sudo systemctl start mozilla-iot-gateway
    

Alternatively, if the rollback worked (couldn’t tell from your response), you can disable automatic updates in Settings -> Updates.

These three installations were bog standard downloads of the Things Gateway downloaded from Mozilla in March 2019. I think this is the first automatic upgrade that failed.

In your step 1 above, the upgrade.sh script requires two parameters. What are the appropriate values for [gateway_archive_url] [node_modules_archive_url]?

Oh, sorry. check-for-update.sh, not upgrade.sh.

No change in behavior. Watching syslog, I see the GLIBC_2.28 not found error.

Apr  7 12:00:26 cottage run-app.sh[4640]: Error: /lib/arm-linux-gnueabihf/libc.so.6: version `GLIBC_2.28' not found (required by /home/pi/mozilla-iot/gateway/node_modules/sqlite3/lib/binding/node-v64-linux-arm/node_sqlite3.node)
...
Apr  7 12:00:26 cottage systemd[1]: Started Mozilla WebThings Gateway Client Update Rollback.
Apr  7 12:00:36 cottage systemd[1]: mozilla-iot-gateway.service: Service hold-off time over, scheduling restart.
Apr  7 12:00:36 cottage systemd[1]: Stopped Mozilla WebThings Gateway Client.
Apr  7 12:00:36 cottage systemd[1]: Started Mozilla WebThings Gateway Client.

It’s an endless loop of the error/rollback/stop/start

Try this:

sudo systemctl stop mozilla-iot-gateway
cd ~/mozilla-iot/gateway
sudo rm -rf node_modules
npm ci --production
sudo systemctl start mozilla-iot-gateway

You may also have to reinstall certain add-ons.

2 Likes

lots of #fail.

It looks like the Zigbee and ZWave adapters need to be rebuilt as they say they’re built with NODE_MODULE_VERSION 57 and NODE_MODULE_VERSION 64 is required. How do I do that?

There was also this ominous sounding error: (node:13552) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 propertyChanged listeners added. Use emitter.setMaxListeners() to increase limit

suggestions?

deleting/removing the Zigbee and ZWave adapters and then re-adding them resolved that problem. I’ve regained control over lights in this building.

Now I’m doing these same steps on the other gateways in other buildings to see if I can regain control over their respective heat sources.

The memory leak thing has existed for a long time. It’s a non-issue.

That worked for me with a gateway from mid-2018, thanks!

Same issue with my home automation system since update. Tried steps here, freezes at npm ci and never completes… or maybe im not patient enough?

Wasn’t patient enought… npm ci completed and was able to connect to HTTPS. Looks like GPIO is lost now… Removed and reinstalled - back up and running!

I’m afraid my gateway failed as well.

Thanks for the tips @mstegeman. SSH is disabled on my gateway so I’ll have to plug in a monitor and keyboard later and try the steps above.

This worked for me too (needed to do sudo rm -rf node_modules).

I also needed to remove and re-add the Zigbee adapter add-on to get devices to work.

You’ll need to reinstall the GPIO add-on.

Apologies for the issues with this upgrade, everyone. Let me offer a quick explanation, for anyone who cares.

There are a few things at play here:

  1. We switched from Raspbian Stretch to Raspbian Buster about a year and a half ago, in order to support the Raspberry Pi 4, but continued issuing a single OTA update. That was bound to catch up to us.

  2. The build process for this release (0.12) was changed. The previous build process was a tedious process, consisting of:

    • A series of manual steps on a first-generation Raspberry Pi.
    • A series of automated steps via Travis CI/GitHub Actions.
    • A series of manual steps on a Linux workstation.

    The process is now a single script, which can be done via automation.

  3. We had to switch Node versions (8 -> 10) in this release, as 8 is no longer supported and several of our dependencies were already dropping support for it.

So, the primary issue here is that the gateway itself (and its dependencies) are now built as a part of the Raspbian Buster image creation process, rather than being cross-compiled in a Debian Stretch environment and then copied into the image.

Since Buster uses a newer version of glibc, the OTA update went awry for people using the old Stretch images.

When those upgrades failed, the part of the upgrade process that pulls down the proper add-on updates for the new Node version also failed, leading to the multi-stage failure.

Again, I apologize for this. On the bright side, the gateway should be in a much better state now, such that anyone can build an image by running one script.

For anyone jumping directly to this comment, there are two solutions:

  1. Rebuild your Node modules and reinstall broken add-ons, as instructed here.
  2. Back up your /home/pi/.mozilla-iot directory, flash a fresh 0.12 image, then restore that directory. You’ll probably still have to reinstall some add-ons.
3 Likes

Thanks - all fixed now!

Partial success – I was able to reinstall gpio and zwave-adapter, but not zigbee-adapter.

I keep getting:
2020-04-15 00:28:31.618 INFO : Fetching add-on https://s3-us-west-2.amazonaws.com/mozilla-gateway-addons/zigbee-adapter-0.11.0-linux-arm-v10.tgz as /tmp/GGduwV/zigbee-adapter.tar.gz
2020-04-15 00:28:32.201 INFO : Expanding add-on /tmp/GGduwV/zigbee-adapter.tar.gz
2020-04-15 00:28:39.949 INFO : Loading add-on: zigbee-adapter
2020-04-15 00:28:41.586 INFO : zigbee-adapter: Opening database: /home/pi/.mozilla-iot/config/db.sqlite3
2020-04-15 00:28:42.336 INFO : zigbee-adapter: Loading add-on zigbee-adapter from /home/pi/.mozilla-iot/addons/zigbee-adapter
2020-04-15 00:28:44.223 ERROR : zigbee-adapter: Error: /lib/arm-linux-gnueabihf/libc.so.6: version `GLIBC_2.28’ not found (required by /home/pi/.mozilla-iot/add

@Pete_Skeggs See here.

If you don’t like the solution in that thread, you can fix this manually as follows:

sudo systemctl stop mozilla-iot-gateway
cd ~/.mozilla-iot/addons/zigbee-adapter
sudo rm -rf node_modules
npm ci --production
mkdir .git
sudo systemctl start mozilla-iot-gateway

You should note that you’ll continue running into issues while running the older Raspbian version, though.