System unresponsive in some cases

Loggging more than a few days of temperature will make the system unresponsive for several minutes when trying to open the log page. Tried this with logging the Pi temp, as well with a few Z-Wave sensors, both have the same issue.
I use a few Ikea Zigbee buttons, but when clicking them quickly several times will also make the whole system unresponsive, but will remember the clicks and execute them eventually, mostly after 30 seconds or so.

Any ideas why this is happening? I use a Raspberry Pi 4 2Gb, so pretty standard.

Are you using an SD card?

If yes, then the amount of data logged by over precise (2 decimal places? ) and, not necessarily accurate, thermometers is likely overloading the SQLite database on the SD card. And the repaint time of that much data in JavaScript may be problematic too.

I handled this issue with 2 different approaches.

  1. I submitted a PR to configure the precision of the thermometer I use (RuuviTag), and
  2. I created a virtual machine on my main PC to run Influx and Grafana, which are purpose built tools to capture and graph high volumes of data. There is an extension to log data to Influx.

Additional point:

  • a small cheap SSD will make a vast improvement in performance, compared to a SD card

Thanks for explaining. I understand it (although client-side javascript will not impact webthings performance)
Logging is not that important for me, it’s just nice to have. A stable running system is what I’m looking for.

Anyone has an idea about the Zigbee buttons making the system “freeze” sometimes?

This could be caused by the queuing of the temperature updates to the SQLite database. While the gateway is massively asynchronous, I’m not sure about whether the writes to that database are inline or in a separate process. If inline definitely an immediate performance impact. If asynchronous and the rate of updates even slightly exceeds the io capacity of the SD card for a short period, then there will be occasional stalls.

I turned logging completely off, so that should have no impact anymore on performance as there are no writes anymore. (unless SQLite is also used for other things?)

The other uses are very lightweight, I can’t imagine an impact.

I did the same thing with a GPIO button, but that works flawlessly. It must have somtheing to to with the zigbee adapter. (I see no errors in the logs when it freezes)
To be more precise, an average 9 clicks will freeze, getting worse when not waiting long enough after this happens.
So it is probably not related to SQLite

Perhaps this is the reason for the temporary unresponsiveness of a Zigbee button? I did not see this error before.

2020-12-10 22:26:42.994 ERROR : zigbee-adapter: Confirm Status: 240: MAC_TRANSACTION_EXPIRED , addr: 680ae2fffea4a873 f1bc

The logging was changed early this month to name these errors. Previously these messages were just a number named ‘undefined’ and were a bit confusing.

Of course, it is possible that this error message could be associated with a delay in ZigBee devices responding.

This error, is it a ZigBee issue or webthings? I searched for the error, and found some posts but that didn’t help much…

As I understand it, it is a very low level ZigBee messaging error, where a response was not received within the expected time.

However, I don’t know much more than that: I just noticed that some unknown message numbers matched the entries in the MAC table, so I added those names into the error reporting.

Nobody has an idea what could possibly solve this? Because this would actually make my whole setup practically usesless…
I also have the same issue with another type of button (also from Ikea)

From what I can make out, this appears to be a ‘normal’ situation, unless it occurs each time that a message is sent to a particular device. And that case, the issue is likely to be the specific device rather than the WebThings software or the ZigBee adapter.

I’ll chime in with a situation I see when using cURL to toggle a thing property “on” between true & false values (turn Z-wave device on or off).

In my case, when the property changes to on=true the action is mostly instantaneous (< .5s) and reliable, no matter the previous state of the thing.

When the property changes to on=false (off) the action is often fast (< 2s) due to expected slow dim-to-off behaviour.

Occasionally, button presses changing to on=false, just hang for up to 30 seconds, and then eventually complete. Note that the cURL command includes a 30 sec timeout to prevent blocking (like this case)

At the end of 30 seconds, the thing almost always transitions to on=false (off) as I have pressed the button that initiated cURL a second time. I’ll have to avoid button-pressing off a second time to debug the flow of this…

So, what I have on my system is the GW receiving cURL commands that are blocking on the GW. At this point I’m not sure if the issue is with the GW, the Z-Wave dongle, or communication link issues to/from the Z-Wave device. I just know that it’s not 100% reliable when using the cURL to set a property (see: curl_examples/setProperty.sh)

Scheduled rules or manual toggling of the Z-Wave device on/off using the GUI are 100% reliable hinting that there is something fishy going on and blocking API commands.

I found the issue… It is caused by the Ikea round button, they all seem to have this issue, also when you connect them directly to a lamp the connection will freeze when clicking multiple times quickly on a button. (I also have the smaller dimmer button and this on does not have this issue)

I also tested it with home assitant and had the same issue, but with home assistant the whole interface did not freeze completeley when something like this happens, and webthings does. Also it would be nice if webthings would see if a bulb is offline.

Thanks for the responses
FWIW, I think Webthings is by far superior, but less mature :slight_smile: