The LiquidEOS team has been working on strategies to significantly increase the stability of the EOS mainnet. Gaps in block production should not happen and below we propose some initial solutions. We have identified three main issues around EOS block production reliability:Current Issues There is a six-second “downtime” for the entire network when a block producer who is supposed to be on schedule is not producing. This may occur for a number of reasons: misconfiguration, a crashed computer or process, a BP transitioned from a standby position and wasn’t ready, or a node which is not yet synced with the blockchain. There is a lot of manual syncing between block producers when updating blacklist entries and other configurations that are vital for reliable block production. Syncing manually creates room for error, and the effects of missing a blacklist entry could be worse than not producing. The handoff between BPs is sometimes slow because of the arbitrary nature of the schedule, which doesn’t take into account the geographical distances or network latencies between BPs. This is part of the reason producing BPs occasionally miss some blocks at the beginning of their six-second cycles. Proposed Stability Solutions
The LiquidEOS team recently created a working group to evaluate these issues and contributed two solutions that could greatly improve these situations if adopted by the block producers.
The Watchdog: https://github.com/bancorprotocol/eos-bp-watchdog
This is a simple but very important tool, which will limit the potential damage caused to the main net when a BP stops producing. A block producer who runs the Watchdog tool will be automatically removed from the schedule if the system identifies an issue with their production.
Once the problem is fixed, the block producer can re-register themselves and return to the schedule, all with minimal impact on the continuity of the c...