The Web POS supports offline operation, using several HTML5 technologies which allows it to persistently store both masterdata and transactional data. Support for offline operation is essential for most customers, because the sales operations can never be stopped, and need to happen immediately, and therefore the Web POS should be able to handle situations such as Internet connectivity loss, or backend server issues, in such a way that the cashier doesn't lose the ability to operate the application and register new sales tickets.
However, the current implementation of offline mode has been problematic in some cases. After analysing experiences of real customers, and comparing their feedback with our internal experiences, we have concluded the following:
- The Web POS works correctly when the connection is available all times.
- The Web POS also works correctly when the connection to the backend server is completely lost. In this case, the Web POS accurately reports the event, and the user can still register new sales without issues.
- The Web POS, however, does not work correctly when the connection is intermittent. This happens in customers with bad connectivity due to bad Wifi or 3g reception, and also in customers with unreliable connectivity in the stores. When connection is intermittent, the operation of the Web POS is confusing and not efficient for cashiers.
- The Web POS also does not work correctly when the server is experiencing difficulties. In some cases, when the server is struggling, requests executed will some times work and some times will not. This case is similar to the intermittent network case. In other cases, the server will not respond, but every request done from the Web POS will at least have to wait for its timeout, making the Web POS very unresponsive.
This project aims to do the necessary refactor to prevent the last two cases, and ensure that the Web POS application is always available and ready to be used to register new sales in all possible circumstances.
Causes of the current problems
The main cause the Web POS doesn't behave correctly when in a state of partial, or intermittent, offline state, is that it treats every remote request as being fully independent. This means that every remote request is always executed, even if the previous ones didn't succeed. This has several bad consequences:
- In complex processes which rely on multiple requests succeeding (such as the login process), the failure in an intermediate request is currently not being handled properly, which means that these processes fail in the event of a connectivity failure in the middle of the process.
- The user experience when connection comes and goes is very poor. In these events, the Web POS always tries to do every request, and this means that when synchronizing a ticket, maybe the connection is "lost", but then the connection is "recovered" after the next request which may happen just one second after the previous one, and the user finally sees the connection being "lost" and "recovered" several times in a few seconds.
- Due to this confusing behavior, users in many cases force a refresh of the page, and this exacerbates the problem because this refreshing of the page while the server is overloaded or the connection is poor has many chances of failing (login and Web POS refresh will likely fail due to the first point).
The main change we propose is in the way we manage the offline state. Currently, the Web POS is aware of the current state (as described by the last request), and it even shows it in the UI, although it's a bit hidden in the menu:
This state, however, is not really used for anything other than for user info. Every remote request is independent, and always triggered.
The main change we propose is to have global management of the online/offline status, with the following fundamental ideas:
- The Web POS will, as it does today, store the current online/offline status.
- This status initially will depend on how the login was done (offline login means offline initial status, otherwise initial status is online)
- The Web POS transitions to offline or to online in a controlled, managed way.
- Offline transition will occur automatically (once the application detects that it is offline)
- Online transition will also have a chance to occur automatically, but the requirements to come back to online will be more demanding than currently, and the automatic transition to online will be initiated just after offline has happened. The transition to online will try to determine if the we can move to online during the defined default time. After this time, if the connection is not stable we will increase the transition to online time (we can define the max transition time) to ensure that we are ready to move to online.
- The user will have a way to force the attempt to transition online. This attempt will do a short and no fully reliable transition to online. The aim of this having this ability of forcing the transition is to go ahead in a situation where we need urgently to do a request to backend. In case this transition fails we will start with the default transition to online attempt.
- Remote requests will not be launched if the application is in offline status. Specifically, if the current status is "offline", then the remote requests will not be launched, and will be immediately considered "failed" from a technical point of view, with equivalent consequences as a technical failure due to timeout, connection loss or server unresponsiveness.
- The criteria to transition to offline status should be configurable
- Current proposal is for transition to offline to happen the moment a remote request fails due to timeout or other reason. However, some customers may decide that they would need a higher number of remote requests to fail before triggering offline mode.
- The criteria to transition to online status should be configurable, complete and extensible
- The interval to try to attempt to online status should be configurable
- The transition to online should only happen when both network and server are stable again. Therefore, the criteria to transition to online should check both sides:
- Several requests will be tried against the server, with a demanding timeout. The purpose of these requests is to ensure that both the server and the network are stable over some period of time.
- Additionally, the server should be able to respond to these requests that it is struggling, in case the server activity is too high, using the same mechanism as the recently developed concurrent access control.
- Furthermore, it would be nice to have a way for customers to be able to add their own backend-side requirements to verify the responsiveness of every sub-system that their custom environments may have.
These changes need to be combined with some changes in the way complex processes with multiple requests (specifically the login process) are done, so that they always follow the same pattern:
- The main online version of the process should still require all remote requests to succeed for the process to be considered successful.
- However, every potential request failure should be correctly handled, and should initiate the "failback" (offline) version of the process, so that the user is never stuck in a wrong, intermediate state which is neither offline nor online.
- Single request failure should always trigger transition to offline?
- "Safest" approach if we want to ensure that the user always arrives to a controlled state
- Risk of potentially causing many annoying "false" positives in current customers which sometimes have trouble with some of our timeouts
- We would need to review timeouts, and sometimes make some of them less "aggressive", thus in sometimes causing loss of "responsiveness".
Conclusion: Will be configurable, as it depends on customer situation and expectations. Default config will be that every request failure moves the Web POS to offline, but will be configurable via preference.
- Is runSyncProcess success enough to consider transitioning to online?
- Maybe no models needed to synchronize, so very fast?
- Proposal from support to make it extensible, so that customers can add their own additional requirements
Conclusion: Transition to online by default will be a process which will require both several requests, and also triggered over a longer period of time. It will also be configurable.