3.1 Testing for Function and Robustness

Criteria To Assess

In my success criteria, I had 4 criteria which were related to the function and robustness of the project, three of which were in relation to the node software and one the webportal (website). Since development of the project has now halted, it is time to see how the project fits or doesn't fit into these criteria.

Criterion
Description

14

The node software should not crash.

15

The software should be capable of receiving multiple messages per second.

16

The software should attempt to retry any failed processes (such as loading a file) and should only exit after multiple failed attempts.

24

The web portal should run without crashing under any circumstance

Criterion 14

Although the software does not crash in most circumstances, under a large load it will still crash and since part of the testing involved putting the software under considerable load for a long period of time and it routinely crashed, this criterion has sadly not been met.

This is due to the software running directly on the cpu it is operating on and VLang not having an overhead, therefore if the software crashes it cannot be recovered so although most errors are handled internally through my code, some errors are outside of the scope of the program and cannot be saved once they occur. This effectively means that the only way to stop the program crashing with these kind of errors would be to prevent them from happening, instead of the traditional method to "wrap" the parts of the program that have these crashes with a error handling function since that doesn't exist in VLang.

General usage demo shown running without problems

More evidence for these crashes being a combined cause of both my code and VLang itself is how it behaves differently depending on the operating system it runs on. When using macOS the software requires a pretty significant load of multiple messages per second for 10-20 seconds, generating processes with ~4 threads running at any given time (although these are "green" threads that are based upon software and don't run in a separate hardware thread each). However, when using Linux the software crashes every time that it tries to send a web-socket message which shows that VLang is not as consistent as it would ideally be. Obviously some variations would be expected and, once again, optimising/refactoring my code and how it handles multiple threads would likely increase the stability of the software, but the fact that the exact same code compiled natively on different systems can run so fundamentally differently, without any form of compiler errors or warnings goes to show how VLang still has a long way to mature. Evidence for these claims can be found in the video below.

A demo showcasing how Vlang performs differently depending on different systems

Criterion 15

The testing for this criterion involved setting up a basic js script that simulated the api requests being made by the dashboard to the broadcast route so as to produce messages, this sent the message "jumagi" paired with the current epoch timestamp to send messages once to twice per second for a prolonged period of time.

This was completed using two separate nodes setup on a local network such that no external issues could be the cause of any issues. This resulted in the nodes successfully communicating for approximately 10-20 seconds before the one that sent the messages (no matter which node was used) crashing with a panic from within the built in "database" module. This means that the issue was probably caused by the Vlang database module not expecting to having to get and send messages multiple times per second and panicking because of that.

The below images show the dashboard during the test and the error message.

The evidence from a prolonged test in cycle 14

This shows that this criterion was not met, and it is once again, partially due to Vlang's limitations as a language and how my code was written.

Criterion 16

In the demo video below the configuration file is renamed to 'hide' it from the program such that the software wishes to regenerate this file upon startup. The software is then run so that it asks the user if they want to generate a new configuration file, at which point the user is prompted with an input field.

The video then proceeds to show two scenarios:

  1. [0:12] The first where the user inputs an incorrect response every time they are prompted - 6 attempts - until the system exits since it can be determined that either something is wrong with the input field or the user may wish to not answer the prompt;

  2. [0:32] The second where the user inputs an incorrect piece of data once and then inputs a correct piece of data, showing failing a process and then passing it on an alternative attempt does not cause any issues and allows the program to continue as if nothing happened. Following this up, the video then shows the file generated in response to this and how it is valid.

The second half of the video simply shows the deletion of this newly generated file and replacement of it with the old config file through renaming the original file back to config.json to show that this also allows for config files to be loaded in and out easily without causing any issues. This is not necessarily relevant for this criterion but is still a useful feature.

This clearly shows an example of the program reattempting a failed process multiple times until it has either succeeded or failed enough times for the program to expect it not to succeed without user intervention. This demo clearly passes this criterion and since most other parts of the program that can fail in such a way in which the program can handle them itself are also given this treatment then the criterion is met.

Criterion 24

Since the initialisation of the website on 'Vercel' (a website hosting service) in cycle 1 of the development, the web portal has never crashed publicly. This is due to using a 'linting' service 'Eslint' which scans all the code for the site before it is pushed to the public server and does not allow the website to update unless it is all valid and passes all required tests.

Failed build logs on Vercel

This can be seen above through the shown 'failed' deployments which were caught by this 'linting' service and prevents them from being pushed to the public website server. For reference I have also included the past 7 build deployments, all of which have passed and were then pushed to the public server.

To prove this I have included a video below of me attempting to crash the webportal by interacting with it, which obviously doesn't phase the site.

A graph of the requests the webportal has received in the past 7 days.
A graph of the unique visitors the webportal has received in the past 7 days

To prove that the webportal has never crashed I have also included some of the traffic data over the last 7 days to show that it has a steady stream of users visiting the site and interacting with it.

All of this evidence shows that the webportal has never crashed past an extent that couldn't be solved simply by refreshing the page and hence it has passed this criterion.

Last updated