At Anzyz, we do automated tagging of textual data. However, this automated tagging is facilitated by our data analysts fine-tuning the results of a self-learning proprietary machine-learning algorithm and then storing them into packets of data stored in a popular, human-readable format called JSON.
These packets of data can be used independently from the ML (Machine Learning) system and this makes our system portable. The advantages of doing so are many; below is a list of a few imporant ones:
GDPR issues: Let us address the elephant in the room. GDPR is a huge issue for any company that deals with text data. This problem is tackled because the resulting data is small (our data analysts summarised 27GB of data into just 3Kb of contextual data) and this is human-readable, so we can confirm that there is no identifiable data.
Simple Architecture: The data is small, so we can use the ram itself to save a copy without having to juggle data between the ram and persistent storage. Also, since we can use the inbuilt JSON libraries in any language, we don't have to download huge machine learning libraries.
Scaling Services: Since we do not have to use huge ML libraries, we can keep the image size down to a few MB, not the 1 plus GB behemoth it would have been otherwise. This means that using the orchestration system like Kubernetes, scaling becomes easy and fast.
Because the system is portable, Anzyz tech can be used in scenarios where a slower, low powered machine has to do automated tagging of data, or where there is a very high inflow of data and the system cannot afford to fail.