Hi! I’m a Java developer working on a sales data project that needs to process a few hundreds of invoices a day.
I develop a Java application to clean the item description. After that, I do some amounts checks to identify outliers and split the invoice by item in a bucket as a JSON file. This file will be used to aggregate by item description the mean and median in further amount checks.
I also save invoice metadata as a JSON to compare inbound with outbound quantities.
The challenges that I’m facing it’s the best practices to store and then process this data that I hope to keep growing over the years.
Any thots and suggestions are much appreciated, and if this kind of question shouldn’t be here, please delete the question.