Today we finished testing the VAT's join functionality. It turns out the bug I described yesterday wasn't really a problem. I just forgot to do a "git pull". Great...
It works really well! It can join any entities in the database together. There are a few missing things like the actions which are embedded in the hits, but we doubt we'll have any issues getting that to work soon.
We need to start giving thought to how we create the UAT (User Affinity Tool), which will be responsible for augmenting their recommender engine. This will basically mean finding a way to produce meaningful information for their Elasticsearch queries quickly enough to not impede the user experience. This may mean denormalizing data (storing affinity scores which can be read from one spot each hit versus querying a user's entire history for every new hit). In any case, we'll need access to the internal user information. Right now that means tapping into their DNN system, which is the CMS they use.
We think we will design this with an API style so that we avoid having our code directly access their database (SQL Server in this case), and then in the future if they change their CMS system, they just need to make sure the API can access it and return the user info to us in the same way as when we develop our system. It will likely be a very simple API so we think we'll use AWS's Lambda feature. This may be its time to shine. It's great for simple API's which are called business-to-business (not user-facing) because there's no need for authentication or rate limiting, because you know who's calling it.
Something this simple can fit into one function. Lambda can even be economical because they charge by the 100ms of your function's execution time. You don't need to keep an EC2 instance running 24/7. If your function would be hit constantly so that you're constantly incurring charges, then at that point it might make sense to just roll an instance instead. We'll be crunching those numbers too, but we were told that AWS fees really aren't a big concern for us so we're not too worried. If anything, I'd love to use Lambda because it'll help me get some practice in creating systems with the serverless architecture.
I'm excited for the UAT portion of the project because I get to see how one goes about connecting various web services together and using machine learning techniques like Elasticsearch's capabilities. These are definitely skills I can take with me in the future in jobs and my own projects.
Note: This was originally posted on the blog I used for my co-op term while at Seneca College (mswelke.wordpress.com) before being imported here.